USRE37811E1 - Distributed application load distribution aid tool - Google Patents

Distributed application load distribution aid tool Download PDF

Info

Publication number
USRE37811E1
USRE37811E1 US09/614,223 US61422300A USRE37811E US RE37811 E1 USRE37811 E1 US RE37811E1 US 61422300 A US61422300 A US 61422300A US RE37811 E USRE37811 E US RE37811E
Authority
US
United States
Prior art keywords
master
load
machine
machines
load data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
US09/614,223
Inventor
Gerard Sitbon
Francois Urbain
Therese Saliba
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Bull SA
Original Assignee
Bull SA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bull SA filed Critical Bull SA
Priority to US09/614,223 priority Critical patent/USRE37811E1/en
Application granted granted Critical
Publication of USRE37811E1 publication Critical patent/USRE37811E1/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system

Definitions

  • the present invention relates to a toolkit for balancing the load of an application distributed among several machines belonging to a distributed data processing system in a local area network.
  • the present invention constitutes an improvement and a generalization of the preceding solution.
  • the tool at the service of a distributed application running on the machines of a distributed data processing system in a local area network which is intended for balancing the load on each of these machines, is characterized in that it comprises a plurality of data processing modules called DAEMONs which run on these machines, one of which is called the master, the others being agents,
  • the master and the agents each having means for calculating the load of the machines on which they are running, at first predetermined sampling instants, and means for storing the load data of the master and the agents,
  • the local agent closest to the application indicating to the latter, at the request of the latter, the machine with the lightest load, the application then making the decision to request this machine to execute the services it needs.
  • FIG. 1 shows a distributed data processing system which includes the load balancing toolkit according to the invention.
  • FIG. 2 shows the controller for distributing the master-agent roles among the various elements which constitute the load balancing toolkit according to the invention.
  • load balancing toolkit ORC for (load balancing toolkit, in English) (load balancing toolkit, in English) in a distributed data processing system according to the invention—for simplicity's sake, from this point forward, it will be called the “toolkit”—appear in FIG. 1 .
  • the distributed data processing system which can be any type whatsoever, here called SID, comprises four data processing machines of any size and shape whatsoever, namely MCO, MC 1 , MC 2 , MC 3 .
  • MCO multi-processor
  • MC 1 a multi-processor
  • MC 2 a multi-processor
  • MC 3 a multi-processor system
  • CPU central processor unit
  • I/O units input/output units
  • the toolkit ORC itself comprises the master MS 0 and the three agents A 1 , A 2 , A 3 . Any agent can also be a master, depending on conditions which will be explained below. Both the master and the agents are constituted by data processing tools known to one skilled in the art as DAEMONs.
  • DAEMON is a data processing tool or entity running on a machine, which is capable of responding to a question.
  • the DAEMONs MS 0 , A 1 , A 2 , A 3 are respectively associated with shared memories MPO, MP 1 , MP 2 , MP 3 . Each of them contains the load of the corresponding machine, but also the loads of the other machines in SID.
  • FIG. 1 assumes that the distributed application is running on the machine MC 2 and that it requires services supplied by the other machines MCO, MC 1 , MC 3 .
  • This application is designated APU.
  • the places where the master and the agents are located are independent from the place where APU is running.
  • Each agent as well as the master collects, for the machine on which it is running, at given time intervals which constituted the first determined sampling instants ti, the load data of this machine, for each of the elements which constitute it (the CPU load, the loads of the associated memories, the loads of the I/Os, the network load, etc.). From the load of each element, expressed as a percentage of its maximum allowable load, the total load of the machine in question is calculated. This is accomplished by load calculation means, respectively MCC 0 for MS 0 , MCC 1 for A 1 , MCC 2 for A 2 , MCC 3 for A 3 .
  • OPERATION 2 At regular intervals, the agents send MS 0 the statistical load data from the corresponding machine, through the network (for A 2 , through the link L 2 between MC 2 and MC 0 ).
  • OPERATION 3 The master MS 0 centralizes, at practically the same regular intervals defined for Operation 2 , which constitute the second determined sampling instants Ti, all the statistical load data of all the agents, as well as its own, at the level of its associated shared memory, in this case MP 0 .
  • This centralization is in fact an operation for collecting the load data. It is therefore executed by load data collection means, respectively MRC 0 for MS 0 , MRC 1 for A 1 , MRC 2 for A 2 , MRC 3 for A 3 , which are in fact collection programs integrated into the master and into each of the agents A 1 through A 3 and are therefore not represented in FIG. 1 for simplicity's sake.
  • OPERATION 4 The master MS 0 sends, using sending means MTCO, all this data to each agent A 1 , A 2 , A 3 through the network RE, namely through the links L 1 between MC 0 and MC 1 , L 2 between MC 0 and MC 2 , L 3 between MC 0 and MC 3 , MTC 0 is an integral part of MS 0 and is therefore not represented in FIG. 1 for simplicity's sake.
  • Each agent receives this load data and copies it into its associated shared memory.
  • OPERATION 6 The application APU scans the shared memory of the machine on which it is running to search it for the load estimated for each of the machines, and at the moment it needs the determined services to be rendered, it deduces the machine with the lightest load at this moment and requests the latter to render it these services.
  • the total load Wt of an agent (and also of the master) is obtained using the following formula:
  • Wt k1*W1+k2*W2+k3*W3+k4*W4, in which:
  • W 1 is the percentage of the utilization of the central processor of the agent in terms of time
  • W 2 is the percentage of the utilization of the memory of the agent, that is, the ratio between the storage capacity actually used and its total capacity
  • W 3 is the percentage of the utilization of the network by the agent, that is, the ratio between the number of pieces of information sent and received by the agent and the maximum allowable rate in the network
  • W 4 is the percentage of the utilization of the input/output units by the agent.
  • k 1 , k 2 , k 3 , k 4 are specific weighting factors of the processor, the memory, the network, and the input/output units. Their sum is equal to 1. Their values depend on the nature of the application in the process of running, in this case APU on the machine MC 2 .
  • the loads W 1 , W 2 , W 3 , W 4 are measured and Wt is calculated as shown in each of the tables in Appendix 1, at determined sampling instants t1, t2, t3, t4, t5, t6, t7, etc. of the period T (which are, in fact, the instants ti mentioned above in the description of Operation 1).
  • Table 1 gives an example of load data collected by any agent, for example A 1 , relative to the corresponding machine MC 1 , for all the instants t1 through t7. This data is, of course, stored in the shared memory MP 1 of the machine MC 1 on which A 1 is running, before it is sent to MS 0 .
  • This table shows, for example, that W 1 is equal to 35 at the instant t1, W 2 to 67 at the instant t4, W 3 to 38 at t6, W 4 to 32 at t7, and so on.
  • a calculation program API associated with APU, which runs on MC 2 then applies—for the load data of each agent and the master which, after the execution of Operation 3, is contained in the shared memory MP 0 of MC 0 associated with MS 0 —the weighting factors k 1 through k 4 specific to the corresponding machines for the application APU.
  • Table 2 shows, for each of the machines MCO through MC 4 , the global load value Wt at the instants t1 through t7.
  • Wt is equal to 56 at the instant t1, 32 at t2, 67 at t3, etc.
  • Wt is equal to 23 at t1, 34 at t2, etc., and so on for the other machines.
  • Table 3 is obtained. This table makes it possible to read, for example, that the estimated values of the loads of MCO through MC 3 at this instant t8 are 73, 82, 36, and 76, respectively.
  • a power coefficient Cp specific to each machine is applied to the total load of each machine in order to obtain its actual available capacity rate C 1 , using the formula:
  • the coefficient Cp of a given machine is calculated by generating a synthesis of the power of the central processor CPU, the capacity of the memories, the processing power of the input/output units, etc. It is recalculated each time the hardware configuration of the machine is changed or its operating system is modified. Likewise, each time the general configuration of the distributed data processing system SID is changed, all the coefficients CP of all the machines in the system are redefined.
  • a Cp equal to 1 corresponds to a medium-scale machine, which is defined by the user.
  • the selection mechanism must ensure that at least 1 DAEMON is running and that two of them cannot be masters simultaneously (especially if they start up at the same time).
  • Phase 1 When a DAEMON starts up, it generates a unique identifier ID in conformity with the protocol used in the network RE, for example in conformity with the TCP-IP protocol used in the exemplary embodiment of the invention described herein.
  • This identifier is composed of the Ethernet address (Ethernet is the part of the TCP-IP protocol related to local area networks and is used in the exemplary embodiment described herein. Ethernet being understood to be standardized and therefore known to one skilled in the art), the moment at which the identifier is transmitted, and a random value. At the same time, it puts itself into an intermediate state and sends these two pieces of information (its current state, ID) through the network RE to all the machines in the network.
  • Phase 2 it waits to receive the identical information from the other DEMONS, for determined a time interval Tr (on the order of 5 to 10 seconds). It is a candidate for the role of master.
  • Phase 3 As soon as it receives this information, it analyzes it.
  • Phase 4 Once this time interval has elapsed, the DAEMON in question tries again. In order to avoid a loss of messages, which is always possible in the network RE, it uses the following procedure:
  • the transmission and the listening for responses are repeated 5 times.
  • the DAEMON in question receives a response from another DAEMON which reveals itself to be an agent, it can be sure that a master MS 0 exists and it waits for the response from the latter to reach it.
  • Phase 5 When 5 repetitions have taken place, and the DAEMON in question has not received any response from the other DAEMONs, it then decides that it is the master MS 0 .
  • the master periodically notifies all the machines in the system SID of its existence. If the master detects the existence of another master, the procedure is restarted by the one whose ID is lower.
  • FIG. 2 which shows the controller AUT which distributes the master-agent roles among the various DAEMONs running on the machines of SID, will make it easier to understand the sequence of the different phases 1 through 5 described above.
  • This controller AUT comprises 5 states:
  • the DAEMON in question sends the two pieces of information (its own ID, its state), which corresponds to Phase 1 .
  • the DAEMON in question becomes an agent A 1 , A 2 or A 3 .
  • the DAEMON in question has disseminated its ID and its state and has set a time interval Tr.
  • e2 An ID is received, and the local identifier ID (that of the DAEMON in question) is lower than the identifier it receives.
  • e3 An ID is received, and the local ID is greater than or equal to the ID received.
  • e5 The time interval Tr has expired and the number of attempts is less than 5, or an agent has just responded.
  • e9 Detection by a master of the existence of another master with a higher ID.

Abstract

A tool at the service of a distributed application running on machines of a distributed data processing system running in a local area network, intended for balancing the load on each of the machines of the system, includes a master daemon and a plurality of agent demons. The master and each of the agents calculate the load of the machine on which they are running. The master collects the load data of each of the agents at a first sampling interval and sends that collected load data to all of the agents. At the request of the distributed application, the local agent closest to the application indicates to the application which machine has the lightest load. The application then makes the decision to request the machine with the lightest load to execute the services the application requires. As necessary, the tool selects a master from the agents, thereby ensuring the existence and uniqueness of a master at all times, regardless of failure affecting one 1010 or more machines in the data processing system.

Description

The present invention relates to a toolkit for balancing the load of an application distributed among several machines belonging to a distributed data processing system in a local area network.
FIELD OF THE INVENTION
The current trend in the development of data processing systems is to form a data processing system through the association of a plurality of machines connected to one another through a network for example a local area network. Any user can run applications of widely varied types on this set of machines. These applications call services which supply the information required to handle the problem or problems they are working on, which are offered by all or some of these machines.
When an application in the process of running requires the use of a particular service, in current practice, it proceeds in the following manner:
either it chooses, in a purely random manner, the machine which will provide it with this service and assigns the work to this machine,
or it makes a circular choice among all the machines, which means that it assigns by turns, always in the same chronological order, the work of providing the services it requires in succession: thus, if the system has three machines, it assigns the work of providing the services it requires first to machine No. 1, those it requires second to machine No. 2, those it requires third to machine No. 3, and so on in the following order: No. 1, No. 2, No. 3, No. 1, etc.
No matter which of the two cases described above applies, none of the work of any of the machines is optimized in terms of time, and the capabilities of these machines in terms of speed and performance are used only at a level far below their maximum.
There are some known solutions which make it possible to eliminate these drawbacks: one of these is described in French patent application No. 94 08764, filed Jul. 13, 1994 by the Assignee of the subject application, under the title “Open Data Processing System with Multiple Servers”. In a system of this type, formed by the association of a central system called a client with several servers, each server calculates its own load based on the criteria specific to each application running on the client, as well as its foreseeable development over time, and transmits these two factors to the client. The latter, when a particular application requires the services of a server, chooses the one with the lightest load during the period of time in which the services must be rendered and assigns it the work of supplying the services requested.
SUMMARY OF THE INVENTION
The present invention constitutes an improvement and a generalization of the preceding solution.
According to the invention, the tool at the service of a distributed application running on the machines of a distributed data processing system in a local area network, which is intended for balancing the load on each of these machines, is characterized in that it comprises a plurality of data processing modules called DAEMONs which run on these machines, one of which is called the master, the others being agents,
the master and the agents each having means for calculating the load of the machines on which they are running, at first predetermined sampling instants, and means for storing the load data of the master and the agents,
the master containing:
means for collecting the load data of each agent, at second predetermined sampling instants,
means for sending the load data of each agent to all of the other agents,
each agent containing:
means for receiving the load data of the other agents,
the local agent closest to the application indicating to the latter, at the request of the latter, the machine with the lightest load, the application then making the decision to request this machine to execute the services it needs.
BRIEF DESCRIPTION OF THE DRAWING
Other characteristics and advantages of the present invention will become apparent from the following description given as a non-limiting example in reference to the appended drawings. In these drawings:
FIG. 1 shows a distributed data processing system which includes the load balancing toolkit according to the invention.
FIG. 2 shows the controller for distributing the master-agent roles among the various elements which constitute the load balancing toolkit according to the invention.
DESCRIPTION OF THE PREFERRED EMBODIMENTS 1) ESSENTIAL CHARACTERISTICS OF THE TOOL ACCORDING TO THE INVENTION
A) Structure:
The various essential characteristic elements of the load balancing toolkit ORC for (load balancing toolkit, in English) (load balancing toolkit, in English) in a distributed data processing system according to the invention—for simplicity's sake, from this point forward, it will be called the “toolkit”—appear in FIG. 1.
As shown in FIG. 1, the distributed data processing system, which can be any type whatsoever, here called SID, comprises four data processing machines of any size and shape whatsoever, namely MCO, MC1, MC2, MC3. Each of these machines—small-, medium, or large-scale computers—comprise the usual elements, namely one or more central processor unit, called CPU , memories associated with the latter, input/output units (I/O units,) and means for connecting to the network RE. These means are symbolically represented by two-way arrows which represent the data links between the four machines MCO through MC3 in FIG. 1.
The toolkit ORC itself comprises the master MS0 and the three agents A1, A2, A3. Any agent can also be a master, depending on conditions which will be explained below. Both the master and the agents are constituted by data processing tools known to one skilled in the art as DAEMONs. A DAEMON is a data processing tool or entity running on a machine, which is capable of responding to a question.
Inside each of the machines MCO through MC3, the DAEMONs MS0, A1, A2, A3 are respectively associated with shared memories MPO, MP1, MP2, MP3. Each of them contains the load of the corresponding machine, but also the loads of the other machines in SID.
FIG. 1 assumes that the distributed application is running on the machine MC2 and that it requires services supplied by the other machines MCO, MC1, MC3. This application is designated APU. The places where the master and the agents are located are independent from the place where APU is running.
B) Operation:
The following are the main lines of operation of the tool ORC, it being understood that during the establishment of communication between all the machines in the system SID, it is assumed to be established that MS0 is the master and that A1, A2, A3 are the agents. Refer again to FIG. 1, and specifically to the arrows and the circled numbers which accompany them, which respectively indicate the direction of the information flowing between the master and the agents, and the sequence of operations.
OPERATION 1: Each agent as well as the master collects, for the machine on which it is running, at given time intervals which constituted the first determined sampling instants ti, the load data of this machine, for each of the elements which constitute it (the CPU load, the loads of the associated memories, the loads of the I/Os, the network load, etc.). From the load of each element, expressed as a percentage of its maximum allowable load, the total load of the machine in question is calculated. This is accomplished by load calculation means, respectively MCC0 for MS0, MCC1 for A1, MCC2 for A2, MCC3 for A3. These means are simply constituted by calculation programs which implement the load calculation method described below, in paragraph 2: “Method for load calculation by each of the agents.”These means are naturally an integral part of the master and of each agent, MS0, A1 through A3 and for this reason they are not represented per se in FIG. 1, for simplicity's sake. Once the total load of the machine in question is calculated a set of statistical data on the load of this machine, namely DSC, is obtained. In FIG. 1, this operation is shown for the agent A2 only, obviously for the purpose of preserving the clarity of this figure.
OPERATION 2: At regular intervals, the agents send MS0 the statistical load data from the corresponding machine, through the network (for A2, through the link L2 between MC2 and MC0).
OPERATION 3: The master MS0 centralizes, at practically the same regular intervals defined for Operation 2, which constitute the second determined sampling instants Ti, all the statistical load data of all the agents, as well as its own, at the level of its associated shared memory, in this case MP0. This centralization is in fact an operation for collecting the load data. It is therefore executed by load data collection means, respectively MRC0 for MS0, MRC1 for A1, MRC2 for A2, MRC3 for A3, which are in fact collection programs integrated into the master and into each of the agents A1 through A3 and are therefore not represented in FIG. 1 for simplicity's sake.
OPERATION 4: The master MS0 sends, using sending means MTCO, all this data to each agent A1, A2, A3 through the network RE, namely through the links L1 between MC0 and MC1, L2 between MC0 and MC2, L3 between MC0 and MC3, MTC0 is an integral part of MS0 and is therefore not represented in FIG. 1 for simplicity's sake.
OPERATION 5: Each agent receives this load data and copies it into its associated shared memory. MP1 for A1, MP2 for A2, MP3 for A3. This is accomplished by the means MRCC1 through MRCC3 for A1 through A3 respectively, which means are integral parts of the latter and are not represented in FIG. 1 for simplicity's sake.
OPERATION 6: The application APU scans the shared memory of the machine on which it is running to search it for the load estimated for each of the machines, and at the moment it needs the determined services to be rendered, it deduces the machine with the lightest load at this moment and requests the latter to render it these services.
2) METHOD FOR LOAD CALCULATION BY EACH OF THE AGENTS:
Examples of loads on the CPU, memory, Input/Output, and network RE elements are described below.
The description of the method for load calculation by each of the means MCCO through MCC3 is given in reference to Tables 1 through 4, which appear below and in which the loads are given as percentages.
TABLE 1
Sample load data
stored in any shared memory associated with an agent (expressed as a %)
t1 t2 t3 t4 t5 t6 t7
CPU load (W1) 35 12 42 73 92 65 33
Memory load (W2) 45 32 33 67 46 32 40
Network load (W3) 12 6 33 20 12 38 5
Input/output load (W4) 25 30 56 46 78 44 32
TABLE 2
synthesis of series of global load data for each machine
t1 t2 t3 t4 t5 t6 t7
MC0 56 32 67 63 79 82 54
MC1 23 34 45 56 67 62 79
MC2 32 38 34 42 35 32 36
MC3 96 94 79 82 74 79 68
TABLE 3
extrapolation
of the value of the global load after a time T for each machine
t1 t2 t3 t4 t5 t6 t7 t8 = t7 + T
MC0 56 32 67 63 79 82 54 estimated 73
MC1 23 34 45 56 67 62 79 estimated 82
MC2 32 38 34 42 35 32 36 estimated 36
MC3 96 94 79 82 74 79 68 estimated 73
TABLE 4
application of the power coefficient, comparison and selection
MC0 MC1 MC2 MC3
Estimated capacity 73 82 36 76
Power coefficient 2.5 2 0.8 1.5
Coefficient of the available 67.5 36 51.2 36
capacity
(100 − estimated load) *
power coefficient
The calculation of the load for each agent and master is identical to that described in the above-mentioned patent. It is briefly summarized here.
The total load Wt of an agent (and also of the master) is obtained using the following formula:
Wt=k1*W1+k2*W2+k3*W3+k4*W4, in which:
W1 is the percentage of the utilization of the central processor of the agent in terms of time,
W2 is the percentage of the utilization of the memory of the agent, that is, the ratio between the storage capacity actually used and its total capacity,
W3 is the percentage of the utilization of the network by the agent, that is, the ratio between the number of pieces of information sent and received by the agent and the maximum allowable rate in the network,
W4 is the percentage of the utilization of the input/output units by the agent.
k1, k2, k3, k4 are specific weighting factors of the processor, the memory, the network, and the input/output units. Their sum is equal to 1. Their values depend on the nature of the application in the process of running, in this case APU on the machine MC2.
The loads W1, W2, W3, W4 are measured and Wt is calculated as shown in each of the tables in Appendix 1, at determined sampling instants t1, t2, t3, t4, t5, t6, t7, etc. of the period T (which are, in fact, the instants ti mentioned above in the description of Operation 1).
Table 1 gives an example of load data collected by any agent, for example A1, relative to the corresponding machine MC1, for all the instants t1 through t7. This data is, of course, stored in the shared memory MP1 of the machine MC1 on which A1 is running, before it is sent to MS0.
This table shows, for example, that W1 is equal to 35 at the instant t1, W2 to 67 at the instant t4, W3 to 38 at t6, W4 to 32 at t7, and so on.
A calculation program API associated with APU, which runs on MC2, then applies—for the load data of each agent and the master which, after the execution of Operation 3, is contained in the shared memory MP0 of MC0 associated with MS0—the weighting factors k1 through k4 specific to the corresponding machines for the application APU.
Thus Table 2 is obtained, which shows, for each of the machines MCO through MC4, the global load value Wt at the instants t1 through t7. Thus it may be seen that, for MC0, Wt is equal to 56 at the instant t1, 32 at t2, 67 at t3, etc. For MC1, Wt is equal to 23 at t1, 34 at t2, etc., and so on for the other machines.
The following step for calculating the load for all the machines consists of estimating, by extrapolation, using the known mathematical method of least error squares, the value of the estimated load Wt at the instant t8=(t7+T).
Thus, Table 3 is obtained. This table makes it possible to read, for example, that the estimated values of the loads of MCO through MC3 at this instant t8 are 73, 82, 36, and 76, respectively.
Next, a power coefficient Cp specific to each machine is applied to the total load of each machine in order to obtain its actual available capacity rate C1, using the formula:
C1=(100−Wt(estimated))*Cp
In effect, it is important to take into account the characteristics of each machine, given that this is a heterogeneous data processing environment wherein the power, the size and the type of the machines which compose it are different. Thus, if a machine has a light load but at the same time does not have enough processing power to provide the services requested of it by the application APU at a given moment, it is obvious that another machine must provide these services. Hence the necessity for a correction factor in order to define the load, and hence the existence of the power coefficient CP which corresponds to this purpose.
The coefficient Cp of a given machine is calculated by generating a synthesis of the power of the central processor CPU, the capacity of the memories, the processing power of the input/output units, etc. It is recalculated each time the hardware configuration of the machine is changed or its operating system is modified. Likewise, each time the general configuration of the distributed data processing system SID is changed, all the coefficients CP of all the machines in the system are redefined. A Cp equal to 1 corresponds to a medium-scale machine, which is defined by the user.
In Table 4, it is possible to read examples of actual available capacity rates C1 for each machine MCO through MC3. Thus, for MCO, with an estimated capacity rate of 73 and a power coefficient Cp of 2.5, the actual available capacity rate is 67.5. The same figures are respectively 82, 2, 36 for MC1, and so on for MC2 and MC3.
3) METHOD FOR SELECTING THE MASTER MS0:
The basic philosophy is that any DAEMON running on any machine can be a master. It is therefore important to develop a mechanism which makes it possible to define which one of them will be the master and the conditions for its selection, as well as the modalities for its replacement should it fail.
The selection mechanism must ensure that at least 1 DAEMON is running and that two of them cannot be masters simultaneously (especially if they start up at the same time).
It is composed of the following 5 main phases:
Phase 1: When a DAEMON starts up, it generates a unique identifier ID in conformity with the protocol used in the network RE, for example in conformity with the TCP-IP protocol used in the exemplary embodiment of the invention described herein. This identifier is composed of the Ethernet address (Ethernet is the part of the TCP-IP protocol related to local area networks and is used in the exemplary embodiment described herein. Ethernet being understood to be standardized and therefore known to one skilled in the art), the moment at which the identifier is transmitted, and a random value. At the same time, it puts itself into an intermediate state and sends these two pieces of information (its current state, ID) through the network RE to all the machines in the network.
Phase 2: it waits to receive the identical information from the other DEMONS, for determined a time interval Tr (on the order of 5 to 10 seconds). It is a candidate for the role of master.
Phase 3: As soon as it receives this information, it analyzes it.
If it comes from a DAEMON which is in fact a master, that is, considered to be MS0, it considers itself to be an agent.
If it comes from a DAEMON in an intermediate state, it compares its own identifier with the one it receives.
If its own identifier is lower than the one it receives, it retains the right to be the master MS0.
If its own identifier is higher than or equal to the one it receives, it cedes the position. It then retransmits the two pieces of information (its own ID, its state) and again waits for responses during the time interval Tr, also called the timer.
Phase 4: Once this time interval has elapsed, the DAEMON in question tries again. In order to avoid a loss of messages, which is always possible in the network RE, it uses the following procedure:
The transmission and the listening for responses are repeated 5 times.
If the DAEMON in question receives a response from another DAEMON which reveals itself to be an agent, it can be sure that a master MS0 exists and it waits for the response from the latter to reach it.
Phase 5: When 5 repetitions have taken place, and the DAEMON in question has not received any response from the other DAEMONs, it then decides that it is the master MS0.
When one of the three agents A1 through A3 realizes that the master MS0 is no longer communicating with it, it starts the procedure below in all its phases, which results in the choice of a new master chosen from among the three.
Moreover, the master periodically notifies all the machines in the system SID of its existence. If the master detects the existence of another master, the procedure is restarted by the one whose ID is lower.
FIG. 2, which shows the controller AUT which distributes the master-agent roles among the various DAEMONs running on the machines of SID, will make it easier to understand the sequence of the different phases 1 through 5 described above.
This controller AUT comprises 5 states:
State IO: The DAEMON in question sends the two pieces of information (its own ID, its state), which corresponds to Phase 1.
State I1: This DAEMON listens for responses from the other DAEMONs, which corresponds to Phases 2 and 3.
State I2: This DAEMON waits for the time interval Tr to elapse, and for a possible response from the master MS0.
State A: The DAEMON in question becomes an agent A1, A2 or A3.
State M: The DAEMON in question becomes the master MS0.
The events which correspond to this controller, which are denominated e1 through e8, are the following:
e1: The DAEMON in question has disseminated its ID and its state and has set a time interval Tr.
e2: An ID is received, and the local identifier ID (that of the DAEMON in question) is lower than the identifier it receives.
e3: An ID is received, and the local ID is greater than or equal to the ID received.
e4: The time interval Tr has expired.
e5: The time interval Tr has expired and the number of attempts is less than 5, or an agent has just responded.
e6: The master has just responded.
e7: The time interval Tr has expired, the number of attempts is equal to 5, and no agent has responded.
e8: The connection with the master is lost.
e9: Detection by a master of the existence of another master with a higher ID.
While this invention has been described in conjunction with specific embodiments thereof, it is evident that many alternatives, modifications and variations will be apparent to those skilled in the art. Accordingly, the preferred embodiments of the invention as set forth herein, are intended to be illustrative, not limiting. Various changes may be made without departing from the true spirit and full scope of the invention as set forth herein and defined in the claims.
APPENDIX 1
TABLE 1
Sample load data
stored in any shared memory associated with an agent (expressed as a %)
t1 t2 t3 t4 t5 t6 t7
CPU load (W1) 35 12 42 73 92 65 33
Memory load (W2) 45 32 33 67 46 32 40
Network load (W3) 12 6 33 20 12 38 5
Input/output load (W4) 25 30 56 46 78 44 32
TABLE 2
synthesis of series of global load data for each machine
t1 t2 t3 t4 t5 t6 t7
MC0 56 32 67 63 79 82 54
MC1 23 34 45 56 67 62 79
MC2 32 38 34 42 35 32 36
MC3 96 94 79 82 74 79 68
TABLE 3
extrapolation
of the value of the global load after a time T for each machine
t1 t2 t3 t4 t5 t6 t7 t8 = t7 + T
MC0 56 32 67 63 79 82 54 estimated 73
MC1 23 34 45 56 67 62 79 estimated 82
MC2 32 38 34 42 35 32 36 estimated 36
MC3 96 94 79 82 74 79 68 estimated 73
TABLE 4
application of the power coefficient, comparison and selection
MC0 MC1 MC2 MC3
Estimated capacity 73 82 36 76
Power coefficient 2.5 2 0.8 1.5
Coefficient of the available 67.5 36 51.2 36
capacity
(100 − estimated load) *
power coefficient

Claims (20)

We claim:
1. A tool (ORC) at the service of a distributed application (APU) running on machines (MC0 through MC3) of a distributed data processing system (SID) in a local area network (RE), intended for balancing a load on each of said machines, said tool comprising a plurality of data processing modules (MS0, A1 through A3) called DAEMONs which run on said machines, one of which is a master (MS0), the others being agents (A1 through A3),
the master (MS0) and the agents (A1 thorough A3) having respective means (MCC0 through MCC3) for calculating at first determined sampling instants ti the load of the machines on which the DAEMONs are running and respective storage means (MP0 through MP3) for storing the load data of the master and the agents,
the master (MS0) containing:
means (MRC0 through MRC3) for collecting load data of the respective agents at second determined sampling instants Ti,
means (MTC0) for sending the load data of each agent to all of the agents,
each agent (A1 through A3) containing:
respective means (MRCC1 through MRCC3) for receiving the load data of the other agents, whereby at a request of the application the local agent DAEMON closest to the application indicates to the application which machine has the lightest load, the application then making the decision to request said machine with the lightest load to execute services the application requires. each machine comprising a state machine (AUT) for choosing a master from among the DAEMONs for ensuring existence and uniqueness of said chosen master at start-up of said DAEMONs and after loss of the chosen master during the execution of the tool (ORC), each state machine generating a DAEMON state (master, slave) of the DAEMON which is running on the respective machine, said DAEMON state being a function of information received from all other DAEMONs which are running on other machines.
2. The tool according to claim 1, further comprising selection means (AUT) for choosing a master from the DAEMONs, which ensures the existence and the uniqueness of said master at the start-up of said DAEMONs and after loss of a master during the execution of the tool (ORC).
3. The tool according to claim 1, further comprising means (AUT, e8) for ensuring continuity of the service rendered to the application by the tool in case of a failure affecting at least one machine in the data processing system.
4. The tool according to claim 1, further comprising means (AUT, e1, State I0) wherein each machine (MC0 through MC3) comprises (AUT, e1, State I1) for automatically detecting receiving from the network which machines will allow the network to retrieve the addresses of all the machines at the start-up of the DAEMONS.
5. A process for implementing a tool at the service of a distributed application (APU) running on machines of a data processing system including data processing modules (MS0, A1 through A3) called DAEMONs which run on said machines, one of which is a master (MS0), the others being agents (A1 through A3) the process comprising,
1) choosing a master from among the DAEMONs for ensuring existence and uniqueness of said chosen master at start-up of said DAEMONs and after loss of the chosen master during the execution of the tool (ORC), each state machine generating a DAEMON state (master, slave) of the DAEMON which is running on the respective machine, said DAEMON state being a function of information received from all other DAEMONs which are running on other machines,
1)2) Collecting load data for the machines at first time intervals ti on which the applications are running, and calculating the total load of each machine from said load data and from its power;
2)3) At second time intervals Ti, sending the load data from agent data processing modules (A1 thorough A3) of the corresponding machines to a master data processing module;
3)4) At said second intervals Ti, centralizing the load data of the agents and the master using collection means (MRC0 through MRC3);
4)5) Sending all of said data to each agent (A1 through A3);
5)6) Copying all said load data from each agent into an associated shared memory (MP1 through MP3), and
6)7) At the moment at which the application (APU) estimates it will need a particular service to be rendered to it, searching in shared memory (MP0 through MP3) of the machine on which the application (APU) is running for the estimated load of each of the machines (MC0 through MC3), deducing the machine with the lightest load at this moment, and then requesting said machine to render the service the application requires.
6. A process for implementing a tool at the service of a distributed application APU) runninig on machines of a data processing system including data processing modules (MS0, A1 through A3) called DAEMONs which run on said machines, one of which is a master (MS0), the others being agents (A1 through A3), the process comprising:
1) choosing a master from among the DAEMONs for ensuring existence and uniqueness of said chosen master at start-up of said DAEMONs and after loss of the chosen master during the execution of the tool (ORC), each state machine generating a DAEMON state (master, slave) of the DAEMON which is running on the respective machine, said DAEMON state being a function of information received from all other DAEMONs which are running on other machines,
1)2) collecting load data for machines on which the agents and the master (MS0, A1 through A3) are running, at first time intervals ti;
2)3) calculating a total load of a respective machine from said data and from the respective machine's power;
3)4) sending the master (MS0) via the respective agents (A1 through A3), at second time intervals Ti, the load data of a respective machine;
4)5) centralizing load data of all the agents as well as load data of the master (MS0) via the master (MS0), at said second time intervals Ti;
5)6) sending all of said centralized load data to each agent (A1 through A3) via the master (MS0);
6)7) copying all said centralized load data into respective shared memories (MP1 through MP3) by each agent (A1 through A3); and
7)8) performing the following steps by the application (APU) at a moment at which the application (APU) estimates the application will need a particular service to be rendered:
a) searching a shared memory (MP0 through MP3) of a machine on which the application (APU) is running for an estimated load of each of the machines (MC0 through MC3),
b) deducing a machine with the lightest load at said moment, and
c) requesting said machine with the lightest load to render the application (APU) said service.
7. The invention as set forth in claim 6 wherein the step of calculating a total load is performed by a calculation program.
8. The invention as set forth in claim 7 wherein the step of centralizing load data is performed by a collection program.
9. The invention as set forth in claim 7 wherein the step of sending all of said centralized load data is effected by an integral part (MTC0) of said master (MS0).
10. The invention as set forth in claim 8 wherein the step of sending all of said centralized load data is effected by an integral part (MTC0) of said master (MS0).
11. The invention as set forth in claim 9 wherein the step of sending all of said centralized load data is effected by an integral part (MTC0) of said master (MS0).
12. The invention as set forth in claim 9 wherein the step of sending all of said centralized load data is effected by an integral part (MTC0) of said master (MS0).
13. The invention as set forth in claim 6 wherein the step of centralizing load data is performed by a collection program.
14. The invention as set forth in claim 13 wherein the step of centralizing load data is performed by a collection program.
15. The invention as set forth in claim 13 wherein the step of sending all of said centralized load data is effected by an integral part (MTC0) of said master (MS0).
16. The invention as set forth in claim 14 wherein the step of sending all of said centralized load data is effected by an integral part (MTC0) of said master (MS0).
17. The invention as set forth in claim 6 wherein the step of sending all of said centralized load data is effected by an integral part (MTC0) of said master (MS0).
18. The invention as set forth in claim 17 wherein the step of centralizing load data is performed by a collection program.
19. The invention as set forth in claim 17 wherein the step of sending all of said centralized load data is effected by an integral part (MTC0) of said master (MS0).
20. The invention as set forth in claim 18 wherein the step of sending all of said centralized load data is effected by an integral part (MTC0) of said master (MS0).
US09/614,223 1994-11-30 1995-11-17 Distributed application load distribution aid tool Expired - Lifetime USRE37811E1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US09/614,223 USRE37811E1 (en) 1994-11-30 1995-11-17 Distributed application load distribution aid tool

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
FR9414386A FR2727540B1 (en) 1994-11-30 1994-11-30 LOAD DISTRIBUTION ASSISTANCE TOOL FOR A DISTRIBUTED APPLICATION
FR9414386 1994-11-30
PCT/FR1995/001514 WO1996017297A1 (en) 1994-11-30 1995-11-17 Distributed application load distribution aid tool
US08/836,483 US5993038A (en) 1994-11-30 1995-11-17 Distributed application load distribution aid tool
US09/614,223 USRE37811E1 (en) 1994-11-30 1995-11-17 Distributed application load distribution aid tool

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US08/836,483 Reissue US5993038A (en) 1994-11-30 1995-11-17 Distributed application load distribution aid tool

Publications (1)

Publication Number Publication Date
USRE37811E1 true USRE37811E1 (en) 2002-07-30

Family

ID=9469329

Family Applications (2)

Application Number Title Priority Date Filing Date
US09/614,223 Expired - Lifetime USRE37811E1 (en) 1994-11-30 1995-11-17 Distributed application load distribution aid tool
US08/836,483 Ceased US5993038A (en) 1994-11-30 1995-11-17 Distributed application load distribution aid tool

Family Applications After (1)

Application Number Title Priority Date Filing Date
US08/836,483 Ceased US5993038A (en) 1994-11-30 1995-11-17 Distributed application load distribution aid tool

Country Status (6)

Country Link
US (2) USRE37811E1 (en)
EP (1) EP0715257B1 (en)
JP (1) JPH10507024A (en)
DE (1) DE69521271D1 (en)
FR (1) FR2727540B1 (en)
WO (1) WO1996017297A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030153994A1 (en) * 2002-02-13 2003-08-14 Li-Jie Jin Method of load balancing a distributed workflow management system
US20050160276A1 (en) * 2004-01-16 2005-07-21 Capital One Financial Corporation System and method for a directory secured user account
US20080034093A1 (en) * 2006-08-01 2008-02-07 Hiromi Sutou System and method for managing resources
US8930508B2 (en) * 2011-08-09 2015-01-06 Fujitsu Limited Device management apparatus, device management method, and computer-readable storage medium

Families Citing this family (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5862348A (en) * 1996-02-09 1999-01-19 Citrix Systems, Inc. Method and apparatus for connecting a client node to a server node based on load levels
US6078943A (en) * 1997-02-07 2000-06-20 International Business Machines Corporation Method and apparatus for dynamic interval-based load balancing
US5925102A (en) * 1997-03-28 1999-07-20 International Business Machines Corporation Managing processor resources in a multisystem environment in order to provide smooth real-time data streams, while enabling other types of applications to be processed concurrently
US6263368B1 (en) * 1997-06-19 2001-07-17 Sun Microsystems, Inc. Network load balancing for multi-computer server by counting message packets to/from multi-computer server
US6134216A (en) * 1997-10-29 2000-10-17 Lucent Technologies Inc. Integrated overload control for overload control for distributed real time systems
US6397252B1 (en) * 1997-12-19 2002-05-28 Electronic Data Systems Corporation Method and system for load balancing in a distributed object system
US6324580B1 (en) 1998-09-03 2001-11-27 Sun Microsystems, Inc. Load balancing for replicated services
US6327622B1 (en) 1998-09-03 2001-12-04 Sun Microsystems, Inc. Load balancing in a network environment
US7058727B2 (en) 1998-09-28 2006-06-06 International Business Machines Corporation Method and apparatus load balancing server daemons within a server
JP2000172657A (en) * 1998-12-08 2000-06-23 Fujitsu Ltd System and method for distributed processing, computer- readable recording medium with program for computer to execute the same method recorded therein, server device and client device
US6256698B1 (en) 1999-01-11 2001-07-03 Sony Corporation Method of and apparatus for providing self-sustained even arbitration within an IEEE 1394 serial bus network of devices
EP1022658A1 (en) * 1999-01-21 2000-07-26 Siemens Aktiengesellschaft Multiprocessor system and load balancing method in a multiprocessor system
EP1061710B1 (en) * 1999-06-17 2010-12-08 Level 3 Communications, LLC System and method for integrated load distribution and resource management on internet environment
JP4183871B2 (en) * 2000-01-26 2008-11-19 株式会社日立製作所 Load balancing method and apparatus
US6542930B1 (en) 2000-03-08 2003-04-01 International Business Machines Corporation Distributed file system with automated file management achieved by decoupling data analysis and movement operations
US6687735B1 (en) 2000-05-30 2004-02-03 Tranceive Technologies, Inc. Method and apparatus for balancing distributed applications
US6785726B1 (en) 2000-05-08 2004-08-31 Citrix Systems, Inc. Method and apparatus for delivering local and remote server events in a similar fashion
US6785713B1 (en) 2000-05-08 2004-08-31 Citrix Systems, Inc. Method and apparatus for communicating among a network of servers utilizing a transport mechanism
US6789112B1 (en) 2000-05-08 2004-09-07 Citrix Systems, Inc. Method and apparatus for administering a server having a subsystem in communication with an event channel
WO2002023337A2 (en) * 2000-09-12 2002-03-21 Falcon Asset Acquisition Group Method and apparatus for flash load balancing
EP1229445A1 (en) * 2001-02-02 2002-08-07 Cluster Labs GmbH Method and apparatus for operating a computer system
US7222245B2 (en) * 2002-04-26 2007-05-22 Hewlett-Packard Development Company, L.P. Managing system power based on utilization statistics
US6996728B2 (en) * 2002-04-26 2006-02-07 Hewlett-Packard Development Company, L.P. Managing power consumption based on utilization statistics
US7426570B2 (en) * 2003-07-25 2008-09-16 Hewlett-Packard Development Company, L.P. Determining placement of distributed application onto distributed resource infrastructure
EP1927921A1 (en) * 2003-08-08 2008-06-04 Teamon Systems, Inc. Communications system providing server load balancing based upon weighted health metrics and related method
US20050033809A1 (en) * 2003-08-08 2005-02-10 Teamon Systems, Inc. Communications system providing server load balancing based upon weighted health metrics and related methods
TW200532466A (en) * 2004-02-03 2005-10-01 Sony Corp Information processing device, information processing method, information processing system and information processing program of accessible media
JP2007041720A (en) * 2005-08-01 2007-02-15 Fujitsu Ltd Job step execution program and job step execution method
WO2008021024A2 (en) * 2006-08-13 2008-02-21 Mentor Graphics Corporation Multiprocessor architecture with hierarchical processor organization
US8838779B2 (en) * 2009-11-04 2014-09-16 International Business Machines Corporation Multi-level offload of model-based adaptive monitoring for systems management

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS63274741A (en) * 1987-05-01 1988-11-11 Kawasaki Steel Corp Austenitic stainless steel for seamless steel tube
JPH01258166A (en) 1988-04-08 1989-10-16 Nec Corp Network job execution monitoring and controlling system
JPH0241535A (en) 1988-08-02 1990-02-09 Seiichi Miyazaki Data processor
JPH0273437A (en) 1988-09-09 1990-03-13 Mitsubishi Electric Corp Master down-line loading system
JPH0423058A (en) * 1990-05-18 1992-01-27 Fujitsu Ltd Multi-medium data transfer method and system using the method
JPH0463771A (en) * 1990-06-30 1992-02-28 Suzuki Motor Corp Four-wheel steering device
JPH04286681A (en) * 1991-03-15 1992-10-12 Nippon Soda Co Ltd Recording material
JPH0573393A (en) 1991-09-11 1993-03-26 Nec Corp Decentralized file control system
JPH05334161A (en) 1992-05-28 1993-12-17 Matsushita Electric Ind Co Ltd Multi-server system
US5339392A (en) * 1989-07-27 1994-08-16 Risberg Jeffrey S Apparatus and method for creation of a user definable video displayed document showing changes in real time data
US5442789A (en) * 1994-03-31 1995-08-15 International Business Machines Corporation System and method for efficiently loading and removing selected functions on digital signal processors without interrupting execution of other functions on the digital signal processors
US5787160A (en) * 1994-12-08 1998-07-28 Mci Communications Corporation Intelligent routing of special service calls
US5790789A (en) * 1996-08-02 1998-08-04 Suarez; Larry Method and architecture for the creation, control and deployment of services within a distributed computer environment

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS63274741A (en) * 1987-05-01 1988-11-11 Kawasaki Steel Corp Austenitic stainless steel for seamless steel tube
JPH01258166A (en) 1988-04-08 1989-10-16 Nec Corp Network job execution monitoring and controlling system
JPH0241535A (en) 1988-08-02 1990-02-09 Seiichi Miyazaki Data processor
JPH0273437A (en) 1988-09-09 1990-03-13 Mitsubishi Electric Corp Master down-line loading system
US5339392A (en) * 1989-07-27 1994-08-16 Risberg Jeffrey S Apparatus and method for creation of a user definable video displayed document showing changes in real time data
JPH0423058A (en) * 1990-05-18 1992-01-27 Fujitsu Ltd Multi-medium data transfer method and system using the method
JPH0463771A (en) * 1990-06-30 1992-02-28 Suzuki Motor Corp Four-wheel steering device
JPH04286681A (en) * 1991-03-15 1992-10-12 Nippon Soda Co Ltd Recording material
JPH0573393A (en) 1991-09-11 1993-03-26 Nec Corp Decentralized file control system
JPH05334161A (en) 1992-05-28 1993-12-17 Matsushita Electric Ind Co Ltd Multi-server system
US5442789A (en) * 1994-03-31 1995-08-15 International Business Machines Corporation System and method for efficiently loading and removing selected functions on digital signal processors without interrupting execution of other functions on the digital signal processors
US5787160A (en) * 1994-12-08 1998-07-28 Mci Communications Corporation Intelligent routing of special service calls
US5790789A (en) * 1996-08-02 1998-08-04 Suarez; Larry Method and architecture for the creation, control and deployment of services within a distributed computer environment

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
IEE Transactions on Software Engineering, vol. 15, No. 11, Nov. 1989, New York, US, pp. 1444-1458.* *
Jian Xu: "Heuristic Methods for Dynamic Load Balancing in a Message-Passing Multicomputer" see p. 2, left col. line 12-14; figure 1; p. 2, right col., line 33-page 3, right col., line 36; p. 4, left col., line 33-right col. line 6 ; page 5, left col.* *
Journal of Parrallel and Distributed Computing, vol. 18, No. 1, May 1993, Duluth, MN US, pp. 1-13.* *
M. Theimer et al.: "Finding Idle Machines in a Workstation-Based Distributed System" see p. 1446, right column, line 52-p. 1447, left col., line 1, see p. 1451, right col., line 15-p. 1452, left col., line 5. *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030153994A1 (en) * 2002-02-13 2003-08-14 Li-Jie Jin Method of load balancing a distributed workflow management system
US7127716B2 (en) * 2002-02-13 2006-10-24 Hewlett-Packard Development Company, L.P. Method of load balancing a distributed workflow management system
US20050160276A1 (en) * 2004-01-16 2005-07-21 Capital One Financial Corporation System and method for a directory secured user account
US20080034093A1 (en) * 2006-08-01 2008-02-07 Hiromi Sutou System and method for managing resources
US8046466B2 (en) * 2006-08-01 2011-10-25 Hitachi, Ltd. System and method for managing resources
US8930508B2 (en) * 2011-08-09 2015-01-06 Fujitsu Limited Device management apparatus, device management method, and computer-readable storage medium

Also Published As

Publication number Publication date
WO1996017297A1 (en) 1996-06-06
EP0715257B1 (en) 2001-06-13
JPH10507024A (en) 1998-07-07
EP0715257A1 (en) 1996-06-05
FR2727540A1 (en) 1996-05-31
DE69521271D1 (en) 2001-07-19
FR2727540B1 (en) 1997-01-03
US5993038A (en) 1999-11-30

Similar Documents

Publication Publication Date Title
USRE37811E1 (en) Distributed application load distribution aid tool
US7185096B2 (en) System and method for cluster-sensitive sticky load balancing
JP4087903B2 (en) Network service load balancing and failover
US7843811B2 (en) Method of solving a split-brain condition
US7124062B2 (en) Services search method
US6249800B1 (en) Apparatus and accompanying method for assigning session requests in a multi-server sysplex environment
US5603029A (en) System of assigning work requests based on classifying into an eligible class where the criteria is goal oriented and capacity information is available
US5136708A (en) Distributed office automation system with specific task assignment among workstations
US6496941B1 (en) Network disaster recovery and analysis tool
US6330605B1 (en) Proxy cache cluster
CN104685852B (en) For using the system and method for the small batch processing of request
CN100359508C (en) Merge protocol for schooling computer system
US20010034752A1 (en) Method and system for symmetrically distributed adaptive matching of partners of mutual interest in a computer network
US20120134299A1 (en) Automated topology formation in dynamic distributed environments
AU2399299A (en) Distributed chassis agent for network management
US8296772B2 (en) Customer information control system workload management based upon target processors requesting work from routers
WO2007073429A2 (en) Distributed and replicated sessions on computing grids
US5857076A (en) Program product for obtaining the state of network resources in A distributed computing environment
CN1330124C (en) Method and apparatus for virtualizing network resources
Jiao et al. Design and Implementation of Load Balancing of Distributed-system-based Web server
US6675259B2 (en) Method and apparatus for validating and ranking disk units for switching
CN111835809B (en) Work order message distribution method, work order message distribution device, server and storage medium
US9727457B2 (en) Minimizing latency due to garbage collection in a distributed system
JP4083049B2 (en) Distributed processing system, request distribution apparatus and method
CN111865639B (en) Method and device for collecting information of snmp service equipment and electronic equipment

Legal Events

Date Code Title Description
FEPP Fee payment procedure

Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

FPAY Fee payment

Year of fee payment: 4

FPAY Fee payment

Year of fee payment: 8

FPAY Fee payment

Year of fee payment: 12