WO2003005192A1 - A system and a method for selecting a preconfigured operating system for a server - Google Patents

A system and a method for selecting a preconfigured operating system for a server Download PDF

Info

Publication number
WO2003005192A1
WO2003005192A1 PCT/SE2002/001353 SE0201353W WO03005192A1 WO 2003005192 A1 WO2003005192 A1 WO 2003005192A1 SE 0201353 W SE0201353 W SE 0201353W WO 03005192 A1 WO03005192 A1 WO 03005192A1
Authority
WO
WIPO (PCT)
Prior art keywords
server
operating system
application
servers
version
Prior art date
Application number
PCT/SE2002/001353
Other languages
French (fr)
Inventor
Markus Reimer
Magnus Ossiansson
Original Assignee
Idékapital Ab
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Idékapital Ab filed Critical Idékapital Ab
Publication of WO2003005192A1 publication Critical patent/WO2003005192A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5044Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering hardware capabilities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/202Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant
    • G06F11/2023Failover techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/202Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant
    • G06F11/2041Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant with more than one idle spare processing component
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/4401Bootstrapping
    • G06F9/4406Loading of operating system
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/34Network arrangements or protocols for supporting network services or applications involving the movement of software or configuration parameters 
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1479Generic software techniques for error detection or fault masking
    • G06F11/1482Generic software techniques for error detection or fault masking by means of middleware or OS functionality
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/10015Access to distributed or replicated servers, e.g. using brokers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/1004Server selection for load balancing
    • H04L67/1014Server selection for load balancing based on the content of a request
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/1034Reaction to server failures by a load balancer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/30Definitions, standards or architectural aspects of layered protocol stacks
    • H04L69/32Architecture of open systems interconnection [OSI] 7-layer type protocol stacks, e.g. the interfaces between the data link level and the physical level
    • H04L69/322Intralayer communication protocols among peer entities or protocol data unit [PDU] definitions
    • H04L69/329Intralayer communication protocols among peer entities or protocol data unit [PDU] definitions in the application layer [OSI layer 7]

Definitions

  • the present invention relates to an arrangement for use in a computer network for providing at least one service to at least one client computer, as defined in the preamble of claim 1.
  • the invention also relates to a method for use in a computer net- work for providing at least one service to at least one client computer as defined in the preamble of claim 7 and a computer program product as defined in the preamble of claim 13.
  • a service provider usually provides different services, such as software applications, to customers using several types of operating systems, such as Windows, Unix, Linux, etc.
  • operating systems such as Windows, Unix, Linux, etc.
  • a customer using a particular operating system cannot normally use a service provided by a server using another operating system.
  • the service provider must therefore provide several servers with different operating systems to provide services to different customers. Supporting only one type of operating system would be less expensive, but would not enable providing services to computers using other operating systems.
  • the load in a system varies with time. For example, during the night the load is lower than during the day. In most cases the load has a peak of considerably higher load than the rest of the day, during a relatively short time span.
  • the load pattern varies in dependence of the application and also in dependence of other factors. To ensure access to the system at all times, including the peak load, the service provider must ensure a capacity that is higher than what is needed most of the time.
  • the service provider also has a spare system in case one of the servers fails and must be repaired. At least one spare server must be available for each supported operating system. This is an expensive solution for securing up-time of the system. All these tree factors cause the need for keeping excess capacity; that is, an excessive number of servers are idle most of the time.
  • a user accessing the server can perform unauthorized actions that may cause damage to the server and thereby affect the function of the server, which in turn will affect other users currently accessing the same server.
  • This problem does not only apply to service providers as such, but could apply to any entity managing a network having at least one server that is accessed by many users.
  • the object is also achieved according to the invention by a method for use in a computer network for providing at least one service to at least one client computer, said client computer using a first operating system, said arrangement comprising at least a first (SI) and a second server (S2) accessible by the client computer, each one of said servers being connectable to a first memory location (C: ⁇ ), each one of said first and second server being connectable to storage means comprising at least a first
  • OS and a second (OS2) preconfigured operating system stored in such a way that it can be retreived by the at least one server, but cannot be altered by an unauthorized user, monitoring the function of the at least first and second server, determining if said first server needs to be rebooted, if the first server needs to be rebooted, selecting an operating system with which to reboot the first server, downloading the booting information of the selected preconfigured operating system to the first memory location of the first server and booting the server using the downloaded booting information
  • a computer program product for use in a computer network for providing at least one service to at least one client computer, said client computer using a first operating system, said arrangement comprising at least a first (SI) and a second server (S2) accessible by the client computer, each of said servers being connectable to a first memory location (C: ⁇ ), said first and second server being connectable to first storage means comprising at least a first (OSl) and a second (OS2) preconfigured operating system stored in such a way that it can be retrieved by the at least one server, said computer program product comprising code means which, when it is executed in a computer, will make the computer perform the following functions: determining if said first server needs to be rebooted, if the first server needs to be rebooted, selecting an operating system with which to reboot the first server, downloading the booting information of the selected operating system to the first memory location of the first server and booting the server using the downloaded booting information
  • method and program the preconfigured operating system for a server can easily be restored or changed depending on the need of the users.
  • the number of servers running a particular operating system can be easily adapted to the need at any given time.
  • the arrangement further comprises storage means comprising at least a first and a second version of a first application, said first version being adapted to the first operating system (OSl) and said second version being adapted to the second operating system (OS2), control means for selecting the version of said at least one application adapted to the selected preconfigured operating system to be downloaded to the first server, said application being control means for downloading and installing the at least one application to the first memory location of the first server .
  • storage means comprising at least a first and a second version of a first application, said first version being adapted to the first operating system (OSl) and said second version being adapted to the second operating system (OS2)
  • control means for selecting the version of said at least one application adapted to the selected preconfigured operating system to be downloaded to the first server, said application being control means for downloading and installing the at least one application to the first memory location of the first server .
  • the method further comprises the steps of selecting the version of said at least one application adapted to the selected precon- figured operating system to be downloaded to the first server, downloading and installing the at least one application to the first memory location of the first server
  • code means of the computer program product is arranged to make the computer perform the following steps: selecting the version of said at least one application adapted to the selected preconfigured operating system to be downloaded to the first server, downloading and installing the at least one application to the first memory location of the first server.
  • one or more applications may be provided to the server after rebooting the server, so that the server can provide a particular set of applications to users accessing the network.
  • the arrangement also comprises a load balance unit (LBU) connected to each of the servers, for directing new users to one of the servers.
  • LBU load balance unit
  • the decision to reboot one or more servers may be taken in dependence of the load in the network.
  • the operating system and, if applicable, application or applications, may be selected in dependence on the load in at least one of the servers.
  • a service provider typically signs a service level agreement with a customer, i.e. a content provider, where the service provider offers to ensure access through the Internet to the content provider's services.
  • the servers can be allocated dynamically to a particular combination of operating system and applications. Therefore, a service provider can guarantee access to an application at all times without keeping excess capacity at times when the load on this particular application is low.
  • the service provider does not have to provide a spare server for each combination of operating system and application in case of failure. Instead, the same spare server can be used to replace different servers depending on what is needed at any given time. In other words, fewer spare servers are needed than the supported number of operating sys- terns, which in turn will reduce the cost for providing redundancy functionality for the system.
  • Another advantage is that the servers may be rebooted on a regular basis with an un- corrupted preconfigured operating system to minimize the danger of malfunction in the servers caused by user violation etc.
  • machine means either a hardware unit having certain properties, such as an operating system and support functions for an application, or a virtual machine existing on one or more hardware units. Further, several virtual ma- chines can share the same hardware.
  • the invention therefore takes advantage of the fact that different applications have different requirements for machine resources. According to the invention three different types of decision may be made: Decisions to allocate more machine capacity to an application Decisions to remove machine capacity from an application Decisions to move capacity from one application to another
  • The number of users of a particular application. If there are many users, or the number of users is increasing, the hardware resources allocated to the application may have to be increased. If there are few users, or the number of users is decreasing, it may be possible to reduce the hardware resources allocated to the ap- plication.
  • The response time for the application. If the response times for an application are long, it may be necessary to increase the machine capacity allocated to the application.
  • a reboot can be performed automatically after a certain uptime.
  • Planned events such as backup or maintenance, may call for a reallocation of applications to another machine.
  • An operator may decide to allocate machine resources to an application, to remove machine resources from an application or to move machine resources from one application to another.
  • An indication of an increase or decrease in the need for resources may lead to a reduction in the hardware machines allocated to the application.
  • the hardware removed from the application may be put in standby mode or allocated to another application.
  • the time for releasing a machine may be taken into account. If it would take a very long time to make a machine ready to be taken down, it will in most cases be better so choose another machine.
  • the time for starting a new application may also be taken into account when de- ciding whether to start the application or not.
  • a service level agreement may specify a maximum response time at any given time.
  • the service level agreement can also specify a maximum ca- pacity that is to be guaranteed regardless of the load.
  • the result of any of these decisions can be a request for the allocation of a new machine resource to an application. It can also be a request to remove a hardware resource with or without allocating it to another application. Allocation of resources can be denied, for example because the maximum amount of resources specified in the service level agreement has already been allocated. For any of these decisions a message may be sent to the customer.
  • FIG. 1 shows a first embodiment of a system according to the present invention.
  • Fig. 2 shows a second embodiment of a system according to the present invention.
  • Fig. 3 shows a third embodiment of a system including two servers.
  • Fig. 4 shows a fourth embodiment of a system including an undefined number of servers.
  • Fig. 5 shows a flow chart of a method for rebooting a server according to an embodiment of the invention.
  • Fig. 6 shows a flow chart of a method for rebooting a server, including moving users according to an embodiment of the invention.
  • Fig. 7 shows a flow chart of the third embodiment of the present invention. Detailed description of preferred embodiments
  • Figure 1 shows a first embodiment of a system 10, including a server SI, having a first memory location C: ⁇ , the C drive, which may be a separate hard drive or a partition of a hard drive.
  • the server is connected to a control unit CU, and to a storage unit SU.
  • the storage unit may contain memory circuits, hard drives etc. The purpose of the storage unit is to store selected information that must be accessible to the system 10.
  • the control unit CU is also connected to the storage unit 20 SU.
  • the first memory location C: ⁇ may also be located in the storage unit SU, as shown in fig. 2.
  • a plurality of users may access the server SI and store user information, such as documents, presentations, etc, at a predetermined location H: ⁇ (H drive) in the system.
  • the user information may be stored in the server SI (not shown) in a separate or as a partition of a hard drive, but is preferably stored outside the server, e.g. in the storage unit SU, as indicated by H: ⁇ in Fig. 1, or in a separate file server (see fig. 2).
  • Each user will only see the drives that the server 51 has mapped up, in this example C and H drive.
  • the storage unit SU includes a second memory location comprising booting data of a plurality of preconfigured operating systems OSl, OS2, in this example two stan- dard operating systems, each being preconfigured with a specific type of applications, such as word processing applications. Examples of standard operating systems are MS Windows NT, Linux, Unix etc.
  • the second memory location may also be implemented in the server SI as one or several separate hard drives (not shown).
  • the users will not have access to the second memory location, to avoid any accidental or intentional damage to the server SI.
  • Figure 2 shows a second embodiment 20 of the present invention including a server SI, a control unit CU, a storage unit SU, a separate file server FS and a monitoring unit MU.
  • the first memory location C: ⁇ is, in this embodiment located in the storage unit SU together with the preconfigured operating system OSl, OS2 in the second memory location,
  • the control unit CU is connected to the storage unit SU and the monitoring unit MU.
  • the server SI is connected to the monitoring unit MU, the storage unit SU and the file server FS.
  • the monitoring unit helps the control unit CU to monitor the number of users accessing the server, monitor the load in the server, measures the up-time of the server, monitor any accidental or voluntary unauthorised action in the server by monitoring data consistency and/or detecting server data intrusion, etc.
  • the control unit may also control the server S 1 via the monitoring unit MU.
  • FIG. 3 shows a third embodiment 30 of the present invention comprising two servers SI and S2. Both servers are connected to a load balance unit LBU, which controls the users access to each server, as described in more detail below.
  • LBU load balance unit
  • Each server is also connected to a storage unit SU, where the first memory location C: ⁇ of each server is located.
  • the storage unit SU has, in this example, booting data for three different preconfigured operating systems OSl, OS2, OS3 for each server SI, S2.
  • the system is also provided with a monitoring unit MU and a control unit CU, connected to the storage unit SU and the monitoring unit MU.
  • the monitoring unit MU is also connected to each server and to the load balance unit LBU.
  • the function of the load balance unit is well known in the field, and it assists with directing new users to an appropriate server. The access to a server for new users may easily be controlled by the load balance unit. A new user will be directed to a particular server offering the desired service with the appropriate operating system. If more than one server fulfils the requirements, the load on each of these servers will be considered when directing the user.
  • the embodiment 30 is also provided with means to store user information, such as a file server or similar, but since this is not a part of the invention it is not shown for the sake of clarity.
  • the function of the control unit CU and the monitoring unit MU is as described above, but the storage unit SU in this embodiment comprises a third memory location stor- ing information regarding different applications for each standard operating system,
  • the storage unit SU in this embodiment comprises a third memory location stor- ing information regarding different applications for each standard operating system,
  • Examples of application configurations may be Word processing applications, database applications, economy applications, etc.
  • the applications may be e-commerce applications, games or other programs, or applications providing information, such as news services.
  • Servers Sl- S4 are the primary servers where the servers SI and S2 have a first standard operating system OSl, e.g. MS Windows NT, and the servers S3 and S4 have a second standard operating system OS2, e.g. Linux.
  • Servers SI and S3 are preconfigured with applications API of a first type and servers S2 and S4 are preconfigured with applications AP2 of a second type. This way the system provides support for two different standard operating systems, each having two application configurations.
  • the load balance unit only directs the users to the appropriate server meeting the need of each user.
  • Servers S5 and S6 are maintained as spare servers, in case one of the primary serv- ers S1-S4 breaks down or if there is an unexpected peak load for a specific applica- tion in one standard operating system, e.g. APl(OSl) word processing applications using Linux. If the load on the server SI becomes too high, one of the spare servers S5 may be configured to provide the same service. In this case, the spare server S5 is booted using OSl and API, as will be discussed in detail in connection with Fig. 7.
  • a flow diagram, shown in figure 5, describes the method for selecting a preconfigured operating system in the embodiments shown in Figures 1 and 2, that is, when selecting only an operating system. The process starts in step 50
  • Step SI The control unit CU selects a preconfigured operating system (OSl or OS2). This selection may be determined in several ways, as will be discussed below.
  • Step 52 The next step of the process to reboot the server is to copy the booting data of the selected preconfigured operating system from the secure second memory location to the first memory location C: ⁇ .
  • Step 53 The server boots up by using the uncorrupted data that was copied to the first memory location C: ⁇ . The flow ends in step 54.
  • the most obvious use of the process shown in Figure 5 is to reboot the server SI regularly by measuring the up-time of the server.
  • a first predetermined time preferably less than 1 hour, e.g. 10-15 minutes (but the predetermined time may be extended to be up to for instance 100 hours or more)
  • a flag is set in the control unit CU, indicating that the server should be rebooted as soon as possible. This is preferably performed when no users are accessing the server SI. If there are users accessing the server all the time, and the up-time of the server passes a second predetermined time, e.g.
  • a second flag may be set in the control unit CU, indicating that a request should be sent to the pre- sent users to log out and that no new users may be logged on to the server, The re- booting will take place as soon as the users have logged out or after a predetermined time, e.g. 10 minutes, after the request is sent, thereafter throwing out any users that have not logged out yet.
  • control unit detects an accidental or voluntary unauthorised action in the server, e.g. someone tries to get unauthorised access to the server the server is shut down immediately and the users are thrown out, to minimise the damage the intruder may achieve.
  • the control unit thereafter selects the same preconfigured operating system as before to reboot the server using uncorrupted booting data.
  • the server may also be rebooted for other reasons, for example, to change the operating system.
  • the user may request a change of preconfigured operating system, thereby causing the control unit CU to select the de- sired preconfigured operating system and rebooting the server after the user has logged out from the server.
  • Figure 6 is a flow chart describing the process of the invention, applied to the embodiment shown in Figure 3. The process starts in step 60 and proceeds to step 61.
  • Step 61 The servers SI and S2 are monitored by the monitoring unit MU, measuring up-time, number of users, load, unauthorised use etc. of each server, as described in connection with figures 1 and 5.
  • Step 62 The control unit CU uses the information obtained in step 61 to determine if a server needs to be rebooted and which preconfigured operating system should be used.
  • Step 63 The control unit removes present users from the selected server by performing the following: 1) send an order to the load balance unit, via the monitoring unit, to direct new users to an alternative server having the same operating system as the server being shut down. If no alternative server is available, the control unit may boot a spare server (not shown), by using the process described in connection with figure 5, to create a server with the same preconfigured operating system as the one that should be shut down and rebooted. If there is no spare server available, new users will be denied access to any server until the server is rebooted.
  • step 64 When the server is shut down in box 63 c, the process continues to step 64,
  • Step 64 A preconfigured operating system is selected, as will be discussed in more detail below.
  • Step 65 The booting data needed to reboot is copied from the uncorrupted second memory location to the first memory Location C: ⁇ of the server.
  • Step 66 The server is booted using the booting data stored in the first memory location C: ⁇ . The process returns back to step 61 after the server has been booted up.
  • Steps 62 and 64 are essential for determining if a server should be rebooted and which operating system should be installed during the rebooting process. These steps are preferably implemented in a rule matrix, where the input parameters may be up-time, number of users, load and unauthorised use and/or others, as discussed above. By combining the decision of the rule matrix with the change of load on the input to the servers appropriate measures may be taken. The taken measures always depend on the result from the monitoring process of the servers.
  • a server should be rebooted with the same preconfigured operating system if, for instance, the up-time has exceeded a predetermined time, as described earlier, or if an unauthorised use has been detected.
  • a server should be rebooted with another preconfigured operating system if, for in- stance, there is a need for more capacity in another preconfigured operating system.
  • a such example could be when the number of users and/or the load of a server exceed a predetermined level. If there is no spare server available, the server having a low number of users is selected to be rebooted with another preconfigured operating system to increase the capacity.
  • the present users of the server being rebooted may be transferred to another server, if available, or thrown out if the need is urgent.
  • FIG 7 is a flow chart of the method according to the invention, applied to the embodiment shown in Figure 4. The following will happen in case of a failure in one of the primary servers, e.g. server S3.
  • the process starts at step 70 and proceeds to step 71.
  • Step 71 The monitor unit MU monitors all 20 servers (Sl-Sn) as described earlier. When server S3 fails the monitor unit registers this and alerts the control unit CU. Step 72: A decision to boot up one of the spare servers, e.g. S5, is taken, since the server S3 may be experiencing a hardware failure.
  • Step 73 Any present users are removed from the server S3. This may be done rapidly if needed or during a longer period of time as described in connection with Fig. 6 and the server is shut down in box 73 a.
  • Step 74 A standard operating system is selected, in this example OS2,
  • Step 75 The booting data of the selected standard operating system is copied from the second memory location to the first memory location of server S5.
  • server S5 In this example there is only provided a second memory location for all the servers. This saves memory space and makes it easier to update the differ- ent operating systems .
  • Step 76 An application configuration, in this example AP1(0S2), is selected to replace the server S3.
  • Step 77 This information regarding the selected application configuration is thereafter copied to an additional memory location D: ⁇ .
  • any new users requiring the second type of operating system for the first type of applications i.e. APl(OS2)
  • the load balance unit which is controlled by the control unit CU via the monitoring unit MU.
  • No new users are directed to the failing server S3.
  • Another advantage with this system is that it reduces the energy consumption, since fewer spare servers are needed, and the servers are used in a more cost efficient and environmentally friendly manner, since a spare server may substitute a number of different servers.
  • Figure 7 may also be used when illustrating what happens during an unexpected peak load, that is, when the capacity for a particular application should be increased.
  • Step 71 The monitoring unit indicates that a specific server, e.g. server S2, experiences a number of users exceeding a predetermined level, e.g. 400 users.
  • Step 72 If there are any spare servers available in idle mode, the control unit CU decides to boot up (at least) one spare server. In this example only S6 is available and idle, since S5 has previously taken over the workload of server S3, which has failed.
  • Step 73 If necessary, users are removed from the spare server. In this case, no users have to be removed from the spare server S6 and the server does not need to be shut down in step 73 a.
  • Step 74 If the control unit CU has not already selected the operating system, in this example OSl, this is done here.
  • Step 75 Then the booting data is copied from the second memory location to the first memory location of server S6.
  • Step 76 If the application configuration has not already been selected, in this ex- ample AP2, this is done here.
  • Step 77 The application configuration is copied to the additional memory location
  • Step 78 The spare server S6 is thereafter booted up using the booting data from the C drive and the application configuration on the D drive. The process returns back to step 71. If there are no available idle spare servers, any other server being idle, i.e. having no users, may be reconfigured by first shutting down the server and rebooting the server with the desired operating system and application configuration.
  • a flexible system has thus been achieved where it is possible to change the operating system of any server, reboot a server that has a long up-time on a regular basis, meet peak load, reduce the number of back-up servers and increase the security of a server system.
  • the described invention may be implemented as a server sharing system for a service provider having a large number of servers to fulfil the need and requirements of the clients, but the invention should not be limited to this. It is fully possible to implement the invention in any network where there is a need to regularly reboot the server(s) and sometimes even to change the operating system of a server, such as a company network.
  • the invention also has the advantage that the life span of a server is increased due to the cycling of the servers (rebooting due to long up-time) and it also increases the security of the system since an uncorrupted version of the preconfigured operating system is downloaded each time the server is rebooted.
  • the word "users” is intended to cover both persons and machines using the resources and services provided by the servers.
  • the methods described above are preferably implemented as a software application that generally includes instructions to carry out the method steps described in connection with figures 5, 6 and 7.
  • the software application is preferably stored in the control unit CU during operation, but may naturally be provided on a CD-ROM, diskettes or other memory storage media, which the control unit may access for running the application.
  • the function of the control unit, the monitoring unit and the load balance unit is controlled by computer programs run on the appropriate unit.

Abstract

In a pool of servers, the capacity may be used in a better way if the servers can be rebooted with different operating systems and provided with different sets of appli-cation depending on the current needs in the network. It may be decided to allocate more machine capacity to an application, remove machine capacity from an applica-tion or move capacity from one application to another. The servers can be allocated dynamically to a particular combination of operating system and applications. Therefore, a service provider can guarantee access to an application at all times without keeping excess capacity at times when the load on the application is low. Also, the service provider does not have to provide a spare server for each combi-nation of operating system and application in case of failure. One spare server can be used to replace different servers depending on the needs at any given time.

Description

A SYSTEM AND A METHOD FOR SELECTING A PRECONFTGURED OPERATING SYSTEM FOR A SERVER
Technical Field
The present invention relates to an arrangement for use in a computer network for providing at least one service to at least one client computer, as defined in the preamble of claim 1. The invention also relates to a method for use in a computer net- work for providing at least one service to at least one client computer as defined in the preamble of claim 7 and a computer program product as defined in the preamble of claim 13.
Background and Prior Art A service provider usually provides different services, such as software applications, to customers using several types of operating systems, such as Windows, Unix, Linux, etc. A customer using a particular operating system cannot normally use a service provided by a server using another operating system. The service provider must therefore provide several servers with different operating systems to provide services to different customers. Supporting only one type of operating system would be less expensive, but would not enable providing services to computers using other operating systems.
The load in a system varies with time. For example, during the night the load is lower than during the day. In most cases the load has a peak of considerably higher load than the rest of the day, during a relatively short time span. The load pattern varies in dependence of the application and also in dependence of other factors. To ensure access to the system at all times, including the peak load, the service provider must ensure a capacity that is higher than what is needed most of the time.
Usually, the service provider also has a spare system in case one of the servers fails and must be repaired. At least one spare server must be available for each supported operating system. This is an expensive solution for securing up-time of the system. All these tree factors cause the need for keeping excess capacity; that is, an excessive number of servers are idle most of the time.
Another major concern is the security issues. A user accessing the server can perform unauthorized actions that may cause damage to the server and thereby affect the function of the server, which in turn will affect other users currently accessing the same server. This problem does not only apply to service providers as such, but could apply to any entity managing a network having at least one server that is accessed by many users.
Summary of the Invention
It is therefore an object of the present invention to minimize the need for hardware installations while ensuring the capacity needed to meet the requirements on a system at any given time.
This object is achieved according to the invention by an arrangement for use in a computer network for providing at least one service to at least one client computer, said client computer using a first operating system, said arrangement comprising at least a first (SI) and a second server (S2) accessible by the client computer, each of said servers being connectable to a first memory location (C:\), said arrangement comprising: storage means comprising at least a first (OS1) and a second (OS2) preconfigured operating system stored in such a way that it can be retreived by the at least one server, but cannot be altered by an unauthorized user, control means for monitoring the function of the at least first and second server, control means for initiating a reboot of the first server, control means for selecting an operating system with which to reboot the first server, control means for downloading the booting information of the selected preconfigured operating system to the first memory location of the first server and booting the server using the downloaded booting information
The object is also achieved according to the invention by a method for use in a computer network for providing at least one service to at least one client computer, said client computer using a first operating system, said arrangement comprising at least a first (SI) and a second server (S2) accessible by the client computer, each one of said servers being connectable to a first memory location (C:\), each one of said first and second server being connectable to storage means comprising at least a first
(OSl) and a second (OS2) preconfigured operating system stored in such a way that it can be retreived by the at least one server, but cannot be altered by an unauthorized user, monitoring the function of the at least first and second server, determining if said first server needs to be rebooted, if the first server needs to be rebooted, selecting an operating system with which to reboot the first server, downloading the booting information of the selected preconfigured operating system to the first memory location of the first server and booting the server using the downloaded booting information
The object is also achieved according to the invention by a computer program product for use in a computer network for providing at least one service to at least one client computer, said client computer using a first operating system, said arrangement comprising at least a first (SI) and a second server (S2) accessible by the client computer, each of said servers being connectable to a first memory location (C:\), said first and second server being connectable to first storage means comprising at least a first (OSl) and a second (OS2) preconfigured operating system stored in such a way that it can be retrieved by the at least one server, said computer program product comprising code means which, when it is executed in a computer, will make the computer perform the following functions: determining if said first server needs to be rebooted, if the first server needs to be rebooted, selecting an operating system with which to reboot the first server, downloading the booting information of the selected operating system to the first memory location of the first server and booting the server using the downloaded booting information
With the inventive arrangement, method and program the preconfigured operating system for a server can easily be restored or changed depending on the need of the users. The number of servers running a particular operating system can be easily adapted to the need at any given time.
Preferably the arrangement further comprises storage means comprising at least a first and a second version of a first application, said first version being adapted to the first operating system (OSl) and said second version being adapted to the second operating system (OS2), control means for selecting the version of said at least one application adapted to the selected preconfigured operating system to be downloaded to the first server, said application being control means for downloading and installing the at least one application to the first memory location of the first server .
In this embodiment the method further comprises the steps of selecting the version of said at least one application adapted to the selected precon- figured operating system to be downloaded to the first server, downloading and installing the at least one application to the first memory location of the first server
Further, the code means of the computer program product is arranged to make the computer perform the following steps: selecting the version of said at least one application adapted to the selected preconfigured operating system to be downloaded to the first server, downloading and installing the at least one application to the first memory location of the first server.
In this embodiment, one or more applications may be provided to the server after rebooting the server, so that the server can provide a particular set of applications to users accessing the network. This makes the system more flexible regarding the capacity for providing different applications. Since the peak load may occur at differ- ent times depending on the type of application, several applications can share a pool of servers. The peak load times also vary geographically, because of the time zones. The excess capacity can be allocated to the application that currently has a peak. When the load changes, the excess capacity can be allocated to another application currently experiencing a peak.
In a preferred embodiment the arrangement also comprises a load balance unit (LBU) connected to each of the servers, for directing new users to one of the servers.
The decision to reboot one or more servers may be taken in dependence of the load in the network. The operating system and, if applicable, application or applications, may be selected in dependence on the load in at least one of the servers.
A service provider typically signs a service level agreement with a customer, i.e. a content provider, where the service provider offers to ensure access through the Internet to the content provider's services. With the arrangement and method according to the invention, the servers can be allocated dynamically to a particular combination of operating system and applications. Therefore, a service provider can guarantee access to an application at all times without keeping excess capacity at times when the load on this particular application is low. Also, the service provider does not have to provide a spare server for each combination of operating system and application in case of failure. Instead, the same spare server can be used to replace different servers depending on what is needed at any given time. In other words, fewer spare servers are needed than the supported number of operating sys- terns, which in turn will reduce the cost for providing redundancy functionality for the system.
Another advantage is that the servers may be rebooted on a regular basis with an un- corrupted preconfigured operating system to minimize the danger of malfunction in the servers caused by user violation etc.
In this document the term "machine" means either a hardware unit having certain properties, such as an operating system and support functions for an application, or a virtual machine existing on one or more hardware units. Further, several virtual ma- chines can share the same hardware.
The invention therefore takes advantage of the fact that different applications have different requirements for machine resources. According to the invention three different types of decision may be made: Decisions to allocate more machine capacity to an application Decisions to remove machine capacity from an application Decisions to move capacity from one application to another
These decisions may be based on one or more of the following criteria, or other cri- teria:
■ The number of users of a particular application. If there are many users, or the number of users is increasing, the hardware resources allocated to the application may have to be increased. If there are few users, or the number of users is decreasing, it may be possible to reduce the hardware resources allocated to the ap- plication. The response time for the application. If the response times for an application are long, it may be necessary to increase the machine capacity allocated to the application.
The number of machines currently allocated to the application. A maximum or minimum number may be specified in the level of service agreement, or may be made dependent on the load.
The geographical location of the application
The logical location of the application
The load on a particular machine. If the load is high, the possibility to allocate more resources to the applications running on this machine should be investigated. If the load is low, it should be considered whether to set part of the machine resources allocated to the application in standby mode instead, or to allocate the machine resource to another application.
Security alarm on a physical or logical machine. In this case the machine may have to be taken down, which means that it should be replaced by another machine if necessary.
Time in operation. A reboot can be performed automatically after a certain uptime.
Planned events, such as backup or maintenance, may call for a reallocation of applications to another machine.
An operator may decide to allocate machine resources to an application, to remove machine resources from an application or to move machine resources from one application to another.
An indication of an increase or decrease in the need for resources. ■ An indication of excess capacity for a particular application may lead to a reduction in the hardware machines allocated to the application. The hardware removed from the application may be put in standby mode or allocated to another application. The time for releasing a machine may be taken into account. If it would take a very long time to make a machine ready to be taken down, it will in most cases be better so choose another machine.
The time for starting a new application may also be taken into account when de- ciding whether to start the application or not.
Some or all of these criteria may be specified in a service level agreement, as mentioned above. For example, the service provider may guarantee a maximum response time at any given time. The service level agreement can also specify a maximum ca- pacity that is to be guaranteed regardless of the load.
The result of any of these decisions can be a request for the allocation of a new machine resource to an application. It can also be a request to remove a hardware resource with or without allocating it to another application. Allocation of resources can be denied, for example because the maximum amount of resources specified in the service level agreement has already been allocated. For any of these decisions a message may be sent to the customer.
Brief Description of the drawings Fig. 1 shows a first embodiment of a system according to the present invention.
Fig. 2 shows a second embodiment of a system according to the present invention.
Fig. 3 shows a third embodiment of a system including two servers.
Fig. 4 shows a fourth embodiment of a system including an undefined number of servers. Fig. 5 shows a flow chart of a method for rebooting a server according to an embodiment of the invention.
Fig. 6 shows a flow chart of a method for rebooting a server, including moving users according to an embodiment of the invention. Fig. 7 shows a flow chart of the third embodiment of the present invention. Detailed description of preferred embodiments
Figure 1 shows a first embodiment of a system 10, including a server SI, having a first memory location C:\, the C drive, which may be a separate hard drive or a partition of a hard drive. The server is connected to a control unit CU, and to a storage unit SU. The storage unit may contain memory circuits, hard drives etc. The purpose of the storage unit is to store selected information that must be accessible to the system 10. The control unit CU is also connected to the storage unit 20 SU. The first memory location C:\ may also be located in the storage unit SU, as shown in fig. 2.
A plurality of users may access the server SI and store user information, such as documents, presentations, etc, at a predetermined location H:\ (H drive) in the system. The user information may be stored in the server SI (not shown) in a separate or as a partition of a hard drive, but is preferably stored outside the server, e.g. in the storage unit SU, as indicated by H:\ in Fig. 1, or in a separate file server (see fig. 2). Each user will only see the drives that the server 51 has mapped up, in this example C and H drive.
The storage unit SU includes a second memory location comprising booting data of a plurality of preconfigured operating systems OSl, OS2, in this example two stan- dard operating systems, each being preconfigured with a specific type of applications, such as word processing applications. Examples of standard operating systems are MS Windows NT, Linux, Unix etc. The second memory location may also be implemented in the server SI as one or several separate hard drives (not shown).
The users will not have access to the second memory location, to avoid any accidental or intentional damage to the server SI.
Figure 2 shows a second embodiment 20 of the present invention including a server SI, a control unit CU, a storage unit SU, a separate file server FS and a monitoring unit MU. The first memory location C:\ is, in this embodiment located in the storage unit SU together with the preconfigured operating system OSl, OS2 in the second memory location, The control unit CU is connected to the storage unit SU and the monitoring unit MU. The server SI is connected to the monitoring unit MU, the storage unit SU and the file server FS.
The monitoring unit helps the control unit CU to monitor the number of users accessing the server, monitor the load in the server, measures the up-time of the server, monitor any accidental or voluntary unauthorised action in the server by monitoring data consistency and/or detecting server data intrusion, etc. The control unit may also control the server S 1 via the monitoring unit MU.
Figure 3 shows a third embodiment 30 of the present invention comprising two servers SI and S2. Both servers are connected to a load balance unit LBU, which controls the users access to each server, as described in more detail below. Each server is also connected to a storage unit SU, where the first memory location C:\ of each server is located. The storage unit SU has, in this example, booting data for three different preconfigured operating systems OSl, OS2, OS3 for each server SI, S2.
As described in connection with figure 2, the system is also provided with a monitoring unit MU and a control unit CU, connected to the storage unit SU and the monitoring unit MU. The monitoring unit MU is also connected to each server and to the load balance unit LBU. The function of the load balance unit is well known in the field, and it assists with directing new users to an appropriate server. The access to a server for new users may easily be controlled by the load balance unit. A new user will be directed to a particular server offering the desired service with the appropriate operating system. If more than one server fulfils the requirements, the load on each of these servers will be considered when directing the user. The embodiment 30 is also provided with means to store user information, such as a file server or similar, but since this is not a part of the invention it is not shown for the sake of clarity.
Figure 4 shows a fourth embodiment of the present invention, comprising a large number of servers (Sl-Sn), n=6, n is larger than 1, all connected to the load balance unit LBU, as described in connection with figure 3, and to the storage unit SU. The function of the control unit CU and the monitoring unit MU is as described above, but the storage unit SU in this embodiment comprises a third memory location stor- ing information regarding different applications for each standard operating system, In this example there are two different types of application configurations for each operating system: APl(OSl), API (OSl), AP2 (OS2), and AP2 (OS2). Examples of application configurations may be Word processing applications, database applications, Economy applications, etc. Alternatively, for service providers on the internet, the applications may be e-commerce applications, games or other programs, or applications providing information, such as news services.
In figure 4 only two servers are shown, but the dotted line between them indicate that there may be more server, and in this example n is assumed to be 6. Servers Sl- S4 are the primary servers where the servers SI and S2 have a first standard operating system OSl, e.g. MS Windows NT, and the servers S3 and S4 have a second standard operating system OS2, e.g. Linux. Servers SI and S3 are preconfigured with applications API of a first type and servers S2 and S4 are preconfigured with applications AP2 of a second type. This way the system provides support for two different standard operating systems, each having two application configurations. The load balance unit only directs the users to the appropriate server meeting the need of each user.
Servers S5 and S6 are maintained as spare servers, in case one of the primary serv- ers S1-S4 breaks down or if there is an unexpected peak load for a specific applica- tion in one standard operating system, e.g. APl(OSl) word processing applications using Linux. If the load on the server SI becomes too high, one of the spare servers S5 may be configured to provide the same service. In this case, the spare server S5 is booted using OSl and API, as will be discussed in detail in connection with Fig. 7.
A flow diagram, shown in figure 5, describes the method for selecting a preconfigured operating system in the embodiments shown in Figures 1 and 2, that is, when selecting only an operating system. The process starts in step 50
Step SI: The control unit CU selects a preconfigured operating system (OSl or OS2). This selection may be determined in several ways, as will be discussed below. Step 52: The next step of the process to reboot the server is to copy the booting data of the selected preconfigured operating system from the secure second memory location to the first memory location C:\. Step 53: The server boots up by using the uncorrupted data that was copied to the first memory location C:\. The flow ends in step 54.
The most obvious use of the process shown in Figure 5 is to reboot the server SI regularly by measuring the up-time of the server. When a first predetermined time has passed, preferably less than 1 hour, e.g. 10-15 minutes (but the predetermined time may be extended to be up to for instance 100 hours or more), a flag is set in the control unit CU, indicating that the server should be rebooted as soon as possible. This is preferably performed when no users are accessing the server SI. If there are users accessing the server all the time, and the up-time of the server passes a second predetermined time, e.g. twice as long as the first predetermined time, a second flag may be set in the control unit CU, indicating that a request should be sent to the pre- sent users to log out and that no new users may be logged on to the server, The re- booting will take place as soon as the users have logged out or after a predetermined time, e.g. 10 minutes, after the request is sent, thereafter throwing out any users that have not logged out yet.
If the control unit detects an accidental or voluntary unauthorised action in the server, e.g. someone tries to get unauthorised access to the server the server is shut down immediately and the users are thrown out, to minimise the damage the intruder may achieve. The control unit thereafter selects the same preconfigured operating system as before to reboot the server using uncorrupted booting data. The server may also be rebooted for other reasons, for example, to change the operating system.
If a privileged user has access to the server, the user may request a change of preconfigured operating system, thereby causing the control unit CU to select the de- sired preconfigured operating system and rebooting the server after the user has logged out from the server.
Figure 6 is a flow chart describing the process of the invention, applied to the embodiment shown in Figure 3. The process starts in step 60 and proceeds to step 61.
Step 61 : The servers SI and S2 are monitored by the monitoring unit MU, measuring up-time, number of users, load, unauthorised use etc. of each server, as described in connection with figures 1 and 5. Step 62: The control unit CU uses the information obtained in step 61 to determine if a server needs to be rebooted and which preconfigured operating system should be used.
The decision regarding whether any server should be rebooted is taken in this step, which will be discussed in more detail below. If there is no server to reboot, the process returns back to step 61, where the monitoring unit continues to monitor the servers. If the control unit CU makes a decision to reboot a server the process continues to step 63. Step 63 : The control unit removes present users from the selected server by performing the following: 1) send an order to the load balance unit, via the monitoring unit, to direct new users to an alternative server having the same operating system as the server being shut down. If no alternative server is available, the control unit may boot a spare server (not shown), by using the process described in connection with figure 5, to create a server with the same preconfigured operating system as the one that should be shut down and rebooted. If there is no spare server available, new users will be denied access to any server until the server is rebooted.
2) Wait until present users, accessing the server, are logged out or, if more urgent, send a request to the present users to log out as soon as possible or, if extremely urgent, throw out present users. This is indicated in box 63a and 63b, where a decision is made in box 63 a to proceed to box 63 c if no users are logged on to the server or if an urgent need to throw out present users has occurred (priority changed). On the other hand if there are still users present on the server and the priority is not changed, a notification is sent to the users requesting that they should log off, see box 63b. An alternative solution is to hand over present users and their running ap- plications to an alternative server (if available),
3) Shut down the server.
When the server is shut down in box 63 c, the process continues to step 64,
Step 64: A preconfigured operating system is selected, as will be discussed in more detail below.
Step 65: The booting data needed to reboot is copied from the uncorrupted second memory location to the first memory Location C:\ of the server. Step 66: The server is booted using the booting data stored in the first memory location C:\. The process returns back to step 61 after the server has been booted up. Steps 62 and 64 are essential for determining if a server should be rebooted and which operating system should be installed during the rebooting process. These steps are preferably implemented in a rule matrix, where the input parameters may be up-time, number of users, load and unauthorised use and/or others, as discussed above. By combining the decision of the rule matrix with the change of load on the input to the servers appropriate measures may be taken. The taken measures always depend on the result from the monitoring process of the servers.
A server should be rebooted with the same preconfigured operating system if, for instance, the up-time has exceeded a predetermined time, as described earlier, or if an unauthorised use has been detected.
A server should be rebooted with another preconfigured operating system if, for in- stance, there is a need for more capacity in another preconfigured operating system. A such example could be when the number of users and/or the load of a server exceed a predetermined level. If there is no spare server available, the server having a low number of users is selected to be rebooted with another preconfigured operating system to increase the capacity. The present users of the server being rebooted, may be transferred to another server, if available, or thrown out if the need is urgent.
Users will be denied access to the server if there is an escalating trend in number of users etc., which indicate that the capacity in the other preconfigured operating system will not be sufficient in a near future, and the server is rebooted to increase the capacity.
Figure 7 is a flow chart of the method according to the invention, applied to the embodiment shown in Figure 4. The following will happen in case of a failure in one of the primary servers, e.g. server S3. The process starts at step 70 and proceeds to step 71.
Step 71 : The monitor unit MU monitors all 20 servers (Sl-Sn) as described earlier. When server S3 fails the monitor unit registers this and alerts the control unit CU. Step 72: A decision to boot up one of the spare servers, e.g. S5, is taken, since the server S3 may be experiencing a hardware failure.
Step 73 : Any present users are removed from the server S3. This may be done rapidly if needed or during a longer period of time as described in connection with Fig. 6 and the server is shut down in box 73 a. Step 74: A standard operating system is selected, in this example OS2,
Step 75: The booting data of the selected standard operating system is copied from the second memory location to the first memory location of server S5. In this example there is only provided a second memory location for all the servers. This saves memory space and makes it easier to update the differ- ent operating systems .
Step 76: An application configuration, in this example AP1(0S2), is selected to replace the server S3.
Step 77: This information regarding the selected application configuration is thereafter copied to an additional memory location D:\. Step 78: The server is thereafter booted using the booting data from the C drive and the application configuration on the D drive. The process returns back to step 71.
When the spare server is up and running, any new users requiring the second type of operating system for the first type of applications, i.e. APl(OS2), are directed to server S5 by the load balance unit, which is controlled by the control unit CU via the monitoring unit MU. No new users are directed to the failing server S3.
As may be seen from this example there is no need to provide a separate spare server for each server having a specific operating system and application configura- tion. Only one spare server is needed, but preferably half as many spare servers as primary servers should be provided to provide a higher quality-of-service level.
Another advantage with this system is that it reduces the energy consumption, since fewer spare servers are needed, and the servers are used in a more cost efficient and environmentally friendly manner, since a spare server may substitute a number of different servers.
Figure 7 may also be used when illustrating what happens during an unexpected peak load, that is, when the capacity for a particular application should be increased. Step 71 : The monitoring unit indicates that a specific server, e.g. server S2, experiences a number of users exceeding a predetermined level, e.g. 400 users. Step 72: If there are any spare servers available in idle mode, the control unit CU decides to boot up (at least) one spare server. In this example only S6 is available and idle, since S5 has previously taken over the workload of server S3, which has failed. Step 73: If necessary, users are removed from the spare server. In this case, no users have to be removed from the spare server S6 and the server does not need to be shut down in step 73 a. Step 74: If the control unit CU has not already selected the operating system, in this example OSl, this is done here. Step 75: Then the booting data is copied from the second memory location to the first memory location of server S6. Step 76: If the application configuration has not already been selected, in this ex- ample AP2, this is done here.
Step 77: The application configuration is copied to the additional memory location
D:\ in step 77. Step 78: The spare server S6 is thereafter booted up using the booting data from the C drive and the application configuration on the D drive. The process returns back to step 71. If there are no available idle spare servers, any other server being idle, i.e. having no users, may be reconfigured by first shutting down the server and rebooting the server with the desired operating system and application configuration.
A flexible system has thus been achieved where it is possible to change the operating system of any server, reboot a server that has a long up-time on a regular basis, meet peak load, reduce the number of back-up servers and increase the security of a server system.
The described invention may be implemented as a server sharing system for a service provider having a large number of servers to fulfil the need and requirements of the clients, but the invention should not be limited to this. It is fully possible to implement the invention in any network where there is a need to regularly reboot the server(s) and sometimes even to change the operating system of a server, such as a company network.
The invention also has the advantage that the life span of a server is increased due to the cycling of the servers (rebooting due to long up-time) and it also increases the security of the system since an uncorrupted version of the preconfigured operating system is downloaded each time the server is rebooted.
In this description the word "users" is intended to cover both persons and machines using the resources and services provided by the servers. The methods described above are preferably implemented as a software application that generally includes instructions to carry out the method steps described in connection with figures 5, 6 and 7. The software application is preferably stored in the control unit CU during operation, but may naturally be provided on a CD-ROM, diskettes or other memory storage media, which the control unit may access for running the application. In all the embodiments discussed here, the function of the control unit, the monitoring unit and the load balance unit is controlled by computer programs run on the appropriate unit.

Claims

Claims
1. An arrangement for use in a computer network for providing at least one service to at least one client computer, said client computer using a first operating sys- tern, said arrangement comprising at least a first (SI) and a second server (S2) accessible by the client computer, each of said servers being connectable to a first memory location (C:\), said arrangement comprising: storage means comprising at least a first (OSl) and a second (OS2) preconfigured operating system stored in such a way that it can be retreived by the at least one server, but cannot be altered by an unauthorized user, control means for monitoring the function of the at least first and second server, control means for initiating a reboot of the first server, control means for selecting an operating system with which to reboot the first server, control means for downloading the booting information of the selected preconfigured operating system to the first memory location of the first server and booting the server using the downloaded booting information
2. An arrangement according to claim 1, further comprising storage means comprising at least a first and a second version of a first application, said first version being adapted to the first operating system (OSl) and said second version being adapted to the second operating system (OS2), control means for selecting the version of said at least one application adapted to the selected preconfigured operating system to be downloaded to the first server, said application being control means for downloading and installing the at least one application to the first memory location of the first server .
3. An arrangement according to claim 1 or 2, further comprising a load balance unit (LBU) connected to each of the servers, for directing new users to one of the servers.
4. An arrangement according to any one of the preceding claims, further comprising control means arranged to initiate a reboot of said first server in dependence of the load in the network.
5. An arrangement according to any one of the preceding claims further comprising control means arranged to select an operating system in dependence on the load in at least one of the servers.
6. An arrangement according to any one of the claims 2-5, further comprising control means arranged to select the version of the application in dependence on the load on at least one application in the server.
7. A method for use in a computer network for providing at least one service to at least one client computer, said client computer using a first operating system, said arrangement comprising at least a first (SI) and a second server (S2) accessible by the client computer, each one of said servers being connectable to a first memory location (C:\), each one of said first and second server being connectable to storage means comprising at least a first (OSl) and a second (OS2) preconfigured operating system stored in such a way that it can be retreived by the at least one server, but cannot be altered by an unauthorized user, monitoring the function of the at least first and second server, determining if said first server needs to be rebooted, if the first server needs to be rebooted, selecting an operating system with which to reboot the first server, downloading the booting information of the selected preconfigured operating system to the first memory location of the first server and booting the server using the downloaded booting information
8. A method according to claim 7, wherein said servers are connectable to storage means comprising at least a first and a second version of a first application, said first version being adapted to the first operating system (OSl) and said second version being adapted to the second operating system (OS2), said method further comprising the steps of selecting the version of said at least one application adapted to the selected preconfigured operating system to be downloaded to the first server, downloading and installing the at least one application to the first memory location of the first server
9. A method according to claim 7 or 8, further comprising the step of determining the need for reboot in dependence on the load in the network.
10. A method according to any one of the claims 7-9, further comprising the step of selecting the operating system with which to reboot the first server in dependence of the load on at least one of the operating systems.
11. A method according to any one of the claims 8-10, further comprising the step of selecting the version of the application in dependence of the load on the at least one application.
12. A method according to any one of the claims 7-11, further comprising the step of directing a user accessing the network to one of said servers by means of a load balancing unit (LBU).
13. A computer program product for use in a computer network for providing at least one service to at least one client computer, said client computer using a first operating system, said arrangement comprising at least a first (SI) and a second server (S2) accessible by the client computer, each of said servers being connectable to a first memory location (C:\), said first and second server being connectable to first storage means comprising at least a first (OSl) and a second (OS2) preconfigured operating system stored in such a way that it can be re- trieved by the at least one server, said computer program product comprising code means which, when it is executed in a computer, will make the computer perform the following functions: determining if said first server needs to be rebooted, if the first server needs to be rebooted, selecting an operating system with which to reboot the first server, downloading the booting information of the selected operating system to the first memory location of the first server and booting the server using the downloaded booting information
14. A computer program product according to claim 13, wherein said network further comprises storage means comprising at least a first and a second version of a first application, said first version being adapted to the first operating system (OSl) and said second version being adapted to the second operating system (OS2), said code means being arranged to make the computer perform the following steps: selecting the version of said at least one application adapted to the selected preconfigured operating system to be downloaded to the first server, downloading and installing the at least one application to the first memory location of the first server
15. A computer program product according to claim 13 or 14 wherein the code means will initiate a reboot of the first server in dependence of the load in the network.
16. A computer program product according to any one of claims 13-15, wherein the operating system will be selected in dependence of the load on at least one operating system.
PCT/SE2002/001353 2001-07-04 2002-07-04 A system and a method for selecting a preconfigured operating system for a server WO2003005192A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
SE0102405A SE0102405D0 (en) 2001-07-04 2001-07-04 A system and a method for selecting a preconfigured operating system for a server
SE0102405-8 2001-07-04

Publications (1)

Publication Number Publication Date
WO2003005192A1 true WO2003005192A1 (en) 2003-01-16

Family

ID=20284744

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/SE2002/001353 WO2003005192A1 (en) 2001-07-04 2002-07-04 A system and a method for selecting a preconfigured operating system for a server

Country Status (2)

Country Link
SE (1) SE0102405D0 (en)
WO (1) WO2003005192A1 (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004092951A2 (en) * 2003-04-18 2004-10-28 Sap Ag Managing a computer system with blades
EP1489498A1 (en) * 2003-06-16 2004-12-22 Sap Ag Managing a computer system with blades
EP1594057A1 (en) * 2004-04-15 2005-11-09 Raytheon Company System and method for computer cluster virtualization using dynamic boot images and virtual disk
GB2444287A (en) * 2006-12-02 2008-06-04 David Peter Neupert Backup Server System
US7433931B2 (en) 2004-11-17 2008-10-07 Raytheon Company Scheduling in a high-performance computing (HPC) system
US7475274B2 (en) 2004-11-17 2009-01-06 Raytheon Company Fault tolerance and recovery in a high-performance computing (HPC) system
US7590683B2 (en) 2003-04-18 2009-09-15 Sap Ag Restarting processes in distributed applications on blade servers
US7711977B2 (en) 2004-04-15 2010-05-04 Raytheon Company System and method for detecting and managing HPC node failure
US8086693B2 (en) 2004-01-20 2011-12-27 International Business Machines Corporation System and method for replacing an application on a server
US8458324B2 (en) 2009-08-25 2013-06-04 International Business Machines Corporation Dynamically balancing resources in a server farm
US8910175B2 (en) 2004-04-15 2014-12-09 Raytheon Company System and method for topology-aware job scheduling and backfilling in an HPC environment
US9037833B2 (en) 2004-04-15 2015-05-19 Raytheon Company High performance computing (HPC) node having a plurality of switch coupled processors
US9178784B2 (en) 2004-04-15 2015-11-03 Raytheon Company System and method for cluster management based on HPC architecture

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5675723A (en) * 1995-05-19 1997-10-07 Compaq Computer Corporation Multi-server fault tolerance using in-band signalling
US5742829A (en) * 1995-03-10 1998-04-21 Microsoft Corporation Automatic software installation on heterogeneous networked client computer systems
EP0838753A1 (en) * 1996-10-28 1998-04-29 Sun Microsystems, Inc. Method and apparatus for initializing a device
GB2346715A (en) * 1999-02-11 2000-08-16 Su Chen Chang A computer with multiple operating systems
EP1037133A1 (en) * 1999-03-15 2000-09-20 International Business Machines Corporation Method and apparatus for alternation between instances of operating systems in computer systems
US6134673A (en) * 1997-05-13 2000-10-17 Micron Electronics, Inc. Method for clustering software applications

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5742829A (en) * 1995-03-10 1998-04-21 Microsoft Corporation Automatic software installation on heterogeneous networked client computer systems
US5675723A (en) * 1995-05-19 1997-10-07 Compaq Computer Corporation Multi-server fault tolerance using in-band signalling
EP0838753A1 (en) * 1996-10-28 1998-04-29 Sun Microsystems, Inc. Method and apparatus for initializing a device
US6134673A (en) * 1997-05-13 2000-10-17 Micron Electronics, Inc. Method for clustering software applications
GB2346715A (en) * 1999-02-11 2000-08-16 Su Chen Chang A computer with multiple operating systems
EP1037133A1 (en) * 1999-03-15 2000-09-20 International Business Machines Corporation Method and apparatus for alternation between instances of operating systems in computer systems

Cited By (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004092951A2 (en) * 2003-04-18 2004-10-28 Sap Ag Managing a computer system with blades
WO2004092951A3 (en) * 2003-04-18 2005-01-27 Sap Ag Managing a computer system with blades
US7610582B2 (en) * 2003-04-18 2009-10-27 Sap Ag Managing a computer system with blades
US7590683B2 (en) 2003-04-18 2009-09-15 Sap Ag Restarting processes in distributed applications on blade servers
EP1489498A1 (en) * 2003-06-16 2004-12-22 Sap Ag Managing a computer system with blades
US8086693B2 (en) 2004-01-20 2011-12-27 International Business Machines Corporation System and method for replacing an application on a server
US10621009B2 (en) 2004-04-15 2020-04-14 Raytheon Company System and method for topology-aware job scheduling and backfilling in an HPC environment
KR101173712B1 (en) * 2004-04-15 2012-08-13 레이던 컴퍼니 System and method for computer cluster virtualization using dynamic boot images and virtual disk
US10769088B2 (en) 2004-04-15 2020-09-08 Raytheon Company High performance computing (HPC) node having a plurality of switch coupled processors
US11093298B2 (en) 2004-04-15 2021-08-17 Raytheon Company System and method for topology-aware job scheduling and backfilling in an HPC environment
WO2005106655A1 (en) * 2004-04-15 2005-11-10 Raytheon Company System and method for computer cluster virtualization using dynamic boot images and virtual disk
US7711977B2 (en) 2004-04-15 2010-05-04 Raytheon Company System and method for detecting and managing HPC node failure
US9832077B2 (en) 2004-04-15 2017-11-28 Raytheon Company System and method for cluster management based on HPC architecture
EP1594057A1 (en) * 2004-04-15 2005-11-09 Raytheon Company System and method for computer cluster virtualization using dynamic boot images and virtual disk
US10289586B2 (en) 2004-04-15 2019-05-14 Raytheon Company High performance computing (HPC) node having a plurality of switch coupled processors
CN100447746C (en) * 2004-04-15 2008-12-31 雷西昂公司 System and method for computer cluster virtualization using dynamic boot images and virtual disk
US9928114B2 (en) 2004-04-15 2018-03-27 Raytheon Company System and method for topology-aware job scheduling and backfilling in an HPC environment
US8910175B2 (en) 2004-04-15 2014-12-09 Raytheon Company System and method for topology-aware job scheduling and backfilling in an HPC environment
US8984525B2 (en) 2004-04-15 2015-03-17 Raytheon Company System and method for topology-aware job scheduling and backfilling in an HPC environment
US9037833B2 (en) 2004-04-15 2015-05-19 Raytheon Company High performance computing (HPC) node having a plurality of switch coupled processors
US9178784B2 (en) 2004-04-15 2015-11-03 Raytheon Company System and method for cluster management based on HPC architecture
US9189278B2 (en) 2004-04-15 2015-11-17 Raytheon Company System and method for topology-aware job scheduling and backfilling in an HPC environment
US9189275B2 (en) 2004-04-15 2015-11-17 Raytheon Company System and method for topology-aware job scheduling and backfilling in an HPC environment
US9904583B2 (en) 2004-04-15 2018-02-27 Raytheon Company System and method for topology-aware job scheduling and backfilling in an HPC environment
US9594600B2 (en) 2004-04-15 2017-03-14 Raytheon Company System and method for topology-aware job scheduling and backfilling in an HPC environment
US7475274B2 (en) 2004-11-17 2009-01-06 Raytheon Company Fault tolerance and recovery in a high-performance computing (HPC) system
US8209395B2 (en) 2004-11-17 2012-06-26 Raytheon Company Scheduling in a high-performance computing (HPC) system
US7433931B2 (en) 2004-11-17 2008-10-07 Raytheon Company Scheduling in a high-performance computing (HPC) system
GB2444287B (en) * 2006-12-02 2011-04-27 David Peter Neupert Server backup system
GB2444287A (en) * 2006-12-02 2008-06-04 David Peter Neupert Backup Server System
US9288147B2 (en) 2009-08-25 2016-03-15 International Business Machines Corporation Dynamically balancing resources in a server farm
US8458324B2 (en) 2009-08-25 2013-06-04 International Business Machines Corporation Dynamically balancing resources in a server farm

Also Published As

Publication number Publication date
SE0102405D0 (en) 2001-07-04

Similar Documents

Publication Publication Date Title
US9971621B1 (en) Hotpooling virtual machines
US9424023B2 (en) Unified system lifecycle for components in an integrated software and hardware system
JP4897710B2 (en) How to determine the optimal order for restarting multiple servers in a data center
JP5011073B2 (en) Server switching method and server system
EP1202175B1 (en) System and method for distributed management of shared computers
US7900206B1 (en) Information technology process workflow for data centers
US8276208B2 (en) Security-level enforcement in virtual-machine fail-over
EP2016491B1 (en) System and method for separating multiple workloads processing in a single computer operating environment
US20030126202A1 (en) System and method for dynamic server allocation and provisioning
US20040054780A1 (en) Dynamic adaptive server provisioning for blade architectures
US20050108703A1 (en) Proactive policy-driven service provisioning framework
WO2003005192A1 (en) A system and a method for selecting a preconfigured operating system for a server
US20100186010A1 (en) Dynamic Checking of Hardware Resources for Virtual Environments
US10007586B2 (en) Deferred server recovery in computing systems
CN107404524A (en) A kind of method and device of distributed type assemblies node visit
US20070124573A1 (en) Method for rapid startup of a computer system
CN100421382C (en) Maintaining unit structure of high extending internet superserver and its method
US20200409806A1 (en) Virtual-machine-specific failover protection
JP2020129184A (en) Cluster system, control method thereof, server, and program
JP2007264904A (en) Automatic program update system
US10153937B1 (en) Layered datacenter components
JP5486038B2 (en) Server switching method and server system
US9461873B1 (en) Layered datacenter
WO2002073438A1 (en) Method and system for management of remote devices
Smarts The big picture

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PH PL PT RO RU SD SE SG SI SK SL TJ TM TN TR TT TZ UA UG US UZ VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR IE IT LU MC NL PT SE SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: COMMUNICATION UNDER RULE 69 EPC ( EPO FORM 1205A DATED 19/07/04 )

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: JP

WWW Wipo information: withdrawn in national office

Country of ref document: JP