US20070233825A1 - Reconfigurable, virtual processing system, cluster, network and method - Google Patents

Reconfigurable, virtual processing system, cluster, network and method Download PDF

Info

Publication number
US20070233825A1
US20070233825A1 US11/759,078 US75907807A US2007233825A1 US 20070233825 A1 US20070233825 A1 US 20070233825A1 US 75907807 A US75907807 A US 75907807A US 2007233825 A1 US2007233825 A1 US 2007233825A1
Authority
US
United States
Prior art keywords
virtual
logic
node
network
processor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US11/759,078
Inventor
Vern Brownell
Pete Manca
Ben Sprachman
Paul Curtis
Ewan Milne
Max Smith
Alan Greenspan
Scott Geng
Dan Busby
Edward Duffy
Peter Schulter
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Family has litigation
First worldwide family litigation filed litigation Critical https://patents.darts-ip.com/?family=21899447&utm_source=google_patent&utm_medium=platform_link&utm_campaign=public_patent_search&patent=US20070233825(A1) "Global patent litigation dataset” by Darts-ip is licensed under a Creative Commons Attribution 4.0 International License.
Application filed by Individual filed Critical Individual
Priority to US11/759,078 priority Critical patent/US20070233825A1/en
Publication of US20070233825A1 publication Critical patent/US20070233825A1/en
Assigned to SILICON VALLEY BANK reassignment SILICON VALLEY BANK SECURITY AGREEMENT Assignors: EGENERA, INC.
Assigned to PHAROS CAPITAL PARTNERS II-A, L.P., AS COLLATERAL AGENT reassignment PHAROS CAPITAL PARTNERS II-A, L.P., AS COLLATERAL AGENT SECURITY AGREEMENT Assignors: EGENERA, INC.
Assigned to PHAROS CAPITAL PARTNERS II-A, L.P., AS COLLATERAL AGENT reassignment PHAROS CAPITAL PARTNERS II-A, L.P., AS COLLATERAL AGENT SECURITY AGREEMENT Assignors: EGENERA, INC.
Assigned to EGENERA, INC. reassignment EGENERA, INC. RELEASE OF SECURITY INTEREST Assignors: SILICON VALLEY BANK
Abandoned legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/28Data switching networks characterised by path configuration, e.g. LAN [Local Area Networks] or WAN [Wide Area Networks]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/35Switches specially adapted for specific applications
    • H04L49/351Switches specially adapted for specific applications for local area network [LAN], e.g. Ethernet switches
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/35Switches specially adapted for specific applications
    • H04L49/356Switches specially adapted for specific applications for storage area networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/70Virtual switches
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L61/00Network arrangements, protocols or services for addressing or naming
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/131Protocols for games, networked simulations or virtual reality
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/22Parsing or analysis of headers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/30Definitions, standards or architectural aspects of layered protocol stacks
    • H04L69/32Architecture of open systems interconnection [OSI] 7-layer type protocol stacks, e.g. the interfaces between the data link level and the physical level
    • H04L69/322Intralayer communication protocols among peer entities or protocol data unit [PDU] definitions
    • H04L69/329Intralayer communication protocols among peer entities or protocol data unit [PDU] definitions in the application layer [OSI layer 7]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/35Switches specially adapted for specific applications
    • H04L49/351Switches specially adapted for specific applications for local area network [LAN], e.g. Ethernet switches
    • H04L49/352Gigabit ethernet switching [GBPS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/35Switches specially adapted for specific applications
    • H04L49/354Switches specially adapted for specific applications for supporting virtual local area networks [VLAN]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/55Prevention, detection or correction of errors
    • H04L49/557Error correction, e.g. fault recovery or fault tolerance
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/1034Reaction to server failures by a load balancer

Definitions

  • the present invention relates to computing systems for enterprises and application service providers and, more specifically, to processing systems having virtualized communication networks and storage for quick deployment and reconfiguration.
  • Deployment is also problematic. For example, when deploying 24 conventional servers, more than 100 discrete connections may be required to configure the overall system. Managing these cables is an ongoing challenge, and each represents a failure point. Attempting to mitigate the risk of failure by adding redundancy can double the cabling, exacerbating the problem while increasing complexity and costs.
  • the present invention features a platform and method for computer processing in which virtual processing area networks may be configured and deployed.
  • a computer processing platform includes a plurality of computer processors connected to an internal communication network. At least one control node is in communication with an external communication network and an external storage network having an external storage address space. The at least one control node is connected to the internal network and thereby communicates with the plurality of computer processors.
  • Configuration logic defines and establishes a virtual processing area network having a corresponding set of computer processors from the plurality of processors, a virtual local area communication network providing communication among the set of computer processors but excluding the processors from the plurality not in the defined set, and a virtual storage space with a defined correspondence to the address space of the storage network.
  • FIG. 1 is a system diagram illustrating one embodiment of the invention
  • FIGS. 2 A-C are diagrams illustrating the communication links established according to one embodiment of the invention.
  • FIGS. 3 A-B are diagrams illustrating the networking software architecture of certain embodiments of the invention.
  • FIGS. 4 A-C are flowcharts illustrating driver logic according to certain embodiments of the invention.
  • FIG. 5 illustrates service clusters according to certain embodiments of the invention
  • FIG. 6 illustrates the storage software architecture of certain embodiments of the invention
  • FIG. 7 illustrates the processor-side storage logic of certain embodiments of the invention.
  • FIG. 8 illustrates the storage address mapping logic of certain embodiments of the invention.
  • FIG. 9 illustrates the cluster management logic of certain embodiments of the invention.
  • Preferred embodiments of the invention provide a processing platform from which virtual systems may be deployed through configuration commands.
  • the platform provides a large pool of processors from which a subset may be selected and configured through software commands to form a virtualized network of computers (“processing area network” or “processor clusters”) that may be deployed to serve a given set of applications or customer.
  • the virtualized processing area network (PAN) may then be used to execute customer specific applications, such as web-based server applications.
  • the virtualization may include virtualization of local area networks (LANs) or the virtualization of I/O storage.
  • a preferred hardware platform 100 includes a set of processing nodes 105 a - n connected to a switch fabrics 115 a,b via high-speed, interconnect 110 a,b .
  • the switch fabric 115 a,b is also connected to at least one control node 120 a,b that is in communication with an external IP network 125 (or other data communication network), and with a storage area network (SAN) 130 .
  • a management application 135 may access one or more of the control nodes via the IP network 125 to assist in configuring the platform 100 and deploying virtualized PANs.
  • processing nodes 105 a - n are contained in a single chassis and interconnected with a fixed, pre-wired mesh of point-to-point (PtP) links.
  • Each processing node 105 is a board that includes one or more (e.g., 4) processors 106 j - 1 , one or more network interface cards (NICs) 107 , and local memory (e.g., greater than 4 Gbytes) that, among other things, includes some BIOS firmware for booting and initialization.
  • NICs network interface cards
  • local memory e.g., greater than 4 Gbytes
  • Each control node 120 is a single board that includes one or more (e.g., 4) processors, local memory, and local disk storage for holding independent copies of the boot image and initial file system that is used to boot operating system software for the processing nodes 105 and for the control nodes 106 .
  • Each control node communicates with SAN 130 via 100 megabyte/second fibre channel adapter cards 128 connected to fibre channel links 122 , 124 and communicates with the Internet (or any other external network) 125 via an external network interface 129 having one or more Gigabit Ethernet NICs connected to Gigabit Ethernet links 121 , 123 .
  • Each control node includes a low speed Ethernet port (not shown) as a dedicated management port, which may be used instead of remote, web-based management via management application 135 .
  • the switch fabrics is composed of one or more 30-port Giganet switches 115 , such as the NIC-CLAN 1000 and clan 5300 switch, and the various processing and control nodes use corresponding NICs for communication with such a fabric module.
  • Giganet switch fabrics have the semantics of a Non-Broadcast Multiple Access (NBMA) network. All inter-node communication is via a switch fabric. Each link is formed as a serial connection between a NIC 107 and a port in the switch fabric 115 . Each link operates at 112 megabytes/second.
  • NBMA Non-Broadcast Multiple Access
  • multiple cabinets or chassises may be connected together to form larger platforms. And in other embodiments the configuration may differ; for example, redundant connections, switches and control nodes may be eliminated.
  • the platform supports multiple, simultaneous and independent processing areas networks (PANs).
  • PANs processing areas networks
  • Each PAN through software commands, is configured to have a corresponding subset of processors 106 that may communicate via a virtual local area network that is emulated over the PtP mesh.
  • Each PAN is also configured to have a corresponding virtual I/O subsystem. No physical deployment or cabling is needed to establish a PAN.
  • software logic executing on the processor nodes and/or the control nodes emulates switched Ethernet semantics; other software logic executing on the processor nodes and/or the control nodes provides virtual storage subsystem functionality that follows SCSI semantics and that provides independent I/O address spaces for each PAN.
  • Certain preferred embodiments allow an administrator to build virtual, emulated LANs using virtual components, interfaces, and connections.
  • Each of the virtual LANs can be internal and private to the platform 100 , or multiple processors may be formed into a processor cluster externally visible as a single IP address.
  • the virtual networks so created emulate a switched Ethernet network, though the physical, underlying network is a PtP mesh.
  • the virtual network utilizes IEEE MAC addresses, and the processing nodes support IETF ARP processing to identify and associate IP addresses with MAC addresses. Consequently, a given processor node replies to an ARP request consistently whether the ARP request came from a node internal or external to the platform.
  • FIG. 2A shows an exemplary network arrangement that may be modeled or emulated.
  • a first subnet 202 is formed by processing nodes PN 1 , PN 2 , and PN k that may communicate with one another via switch 206 .
  • a second subnet 204 is formed by processing nodes PN k and PN m that may communicate with one another via switch 208 .
  • one node on a subnet may communicate directly with another node on the subnet; for example, PN 1 may send a message to PN 2 .
  • the semantics also allow one node to communicate with a set of the other nodes; for example PN 1 may send a broadcast message to other nodes.
  • PN 1 and PN 2 cannot directly communicate with PN m because PN m is on a different subnet.
  • higher layer networking software would need to be utilized, which software would have a fuller understanding of both subnets.
  • a given switch may communicate via an “uplink” to another switch or the like.
  • the need for such uplinks is different than their need when the switches are physical. Specifically, since the switches are virtual and modeled in software they may scale horizontally as wide as needed. (In contrast, physical switches have a fixed number of physical ports sometimes the uplinks are needed to provide horizontal scalability.)
  • FIG. 2B shows exemplary software communication paths and logic used under certain embodiments to model the subnets 202 and 204 of FIG. 2A .
  • the communication paths 212 connect processing nodes PN 1 , PN 2 , PN k , and PN m , specifically their corresponding processor-side network communication logic 210 , and they also connect processing nodes to control nodes. (Though drawn as a single instance of logic for the purpose of clarity, PN k may have multiple instances of the corresponding processor logic, one per subnet, for example.)
  • management logic and the control node logic are responsible for establishing, managing and destroying the communication paths. The individual processing nodes are not permitted to establish such paths.
  • the processor logic and the control node logic together emulate switched Ethernet semantics over such communication paths.
  • the control nodes have control node-side virtual switch logic 214 to emulate some (but not necessarily all) of the semantics of an Ethernet switch
  • the processor logic includes logic to emulate some (but not necessarily all) of the semantics of an Ethernet driver.
  • one processor node may communicate directly with another via a corresponding virtual interface 212 .
  • a processor node may communicate with the control node logic via a separate virtual interface.
  • the underlying switch fabric and associated logic e.g., switch fabric manager logic, not shown
  • VIs virtual interfaces
  • RVIs reliable virtual interfaces
  • node PN 1 if node PN 1 is to communicate with node PN 2 it does so ordinarily by virtual interface 212 1-2 .
  • preferred embodiments allow communication between PN 1 and PN 2 to occur via switch emulation logic, if for example VI 212 1-2 is not operating satisfactorily. In this case a message may be sent via VI 212 1-switch206 and via VI 212 switch206-2 . If PN 1 is to broadcast or multicast a message to other nodes in the subnet 202 it does so by sending the message to control node-side logic 214 via virtual interface 212 1-switch206 .
  • Control node-side logic 214 then emulates the broadcast or multicast functionality by cloning and sending the message to the other relevant nodes using the relevant VIs.
  • the same or analogous VIs may be used to convey other messages requiring control node-side logic.
  • control node-side logic includes logic to support the address resolution protocol (ARP), and VIs are used to communicate ARP replies and requests to the control node.
  • ARP address resolution protocol
  • VIs are used to communicate ARP replies and requests to the control node.
  • FIG. 2C shows the exemplary physical connections of certain embodiments to realize the subnets of FIGS. 2A and B.
  • each instance of processing network logic 210 communicates with the switch fabric 115 via a PtP links 216 of interconnect 110 .
  • the control node has multiple instances of switch logic 214 and each communicates over a PtP connection 216 to the switch fabric.
  • the virtual interfaces of FIG. 2B include the logic to convey information over these physical links, as will be described further below.
  • an administrator defines the network topology of a PAN and specifies (e.g., via a utility within the management software 135 ) MAC address assignments of the various nodes.
  • the MAC address is virtual, identifying a virtual interface, and not tied to any specific physical node.
  • MAC addresses follow the IEEE 48 bit address format, but in which the contents include a “locally administered” bit (set to 1), the serial number of the control node 120 on which the virtual interface was originally defined (more below), and a count value from a persistent sequence counter on the control node that is kept in NVRAM in the control node.
  • These MACs will be used to identify the nodes (as is conventional) at a layer 2 level. For example, in replying to ARP requests (whether from a node internal to the PAN or on an external network) these MACs will be included in the ARP reply.
  • the control node-side networking logic maintains data structures that contain information reflecting the connectivity of the LAN (e.g., which nodes may communicate to which other nodes).
  • the control node logic also allocates and assigns VI (or RVI) mappings to the defined MAC addresses and allocates and assigns VIs or (RVIs) between the control nodes and between the control nodes and the processing nodes.
  • VIs or RVI mappings to the defined MAC addresses and allocates and assigns VIs or (RVIs) between the control nodes and between the control nodes and the processing nodes.
  • the logic would allocate and assign VIs 212 of FIG. 2B .
  • the naming of the VIs and RVIs in some embodiments is a consequence of the switching fabric and the switch fabric manager logic employed.
  • BIOS-based boot logic initializes each processor 106 of the node 105 and, among other things, establishes a (or discovers the) VI 212 to the control node logic.
  • the processor node then obtains from the control node relevant data link information, such as the processor node's MAC address, and the MAC identities of other devices within the same data link configuration.
  • Each processor registers its IP address with the control node, which then binds the IP address to the node and an RVI (e.g., the RVI on which the registration arrived). In this fashion, the control node will be able to bind IP addresses for each virtual MAC for each node on a subnet.
  • the processor node also obtains the RVI or VI-related information for its connections to other nodes or to control node networking logic.
  • IP IP
  • FIG. 3A details the processor-side networking logic 210 and FIG. 3B details the control node-side networking 310 logic of certain embodiments.
  • the processor side logic 210 includes IP stack 305 , virtual network driver 310 , ARP logic 350 , RCLAN layer 315 , and redundant Giganet drivers 320 a,b .
  • the control node-side logic 310 includes redundant Giganet drivers 325 a,b , RCLAN layer 330 , virtual Cluster proxy logic 360 , virtual LAN server 335 , ARP server logic 355 , virtual LAN proxy 340 , and physical LAN drivers 345 .
  • the IP stack 305 is the communication protocol stack provided with the operating system (e.g., Linux) used by the processing nodes 106 .
  • the IP stack provides a layer 3 interface for the applications and operating system executing on a processor 106 to communicate with the simulated Ethernet network.
  • the IP stack provides packets of information to the virtual Ethernet layer 310 in conjunction with providing a layer 3, IP address as a destination for that packet.
  • the IP stack logic is conventional except that certain embodiment avoid check sum calculations and logic.
  • the virtual Ethernet driver 310 will appear to the IP stack 305 like a “real” Ethernet driver.
  • the virtual Ethernet driver 310 receives IP packets or datagrams from the IP stack for subsequent transmission on the network, and it receives packet information from the network to be delivered to the stack as an IP packet.
  • the stack builds the MAC header.
  • the “normal” Ethernet code in the stack may be used.
  • the virtual Ethernet driver receives the packet with the MAC header already built and the correct MAC address already in the header.
  • the virtual Ethernet driver 310 dequeues 405 outgoing IP datagrams so that the packet may be sent on the network.
  • the standard IP stack ARP logic is used.
  • the driver intercepts all ARP packets entering and leaving the system to modify them so that the proper information ends up in each node's ARP tables.
  • the normal ARP logic places the correct MAC address in the link layer header of the outgoing packet before the packet is queued to the Ethernet driver.
  • the driver then just examines the link layer header and destination MAC to determine how to send the packet. The driver does not directly manipulate the ARP table (except for the occasional invalidation of ARP entries).
  • the driver 310 determines 415 whether ARP logic 350 has MAC address information (more below) associated with the IP address in the dequeued packet. If the ARP logic 350 has the information, the information is used to send 420 the packet accordingly. If the ARP logic 350 does not have the information, the driver needs to determine such information, and in certain preferred embodiments, this information is obtained as a result of an implementation of the ARP protocol as discussed in connection with FIGS. 4 B-C.
  • the driver analyzes the information returned from the ARP logic 350 to determine where and how to send the packet. Specifically, the driver looks at the address to determine whether the MAC address is in a valid format or in a particular invalid format. For example, in one embodiment, internal nodes (i.e., PAN nodes internal to the platform) are signaled through a combination of setting the locally administered bit, the multicast bit, and another predefined bit pattern in the first byte of the MAC address.
  • the overarching pattern is one which is highly improbable of being a valid pattern.
  • the IP address associated with that MAC address is for a node external at least to the relevant subnet and in preferred embodiments is external to the platform.
  • the driver prepends the packet with a TLV (type-length-value) header.
  • the logic then sends the packet to the control node over a pre-established VI.
  • the control node then handles the rest of the transmission as appropriate.
  • the invalid format signals that the IP-addressed node is to an internal node, and the information in the MAC address information is used to help identify the VI (or RVI) directly connecting the two processing nodes.
  • the ARP table entry may hold information identifying the RVI 212 to use to send the packet, e.g., 212 1-2 , to another processing node.
  • the driver prepends the packet with a TLV header. It then places address information into the header as well as information identifying the Ethernet protocol type. The logic then selects the appropriate VI (or RVI) on which to send the encapsulated packet.
  • the ARP table may contain information to actually specify the RVI to use, many other techniques may be employed. For example, the information in the table may indirectly provide such information, e.g., by pointing to the information of interest or otherwise identifying the information of interest though not contain it.
  • the driver sends the message to the control node on a defined VI.
  • the control node then clones the packet and sends it to all nodes (excluding the sending node) and the uplink accordingly.
  • the packet is put aside until ARP resolution is completed. Once the ARP layer has finished ARPing, the packets held back pending ARP get their datalink headers build and the packets are then sent to the driver.
  • the driver 310 If the ARP logic has no mapping for an IP address of an IP packet from the IP stack and, consequently, the driver 310 is unable to determine the associated addressing information (i.e., MAC address or RVI-related information), the driver obtains such information by following the ARP protocol. Referring to FIGS. 4 B-C, the driver builds 425 an ARP request packet containing the relevant IP address for which there is no MAC mapping in the local ARP table. The node then prepends 430 the ARP packet with a TLV-type header. The ARP request is then sent via a dedicated RVI to the control node-side networking logic—specifically, the virtual LAN server 335 .
  • the control node-side networking logic specifically, the virtual LAN server 335 .
  • the ARP request packet is processed 435 by the control node and broadcast 440 to the relevant nodes.
  • the control node will flag whether the requesting node is part of an IP service cluster.
  • the Ethernet driver logic 310 at the relevant nodes receives 445 the ARP reply, and determines 450 if it is the target of the ARP request by comparing the target IP address with a list of locally configured IP addresses by making calls to the node's IP stack. If it is not the target, it passes up the packet without modification. If it is the target, the driver creates 460 a local MAC header from the TLV header and updates 465 the local ARP table and creates an ARP reply. The driver modifies the information in the ARP request (mainly the source MAC) and then passes the ARP request up normally for the upper layers to handle. It is the upper layers that form the ARP reply when necessary.
  • the reply among other things contains the MAC address of the replying node and has a bit set in the TLV header indicating that the reply is from a local node.
  • the node responds according to IETF-type ARP semantics (in contrast to ATM ARP protocols in which ARP replies are handled centrally).
  • the reply is then sent 470 .
  • control node logic 335 receives 473 the reply and modifies it. For example, the control node may substitute the MAC address of a replying, internal node with information identifying the source cabinet, processing node number, RVI connection number, channel, virtual interface number, and virtual LAN name. Once the ARP reply is modified the control node logic then sends 475 the ARP reply to an appropriate node, i.e., the node that sent the ARP request, or in specific instances to the load balancer in an IP service cluster, discussed below.
  • an appropriate node i.e., the node that sent the ARP request, or in specific instances to the load balancer in an IP service cluster, discussed below.
  • an encapsulated ARP reply is received 480 . If the replying node is an external node, the ARP reply contains the MAC address of the replying node. If the replying node is an internal node, the ARP reply instead contains information identifying the relevant RVI to communicate with the node. In either case, the local table is updated 485 .
  • the pending datagram is dequeued 487 , and the appropriate RVI is selected 493 . As discussed above, the appropriate RVI is selected based on whether the target node is internal or external.
  • a TLV header is prepended to the packet and sent 495 .
  • the maximum transmission unit is configured as 16896 bytes. Even though the configured MTU is 16896 bytes, the Ethernet driver 310 recognizes when a packet is being sent to an external network. Through the use of path MTU discovery, ICMP and IP stack changes, the path MTU is changed at the source node 105 . This mechanism is also used to trigger packet check summing.
  • Certain embodiments of the invention support promiscuous mode through a combination of logic at the virtual LAN server 335 and in the virtual LAN drivers 310 .
  • the message contains information about the identity of the receiver desiring to enter promiscuous mode. This information includes the receiver's location (cabinet, node, etc), the interface number of the promiscuous virtual interface 310 on the receiver (required for demultiplexing packets), and the name of the virtual LAN to which the receiver belongs. This information is then used by the driver 310 to determine how to send promiscuous packets to the receiver (which RVI or other mechanism to use to send the packets).
  • the virtual interface 310 maintains a list of promiscuous listeners on the same virtual LAN. When a sending node receives a promiscuous mode message it will update its promiscuous list accordingly.
  • the header TLV includes extra information the destination can use to demultiplex and validate the incoming packet. Part of this information is the destination virtual Ethernet interface number (destination device number on the receiving node). Since these can be different between the actual packet destination and the promiscuous destination, this header cannot simply be cloned. Thus, memory will have to be allocated for each header for each packet clone to each promiscuous listener. When the packet header for a promiscuous packet is built the packet type will be set to indicate that the packet was a promiscuous transmission rather than a unicast transmission.
  • the virtual Ethernet driver 310 is also responsible for handling the redundant control node connections. For example, the virtual Ethernet drivers will periodically test end-to-end connectivity by sending a heartbeat TLV to each connected RVI. This will allow virtual Ethernet drivers to determine if a node has stopped responding or whether a stopped node has started to respond again. When an RVI or control node 120 is determined to be down, the Ethernet driver will send traffic through the surviving control node. If both control nodes are functional the driver 310 will attempt to load balance traffic between the two nodes.
  • Certain embodiments of the invention provide performance improvements. For example, with modifications to the IP stack 305 , packets sent only within the platform 100 are not check summed since all elements of the platform 100 provide error detection and guaranteed data delivery.
  • the RVI may be configured so that the packets may be larger than the maximum size permitted by Ethernet.
  • maximum packet size may be violated to improve performance.
  • the actual packet size will be negotiated as part of the data link layer.
  • Failure of a control node is detected either by a notification from the RCLAN layer, or by a failure of heartbeat TLVs. If a control node fails the Ethernet driver 310 will send traffic only to the remaining control node. The Ethernet driver 310 will recognize the recovery of a control node via notification from the RCLAN layer or the resumption of heartbeat TLVs. Once a control node has recovered, the Ethernet driver 310 will resume load balancing.
  • a node detects that it cannot communicate with another node via a direct RVI (as outlined above) the node attempts to communicate via the control node, acting as a switch.
  • failure may be signaled by the lower RCLAN layer, for example from failure to receive a virtual interface acknowledgement or from failures detected through heartbeat mechanisms.
  • the driver marks bits in the TLV header accordingly to indicate that the message is to be unicast and sends the packet to the control node so that it can send the packet to the desired node (e.g., based on the IP address, if necessary).
  • the RCLAN layer 315 is responsible for handling the redundancy, fail-over and load balancing logic of the redundant interconnect NICs 107 . This includes detecting failures, re-routing traffic over a redundant connection on failures, load balancing, and reporting inability to deliver traffic back to the virtual network drivers 310 .
  • the virtual ethernet drivers 310 expect to be notified asynchronously when there is a fatal error on any RVI that makes the RVI unusable or if any RVI is taken down for any reason.
  • the virtual network driver 310 on each processor will attempt to load balance outgoing packets between available control nodes. This can be done via simple round-robin alternation between available control nodes, or by keeping track of how many bytes have been transmitted on each and always transmitting on the control nodes through which fewest bytes have been sent.
  • the RCLAN provides high bandwidth (224 MB/sec each way) low latency reliable asynchronous point-to-point communication between kernels.
  • the sender of the data is notified if the data cannot be delivered and a best effort will be made to deliver it.
  • the RCLAN uses two Giganet clan 1000 cards to provide redundant communication paths between kernels. It seamlessly recovers single failures in the clan 1000 cards or the Giganet switches. It detects lost data and data errors and resends the data if needed. Communication will not be disrupted as long as one of the connections is partially working, e.g., the error rate does not exceed 5%.
  • Clients of the RCLAN include the RPC mechanism, the remote SCSI mechanism, and remote Ethernet.
  • the RCLAN also provide a simple form of flow control. Low latency and high concurrency are achieved by allowing multiple simultaneous requests for each device to be sent by the processor node to the control node, so that they can be forwarded to the device as soon as possible or, alternatively so that they can be queued for completion as close to the device as possible as opposed to queuing all requests on the processor node.
  • the RCLAN layer 330 on the control node-side operates analogously to the above.
  • the Giganet driver logic 320 is the logic responsible for providing an interface to the Giganet NIC 107 , whether on a processor 106 or control node 120 .
  • the Giganet driver logic establishes VI connections, associated by VI id's, so that the higher layers, e.g., RCLAN 315 and Ethernet driver 310 , need only understand the semantics of VI's.
  • Giganet driver logic 320 is responsible for allocating memory in each node for buffers and queues for the VI's, and for conditioning the NIC 107 to know about the connection and its memory allocation. Certain embodiments use VI connections provided by the Giganet driver.
  • the Giganet NIC driver code establishes a Virtual Interface pair (i.e., VI) and assigns it to a corresponding virtual interface id.
  • Each VI is a bi-directional connection established between one Giganet port and another, or more precisely between memory buffers and memory queues on one node to buffers and queues on another.
  • the allocation of ports and memory is handled by the NIC drivers as stated above. Data is transmitted by placing it into a buffer the NIC knows about and triggering action by writing to a specific memory-mapped register. On the receiving side, the data appears in a buffer and completion status appears in a queue. The data never need be copied if the sending and receiving programs are capable of producing and consuming messages in the connection's buffers. The transmission can even be direct from application program to application program if the operating system memory-maps the connection's buffers and control registers into application address space.
  • Each Giganet port can support 1024 simultaneous VI connections over it and keep them separate from each other with hardware protection, so the operating system as well as disparate applications can safely share a single port.
  • 14 VI connections may be established simultaneously from every port to every other port.
  • the NIC drivers establish VI connections in redundant pairs, with one connection of the pair going through one of the two switch fabrics 115 a,b and the other through the other switch. Moreover, in preferred embodiments, data is sent alternately on the two legs of the pair, equalizing load on the switches. Alternatively, the redundant pairs may be used in fail-over manner.
  • connection pairs established by the node persist as long as the operating system remains up. Establishment of a connection pair to simulate an Ethernet connection is intended to be analogous to, and as persistent as, physically plugging in a cable between network interface cards. If a node's defined configuration changes while its operating system is running, then applicable redundant Virtual Interface connection pairs will be established or discarded at the time of the change.
  • the Giganet driver logic 325 on the control node-side operates analogously to the above.
  • the virtual LAN server logic 335 facilitates the emulation of an Ethernet network over the underlying NBMA network.
  • the virtual LAN server logic 335 facilitates the emulation of an Ethernet network over the underlying NBMA network.
  • IP addresses on virtual LANs may be configured in the same way as on an “ordinary” subnet.
  • the choice of IP addresses to use is dependent on the external visibility of nodes on a virtual LAN. If the virtual LAN is not globally visible (either not visible outside the platform 100 , or from the Internet), private IP addresses should be used. Otherwise, IP addresses must be configured from the range provided by the internet service provider (ISP) that provides the Internet connectivity. In general, virtual LAN IP address assignment must be treated the same as normal LAN IP address assignment. Configuration files stored on the local disks of the control node 120 define the IP addresses within a virtual LAN.
  • ISP internet service provider
  • an IP alias just creates another IP to RVI mapping on the virtual LAN server logic 335 .
  • Each processor may configure multiple virtual interfaces as needed.
  • the primary restrictions on the creation and configuration of virtual network interfaces are IP address allocation and configuration.
  • Each virtual LAN has a corresponding instance of server logic 335 that executes on both of the control nodes 120 and a number of nodes executing on the processor nodes 105 .
  • the topology is defined by the administrator.
  • Each virtual LAN server 335 is configured to manage exactly one broadcast domain, and any number of layer 3 (IP) subnets may be present on the given layer 2 broadcast domain.
  • the servers 335 are configured and created in response to administrator commands to create virtual LANs.
  • a processor 106 When a processor 106 boots and configures its virtual networks, it connects to the virtual LAN server 335 via a special management RVI. The processors then obtain their data link configuration information, such as the virtual MAC addresses assigned to it, virtual LAN membership information and the like. The virtual LAN server 335 will determine and confirm that the processor attempting to connect to it is properly a member of the virtual LAN that that server 335 is servicing. If the processor is not a virtual LAN member, the connection to the server is rejected. If it is a member, the virtual network driver 310 registers its IP address with the virtual LAN server. (The IP address is provided by the IP stack 305 when the driver 310 is configured.) The virtual LAN server then binds that IP address to an RVI on which the registration arrived.
  • RVIs to connect nodes at the data link layer and to form control connections. Some of these connections are created and assigned as part of control nodes booting and initialization.
  • the data link layer connections are used for the reasons described above.
  • the control connections are used to exchange management, configuration, and health information.
  • RVI connections are between nodes for unicast traffic, e.g., 212 1-2 .
  • Other RVI connections are to the virtual LAN server logic 335 so that the server can handle the requests, e.g., ARP traffic, broadcasts, and so on.
  • the virtual LAN server 335 creates and removes RVIs through calls to a Giganet switch manager 360 (provided with the switch fabric and Giganet NICs).
  • the switch manager may execute on the control nodes 120 and cooperates with the Giganet drivers to create the RVIs.
  • the virtual LAN server creates and assigns virtual MAC addresses for the nodes, as described above.
  • the virtual LAN server logic maintains data structures reflecting the topology and MAC assignments for the various nodes.
  • the virtual LAN server logic then creates corresponding RVIs for the unicast paths between nodes. These RVIs are subsequently allocated and made known to the nodes during the nodes booting.
  • the RVIs are also associated with IP addresses during the virtual LAN server's handling of ARP traffic. The RVI connections are torn down if a node is removed from the topology.
  • a node 106 at one end of an established RVI connection is rebooted, the two operating systems of the each end of the connection, and RVI management logic re-establish the connection.
  • Software using the connection on the processing node that remained up will be unaware that anything happened to the connection itself. Whether or not the software notices or cares that the software at the other end was rebooted depends upon what it is using the connection for and the extent to which the rebooted end is able to re-establish its state from persistent storage. For example, any software communicating via Transmission Control Protocol (TCP) will notice that all TCP sessions are closed by a reboot.
  • TCP Transmission Control Protocol
  • NFS Network File System
  • virtual LAN server 335 can also serve as the packet relay mechanism of last resort.
  • certain embodiments use virtual Ethernet drivers 310 that algorithmically determine the RVI that it ought to use to connect to its associated virtual LAN server 335 .
  • the algorithm may need to consider identification information such as cabinet number to identify the RVI.
  • the virtual Ethernet drivers 310 of certain embodiments support ARP.
  • ARP processing is used to advantage to create mappings at the nodes between IP addresses and RVIs that may be used to carry unicast traffic, including IP packets, between nodes.
  • the virtual Ethernet drivers 310 send ARP packet requests and replies to the virtual LAN server 335 via a dedicated RVI.
  • the virtual LAN server 335 and specifically ARP server logic 355 , handles the packets by adding information to the packet header. As was explained above, this information facilitates identification of the source and target and identifies the RVI that may be used between the nodes.
  • the ARP server logic 355 receives the ARP requests, processes the TLV header, and broadcasts the request to all relevant nodes on the internal platform and the external network if appropriate. Among other things, the server logic 355 determines who should receive the ARP reply, resulting from the request. For example, if the source is a clustered IP address, the reply should be sent to the cluster load balancer, not necessarily the source of the ARP request. The server logic 355 indicates such by including information in the TLV header of the ARP request, so that the target of the ARP replies accordingly. The server 335 will process the ARP packet by including further information in the appended header and broadcast the packet to the nodes in the relevant domain. For example, the modified header may include information identifying the source cabinet, processing node number, RVI connection number, channel, virtual interface number, and virtual LAN name (some of which is only known by the server 335 ).
  • the ARP replies are received by the server logic 355 , which then maps the MAC information in the reply to corresponding RVI related information.
  • the RVI-related information is placed in the target MAC entry of the reply and sent to the appropriate source node (e.g., may be the sender of the request, but in some instances such as with clustered IP addresses may be a different node).
  • broadcasts are handled by receiving the packet on a dedicated RVI.
  • the packet is then cloned by the server 335 and unicast to all virtual interfaces 310 in the relevant broadcast domain.
  • the same approach may be used for multicast. All multicast packets will be reflected off the virtual LAN server. Under some alternative embodiments, the virtual LAN server will treat multicast the same as broadcast and rely on IP filtering on each node to filter out unwanted packets.
  • the processor virtual network driver 310 sends a join request to the virtual LAN server 335 via a dedicated RVI.
  • the virtual LAN server then configures a specific multicast MAC address on the interface and informs the LAN Proxy 340 , discussed below, as necessary.
  • the Proxy 340 will have to keep track of use counts on specific multicast groups so a multicast address is only removed when no processor belongs to that multicast group.
  • the external network 125 may operate in one of two modes: filtered or unfiltered.
  • filtered mode a single MAC address for the entire system is used for all outgoing packets. This hides the virtual MAC addresses of a processing node 107 behind the Virtual LAN Proxy 340 and makes the system appear as a single node on the network 125 (or as multiple nodes behind a bridge or proxy). Because this doesn't expose unique link layer information for each internal node 107 some other unique identifier is required to properly deliver incoming packets.
  • filter mode the destination IP address of each incoming packet is used to uniquely identify the intended recipient since the MAC address will only identify the system.
  • unfiltered mode the virtual MACs of a node 107 are visible outside the system so that they may be used to direct incoming traffic. That is, filtered mode mandates layer 3 switching while unfiltered mode allows layer 2 switching. Filtered mode requires that some component (in this case the Virtual LAN Proxy 340 ) perform replacement of node virtual MAC addresses with the MAC address of the external network 125 on all outgoing packets.
  • Some embodiments support the ability for a virtual LAN to be connected to external networks. Consequently, the virtual LAN will have to handle IP addresses not configured locally. To address this, one embodiment imposes a limit that each virtual LAN so connected be restricted to one external broadcast domain. IP addresses and subnet assignments for the internal nodes of the virtual LAN will have to be in accordance with the external domain.
  • the virtual LAN server 335 services the external connection by effectively acting as a data link layer bridge in that it moves packets between the external Ethernet driver 345 and internal processors and performs no IP processing.
  • the server cannot always rely on distinctive layer two addresses from the external network to internal nodes and instead the connection may use layer 3 (IP) information to make the bridging decisions.
  • IP layer 3
  • the external connection software extracts IP address information from incoming packets and it uses this information to identify the correct node 106 so that it may move the packet to that node.
  • a virtual LAN server 335 having an attached external broadcast domain has to intercept and process packets from and to the external domain so that external nodes have a consistent view of the subnet(s) in the broadcast domain.
  • virtual LAN server 335 having an attached external broadcast domain When virtual LAN server 335 having an attached external broadcast domain receives an ARP request from an external node it will relay the request to all internal nodes. The correct node will then compose the reply and send the reply back to the requester through the virtual LAN server 335 .
  • the virtual LAN server cooperates with the virtual LAN Proxy 340 so that the Proxy may handle any necessary MAC address translation on outgoing requests. All ARP Replies and ARP advertisements from external sources will be relayed directly to the target nodes.
  • Virtual Ethernet interfaces 310 will send all unicast packets with an external destination to the virtual LAN server 335 over the control connection RVI. (External destinations may be recognized by the driver by the MAC address format.) The virtual LAN server will then move the packet to the external network 125 accordingly.
  • the virtual LAN server 335 receives a broadcast or multicast packet from an internal node it relays the packet to the external network in addition to relaying the packet to all internal virtual LAN members. If the virtual LAN server 335 receives a broadcast or multicast packet from an external source it relays the packet to all attached internal nodes.
  • interconnecting virtual LANs through the use of IP routers or firewalls is accomplished using analogous mechanisms to those used in interconnecting physical LANs.
  • One processor is configured on both LANs, and the Linux kernel on that processor must have routing (and possibly IP masquerading) enabled. Normal IP subnetting and routing semantics will always be maintained, even for two nodes located in the same platform.
  • a processor could be configured as a router between two external subnets, between and external and internal subnet, and between two internal subnets.
  • an internal node is sending a packet through a router there are no problems because of the point-to-point topology of the internal network.
  • the sender will send directly to the router (i.e., processor so configured with routing logic) without the intervention of the virtual LAN server (i.e., typical processor to processor communication, discussed above).
  • the destination MAC address of the incoming packet will be that of the platform 100 .
  • the MAC address can not be used to uniquely identify the packet destination node.
  • the destination IP address in the IP header is used to direct the packet to the proper destination node.
  • the destination IP address in the IP header is that of the final destination rather than that of the next hop (which is the internal router).
  • the incoming packet there is nothing in the incoming packet that can be used to direct it to the correct internal node.
  • one embodiment imposes a limit of no more than one router exposed to an external network on a virtual LAN.
  • This router is registered with the virtual LAN server 335 as a default destination so that incoming packets with no valid destination will be directed to this default node.
  • the destination MAC address of the incoming packet will be the virtual MAC address of the internal destination node.
  • the LAN Server 335 will then use this virtual MAC to send the packet directly to the destination internal node.
  • any number of internal nodes may be functioning as routers as the incoming packet's MAC address will uniquely identify the destination node.
  • a configuration requires multiple routers on a subnet, one router can be picked as the exposed router. This router in turn could route to the other routers as necessary.
  • router redundancy is provided, by making a router a clustered service and load balancing or failing over on a stateless basis (i.e., every IP packet rather than per-TCP connection).
  • Certain embodiments of the invention support promiscuous mode functionality by providing switch semantics in which a given port may be designated as a promiscuous port so that all traffic passing through the switch is repeated on the promiscuous port.
  • the nodes that are allowed to listen in promiscuous mode will be assigned administratively at the virtual LAN server.
  • a virtual Ethernet interface 310 When a virtual Ethernet interface 310 enters promiscuous receive mode it will send a message to the virtual LAN server 335 over the management RVI. This message will contain all the information about the virtual Ethernet interface 310 entering promiscuous mode.
  • the virtual LAN Server receives a promiscuous mode message from a node, it will check its configuration information to determine if the node is allowed to listen promiscuously. If not, the virtual LAN Server will drop the promiscuous mode message without further processing. If the node is allowed to enter promiscuous mode, the virtual LAN server will broadcast the promiscuous mode message to all other nodes on the virtual LAN. The virtual LAN server will also mark the node as being promiscuous so that it can forward copies of incoming external packets to it.
  • a promiscuous listener When a promiscuous listener detects any change in its RVI configuration it will send a promiscuous mode message to the virtual LAN to update the state of all other nodes on the relevant broadcast domain. This will update any nodes entering or leaving a virtual LAN.
  • a virtual Ethernet interface 310 leaves promiscuous it will send the virtual LAN server a message informing it that the interface is leaving promiscuous mode. The virtual LAN server will then send this message to all other nodes on the virtual LAN.
  • Promiscuous settings will allow for placing an external connection in promiscuous mode when any internal virtual interface is a promiscuous listener. This will make the traffic external to the platform (but on the same virtual LAN) available to the promiscuous listener.
  • a service cluster is a set of services available at one or more IP address (or host names). Examples of these services are HTTP, FTP, telnet, NFS, etc.
  • An IP address and port number pair represents a specific service type (though not a service instance) offered by the cluster to clients, including clients on the external network 125 .
  • FIG. 5 shows how certain embodiments present a virtual cluster 405 of services as a single virtual host to the Internet or other external network 125 via a cluster IP address. All the services of the cluster 505 are addressed through a single IP address, through different ports at that IP address.
  • service B is a load balanced service.
  • VCP virtual cluster proxy
  • the virtual LAN Proxy logic 340 When a packet arrives on the virtual cluster IP address, the virtual LAN Proxy logic 340 will send the packet to the VCP 360 for processing. The VCP will then decide where to send the packet based on the packet contents, its internal connection state cache, any load balancing algorithms being applied to incoming traffic, and the availability of configured services. The VCP will relay incoming packets based on both the destination IP address as well as the TCP or UDP port number. Further, it will only distribute packets destined for port numbers known to the VCP (or for existing TCP connections). It is the configuration of these ports, and the mapping of the port number to one or more processors that creates the virtual cluster and makes specific service instances available in the cluster. If multiple instances of the same service from multiple application processors are configured then the VCP can load balance between the service instances.
  • the VCP 360 maintains a cache of all active connections that exist on the cluster's IP address. Any load balancing decisions that are made will only be made when a new connection is established between the client and a service. Once the connection has been set up, the VCP will use the source and destination information in the incoming packet header to make sure all packets in a TCP stream get routed to the same processor 106 configured to provide the service. In the absence of the ability to determine a client session (for example, HTTP sessions), the actual connection/load balancing mapping cache will route packets based on client address so that subsequent connections from the same client goes to the same processor (making a client session persistent or “sticky”). Session persistence should be selectable on a service port number basis since only certain types of services require session persistence.
  • Replies to ARP requests, and routing of ARP replies, is handled by the VCP.
  • a processor sends any ARP packet, it will send it out through the Virtual Ethernet driver 310 .
  • the packet will then be sent to the virtual LAN Server 335 for normal ARP processing.
  • the virtual LAN server will broadcast the packet as usual, but will make sure it doesn't get broadcast to any member of the cluster (not just the sender). It will also place information in the packet header TLV that indicates to the ARP target that the ARP source can only be reached through the virtual LAN server and specifically through the load balancer.
  • the ARP target whether internal or external, will process the ARP request normally and send a reply back through the virtual LAN server.
  • the virtual LAN server will be unable to determine which processor sent out the original request. Thus, the virtual LAN Server will send the reply to each cluster member so that they can handle it properly.
  • the virtual LAN server will send the request to every cluster member.
  • Each cluster member will receive the ARP request and process it normally. They will then compose an ARP reply and send it back to the source via the virtual LAN server.
  • the virtual LAN server receives any ARP reply from a cluster member it will drop that reply, but the virtual LAN server will compose and send an ARP reply to the ARP source.
  • the virtual LAN Server will respond to all ARPs of the cluster IP address.
  • the ARP reply will contain the information necessary for the ARP source to send all packets for the cluster IP address to the VCP.
  • this will simply be an ARP reply with the external MAC address as the source hardware address.
  • For internal ARP sources this will be the information necessary to tell the source to send packets for the cluster IP address down the virtual LAN management RVI rather than through a directly connected RVI.
  • Any gratuitous ARP packets that are received will be forwarded to all cluster members. Any gratuitous ARP packets sent by a cluster member will be sent normally.
  • the virtual LAN Proxy 340 performs the basic co-ordination of the physical network resources among all the processors that have virtual interfaces to the external physical network 125 . It bridges virtual LAN server 335 to the external network 125 . When the external network 125 is running in filtered mode the Virtual LAN Proxy 340 will convert the internal virtual MAC addresses from each node to the single external MAC assigned to the system 100 . When the external network 125 is operating in unfiltered mode no such MAC translation is required. The Virtual LAN Proxy 340 also performs insertion and removal of IEEE 802.1Q Virtual LAN ID tagging information, and demultiplexing packets based on their VLAN Ids. It also serializes access to the physical Ethernet interface 129 and co-ordinates the allocation and removal of MAC addresses, such as multicast addresses, on the physical network.
  • the virtual LAN Proxy 340 When the external network 125 is running in filtered mode and the virtual LAN Proxy 340 receives outgoing packets (ARP or otherwise) from a virtual LAN server 335 , it replace the internal format MAC address with the MAC address of the physical Ethernet device 129 as the source MAC address. When the External Network 125 is running in unfiltered mode no such replacement is required.
  • the virtual LAN Proxy 340 When the virtual LAN Proxy 340 receives incoming ARP packets, it moves the packet to the virtual LAN server 335 which handles the packet and relays the packet on to the correct destination(s). If the ARP packet is a broadcast packet then the packet is relayed to all internal nodes on the Virtual LAN. If the packet is a unicast packet the packet is sent only to the destination node. The destination node is determined by the IP address in the ARP packet when the External Network 125 is running in filtered mode, or by the MAC address in the Ethernet header of the ARP packet (not the MAC address is the ARP packet).
  • connection to the external network 125 is via Gigabit or 100/10baseT Ethernet links connected to the control node.
  • Physical LAN drivers 345 are responsible for interfacing with such links. Packets being sent on the interface will be queued to the device in the normal manner, including placing the packets in socket buffers. The queue used to queue the packets is the one used by the protocol stack to queue packets to the device's transmit routine. For incoming packets, the socket buffer containing the packets will be passed around and the packet data will never be copied (though it will be cloned if needed for multicast operations).
  • generic Linux network device drivers may be used in the control node without modification. This facilitates the addition of new devices to the platform without requiring additional device driver work.
  • the physical network interface 345 is in communication only with the virtual LAN proxy 340 . This prevents the control node from using the external connection in any way that would interfere with the operation of the virtual LANs and improves security and isolation of user data, i.e., an administrator may not “sniff” any user's packets.
  • the redundant connections to the external network 125 will be used alternately to load balance packet transmission between two redundant interfaces to the external network 125 .
  • Other embodiments load balance by configuring each virtual network interface on alternating control nodes so the virtual interfaces are evenly distributed between the two control nodes.
  • Another embodiment transmits through one control node and receives through another.
  • load balancing is performed by allowing transmission on both control nodes but only reception through one.
  • the failover case is both send and receive through the same control node.
  • the recovery case is transmission through the recovered control node since that doesn't require any MAC manipulation.
  • the control node doing reception has IP information for filtering and multicast address information for multicast MAC configuration. This information is needed to process incoming packets and should be failed over should the receiving control node fail. If the transmitting control node fails, virtual network drivers need only start sending outgoing packets only to the receiving control node. No special failover processing is required other than the recognition that the transmitting control node has failed. If the failed control node recovers the virtual network drivers can resume sending outgoing packets to the recovered control nodes without any additional special recovery processing. If the receiving control node fails then the transmitting control node must assume the receiving interface role. To do this, it must configure all MAC addresses on its physical interface to enable packet reception. Alternately, both control nodes could have the same MAC address configured on their interfaces, but receives could be physically disabled on the Ethernet device by the device driver until an control node is ready to receive packets. Then failover would simply enable receives on the device.
  • multicast information must be shared between control nodes so that failover will be transparent to the processor. Since the virtual network drivers will have to keep track of multicast group membership anyway, this information will always be available to a LAN Proxy via the virtual LAN server when needed. Thus, a receive failover will result in multicast group membership being queried from virtual network drivers to rebuild the local multicast group membership tables. This operations is low overhead and requires no special processing except during failover and recovery, and doesn't require any special replication of data between control nodes. When receive has failed over and the failed control node recovers, only transmissions will be moved over to the recovered control node. Thus, the algorithm for recovery on virtual network interfaces is to always move transmissions to the recovered control node and leave receive processing where it is.
  • Virtual service clusters may also use load balancing and failover.
  • Each cabinet will have at least one control node which will be used for inter-cabinet connections.
  • Each control node will include a virtual LAN server 335 to handle local connections and traffic.
  • One of the servers is configured to be a master, such as the one located on the control node with the external connection for the virtual LAN.
  • the other virtual LAN server will act as proxy servers, or slaves, so that the local processors of those cabinets can participate.
  • the master maintains all virtual LAN state and control while the proxies relay packets between the processors and masters.
  • Each virtual LAN server proxy maintains a RVI to each master virtual LAN Server.
  • Each local processor will connect to the virtual LAN Server Proxy server just as if it were a master.
  • the proxy When a processor connects and registers an IP and MAC address, the proxy will register that IP and MAC address with the master. This will cause the master to bind the addresses to the RVI from the proxy.
  • the master will contain RVI bindings for all internal nodes, but proxies will contain bindings only for nodes in the same cabinet.
  • the packet When an processor anywhere in a multicabinet virtual LAN sends any packet to its virtual LAN server, the packet will be relayed to the master for processing. The master will then do normal processing on the packet. The master will relay packets to the proxies as necessary for multicast and broadcast. The master will also relay unicast packets based on the destination IP address of the unicast packet and registered IP addresses on the proxies. Note that on the master, a proxy connection looks very much like a node with many configured IP addresses.
  • the node's serial console traffic and boot image requests are routed by switch driver code located in the processing node's kernel debugging software or BIOS to management software running on a control node (not shown). From there, the console traffic can again be accessed either from the high-speed external network 125 or through the control node's management ports.
  • the boot image requests can be satisfied from either the control node's local disks or from partitions out on the external SAN 130 .
  • the control node 120 is preferably booted and running normally before anything can be done to an processing node.
  • the control node is itself booted or debugged from its management ports.
  • Some customers may wish to restrict booting and debugging of controllers to local access only, by plugging their management ports into an on-site computer when needed. Others may choose to allow remote booting and debugging by establishing a secure network segment for management purposes, suitably isolated from the Internet, into which to plug their management ports. Once a controller is booted and running normally, all other management functions for it and for the rest of the platform can be accessed from the high-speed external network 125 as well as the management ports, if permitted by the administrator.
  • Serial console traffic to and from each processing node 105 is sent by an operating system kernel driver over the switch fabric 115 to management software running on a control node 120 . From there, any node's console traffic can be accessed either from the normal, high-speed external network 125 or through either of the control node's management ports.
  • Each virtual PAN has its own virtualized I/O space and issues SCSI commands and status within such space.
  • Logic at the control node translates or transforms the addresses and commands as necessary from a PAN and transmits them accordingly to the SAN 130 which services the commands.
  • the client is the platform 100 and the actual PANs that issued the commands are hidden and anonymous.
  • the SAN address space is virtualized, one PAN operating on the platform 100 may have device numbering starting with a device number 1 , and a second PAN may also have a device number 1 . Yet each of the device number 1 s will correspond to a different, unique portion of SAN storage.
  • an administrator can build virtual storage.
  • Each of the PANs will have its own independent perspective of mass storage.
  • a first PAN may have a given device/LUN address map to a first location in the SAN
  • a second PAN may have the same given device/LUN map to a second, different location in the SAN.
  • Each processor maps a device/LUN address into a major and minor device number, to identify a disk and a partition, for example. Though the major and minor device numbers are perceived as a physical address by the PAN and the processors within a PAN, in effect they are treated by the platform as a virtual address to the mass storage provided by the SAN. That is, the major and minor device numbers of each processor are mapped to corresponding SAN locations.
  • FIG. 6 illustrates the software components used to implement the storage architecture of certain embodiments.
  • a configuration component 605 typically executed on a control node 120 , is in communication with external SAN 130 .
  • a management interface component 610 provides an interface to the configuration component 605 and is in communication with IP network 125 and thus with remote management logic 135 (see FIG. 1 ).
  • Each processor 106 in the system 100 includes an instance of processor-side storage logic 620 .
  • Each such instance 620 communicates via 2 RVI connections 625 to a corresponding instance of control node-side storage logic 615 .
  • the configuration component 605 and interface 610 are responsible for discovering those portions of SAN storage that are allocated to the platform 100 and for allowing an administrator to suballocate portions to specific PANs or processors 106 .
  • Storage configuration logic 605 is also responsible for communicating the SAN storage allocations to control node-side logic 615 .
  • the processor-side storage logic 620 is responsible for communicating the processor's storage requests over the internal interconnect 110 and storage fabric 115 via dedicated RVIs 625 to the control node-side logic 615 .
  • the requests will contain, under certain embodiments, virtual storage addresses and SCSI commands.
  • the control node-side logic is responsible for receiving and handling such commands by identifying the corresponding actual address for the SAN and converting the commands and protocol to the appropriate form for the SAN, for example, including but not limited to, fibre channel (Gigabit Ethernet with iSCSI is another exemplary connectivity).
  • the configuration component 605 determines which elements in the SAN 130 are visible to each individual processor 106 . It provides a mapping function that translates the device numbers (e.g., SCSI target and LUN) that the processor uses into the device numbers visible to the control nodes through their attached SCSI and Fibre Channel I/O interfaces 128 . It also provides an access control function, which prevents processors from accessing external storage devices which are attached to the control nodes but not included in the processors' configuration. The model that is presented to the processor (and to the system administrator and applications/users on that processor) makes it appear as if each processor has its own mass storage devices attached to interfaces on the processor.
  • mapping function that translates the device numbers (e.g., SCSI target and LUN) that the processor uses into the device numbers visible to the control nodes through their attached SCSI and Fibre Channel I/O interfaces 128 . It also provides an access control function, which prevents processors from accessing external storage devices which are attached to the control nodes but not included in the processors' configuration
  • this functionality allows the software on a processor 106 to be moved to another processor easily.
  • the control node via software may change the PAN configurations to allow a new processor to access the required devices.
  • a new processor may be made to inherit the storage personality of another.
  • control nodes appear as hosts on the SANs, though alternative embodiments allow the processors to act as such.
  • the configuration logic discovers the SAN storage allocated to the platform 100 (for example, during platform boot) and this pool is subsequently allocated by an administrator. If discovery is activated later, the control node that performs the discovery operation compares the new view with the prior view. Newly available storage is added to the pool of storage that may be allocated by an administrator. Partitions that disappear that were not assigned are removed from the available pool of storage that may be allocated to PANs. Partitions that disappear that were assigned trigger error messages.
  • the configuration component 605 allows management software to access and update the information which describes the device mapping between the devices visible to the control nodes 120 and the virtual devices visible to the individual processors 106 . It also allows access to control information.
  • the assignments may be identified by the processing node in conjunction with an identification of the simulated SCSI disks, e.g., by name of the simulated controller, cable, unit, or logical unit number (LUN).
  • the interface component 610 cooperates with the configuration component to gather and monitor information and statistics, such as:
  • the processor-side logic 620 of the protocol is implemented as a host adapter module that emulates a SCSI subsystem by providing a low-level virtual interface to in the operating system on the processors 106 .
  • the processors 106 use this virtual interface to send SCSI I/O commands to the control nodes 120 for processing.
  • each processing node 105 will include one instance of logic 620 per control node 120 .
  • the processors refer to storage using physical device numbering, rather than logical. That is, the address is specified as a device name to identify the LUN, the SCSI target, channel, host adapter, and control node 120 (e.g., node 120 a or 120 b ). As shown in FIG. 8 , one embodiment maps the target (T) and LUN (L) to a host adapter (H), channel (C), mapped target (mT), and mapped LUN (mL)
  • FIG. 7 shows an exemplary architecture for processor side logic 720 .
  • Logic 720 includes a device-type-specific driver (e.g., a disk driver) 705 , a mid-level SCSI I/O driver 710 , and wrapper and interconnect logic 715 .
  • a device-type-specific driver e.g., a disk driver
  • a mid-level SCSI I/O driver 710 e.g., a disk driver
  • wrapper and interconnect logic 715 e.g., wrapper and interconnect logic
  • the device-type-specific driver 705 is a conventional driver provided with the operating system and associated with specific device types.
  • the mid-level SCSI I/O driver 710 is a conventional mid-level driver that is called by the device-type-specific driver 705 once the driver 705 determines that the device is a SCSI device.
  • the wrapper and interconnect logic 715 is called by the mid-level SCSI I/O driver 710 . This logic provides the SCSI subsystem interface and thus emulates the SCSI subsystem. In certain embodiments that use the Giganet fabric, logic 715 is responsible for wrapping the SCSI commands as necessary and for interacting with the Giganet and RCLAN interface to cause the NIC to send the packets to the control nodes via the dedicated RVIs to the control nodes, described above. The header information for the Giganet packet is modified to indicate that this is a storage packet and includes other information, described below in context. Though not shown in FIG. 7 , wrapper logic 715 may use the RCLAN layer to support and utilize redundant interconnects 110 and fabrics 115 .
  • the RVIs of connection 725 are assigned virtual interface (VI) numbers from the range of 1024 available VIs.
  • VI virtual interface
  • the switch 115 is programmed with a bi-directional path between the pair (control node switch port, control node VI number), (processor node 105 switch port, processor node VI number).
  • a separate RVI is used for each type of message sent in either direction.
  • the receive buffers posted to each of the RVI channels can be sized appropriately for the maximum message length that the protocol will use for that type of message.
  • all of the possible message types are multiplexed onto a single RVI, rather than using 2 VIs.
  • the protocol and the message format do not specifically require the use of 2 RVIs, and the messages themselves have message type information in their header so that they could be demultiplexed.
  • One of the two channels is used to exchange SCSI commands (CMD) and status (STAT) messages.
  • CMD SCSI commands
  • STAT status
  • the other channel is used to exchange buffer (BUF) and transmit (TRAN) messages.
  • This channel is also used to handle data payloads of SCSI commands.
  • CMD messages contain control information, the SCSI command to be performed, and the virtual addresses and sizes of I/O buffers in the node 105 .
  • STAT messages contain control information and a completion status code reflecting any errors that may have occurred while processing the SCSI command.
  • BUF messages contain control information and the virtual addresses and sizes of I/O buffers in the control node 120 .
  • TRAN messages contain control information and are used to confirm successful transmission of data from node 105 to the control node 120 .
  • the processor side wrapper logic 715 examines the SCSI command to be sent to determine if the command requires the transfer of data and, if so, in what direction. Depending on the analysis, the wrapper logic 715 sets appropriate flag information in the message header accordingly.
  • the section describing the control node-side logic describes how the flag information is utilized.
  • the link 725 between processor-side storage logic 720 and control node-side storage logic 715 may be used to convey control messages, not part of the SCSI protocol and not to be communicated to the SAN 130 . Instead, these control messages are to be handled by the control node-side logic 715 .
  • the protocol control messages are always generated by the processor-side of the protocol and sent to the control node-side of the protocol over one of two virtual interfaces (VIs) connecting the processor-side logic 720 to the control node-side storage logic 715 .
  • the message header used for protocol control operations is the same as a command message header, except that different flag bits are used to distinguish the message as a protocol control message.
  • the control node 120 performs the requested operation and responds over the RVI with a message header that is the same as is used by a status message. In this fashion, a separate RVI for the infrequently used protocol control operations is not needed.
  • the processor-side logic 720 detects certain errors from issued commands and in response re-issues the command to the other control node. This retry may be implemented in a mid-level driver 710 .
  • control node-side storage logic 715 is implemented as a device driver module.
  • the logic 715 provides a device-level interface to the operating system on the control nodes 120 . This device-level interface is also used to access the configuration component 705 .
  • this device driver module When this device driver module is initialized, it responds to protocol messages from all of the processors 106 in the platform 100 . All of the configuration activity is introduced through the device-level interface. All of the I/O activity is introduced through messages that are sent and received through the interconnect 110 and switch fabric 115 .
  • On the control node 120 there will be one instance of logic 715 per processor node 105 (though it is only shown as one box in FIG. 7 ).
  • the control node-side logic 715 communicates with the SAN 130 via FCP or FCP-2 protocols, or iSCSI or other protocols that use the SCSI-2 or SCSI-3 command set over various media.
  • the processor-side logic sets flags in the RVI message headers indicating whether data flow is associated with the command and, if so, in which direction.
  • the control node-side storage logic 715 receives messages from the processor-side logic and then analyzes the header information to determine how to act, e.g., to allocate buffers or the like.
  • the logic translates the address information contained in the messages from the processor to the corresponding, mapped SAN address and issues the commands (e.g., via FCP or FCP-2) to the SAN 130 .
  • the control node-side logic 715 allocates temporary memory buffers to store the data from the SCSI operation while the SCSI command is executing on the control node. After the control node-side logic 715 has sent the SCSI command to the SAN 130 for processing and the command has completed it sends the data back to the processor 105 memory with a sequence of one or more RDMA WRITE operations. It then constructs a status message with a standard message header, the sequence number for this command, the status of the completed command, and optionally the REQUEST SENSE data if the command completed with a SCSI CHECK CONDITION status.
  • the use of the BUF messages to communicate the location of temporary buffer memory in the control node to the processor-side storage logic and the use of TRAN messages to indicate completion of the RDMA WRITE data transfer is due to the lack of RDMA READ capability in the underlying Giganet fabric. If the underlying fabric supports RDMA READ operations, a different sequence of corresponding actions may be employed. More specifically, the processor-side logic 720 constructs a CMD message with a standard message header, a new sequence number for this command, the desired SCSI target and LUN, and the SCSI command to be executed. The control node-side logic 715 allocates temporary memory buffers to store the data from the SCSI operation while the SCSI command is executing on the control node.
  • the control node-side of the protocol then constructs a BUF message with a standard message header, the sequence number for this command, and a list of regions of virtual memory which are used for the temporary memory buffers on the control node.
  • the processor-side logic 720 then sends the data over to the control node memory with a sequence of one or more RDMA WRITE operations.
  • control node-side logic After the control node-side logic has sent the SCSI command to the SAN 130 for processing and has received the command completion, it constructs a STAT message with a standard message header, the sequence number for this command, the status of the completed command, and optionally the REQUEST SENSE data if the command completed with a CHECK CONDITION status.
  • the CMD message contains a list of regions of virtual memory from where the data for the command is stored.
  • the BUF and TRAN messages also contain an index field, which allows the control node-side of the protocol to send a separate BUF message for each entry in the region list in the CMD message.
  • the processor-side of the protocol would respond to such a message by performing RDMA WRITE operations for the amount of data described in the BUF message, followed by a TRAN message to indicate the completion of a single segment of data transfer.
  • the protocol between the processor-side logic 720 and the control node-side logic 715 allows for scatter-gather I/O operations. This functionality allows the data involved in an I/O request to be read from or written to several distinct regions of virtual and/or physical memory. This allows multiple, non-contiguous buffers to be used for the request on the control node.
  • the configuration logic 705 is responsible for discovering the SAN storage allocated to the platform and for interacting with the interface logic 710 so that an administrator may suballocate the storage to specific PANs.
  • the configuration component 705 creates and maintains a storage data structure 915 that includes information identifying the correspondence between processor addresses and actual SAN addresses.
  • FIG. 7 shows such a structure.
  • the correspondence as described above, may be between the processing node and the identification of the simulated SCSI disks, e.g., by name of the simulated controller, cable, unit, or logical unit number (LUN).
  • Management logic 135 is used to interface to control node software to provision the PANs.
  • the logic 135 allows an administrator to establish the virtual network topology of a PAN, its visibility to the external network (e.g., as a service cluster), and to establish the types of devices on the PAN, e.g., bridges and routing.
  • the logic 135 also interfaces with the storage management interface logic 710 so that an administrator may define the storage for a PAN during initial allocation or subsequently.
  • the configuration definition includes the storage correspondence (SCSI to SAN) discussed above and access control permissions.
  • each of the PANs and each of the processors will have a personality defined by its virtual networking (including a virtual MAC address) and virtual storage.
  • the structures that record such personality may be accessed by management logic, as described below, to implement processor clustering.
  • management logic as described below
  • they may be accessed by an administrator as described above or with an agent administrator.
  • An agent for example may be used to re-configure a PAN in response to certain events, such as time of day or year, or in response to certain loads on the system.
  • the operating system software at a processor includes serial console driver code to route console I/O traffic for the node over the Giganet switch 115 to management software running on a control node. From there, the management software can make any node's console I/O stream accessible via the control node's management ports (its low-speed Ethernet port and its Emergency Management Port) or via the high-speed external network 125 , as permitted by an administrator. Console traffic can be logged for audit and history purposes.
  • FIG. 9 illustrates the cluster management logic of certain embodiments.
  • the cluster management logic 905 accesses the data structures 910 that record the networking information described above, such as the network topologies of PANs, the MAC address assignments within a PAN and so on.
  • the cluster management logic 905 accesses the data structures 915 that record the storage correspondence of the various processors 106 .
  • the cluster management logic 905 accesses a data structure 920 that records free resources such as unallocated processors within the platform 100 .
  • the cluster management logic 905 can change the data structures to cause the storage and networking personalities of a given processor to “migrate” to a new processor. In this fashion, the new processor “inherits” the personality of the former processor.
  • the cluster management logic 905 may be caused to do this to swap a new processor in to a PAN to replace a failing one.
  • the new processor will inherit the MAC address of a former processor and act like the former.
  • the control node will communicate the connectivity information when the new processor boots, and will update the connectivity information for the non-failing processors as needed.
  • the RVI connections for the other processors are updated transparently; that is, the software on the other processors does not need to be involved in establishing connectivity to the newly swapped in processor.
  • the new processor will inherit the storage correspondence of the former and consequently inherit the persisted state of the former processor.
  • each Giganet port of the switch fabric 115 can support 1024 simultaneous Virtual Interface connections over it and keep them separate from each other with hardware protection, the operating system can safely share a node's Giganet ports with application programs. This would allow direct connection between application programs without the need to run through the full stack of driver code. To do this, an operating system call would establish a Virtual Interface channel and memory-map its buffers and queues into application address space. In addition, a library to encapsulate the low-level details of interfacing to the channel would facilitate use of such Virtual Interface connections. The library could also automatically establish redundant Virtual Interface channel pairs and manage sharing or failing over between them, without requiring any effort or awareness from the calling application.
  • the embodiments described above emulated Ethernet internally over an ATM-like fabric.
  • the design may be changed to use an internal Ethernet fabric which would simplify much of the architecture, e.g., obviating the need for emulation features.
  • the external network communicates according to ATM
  • another variation would use ATM internally without emulation of Ethernet and the ATM could be communicated externally to the external network when so addressed.
  • Another variation would allow ATM internally to the platform (i.e., without emulation of Ethernet) and only external communications are transformed to Ethernet. This would streamline internal communications but require emulation logic at the controller.
  • Certain embodiments deploy PANs based on software configuration commands. It will be appreciated that deployment may be based on programmatic control. For example, more processors may be deployed under software control during peak hours of operation for that PAN, or corresponding more or less storage space for a PAN may be deployed under software algorithmic control.

Abstract

A platform and method of deploying virtual processing areas networks are described. A plurality of computer processors are connected to an internal communication network. At least one control node is in communication with an external communication network and an external storage network has an external storage address space. The at least one control node is connected to the internal network and thereby is in communication with the plurality of computer processors. Configuration logic defines and establishes a virtual processing area network having a corresponding set of computer processors from the plurality of processors, a virtual local area communication network providing communication among the set of computer processors, and a virtual storage space with a defined correspondence to the address space of the storage network.

Description

    CROSS-REFERENCE TO RELATED APPLICATIONS
  • This application is a continuation of and claims priority under 35 U.S.C. §120 to U.S. patent application Ser. No. 10/038,353, filed on Jan. 4, 2002, entitled Reconfigurable, Virtual Processing System, Cluster, Network and Method, which claims the benefit under 35 U.S.C. §119(e) of U.S. Provisional Patent Application No. 60/285,296, filed on Apr. 20, 2001, entitled Process Area Network, both of which are incorporated herein by reference in their entirety.
  • BACKGROUND
  • 1. Field of the Invention
  • The present invention relates to computing systems for enterprises and application service providers and, more specifically, to processing systems having virtualized communication networks and storage for quick deployment and reconfiguration.
  • 2. Discussion of Related Art
  • In current enterprise computing and application service provider environments, personnel from multiple information technology (IT) functions (electrical, networking, etc.) must participate to deploy processing and networking resources. Consequently, because of scheduling and other difficulties in coordinating activities from multiple departments, it can take weeks or months to deploy a new computer server. This lengthy, manual process increases both human and equipment costs, and delays the launch of applications.
  • Moreover, because it is difficult to anticipate how much processing power applications will require, managers typically over-provision the amount of computational power. As a result, data-center computing resources often go unutilized or under-utilized.
  • If more processing power is eventually needed than originally provisioned, the various IT functions will again need to coordinate activities to deploy more or improved servers, connect them to the communication and storage networks and so forth. This task gets increasingly difficult as the systems become larger.
  • Deployment is also problematic. For example, when deploying 24 conventional servers, more than 100 discrete connections may be required to configure the overall system. Managing these cables is an ongoing challenge, and each represents a failure point. Attempting to mitigate the risk of failure by adding redundancy can double the cabling, exacerbating the problem while increasing complexity and costs.
  • Provisioning for high availability with today's technology is a difficult and costly proposition. Generally, a failover server must be deployed for every primary server. In addition, complex management software and professional services are usually required.
  • Generally, it is not possible to adjust the processing power or upgrade the CPUs on a legacy server. Instead, scaling processor capacity and/or migrating to a vendor's next-generation architecture often requires a “forklift upgrade,” meaning more hardware/software systems are added, needing new connections and the like.
  • Consequently, there is a need for a system and method of providing a platform for enterprise and ASP computing that addresses the above shortcomings.
  • SUMMARY
  • The present invention features a platform and method for computer processing in which virtual processing area networks may be configured and deployed.
  • According to one aspect of the invention, a computer processing platform includes a plurality of computer processors connected to an internal communication network. At least one control node is in communication with an external communication network and an external storage network having an external storage address space. The at least one control node is connected to the internal network and thereby communicates with the plurality of computer processors. Configuration logic defines and establishes a virtual processing area network having a corresponding set of computer processors from the plurality of processors, a virtual local area communication network providing communication among the set of computer processors but excluding the processors from the plurality not in the defined set, and a virtual storage space with a defined correspondence to the address space of the storage network.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • In the Drawing,
  • FIG. 1 is a system diagram illustrating one embodiment of the invention;
  • FIGS. 2A-C are diagrams illustrating the communication links established according to one embodiment of the invention;
  • FIGS. 3A-B are diagrams illustrating the networking software architecture of certain embodiments of the invention;
  • FIGS. 4A-C are flowcharts illustrating driver logic according to certain embodiments of the invention;
  • FIG. 5 illustrates service clusters according to certain embodiments of the invention;
  • FIG. 6 illustrates the storage software architecture of certain embodiments of the invention;
  • FIG. 7 illustrates the processor-side storage logic of certain embodiments of the invention;
  • FIG. 8 illustrates the storage address mapping logic of certain embodiments of the invention; and
  • FIG. 9 illustrates the cluster management logic of certain embodiments of the invention.
  • DETAILED DESCRIPTION
  • Preferred embodiments of the invention provide a processing platform from which virtual systems may be deployed through configuration commands. The platform provides a large pool of processors from which a subset may be selected and configured through software commands to form a virtualized network of computers (“processing area network” or “processor clusters”) that may be deployed to serve a given set of applications or customer. The virtualized processing area network (PAN) may then be used to execute customer specific applications, such as web-based server applications. The virtualization may include virtualization of local area networks (LANs) or the virtualization of I/O storage. By providing such a platform, processing resources may be deployed rapidly and easily through software via configuration commands, e.g., from an administrator, rather than through physically providing servers, cabling network and storage connections, providing power to each server and so forth.
  • Overview of the Platform and Its Behavior
  • As shown in FIG. 1, a preferred hardware platform 100 includes a set of processing nodes 105 a-n connected to a switch fabrics 115 a,b via high-speed, interconnect 110 a,b. The switch fabric 115 a,b is also connected to at least one control node 120 a,b that is in communication with an external IP network 125 (or other data communication network), and with a storage area network (SAN) 130. A management application 135, for example, executing remotely, may access one or more of the control nodes via the IP network 125 to assist in configuring the platform 100 and deploying virtualized PANs.
  • Under certain embodiments, about 24 processing nodes 105 a-n, two control nodes 120, and two switch fabrics 115 a,b are contained in a single chassis and interconnected with a fixed, pre-wired mesh of point-to-point (PtP) links. Each processing node 105 is a board that includes one or more (e.g., 4) processors 106 j-1, one or more network interface cards (NICs) 107, and local memory (e.g., greater than 4 Gbytes) that, among other things, includes some BIOS firmware for booting and initialization. There is no local disk for the processors 106; instead all storage, including storage needed for paging, is handled by SAN storage devices 130.
  • Each control node 120 is a single board that includes one or more (e.g., 4) processors, local memory, and local disk storage for holding independent copies of the boot image and initial file system that is used to boot operating system software for the processing nodes 105 and for the control nodes 106. Each control node communicates with SAN 130 via 100 megabyte/second fibre channel adapter cards 128 connected to fibre channel links 122, 124 and communicates with the Internet (or any other external network) 125 via an external network interface 129 having one or more Gigabit Ethernet NICs connected to Gigabit Ethernet links 121,123. (Many other techniques and hardware may be used for SAN and external network connectivity.) Each control node includes a low speed Ethernet port (not shown) as a dedicated management port, which may be used instead of remote, web-based management via management application 135.
  • The switch fabrics is composed of one or more 30-port Giganet switches 115, such as the NIC-CLAN 1000 and clan 5300 switch, and the various processing and control nodes use corresponding NICs for communication with such a fabric module. Giganet switch fabrics have the semantics of a Non-Broadcast Multiple Access (NBMA) network. All inter-node communication is via a switch fabric. Each link is formed as a serial connection between a NIC 107 and a port in the switch fabric 115. Each link operates at 112 megabytes/second.
  • In some embodiments, multiple cabinets or chassises may be connected together to form larger platforms. And in other embodiments the configuration may differ; for example, redundant connections, switches and control nodes may be eliminated.
  • Under software control, the platform supports multiple, simultaneous and independent processing areas networks (PANs). Each PAN, through software commands, is configured to have a corresponding subset of processors 106 that may communicate via a virtual local area network that is emulated over the PtP mesh. Each PAN is also configured to have a corresponding virtual I/O subsystem. No physical deployment or cabling is needed to establish a PAN. Under certain preferred embodiments, software logic executing on the processor nodes and/or the control nodes emulates switched Ethernet semantics; other software logic executing on the processor nodes and/or the control nodes provides virtual storage subsystem functionality that follows SCSI semantics and that provides independent I/O address spaces for each PAN.
  • Network Architecture
  • Certain preferred embodiments allow an administrator to build virtual, emulated LANs using virtual components, interfaces, and connections. Each of the virtual LANs can be internal and private to the platform 100, or multiple processors may be formed into a processor cluster externally visible as a single IP address.
  • Under certain embodiments, the virtual networks so created emulate a switched Ethernet network, though the physical, underlying network is a PtP mesh. The virtual network utilizes IEEE MAC addresses, and the processing nodes support IETF ARP processing to identify and associate IP addresses with MAC addresses. Consequently, a given processor node replies to an ARP request consistently whether the ARP request came from a node internal or external to the platform.
  • FIG. 2A shows an exemplary network arrangement that may be modeled or emulated. A first subnet 202 is formed by processing nodes PN1, PN2, and PNk that may communicate with one another via switch 206. A second subnet 204 is formed by processing nodes PNk and PNm that may communicate with one another via switch 208. Under switched Ethernet semantics, one node on a subnet may communicate directly with another node on the subnet; for example, PN1 may send a message to PN2. The semantics also allow one node to communicate with a set of the other nodes; for example PN1 may send a broadcast message to other nodes. The processing nodes PN1 and PN2 cannot directly communicate with PNm because PNm is on a different subnet. For PN1 and PN2 to communicate with PNm higher layer networking software would need to be utilized, which software would have a fuller understanding of both subnets. Though not shown in the figure, a given switch may communicate via an “uplink” to another switch or the like. As will be appreciated given the description below, the need for such uplinks is different than their need when the switches are physical. Specifically, since the switches are virtual and modeled in software they may scale horizontally as wide as needed. (In contrast, physical switches have a fixed number of physical ports sometimes the uplinks are needed to provide horizontal scalability.)
  • FIG. 2B shows exemplary software communication paths and logic used under certain embodiments to model the subnets 202 and 204 of FIG. 2A. The communication paths 212 connect processing nodes PN1, PN2, PNk, and PNm, specifically their corresponding processor-side network communication logic 210, and they also connect processing nodes to control nodes. (Though drawn as a single instance of logic for the purpose of clarity, PNk may have multiple instances of the corresponding processor logic, one per subnet, for example.) Under preferred embodiments, management logic and the control node logic are responsible for establishing, managing and destroying the communication paths. The individual processing nodes are not permitted to establish such paths.
  • As will be explained in detail below, the processor logic and the control node logic together emulate switched Ethernet semantics over such communication paths. For example, the control nodes have control node-side virtual switch logic 214 to emulate some (but not necessarily all) of the semantics of an Ethernet switch, and the processor logic includes logic to emulate some (but not necessarily all) of the semantics of an Ethernet driver.
  • Within a subnet, one processor node may communicate directly with another via a corresponding virtual interface 212. Likewise, a processor node may communicate with the control node logic via a separate virtual interface. Under certain embodiments, the underlying switch fabric and associated logic (e.g., switch fabric manager logic, not shown) provides the ability to establish and manage such virtual interfaces (VIs) over the point to point mesh. Moreover, these virtual interfaces may be established in a reliable, redundant fashion and are referred to herein in as RVIs. At points in this description, the terms virtual interface (VI) and reliable virtual interface (RVI) are used interchangeably, as the choice between a VI versus an RVI largely depends on the amount of reliability desired by the system at the expense of system resources.
  • Referring conjointly to FIGS. 2A-B, if node PN1 is to communicate with node PN2 it does so ordinarily by virtual interface 212 1-2. However, preferred embodiments allow communication between PN1 and PN2 to occur via switch emulation logic, if for example VI 212 1-2 is not operating satisfactorily. In this case a message may be sent via VI 212 1-switch206 and via VI 212 switch206-2. If PN1 is to broadcast or multicast a message to other nodes in the subnet 202 it does so by sending the message to control node-side logic 214 via virtual interface 212 1-switch206. Control node-side logic 214 then emulates the broadcast or multicast functionality by cloning and sending the message to the other relevant nodes using the relevant VIs. The same or analogous VIs may be used to convey other messages requiring control node-side logic. For example, as will be described below, control node-side logic includes logic to support the address resolution protocol (ARP), and VIs are used to communicate ARP replies and requests to the control node. Though the above description suggests just one VI between processor logic and control logic, many embodiments employ several such connections. Moreover, though the figures suggest symmetry in the software communication paths, the architecture actually allows asymmetric communication. For example, as will be discussed below, for communication clustered services the packets would be routed via the control node. However, return communication may be direct between nodes.
  • Notice that like the network of FIG. 2A, there is no mechanism for communication between node PN2, and PNm. Moreover, by having communication paths managed and created centrally (instead of via the processing nodes) such a path is not creatable by the processing nodes, and the defined subnet connectivity cannot be violated by a processor.
  • FIG. 2C shows the exemplary physical connections of certain embodiments to realize the subnets of FIGS. 2A and B. Specifically, each instance of processing network logic 210 communicates with the switch fabric 115 via a PtP links 216 of interconnect 110. Likewise, the control node has multiple instances of switch logic 214 and each communicates over a PtP connection 216 to the switch fabric. The virtual interfaces of FIG. 2B include the logic to convey information over these physical links, as will be described further below.
  • To create and configure such networks, an administrator defines the network topology of a PAN and specifies (e.g., via a utility within the management software 135) MAC address assignments of the various nodes. The MAC address is virtual, identifying a virtual interface, and not tied to any specific physical node. Under certain embodiments, MAC addresses follow the IEEE 48 bit address format, but in which the contents include a “locally administered” bit (set to 1), the serial number of the control node 120 on which the virtual interface was originally defined (more below), and a count value from a persistent sequence counter on the control node that is kept in NVRAM in the control node. These MACs will be used to identify the nodes (as is conventional) at a layer 2 level. For example, in replying to ARP requests (whether from a node internal to the PAN or on an external network) these MACs will be included in the ARP reply.
  • The control node-side networking logic maintains data structures that contain information reflecting the connectivity of the LAN (e.g., which nodes may communicate to which other nodes). The control node logic also allocates and assigns VI (or RVI) mappings to the defined MAC addresses and allocates and assigns VIs or (RVIs) between the control nodes and between the control nodes and the processing nodes. In the example of FIG. 2A, the logic would allocate and assign VIs 212 of FIG. 2B. (The naming of the VIs and RVIs in some embodiments is a consequence of the switching fabric and the switch fabric manager logic employed.)
  • As each processor boots, BIOS-based boot logic initializes each processor 106 of the node 105 and, among other things, establishes a (or discovers the) VI 212 to the control node logic. The processor node then obtains from the control node relevant data link information, such as the processor node's MAC address, and the MAC identities of other devices within the same data link configuration. Each processor then registers its IP address with the control node, which then binds the IP address to the node and an RVI (e.g., the RVI on which the registration arrived). In this fashion, the control node will be able to bind IP addresses for each virtual MAC for each node on a subnet. In addition to the above, the processor node also obtains the RVI or VI-related information for its connections to other nodes or to control node networking logic.
  • Thus, after boot and initialization, the various processor nodes should understand their layer 2, data link connectivity. As will be explained below, layer 3 (IP) connectivity and specifically layer 3 to layer 2 associations are determined during normal processing of the processors as a consequence of the address resolution protocol.
  • FIG. 3A details the processor-side networking logic 210 and FIG. 3B details the control node-side networking 310 logic of certain embodiments. The processor side logic 210 includes IP stack 305, virtual network driver 310, ARP logic 350, RCLAN layer 315, and redundant Giganet drivers 320 a,b. The control node-side logic 310 includes redundant Giganet drivers 325 a,b, RCLAN layer 330, virtual Cluster proxy logic 360, virtual LAN server 335, ARP server logic 355, virtual LAN proxy 340, and physical LAN drivers 345.
  • IP Stack
  • The IP stack 305 is the communication protocol stack provided with the operating system (e.g., Linux) used by the processing nodes 106. The IP stack provides a layer 3 interface for the applications and operating system executing on a processor 106 to communicate with the simulated Ethernet network. The IP stack provides packets of information to the virtual Ethernet layer 310 in conjunction with providing a layer 3, IP address as a destination for that packet. The IP stack logic is conventional except that certain embodiment avoid check sum calculations and logic.
  • Virtual Ethernet Driver
  • The virtual Ethernet driver 310 will appear to the IP stack 305 like a “real” Ethernet driver. In this regard, the virtual Ethernet driver 310 receives IP packets or datagrams from the IP stack for subsequent transmission on the network, and it receives packet information from the network to be delivered to the stack as an IP packet.
  • The stack builds the MAC header. The “normal” Ethernet code in the stack may be used. The virtual Ethernet driver receives the packet with the MAC header already built and the correct MAC address already in the header.
  • In material part and with reference to FIGS. 4A-C, the virtual Ethernet driver 310 dequeues 405 outgoing IP datagrams so that the packet may be sent on the network. The standard IP stack ARP logic is used. The driver, as will be explained below, intercepts all ARP packets entering and leaving the system to modify them so that the proper information ends up in each node's ARP tables. The normal ARP logic places the correct MAC address in the link layer header of the outgoing packet before the packet is queued to the Ethernet driver. The driver then just examines the link layer header and destination MAC to determine how to send the packet. The driver does not directly manipulate the ARP table (except for the occasional invalidation of ARP entries).
  • The driver 310 determines 415 whether ARP logic 350 has MAC address information (more below) associated with the IP address in the dequeued packet. If the ARP logic 350 has the information, the information is used to send 420 the packet accordingly. If the ARP logic 350 does not have the information, the driver needs to determine such information, and in certain preferred embodiments, this information is obtained as a result of an implementation of the ARP protocol as discussed in connection with FIGS. 4B-C.
  • If the ARP logic 350 has the MAC address information, the driver analyzes the information returned from the ARP logic 350 to determine where and how to send the packet. Specifically, the driver looks at the address to determine whether the MAC address is in a valid format or in a particular invalid format. For example, in one embodiment, internal nodes (i.e., PAN nodes internal to the platform) are signaled through a combination of setting the locally administered bit, the multicast bit, and another predefined bit pattern in the first byte of the MAC address. The overarching pattern is one which is highly improbable of being a valid pattern.
  • If the MAC address returned from the ARP logic is in a valid format, the IP address associated with that MAC address is for a node external at least to the relevant subnet and in preferred embodiments is external to the platform. To deliver such a packet, the driver prepends the packet with a TLV (type-length-value) header. The logic then sends the packet to the control node over a pre-established VI. The control node then handles the rest of the transmission as appropriate.
  • If the MAC address information returned from the ARP logic 350 is in an a particular invalid format, the invalid format signals that the IP-addressed node is to an internal node, and the information in the MAC address information is used to help identify the VI (or RVI) directly connecting the two processing nodes. For example, the ARP table entry may hold information identifying the RVI 212 to use to send the packet, e.g., 212 1-2, to another processing node. The driver prepends the packet with a TLV header. It then places address information into the header as well as information identifying the Ethernet protocol type. The logic then selects the appropriate VI (or RVI) on which to send the encapsulated packet. If that VI (or RVI) is operating satisfactorily it is used to carry the packet; if it is operating unsatisfactorily the packet is sent to the control node switch logic (more below) so that the switch logic can send it to the appropriate node. Though the ARP table may contain information to actually specify the RVI to use, many other techniques may be employed. For example, the information in the table may indirectly provide such information, e.g., by pointing to the information of interest or otherwise identifying the information of interest though not contain it.
  • For any multicast or broadcast type messages, the driver sends the message to the control node on a defined VI. The control node then clones the packet and sends it to all nodes (excluding the sending node) and the uplink accordingly.
  • If there is no ARP mapping then the upper layers would never have sent the packet to the driver. If there is no datalink layer mapping available, the packet is put aside until ARP resolution is completed. Once the ARP layer has finished ARPing, the packets held back pending ARP get their datalink headers build and the packets are then sent to the driver.
  • If the ARP logic has no mapping for an IP address of an IP packet from the IP stack and, consequently, the driver 310 is unable to determine the associated addressing information (i.e., MAC address or RVI-related information), the driver obtains such information by following the ARP protocol. Referring to FIGS. 4B-C, the driver builds 425 an ARP request packet containing the relevant IP address for which there is no MAC mapping in the local ARP table. The node then prepends 430 the ARP packet with a TLV-type header. The ARP request is then sent via a dedicated RVI to the control node-side networking logic—specifically, the virtual LAN server 335.
  • As will be discussed in more detail below, the ARP request packet is processed 435 by the control node and broadcast 440 to the relevant nodes. For example, the control node will flag whether the requesting node is part of an IP service cluster.
  • The Ethernet driver logic 310 at the relevant nodes receives 445 the ARP reply, and determines 450 if it is the target of the ARP request by comparing the target IP address with a list of locally configured IP addresses by making calls to the node's IP stack. If it is not the target, it passes up the packet without modification. If it is the target, the driver creates 460 a local MAC header from the TLV header and updates 465 the local ARP table and creates an ARP reply. The driver modifies the information in the ARP request (mainly the source MAC) and then passes the ARP request up normally for the upper layers to handle. It is the upper layers that form the ARP reply when necessary. The reply among other things contains the MAC address of the replying node and has a bit set in the TLV header indicating that the reply is from a local node. In this regard, the node responds according to IETF-type ARP semantics (in contrast to ATM ARP protocols in which ARP replies are handled centrally). The reply is then sent 470.
  • As will be explained in more detail below, the control node logic 335 receives 473 the reply and modifies it. For example, the control node may substitute the MAC address of a replying, internal node with information identifying the source cabinet, processing node number, RVI connection number, channel, virtual interface number, and virtual LAN name. Once the ARP reply is modified the control node logic then sends 475 the ARP reply to an appropriate node, i.e., the node that sent the ARP request, or in specific instances to the load balancer in an IP service cluster, discussed below.
  • Eventually, an encapsulated ARP reply is received 480. If the replying node is an external node, the ARP reply contains the MAC address of the replying node. If the replying node is an internal node, the ARP reply instead contains information identifying the relevant RVI to communicate with the node. In either case, the local table is updated 485.
  • The pending datagram is dequeued 487, and the appropriate RVI is selected 493. As discussed above, the appropriate RVI is selected based on whether the target node is internal or external. A TLV header is prepended to the packet and sent 495.
  • For communications within a virtual LAN the maximum transmission unit (MTU) is configured as 16896 bytes. Even though the configured MTU is 16896 bytes, the Ethernet driver 310 recognizes when a packet is being sent to an external network. Through the use of path MTU discovery, ICMP and IP stack changes, the path MTU is changed at the source node 105. This mechanism is also used to trigger packet check summing.
  • Certain embodiments of the invention support promiscuous mode through a combination of logic at the virtual LAN server 335 and in the virtual LAN drivers 310. When a virtual LAN driver 310 receives a promiscuous mode message from the virtual LAN server 335, the message contains information about the identity of the receiver desiring to enter promiscuous mode. This information includes the receiver's location (cabinet, node, etc), the interface number of the promiscuous virtual interface 310 on the receiver (required for demultiplexing packets), and the name of the virtual LAN to which the receiver belongs. This information is then used by the driver 310 to determine how to send promiscuous packets to the receiver (which RVI or other mechanism to use to send the packets). The virtual interface 310 maintains a list of promiscuous listeners on the same virtual LAN. When a sending node receives a promiscuous mode message it will update its promiscuous list accordingly.
  • When a packet is transmitted over a virtual Ethernet driver 310, this list will be examined. If the list is not empty, then the virtual Ethernet interface 310 will do the following:
      • If the outgoing packet is being broadcast or multicast, no promiscuous copy will be sent. The normal broadcast operation will transmit the packet to the promiscuous listener(s).
      • If the packet is a unicast packet with a destination other than the promiscuous listener, the packet will be cloned and sent to the promiscuous listeners.
  • The header TLV includes extra information the destination can use to demultiplex and validate the incoming packet. Part of this information is the destination virtual Ethernet interface number (destination device number on the receiving node). Since these can be different between the actual packet destination and the promiscuous destination, this header cannot simply be cloned. Thus, memory will have to be allocated for each header for each packet clone to each promiscuous listener. When the packet header for a promiscuous packet is built the packet type will be set to indicate that the packet was a promiscuous transmission rather than a unicast transmission.
  • The virtual Ethernet driver 310 is also responsible for handling the redundant control node connections. For example, the virtual Ethernet drivers will periodically test end-to-end connectivity by sending a heartbeat TLV to each connected RVI. This will allow virtual Ethernet drivers to determine if a node has stopped responding or whether a stopped node has started to respond again. When an RVI or control node 120 is determined to be down, the Ethernet driver will send traffic through the surviving control node. If both control nodes are functional the driver 310 will attempt to load balance traffic between the two nodes.
  • Certain embodiments of the invention provide performance improvements. For example, with modifications to the IP stack 305, packets sent only within the platform 100 are not check summed since all elements of the platform 100 provide error detection and guaranteed data delivery.
  • In addition, for communications within a PAN (or even within a platform 100) the RVI may be configured so that the packets may be larger than the maximum size permitted by Ethernet. Thus, while the model emulates Ethernet behavior in certain embodiments maximum packet size may be violated to improve performance. The actual packet size will be negotiated as part of the data link layer.
  • Failure of a control node is detected either by a notification from the RCLAN layer, or by a failure of heartbeat TLVs. If a control node fails the Ethernet driver 310 will send traffic only to the remaining control node. The Ethernet driver 310 will recognize the recovery of a control node via notification from the RCLAN layer or the resumption of heartbeat TLVs. Once a control node has recovered, the Ethernet driver 310 will resume load balancing.
  • If a node detects that it cannot communicate with another node via a direct RVI (as outlined above) the node attempts to communicate via the control node, acting as a switch. Such failure may be signaled by the lower RCLAN layer, for example from failure to receive a virtual interface acknowledgement or from failures detected through heartbeat mechanisms. In this instance, the driver marks bits in the TLV header accordingly to indicate that the message is to be unicast and sends the packet to the control node so that it can send the packet to the desired node (e.g., based on the IP address, if necessary).
  • RCLAN Layer
  • The RCLAN layer 315 is responsible for handling the redundancy, fail-over and load balancing logic of the redundant interconnect NICs 107. This includes detecting failures, re-routing traffic over a redundant connection on failures, load balancing, and reporting inability to deliver traffic back to the virtual network drivers 310. The virtual ethernet drivers 310 expect to be notified asynchronously when there is a fatal error on any RVI that makes the RVI unusable or if any RVI is taken down for any reason.
  • Under normal circumstances the virtual network driver 310 on each processor will attempt to load balance outgoing packets between available control nodes. This can be done via simple round-robin alternation between available control nodes, or by keeping track of how many bytes have been transmitted on each and always transmitting on the control nodes through which fewest bytes have been sent.
  • The RCLAN provides high bandwidth (224 MB/sec each way) low latency reliable asynchronous point-to-point communication between kernels. The sender of the data is notified if the data cannot be delivered and a best effort will be made to deliver it. The RCLAN uses two Giganet clan 1000 cards to provide redundant communication paths between kernels. It seamlessly recovers single failures in the clan 1000 cards or the Giganet switches. It detects lost data and data errors and resends the data if needed. Communication will not be disrupted as long as one of the connections is partially working, e.g., the error rate does not exceed 5%. Clients of the RCLAN include the RPC mechanism, the remote SCSI mechanism, and remote Ethernet. The RCLAN also provide a simple form of flow control. Low latency and high concurrency are achieved by allowing multiple simultaneous requests for each device to be sent by the processor node to the control node, so that they can be forwarded to the device as soon as possible or, alternatively so that they can be queued for completion as close to the device as possible as opposed to queuing all requests on the processor node.
  • The RCLAN layer 330 on the control node-side operates analogously to the above.
  • Giganet Driver
  • The Giganet driver logic 320 is the logic responsible for providing an interface to the Giganet NIC 107, whether on a processor 106 or control node 120. In short, the Giganet driver logic establishes VI connections, associated by VI id's, so that the higher layers, e.g., RCLAN 315 and Ethernet driver 310, need only understand the semantics of VI's.
  • Giganet driver logic 320 is responsible for allocating memory in each node for buffers and queues for the VI's, and for conditioning the NIC 107 to know about the connection and its memory allocation. Certain embodiments use VI connections provided by the Giganet driver. The Giganet NIC driver code establishes a Virtual Interface pair (i.e., VI) and assigns it to a corresponding virtual interface id.
  • Each VI is a bi-directional connection established between one Giganet port and another, or more precisely between memory buffers and memory queues on one node to buffers and queues on another. The allocation of ports and memory is handled by the NIC drivers as stated above. Data is transmitted by placing it into a buffer the NIC knows about and triggering action by writing to a specific memory-mapped register. On the receiving side, the data appears in a buffer and completion status appears in a queue. The data never need be copied if the sending and receiving programs are capable of producing and consuming messages in the connection's buffers. The transmission can even be direct from application program to application program if the operating system memory-maps the connection's buffers and control registers into application address space. Each Giganet port can support 1024 simultaneous VI connections over it and keep them separate from each other with hardware protection, so the operating system as well as disparate applications can safely share a single port. Under one embodiment of the invention, 14 VI connections may be established simultaneously from every port to every other port.
  • In preferred embodiments, the NIC drivers establish VI connections in redundant pairs, with one connection of the pair going through one of the two switch fabrics 115 a,b and the other through the other switch. Moreover, in preferred embodiments, data is sent alternately on the two legs of the pair, equalizing load on the switches. Alternatively, the redundant pairs may be used in fail-over manner.
  • All the connection pairs established by the node persist as long as the operating system remains up. Establishment of a connection pair to simulate an Ethernet connection is intended to be analogous to, and as persistent as, physically plugging in a cable between network interface cards. If a node's defined configuration changes while its operating system is running, then applicable redundant Virtual Interface connection pairs will be established or discarded at the time of the change.
  • The Giganet driver logic 325 on the control node-side operates analogously to the above.
  • Virtual LAN Server
  • The virtual LAN server logic 335 facilitates the emulation of an Ethernet network over the underlying NBMA network. The virtual LAN server logic
      • 1. manages membership to a corresponding virtual LAN;
      • 2. provides RVI mapping and management;
      • 3. ARP processing and IP mapping to RVI;
      • 4. provides broadcast and multicast services;
      • 5. facilitates bridging and routing to other domains; and
      • 6. manages service clusters.
    1. Virtual LAN Membership Management
  • Administrators configure the virtual LANs using management application 135. Assignment and configuration of IP addresses on virtual LANs may done in the same way as on an “ordinary” subnet. The choice of IP addresses to use is dependent on the external visibility of nodes on a virtual LAN. If the virtual LAN is not globally visible (either not visible outside the platform 100, or from the Internet), private IP addresses should be used. Otherwise, IP addresses must be configured from the range provided by the internet service provider (ISP) that provides the Internet connectivity. In general, virtual LAN IP address assignment must be treated the same as normal LAN IP address assignment. Configuration files stored on the local disks of the control node 120 define the IP addresses within a virtual LAN. For the purposes of a virtual network interface, an IP alias just creates another IP to RVI mapping on the virtual LAN server logic 335. Each processor may configure multiple virtual interfaces as needed. The primary restrictions on the creation and configuration of virtual network interfaces are IP address allocation and configuration.
  • Each virtual LAN has a corresponding instance of server logic 335 that executes on both of the control nodes 120 and a number of nodes executing on the processor nodes 105. The topology is defined by the administrator.
  • Each virtual LAN server 335 is configured to manage exactly one broadcast domain, and any number of layer 3 (IP) subnets may be present on the given layer 2 broadcast domain. The servers 335 are configured and created in response to administrator commands to create virtual LANs.
  • When a processor 106 boots and configures its virtual networks, it connects to the virtual LAN server 335 via a special management RVI. The processors then obtain their data link configuration information, such as the virtual MAC addresses assigned to it, virtual LAN membership information and the like. The virtual LAN server 335 will determine and confirm that the processor attempting to connect to it is properly a member of the virtual LAN that that server 335 is servicing. If the processor is not a virtual LAN member, the connection to the server is rejected. If it is a member, the virtual network driver 310 registers its IP address with the virtual LAN server. (The IP address is provided by the IP stack 305 when the driver 310 is configured.) The virtual LAN server then binds that IP address to an RVI on which the registration arrived. This enables the virtual LAN server to find the processor associated with a specific IP address. Additionally, the association of IP addresses with a processor can be performed via the virtual LAN management interface 135. The latter method is necessary to properly configure cluster IP addresses or IP addresses with special handling, discussed below.
  • 2. RVI Mapping and Management
  • As outlined above, certain embodiments use RVIs to connect nodes at the data link layer and to form control connections. Some of these connections are created and assigned as part of control nodes booting and initialization. The data link layer connections are used for the reasons described above. The control connections are used to exchange management, configuration, and health information.
  • Some RVI connections are between nodes for unicast traffic, e.g., 212 1-2. Other RVI connections are to the virtual LAN server logic 335 so that the server can handle the requests, e.g., ARP traffic, broadcasts, and so on. To create the RVI the virtual LAN server 335 creates and removes RVIs through calls to a Giganet switch manager 360 (provided with the switch fabric and Giganet NICs). The switch manager may execute on the control nodes 120 and cooperates with the Giganet drivers to create the RVIs.
  • With regard to processor connections, as nodes register with the virtual LAN server 335, the virtual LAN server creates and assigns virtual MAC addresses for the nodes, as described above. In conjunction with this, the virtual LAN server logic maintains data structures reflecting the topology and MAC assignments for the various nodes. The virtual LAN server logic then creates corresponding RVIs for the unicast paths between nodes. These RVIs are subsequently allocated and made known to the nodes during the nodes booting. Moreover, the RVIs are also associated with IP addresses during the virtual LAN server's handling of ARP traffic. The RVI connections are torn down if a node is removed from the topology.
  • If a node 106 at one end of an established RVI connection is rebooted, the two operating systems of the each end of the connection, and RVI management logic re-establish the connection. Software using the connection on the processing node that remained up will be unaware that anything happened to the connection itself. Whether or not the software notices or cares that the software at the other end was rebooted depends upon what it is using the connection for and the extent to which the rebooted end is able to re-establish its state from persistent storage. For example, any software communicating via Transmission Control Protocol (TCP) will notice that all TCP sessions are closed by a reboot. On the other hand, Network File System (NFS) access is stateless and not affected by a reboot if it occurs within an allowed timeout period.
  • Should a node be unable to send a packet on a direct RVI at any time, it can always attempt to send the packet to a destination via the virtual LAN server 335. Since the virtual LAN server 335 is connected to all virtual Ethernet driver 310 interfaces on the virtual LAN via the control connections, virtual LAN server 335 can also serve as the packet relay mechanism of last resort.
  • With regard to the connections to the virtual LAN server 335, certain embodiments use virtual Ethernet drivers 310 that algorithmically determine the RVI that it ought to use to connect to its associated virtual LAN server 335. The algorithm, depending on the embodiment, may need to consider identification information such as cabinet number to identify the RVI.
  • 3. ARP Processing and IP Mapping to RVIs
  • As explained above, the virtual Ethernet drivers 310 of certain embodiments support ARP. In these embodiments, ARP processing is used to advantage to create mappings at the nodes between IP addresses and RVIs that may be used to carry unicast traffic, including IP packets, between nodes.
  • To do this, the virtual Ethernet drivers 310 send ARP packet requests and replies to the virtual LAN server 335 via a dedicated RVI. The virtual LAN server 335, and specifically ARP server logic 355, handles the packets by adding information to the packet header. As was explained above, this information facilitates identification of the source and target and identifies the RVI that may be used between the nodes.
  • The ARP server logic 355 receives the ARP requests, processes the TLV header, and broadcasts the request to all relevant nodes on the internal platform and the external network if appropriate. Among other things, the server logic 355 determines who should receive the ARP reply, resulting from the request. For example, if the source is a clustered IP address, the reply should be sent to the cluster load balancer, not necessarily the source of the ARP request. The server logic 355 indicates such by including information in the TLV header of the ARP request, so that the target of the ARP replies accordingly. The server 335 will process the ARP packet by including further information in the appended header and broadcast the packet to the nodes in the relevant domain. For example, the modified header may include information identifying the source cabinet, processing node number, RVI connection number, channel, virtual interface number, and virtual LAN name (some of which is only known by the server 335).
  • The ARP replies are received by the server logic 355, which then maps the MAC information in the reply to corresponding RVI related information. The RVI-related information is placed in the target MAC entry of the reply and sent to the appropriate source node (e.g., may be the sender of the request, but in some instances such as with clustered IP addresses may be a different node).
  • 4. Broadcast and Multicast Services
  • As outlined above, broadcasts are handled by receiving the packet on a dedicated RVI. The packet is then cloned by the server 335 and unicast to all virtual interfaces 310 in the relevant broadcast domain.
  • The same approach may be used for multicast. All multicast packets will be reflected off the virtual LAN server. Under some alternative embodiments, the virtual LAN server will treat multicast the same as broadcast and rely on IP filtering on each node to filter out unwanted packets.
  • When an application wishes to send or receive multicast addresses it must first join a multicast group. When a process on a processor performs a multicast join, the processor virtual network driver 310 sends a join request to the virtual LAN server 335 via a dedicated RVI. The virtual LAN server then configures a specific multicast MAC address on the interface and informs the LAN Proxy 340, discussed below, as necessary. The Proxy 340 will have to keep track of use counts on specific multicast groups so a multicast address is only removed when no processor belongs to that multicast group.
  • 5. Bridging and Routing to Other Domains
  • From the perspective of system 100, the external network 125 may operate in one of two modes: filtered or unfiltered. In filtered mode a single MAC address for the entire system is used for all outgoing packets. This hides the virtual MAC addresses of a processing node 107 behind the Virtual LAN Proxy 340 and makes the system appear as a single node on the network 125 (or as multiple nodes behind a bridge or proxy). Because this doesn't expose unique link layer information for each internal node 107 some other unique identifier is required to properly deliver incoming packets. When running in filter mode, the destination IP address of each incoming packet is used to uniquely identify the intended recipient since the MAC address will only identify the system. In unfiltered mode the virtual MACs of a node 107 are visible outside the system so that they may be used to direct incoming traffic. That is, filtered mode mandates layer 3 switching while unfiltered mode allows layer 2 switching. Filtered mode requires that some component (in this case the Virtual LAN Proxy 340) perform replacement of node virtual MAC addresses with the MAC address of the external network 125 on all outgoing packets.
  • Some embodiments support the ability for a virtual LAN to be connected to external networks. Consequently, the virtual LAN will have to handle IP addresses not configured locally. To address this, one embodiment imposes a limit that each virtual LAN so connected be restricted to one external broadcast domain. IP addresses and subnet assignments for the internal nodes of the virtual LAN will have to be in accordance with the external domain.
  • The virtual LAN server 335 services the external connection by effectively acting as a data link layer bridge in that it moves packets between the external Ethernet driver 345 and internal processors and performs no IP processing. However, unlike like a data link layer bridge, the server cannot always rely on distinctive layer two addresses from the external network to internal nodes and instead the connection may use layer 3 (IP) information to make the bridging decisions. To do this, the external connection software extracts IP address information from incoming packets and it uses this information to identify the correct node 106 so that it may move the packet to that node.
  • A virtual LAN server 335 having an attached external broadcast domain has to intercept and process packets from and to the external domain so that external nodes have a consistent view of the subnet(s) in the broadcast domain.
  • When virtual LAN server 335 having an attached external broadcast domain receives an ARP request from an external node it will relay the request to all internal nodes. The correct node will then compose the reply and send the reply back to the requester through the virtual LAN server 335. The virtual LAN server cooperates with the virtual LAN Proxy 340 so that the Proxy may handle any necessary MAC address translation on outgoing requests. All ARP Replies and ARP advertisements from external sources will be relayed directly to the target nodes.
  • Virtual Ethernet interfaces 310 will send all unicast packets with an external destination to the virtual LAN server 335 over the control connection RVI. (External destinations may be recognized by the driver by the MAC address format.) The virtual LAN server will then move the packet to the external network 125 accordingly.
  • If the virtual LAN server 335 receives a broadcast or multicast packet from an internal node it relays the packet to the external network in addition to relaying the packet to all internal virtual LAN members. If the virtual LAN server 335 receives a broadcast or multicast packet from an external source it relays the packet to all attached internal nodes.
  • Under certain embodiments, interconnecting virtual LANs through the use of IP routers or firewalls is accomplished using analogous mechanisms to those used in interconnecting physical LANs. One processor is configured on both LANs, and the Linux kernel on that processor must have routing (and possibly IP masquerading) enabled. Normal IP subnetting and routing semantics will always be maintained, even for two nodes located in the same platform.
  • A processor could be configured as a router between two external subnets, between and external and internal subnet, and between two internal subnets. When an internal node is sending a packet through a router there are no problems because of the point-to-point topology of the internal network. The sender will send directly to the router (i.e., processor so configured with routing logic) without the intervention of the virtual LAN server (i.e., typical processor to processor communication, discussed above).
  • When an external node sends a packet to an internal router, and the external network 125 is running in filtered mode, the destination MAC address of the incoming packet will be that of the platform 100. Thus the MAC address can not be used to uniquely identify the packet destination node. For a packet whose destination is an internal node on the virtual LAN, the destination IP address in the IP header is used to direct the packet to the proper destination node. However, because routers are not final destinations, the destination IP address in the IP header is that of the final destination rather than that of the next hop (which is the internal router). Thus, there is nothing in the incoming packet that can be used to direct it to the correct internal node. To handle this situation, one embodiment imposes a limit of no more than one router exposed to an external network on a virtual LAN. This router is registered with the virtual LAN server 335 as a default destination so that incoming packets with no valid destination will be directed to this default node.
  • When an external node sends a packet to an internal router and the external network 125 is running in unfiltered mode, the destination MAC address of the incoming packet will be the virtual MAC address of the internal destination node. The LAN Server 335 will then use this virtual MAC to send the packet directly to the destination internal node. In this case any number of internal nodes may be functioning as routers as the incoming packet's MAC address will uniquely identify the destination node.
  • If a configuration requires multiple routers on a subnet, one router can be picked as the exposed router. This router in turn could route to the other routers as necessary.
  • Under certain embodiments, router redundancy is provided, by making a router a clustered service and load balancing or failing over on a stateless basis (i.e., every IP packet rather than per-TCP connection).
  • Certain embodiments of the invention support promiscuous mode functionality by providing switch semantics in which a given port may be designated as a promiscuous port so that all traffic passing through the switch is repeated on the promiscuous port. The nodes that are allowed to listen in promiscuous mode will be assigned administratively at the virtual LAN server.
  • When a virtual Ethernet interface 310 enters promiscuous receive mode it will send a message to the virtual LAN server 335 over the management RVI. This message will contain all the information about the virtual Ethernet interface 310 entering promiscuous mode. When the virtual LAN Server receives a promiscuous mode message from a node, it will check its configuration information to determine if the node is allowed to listen promiscuously. If not, the virtual LAN Server will drop the promiscuous mode message without further processing. If the node is allowed to enter promiscuous mode, the virtual LAN server will broadcast the promiscuous mode message to all other nodes on the virtual LAN. The virtual LAN server will also mark the node as being promiscuous so that it can forward copies of incoming external packets to it. When a promiscuous listener detects any change in its RVI configuration it will send a promiscuous mode message to the virtual LAN to update the state of all other nodes on the relevant broadcast domain. This will update any nodes entering or leaving a virtual LAN. When a virtual Ethernet interface 310 leaves promiscuous it will send the virtual LAN server a message informing it that the interface is leaving promiscuous mode. The virtual LAN server will then send this message to all other nodes on the virtual LAN. Promiscuous settings will allow for placing an external connection in promiscuous mode when any internal virtual interface is a promiscuous listener. This will make the traffic external to the platform (but on the same virtual LAN) available to the promiscuous listener.
  • 6. Managing Service Clusters
  • A service cluster is a set of services available at one or more IP address (or host names). Examples of these services are HTTP, FTP, telnet, NFS, etc. An IP address and port number pair represents a specific service type (though not a service instance) offered by the cluster to clients, including clients on the external network 125.
  • FIG. 5 shows how certain embodiments present a virtual cluster 405 of services as a single virtual host to the Internet or other external network 125 via a cluster IP address. All the services of the cluster 505 are addressed through a single IP address, through different ports at that IP address. In the example of FIG. 5, service B is a load balanced service.
  • With reference to FIG. 3B, virtual clusters are supported by the inclusion of virtual cluster proxy (VCP) logic 360 which cooperates with the virtual LAN server 335. In short, VCP 360 is responsible for handling distribution of incoming connections, port filters, and real server connections for each configured virtual IP address. There will be one VCP for each clustered IP address configured.
  • When a packet arrives on the virtual cluster IP address, the virtual LAN Proxy logic 340 will send the packet to the VCP 360 for processing. The VCP will then decide where to send the packet based on the packet contents, its internal connection state cache, any load balancing algorithms being applied to incoming traffic, and the availability of configured services. The VCP will relay incoming packets based on both the destination IP address as well as the TCP or UDP port number. Further, it will only distribute packets destined for port numbers known to the VCP (or for existing TCP connections). It is the configuration of these ports, and the mapping of the port number to one or more processors that creates the virtual cluster and makes specific service instances available in the cluster. If multiple instances of the same service from multiple application processors are configured then the VCP can load balance between the service instances.
  • The VCP 360 maintains a cache of all active connections that exist on the cluster's IP address. Any load balancing decisions that are made will only be made when a new connection is established between the client and a service. Once the connection has been set up, the VCP will use the source and destination information in the incoming packet header to make sure all packets in a TCP stream get routed to the same processor 106 configured to provide the service. In the absence of the ability to determine a client session (for example, HTTP sessions), the actual connection/load balancing mapping cache will route packets based on client address so that subsequent connections from the same client goes to the same processor (making a client session persistent or “sticky”). Session persistence should be selectable on a service port number basis since only certain types of services require session persistence.
  • Replies to ARP requests, and routing of ARP replies, is handled by the VCP. When a processor sends any ARP packet, it will send it out through the Virtual Ethernet driver 310. The packet will then be sent to the virtual LAN Server 335 for normal ARP processing. The virtual LAN server will broadcast the packet as usual, but will make sure it doesn't get broadcast to any member of the cluster (not just the sender). It will also place information in the packet header TLV that indicates to the ARP target that the ARP source can only be reached through the virtual LAN server and specifically through the load balancer. The ARP target, whether internal or external, will process the ARP request normally and send a reply back through the virtual LAN server. Because the source of the ARP was a cluster IP address, the virtual LAN server will be unable to determine which processor sent out the original request. Thus, the virtual LAN Server will send the reply to each cluster member so that they can handle it properly. When an ARP packet is sent by a source with a cluster IP address as the target, the virtual LAN server will send the request to every cluster member. Each cluster member will receive the ARP request and process it normally. They will then compose an ARP reply and send it back to the source via the virtual LAN server. When the virtual LAN server receives any ARP reply from a cluster member it will drop that reply, but the virtual LAN server will compose and send an ARP reply to the ARP source. Thus, the virtual LAN Server will respond to all ARPs of the cluster IP address. The ARP reply will contain the information necessary for the ARP source to send all packets for the cluster IP address to the VCP. For external ARP sources, this will simply be an ARP reply with the external MAC address as the source hardware address. For internal ARP sources this will be the information necessary to tell the source to send packets for the cluster IP address down the virtual LAN management RVI rather than through a directly connected RVI. Any gratuitous ARP packets that are received will be forwarded to all cluster members. Any gratuitous ARP packets sent by a cluster member will be sent normally.
  • Virtual LAN Proxy
  • The virtual LAN Proxy 340 performs the basic co-ordination of the physical network resources among all the processors that have virtual interfaces to the external physical network 125. It bridges virtual LAN server 335 to the external network 125. When the external network 125 is running in filtered mode the Virtual LAN Proxy 340 will convert the internal virtual MAC addresses from each node to the single external MAC assigned to the system 100. When the external network 125 is operating in unfiltered mode no such MAC translation is required. The Virtual LAN Proxy 340 also performs insertion and removal of IEEE 802.1Q Virtual LAN ID tagging information, and demultiplexing packets based on their VLAN Ids. It also serializes access to the physical Ethernet interface 129 and co-ordinates the allocation and removal of MAC addresses, such as multicast addresses, on the physical network.
  • When the external network 125 is running in filtered mode and the virtual LAN Proxy 340 receives outgoing packets (ARP or otherwise) from a virtual LAN server 335, it replace the internal format MAC address with the MAC address of the physical Ethernet device 129 as the source MAC address. When the External Network 125 is running in unfiltered mode no such replacement is required.
  • When the virtual LAN Proxy 340 receives incoming ARP packets, it moves the packet to the virtual LAN server 335 which handles the packet and relays the packet on to the correct destination(s). If the ARP packet is a broadcast packet then the packet is relayed to all internal nodes on the Virtual LAN. If the packet is a unicast packet the packet is sent only to the destination node. The destination node is determined by the IP address in the ARP packet when the External Network 125 is running in filtered mode, or by the MAC address in the Ethernet header of the ARP packet (not the MAC address is the ARP packet).
  • Physical LAN Driver
  • Under certain embodiments, the connection to the external network 125 is via Gigabit or 100/10baseT Ethernet links connected to the control node. Physical LAN drivers 345 are responsible for interfacing with such links. Packets being sent on the interface will be queued to the device in the normal manner, including placing the packets in socket buffers. The queue used to queue the packets is the one used by the protocol stack to queue packets to the device's transmit routine. For incoming packets, the socket buffer containing the packets will be passed around and the packet data will never be copied (though it will be cloned if needed for multicast operations). Under these embodiments, generic Linux network device drivers may be used in the control node without modification. This facilitates the addition of new devices to the platform without requiring additional device driver work.
  • The physical network interface 345 is in communication only with the virtual LAN proxy 340. This prevents the control node from using the external connection in any way that would interfere with the operation of the virtual LANs and improves security and isolation of user data, i.e., an administrator may not “sniff” any user's packets.
  • Load Balancing and Failover
  • Under some embodiments, the redundant connections to the external network 125 will be used alternately to load balance packet transmission between two redundant interfaces to the external network 125. Other embodiments load balance by configuring each virtual network interface on alternating control nodes so the virtual interfaces are evenly distributed between the two control nodes. Another embodiment transmits through one control node and receives through another.
  • When in filtered mode, there will be one externally visible MAC address to which external nodes transmit packets for a set of virtual network interfaces. If that adapter goes down, then not only do the virtual network interfaces have to fail over to the other control node, but the MAC address must fail over too so that external nodes can continue to send packets to the MAC address already in the ARP caches. Under one embodiment of the invention, when a failed control node recovers, a single MAC address is manipulated and the MAC address does not have to be remapped on recovery.
  • Under another embodiment of the invention, load balancing is performed by allowing transmission on both control nodes but only reception through one. The failover case is both send and receive through the same control node. The recovery case is transmission through the recovered control node since that doesn't require any MAC manipulation.
  • The control node doing reception has IP information for filtering and multicast address information for multicast MAC configuration. This information is needed to process incoming packets and should be failed over should the receiving control node fail. If the transmitting control node fails, virtual network drivers need only start sending outgoing packets only to the receiving control node. No special failover processing is required other than the recognition that the transmitting control node has failed. If the failed control node recovers the virtual network drivers can resume sending outgoing packets to the recovered control nodes without any additional special recovery processing. If the receiving control node fails then the transmitting control node must assume the receiving interface role. To do this, it must configure all MAC addresses on its physical interface to enable packet reception. Alternately, both control nodes could have the same MAC address configured on their interfaces, but receives could be physically disabled on the Ethernet device by the device driver until an control node is ready to receive packets. Then failover would simply enable receives on the device.
  • Because the interfaces must be configured with multicast MAC addresses when any processor has joined a multicast group, multicast information must be shared between control nodes so that failover will be transparent to the processor. Since the virtual network drivers will have to keep track of multicast group membership anyway, this information will always be available to a LAN Proxy via the virtual LAN server when needed. Thus, a receive failover will result in multicast group membership being queried from virtual network drivers to rebuild the local multicast group membership tables. This operations is low overhead and requires no special processing except during failover and recovery, and doesn't require any special replication of data between control nodes. When receive has failed over and the failed control node recovers, only transmissions will be moved over to the recovered control node. Thus, the algorithm for recovery on virtual network interfaces is to always move transmissions to the recovered control node and leave receive processing where it is.
  • Virtual service clusters may also use load balancing and failover.
  • Multicabinet Platforms
  • Some embodiments allow cabinets to be connected together to form larger platforms. Each cabinet will have at least one control node which will be used for inter-cabinet connections. Each control node will include a virtual LAN server 335 to handle local connections and traffic. One of the servers is configured to be a master, such as the one located on the control node with the external connection for the virtual LAN. The other virtual LAN server will act as proxy servers, or slaves, so that the local processors of those cabinets can participate. The master maintains all virtual LAN state and control while the proxies relay packets between the processors and masters.
  • Each virtual LAN server proxy maintains a RVI to each master virtual LAN Server. Each local processor will connect to the virtual LAN Server Proxy server just as if it were a master. When a processor connects and registers an IP and MAC address, the proxy will register that IP and MAC address with the master. This will cause the master to bind the addresses to the RVI from the proxy. Thus, the master will contain RVI bindings for all internal nodes, but proxies will contain bindings only for nodes in the same cabinet.
  • When an processor anywhere in a multicabinet virtual LAN sends any packet to its virtual LAN server, the packet will be relayed to the master for processing. The master will then do normal processing on the packet. The master will relay packets to the proxies as necessary for multicast and broadcast. The master will also relay unicast packets based on the destination IP address of the unicast packet and registered IP addresses on the proxies. Note that on the master, a proxy connection looks very much like a node with many configured IP addresses.
  • Networking Management Logic
  • During times when there is no operating system running on a processing node, such as booting or kernel debugging, the node's serial console traffic and boot image requests are routed by switch driver code located in the processing node's kernel debugging software or BIOS to management software running on a control node (not shown). From there, the console traffic can again be accessed either from the high-speed external network 125 or through the control node's management ports. The boot image requests can be satisfied from either the control node's local disks or from partitions out on the external SAN 130. The control node 120 is preferably booted and running normally before anything can be done to an processing node. The control node is itself booted or debugged from its management ports.
  • Some customers may wish to restrict booting and debugging of controllers to local access only, by plugging their management ports into an on-site computer when needed. Others may choose to allow remote booting and debugging by establishing a secure network segment for management purposes, suitably isolated from the Internet, into which to plug their management ports. Once a controller is booted and running normally, all other management functions for it and for the rest of the platform can be accessed from the high-speed external network 125 as well as the management ports, if permitted by the administrator.
  • Serial console traffic to and from each processing node 105 is sent by an operating system kernel driver over the switch fabric 115 to management software running on a control node 120. From there, any node's console traffic can be accessed either from the normal, high-speed external network 125 or through either of the control node's management ports.
  • Storage Architecture
  • Certain embodiments follow a SCSI model of storage. Each virtual PAN has its own virtualized I/O space and issues SCSI commands and status within such space. Logic at the control node translates or transforms the addresses and commands as necessary from a PAN and transmits them accordingly to the SAN 130 which services the commands. From the perspective of the SAN, the client is the platform 100 and the actual PANs that issued the commands are hidden and anonymous. Because the SAN address space is virtualized, one PAN operating on the platform 100 may have device numbering starting with a device number 1, and a second PAN may also have a device number 1. Yet each of the device number 1s will correspond to a different, unique portion of SAN storage.
  • Under preferred embodiments, an administrator can build virtual storage. Each of the PANs will have its own independent perspective of mass storage. Thus, as will be explained below, a first PAN may have a given device/LUN address map to a first location in the SAN, and a second PAN may have the same given device/LUN map to a second, different location in the SAN. Each processor maps a device/LUN address into a major and minor device number, to identify a disk and a partition, for example. Though the major and minor device numbers are perceived as a physical address by the PAN and the processors within a PAN, in effect they are treated by the platform as a virtual address to the mass storage provided by the SAN. That is, the major and minor device numbers of each processor are mapped to corresponding SAN locations.
  • FIG. 6 illustrates the software components used to implement the storage architecture of certain embodiments. A configuration component 605, typically executed on a control node 120, is in communication with external SAN 130. A management interface component 610 provides an interface to the configuration component 605 and is in communication with IP network 125 and thus with remote management logic 135 (see FIG. 1). Each processor 106 in the system 100 includes an instance of processor-side storage logic 620. Each such instance 620 communicates via 2 RVI connections 625 to a corresponding instance of control node-side storage logic 615.
  • In short, the configuration component 605 and interface 610 are responsible for discovering those portions of SAN storage that are allocated to the platform 100 and for allowing an administrator to suballocate portions to specific PANs or processors 106. Storage configuration logic 605 is also responsible for communicating the SAN storage allocations to control node-side logic 615. The processor-side storage logic 620 is responsible for communicating the processor's storage requests over the internal interconnect 110 and storage fabric 115 via dedicated RVIs 625 to the control node-side logic 615. The requests will contain, under certain embodiments, virtual storage addresses and SCSI commands. The control node-side logic is responsible for receiving and handling such commands by identifying the corresponding actual address for the SAN and converting the commands and protocol to the appropriate form for the SAN, for example, including but not limited to, fibre channel (Gigabit Ethernet with iSCSI is another exemplary connectivity).
  • Configuration Component
  • The configuration component 605 determines which elements in the SAN 130 are visible to each individual processor 106. It provides a mapping function that translates the device numbers (e.g., SCSI target and LUN) that the processor uses into the device numbers visible to the control nodes through their attached SCSI and Fibre Channel I/O interfaces 128. It also provides an access control function, which prevents processors from accessing external storage devices which are attached to the control nodes but not included in the processors' configuration. The model that is presented to the processor (and to the system administrator and applications/users on that processor) makes it appear as if each processor has its own mass storage devices attached to interfaces on the processor.
  • Among other things, this functionality allows the software on a processor 106 to be moved to another processor easily. For example, in certain embodiments, the control node via software (without any physical re-cabling) may change the PAN configurations to allow a new processor to access the required devices. Thus, a new processor may be made to inherit the storage personality of another.
  • Under certain embodiments, the control nodes appear as hosts on the SANs, though alternative embodiments allow the processors to act as such.
  • As outlined above, the configuration logic discovers the SAN storage allocated to the platform 100 (for example, during platform boot) and this pool is subsequently allocated by an administrator. If discovery is activated later, the control node that performs the discovery operation compares the new view with the prior view. Newly available storage is added to the pool of storage that may be allocated by an administrator. Partitions that disappear that were not assigned are removed from the available pool of storage that may be allocated to PANs. Partitions that disappear that were assigned trigger error messages.
  • Management Interface Component
  • The configuration component 605 allows management software to access and update the information which describes the device mapping between the devices visible to the control nodes 120 and the virtual devices visible to the individual processors 106. It also allows access to control information. The assignments may be identified by the processing node in conjunction with an identification of the simulated SCSI disks, e.g., by name of the simulated controller, cable, unit, or logical unit number (LUN).
  • Under certain embodiments the interface component 610 cooperates with the configuration component to gather and monitor information and statistics, such as:
      • Total number of I/O operations performed
      • Total number of bytes transferred
      • Total number of read operations performed
      • Total number of write operations performed
      • Total amount of time I/O was in progress
    Processor-side Storage Logic
  • The processor-side logic 620 of the protocol is implemented as a host adapter module that emulates a SCSI subsystem by providing a low-level virtual interface to in the operating system on the processors 106. The processors 106 use this virtual interface to send SCSI I/O commands to the control nodes 120 for processing.
  • Under embodiments employing redundant control nodes 120, each processing node 105 will include one instance of logic 620 per control node 120. Under certain embodiments, the processors refer to storage using physical device numbering, rather than logical. That is, the address is specified as a device name to identify the LUN, the SCSI target, channel, host adapter, and control node 120 (e.g., node 120 a or 120 b). As shown in FIG. 8, one embodiment maps the target (T) and LUN (L) to a host adapter (H), channel (C), mapped target (mT), and mapped LUN (mL)
  • FIG. 7 shows an exemplary architecture for processor side logic 720. Logic 720 includes a device-type-specific driver (e.g., a disk driver) 705, a mid-level SCSI I/O driver 710, and wrapper and interconnect logic 715.
  • The device-type-specific driver 705 is a conventional driver provided with the operating system and associated with specific device types.
  • The mid-level SCSI I/O driver 710 is a conventional mid-level driver that is called by the device-type-specific driver 705 once the driver 705 determines that the device is a SCSI device.
  • The wrapper and interconnect logic 715 is called by the mid-level SCSI I/O driver 710. This logic provides the SCSI subsystem interface and thus emulates the SCSI subsystem. In certain embodiments that use the Giganet fabric, logic 715 is responsible for wrapping the SCSI commands as necessary and for interacting with the Giganet and RCLAN interface to cause the NIC to send the packets to the control nodes via the dedicated RVIs to the control nodes, described above. The header information for the Giganet packet is modified to indicate that this is a storage packet and includes other information, described below in context. Though not shown in FIG. 7, wrapper logic 715 may use the RCLAN layer to support and utilize redundant interconnects 110 and fabrics 115.
  • For embodiments that use Giganet fabric 115, the RVIs of connection 725 are assigned virtual interface (VI) numbers from the range of 1024 available VIs. For the two endpoints to communicate, the switch 115 is programmed with a bi-directional path between the pair (control node switch port, control node VI number), (processor node 105 switch port, processor node VI number).
  • A separate RVI is used for each type of message sent in either direction. Thus, there is always a receive buffer pending on each RVI for a message that can be sent from the other side of the protocol. In addition, since only one type of message is sent in either direction on each RVI, the receive buffers posted to each of the RVI channels can be sized appropriately for the maximum message length that the protocol will use for that type of message. Under other embodiments, all of the possible message types are multiplexed onto a single RVI, rather than using 2 VIs. The protocol and the message format do not specifically require the use of 2 RVIs, and the messages themselves have message type information in their header so that they could be demultiplexed.
  • One of the two channels is used to exchange SCSI commands (CMD) and status (STAT) messages. The other channel is used to exchange buffer (BUF) and transmit (TRAN) messages. This channel is also used to handle data payloads of SCSI commands.
  • CMD messages contain control information, the SCSI command to be performed, and the virtual addresses and sizes of I/O buffers in the node 105. STAT messages contain control information and a completion status code reflecting any errors that may have occurred while processing the SCSI command. BUF messages contain control information and the virtual addresses and sizes of I/O buffers in the control node 120. TRAN messages contain control information and are used to confirm successful transmission of data from node 105 to the control node 120.
  • The processor side wrapper logic 715 examines the SCSI command to be sent to determine if the command requires the transfer of data and, if so, in what direction. Depending on the analysis, the wrapper logic 715 sets appropriate flag information in the message header accordingly. The section describing the control node-side logic describes how the flag information is utilized.
  • Under certain embodiments of the invention, the link 725 between processor-side storage logic 720 and control node-side storage logic 715 may be used to convey control messages, not part of the SCSI protocol and not to be communicated to the SAN 130. Instead, these control messages are to be handled by the control node-side logic 715.
  • The protocol control messages are always generated by the processor-side of the protocol and sent to the control node-side of the protocol over one of two virtual interfaces (VIs) connecting the processor-side logic 720 to the control node-side storage logic 715. The message header used for protocol control operations is the same as a command message header, except that different flag bits are used to distinguish the message as a protocol control message. The control node 120 performs the requested operation and responds over the RVI with a message header that is the same as is used by a status message. In this fashion, a separate RVI for the infrequently used protocol control operations is not needed.
  • Under certain embodiments using redundant control nodes, the processor-side logic 720 detects certain errors from issued commands and in response re-issues the command to the other control node. This retry may be implemented in a mid-level driver 710.
  • Control Node-side Storage Logic
  • Under certain embodiments, the control node-side storage logic 715 is implemented as a device driver module. The logic 715 provides a device-level interface to the operating system on the control nodes 120. This device-level interface is also used to access the configuration component 705. When this device driver module is initialized, it responds to protocol messages from all of the processors 106 in the platform 100. All of the configuration activity is introduced through the device-level interface. All of the I/O activity is introduced through messages that are sent and received through the interconnect 110 and switch fabric 115. On the control node 120, there will be one instance of logic 715 per processor node 105 (though it is only shown as one box in FIG. 7). Under certain embodiments, the control node-side logic 715 communicates with the SAN 130 via FCP or FCP-2 protocols, or iSCSI or other protocols that use the SCSI-2 or SCSI-3 command set over various media.
  • As described above, the processor-side logic sets flags in the RVI message headers indicating whether data flow is associated with the command and, if so, in which direction. The control node-side storage logic 715 receives messages from the processor-side logic and then analyzes the header information to determine how to act, e.g., to allocate buffers or the like. In addition, the logic translates the address information contained in the messages from the processor to the corresponding, mapped SAN address and issues the commands (e.g., via FCP or FCP-2) to the SAN 130.
  • A SCSI command such as a TEST UNIT READY command, which does not require a SCSI data transfer phase, is handled by the processor-side logic 720 sending a single command on the RVI used for command messages, and by the control node-side logic sending a single status message back over the same RVI. More specifically, the processor-side of the protocol constructs the message with a standard message header, a new sequence number for this command, the desired SCSI target and LUN, the SCSI command to be executed, and a list size of zero. The control node-side of the logic receives the message, extracts the SCSI command information and conveys it to the SAN 130 via interface 128. After the control node has received the command completion callback, it constructs a status message to the processor using a standard message header, the sequence number for this command, the status of the completed command, and optionally the request sense data if the command completed with a check condition status.
  • A SCSI command such as a READ command, which requires a SCSI data transfer phase to transfer data from the SCSI device into the host memory, is handled by the processor-side logic sending a command message to the control node-side logic 715, and the control node responding with one or more RDMA WRITE operation into memory in the processor node 105, and a single status message from the control node-side logic. More specifically, the processor-side logic 720 constructs a command message with a standard message header, a new sequence number for this command, the desired SCSI target and LUN, the SCSI command to be executed, and a list of regions of memory where the data from the command is to be stored. The control node-side logic 715 allocates temporary memory buffers to store the data from the SCSI operation while the SCSI command is executing on the control node. After the control node-side logic 715 has sent the SCSI command to the SAN 130 for processing and the command has completed it sends the data back to the processor 105 memory with a sequence of one or more RDMA WRITE operations. It then constructs a status message with a standard message header, the sequence number for this command, the status of the completed command, and optionally the REQUEST SENSE data if the command completed with a SCSI CHECK CONDITION status.
  • A SCSI command such as a WRITE command, which requires a SCSI data transfer phase to transfer data from the host memory to the SCSI device, is handled by the processor-side logic 720 sending a single command message to the control node-side logic 715, one or more BUF messages from the control node-side logic 715 to the processor-side logic, one or more RDMA WRITE operations from the processor-side storage logic into memory in the control node, one or more TRAN messages from the processor-side logic to the control node-side logic, and a single status message from the control node-side logic back to the processor-side logic. The use of the BUF messages to communicate the location of temporary buffer memory in the control node to the processor-side storage logic and the use of TRAN messages to indicate completion of the RDMA WRITE data transfer is due to the lack of RDMA READ capability in the underlying Giganet fabric. If the underlying fabric supports RDMA READ operations, a different sequence of corresponding actions may be employed. More specifically, the processor-side logic 720 constructs a CMD message with a standard message header, a new sequence number for this command, the desired SCSI target and LUN, and the SCSI command to be executed. The control node-side logic 715 allocates temporary memory buffers to store the data from the SCSI operation while the SCSI command is executing on the control node. The control node-side of the protocol then constructs a BUF message with a standard message header, the sequence number for this command, and a list of regions of virtual memory which are used for the temporary memory buffers on the control node. The processor-side logic 720 then sends the data over to the control node memory with a sequence of one or more RDMA WRITE operations. It then constructs a TRAN message with a standard message header, and the sequence number for this command After the control node-side logic has sent the SCSI command to the SAN 130 for processing and has received the command completion, it constructs a STAT message with a standard message header, the sequence number for this command, the status of the completed command, and optionally the REQUEST SENSE data if the command completed with a CHECK CONDITION status.
  • Under some embodiments, the CMD message contains a list of regions of virtual memory from where the data for the command is stored. The BUF and TRAN messages also contain an index field, which allows the control node-side of the protocol to send a separate BUF message for each entry in the region list in the CMD message. The processor-side of the protocol would respond to such a message by performing RDMA WRITE operations for the amount of data described in the BUF message, followed by a TRAN message to indicate the completion of a single segment of data transfer.
  • The protocol between the processor-side logic 720 and the control node-side logic 715 allows for scatter-gather I/O operations. This functionality allows the data involved in an I/O request to be read from or written to several distinct regions of virtual and/or physical memory. This allows multiple, non-contiguous buffers to be used for the request on the control node.
  • As stated above, the configuration logic 705 is responsible for discovering the SAN storage allocated to the platform and for interacting with the interface logic 710 so that an administrator may suballocate the storage to specific PANs. As part of this allocation, the configuration component 705 creates and maintains a storage data structure 915 that includes information identifying the correspondence between processor addresses and actual SAN addresses. FIG. 7 shows such a structure. The correspondence, as described above, may be between the processing node and the identification of the simulated SCSI disks, e.g., by name of the simulated controller, cable, unit, or logical unit number (LUN).
  • Management Logic
  • Management logic 135 is used to interface to control node software to provision the PANs. Among other things, the logic 135 allows an administrator to establish the virtual network topology of a PAN, its visibility to the external network (e.g., as a service cluster), and to establish the types of devices on the PAN, e.g., bridges and routing.
  • The logic 135 also interfaces with the storage management interface logic 710 so that an administrator may define the storage for a PAN during initial allocation or subsequently. The configuration definition includes the storage correspondence (SCSI to SAN) discussed above and access control permissions.
  • As described above, each of the PANs and each of the processors will have a personality defined by its virtual networking (including a virtual MAC address) and virtual storage. The structures that record such personality may be accessed by management logic, as described below, to implement processor clustering. In addition, they may be accessed by an administrator as described above or with an agent administrator. An agent for example may be used to re-configure a PAN in response to certain events, such as time of day or year, or in response to certain loads on the system.
  • The operating system software at a processor includes serial console driver code to route console I/O traffic for the node over the Giganet switch 115 to management software running on a control node. From there, the management software can make any node's console I/O stream accessible via the control node's management ports (its low-speed Ethernet port and its Emergency Management Port) or via the high-speed external network 125, as permitted by an administrator. Console traffic can be logged for audit and history purposes.
  • Cluster Management Logic
  • FIG. 9 illustrates the cluster management logic of certain embodiments. The cluster management logic 905 accesses the data structures 910 that record the networking information described above, such as the network topologies of PANs, the MAC address assignments within a PAN and so on. In addition, the cluster management logic 905 accesses the data structures 915 that record the storage correspondence of the various processors 106. Moreover, the cluster management logic 905 accesses a data structure 920 that records free resources such as unallocated processors within the platform 100.
  • In response to processor error events or administrator commands, the cluster management logic 905 can change the data structures to cause the storage and networking personalities of a given processor to “migrate” to a new processor. In this fashion, the new processor “inherits” the personality of the former processor. The cluster management logic 905 may be caused to do this to swap a new processor in to a PAN to replace a failing one.
  • The new processor will inherit the MAC address of a former processor and act like the former. The control node will communicate the connectivity information when the new processor boots, and will update the connectivity information for the non-failing processors as needed. For example, in certain embodiments, the RVI connections for the other processors are updated transparently; that is, the software on the other processors does not need to be involved in establishing connectivity to the newly swapped in processor. Moreover, the new processor will inherit the storage correspondence of the former and consequently inherit the persisted state of the former processor.
  • Among other advantages this allows a free pool of resources, including processors, to be shared across the entire platform rather than across given PANs. In this way, the free resources (which may be kept as such to improve reliability and fault tolerance of the system) may be used more efficiently.
  • When a new processor is “swapped in” it will need to re-ARP to learn IP address to MAC address associations.
  • Alternatives
  • As each Giganet port of the switch fabric 115 can support 1024 simultaneous Virtual Interface connections over it and keep them separate from each other with hardware protection, the operating system can safely share a node's Giganet ports with application programs. This would allow direct connection between application programs without the need to run through the full stack of driver code. To do this, an operating system call would establish a Virtual Interface channel and memory-map its buffers and queues into application address space. In addition, a library to encapsulate the low-level details of interfacing to the channel would facilitate use of such Virtual Interface connections. The library could also automatically establish redundant Virtual Interface channel pairs and manage sharing or failing over between them, without requiring any effort or awareness from the calling application.
  • The embodiments described above emulated Ethernet internally over an ATM-like fabric. The design may be changed to use an internal Ethernet fabric which would simplify much of the architecture, e.g., obviating the need for emulation features. If the external network communicates according to ATM, another variation would use ATM internally without emulation of Ethernet and the ATM could be communicated externally to the external network when so addressed. Another variation would allow ATM internally to the platform (i.e., without emulation of Ethernet) and only external communications are transformed to Ethernet. This would streamline internal communications but require emulation logic at the controller.
  • Certain embodiments deploy PANs based on software configuration commands. It will be appreciated that deployment may be based on programmatic control. For example, more processors may be deployed under software control during peak hours of operation for that PAN, or corresponding more or less storage space for a PAN may be deployed under software algorithmic control.
  • It will be appreciated that the scope of the present invention is not limited to the above described embodiments, but rather is defined by the appended claims; and that these claims will encompass modifications of and improvements to what has been described.

Claims (3)

1. A platform for automatically deploying virtual processing area networks (PANs) in response to software commands, wherein each virtual PAN may communicate with an external internet protocol (IP) communication network and wherein each virtual PAN may communicate with an external storage network having an external storage address space, said platform comprising:
a pre-wired switching fabric independent of and isolated from said external IP communication network, and independent of and isolated from said external storage network;
a plurality of computer processor nodes each pre-wired and in fixed connection to said switching fabric, each computer processor node having at least one computer processor and memory storage, and wherein each computer processor node includes a programmable virtual MAC address;
configuration logic for receiving software commands specifying virtual PANs, each virtual PAN specification having (i) a specified number of computer processors, (ii) a defined communication interconnectivity and switching functionality among the computer processors of the virtual PAN, said interconnectivity including a layer 2 data link interconnectivity having MAC address assignments for the computer processors of the virtual PAN, and (iii) a virtual storage space for the virtual PAN;
wherein said configuration logic includes logic to program a corresponding set of computer processors and said switching fabric to completely establish without human intervention the defined communication interconnectivity and switching functionality for a virtual PAN specification, including programming said computer processor nodes with the MAC address assignments for the virtual PAN;
wherein said configuration logic includes logic to migrate MAC address assignments from a first set of computer processors to a second set of computer processors to re-deploy the virtual PAN on the platform, said re-deployment being automatic and without human intervention for re-wiring the switch fabric to establish the specified data link connectivity of the virtual PAN;
all network I/O requests and storage requests of any computer processor of a virtual PAN are transmitted via said switching fabric;
wherein network I/O requests of any computer processor of a virtual PAN, addressed to an entity on said external IP communication network, are transmitted to said external IP communication network;
wherein all storage requests of any computer processor of a virtual PAN are transmitted to said external storage network; and
wherein the platform supports simultaneous, independent operation of multiple virtual PANs using said switching fabric.
2. The platform of claim 1 wherein the MAC address assignments for a virtual PAN are recorded in computer-readable data structures.
3. The platform of claim 1 wherein computer processor nodes have MAC address assignments programmed during a booting operation.
US11/759,078 2001-04-20 2007-06-06 Reconfigurable, virtual processing system, cluster, network and method Abandoned US20070233825A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US11/759,078 US20070233825A1 (en) 2001-04-20 2007-06-06 Reconfigurable, virtual processing system, cluster, network and method

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US28529601P 2001-04-20 2001-04-20
US10/038,353 US7231430B2 (en) 2001-04-20 2002-01-04 Reconfigurable, virtual processing system, cluster, network and method
US11/759,078 US20070233825A1 (en) 2001-04-20 2007-06-06 Reconfigurable, virtual processing system, cluster, network and method

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US10/038,353 Continuation US7231430B2 (en) 2001-04-20 2002-01-04 Reconfigurable, virtual processing system, cluster, network and method

Publications (1)

Publication Number Publication Date
US20070233825A1 true US20070233825A1 (en) 2007-10-04

Family

ID=21899447

Family Applications (4)

Application Number Title Priority Date Filing Date
US10/038,353 Active 2024-05-29 US7231430B2 (en) 2001-04-20 2002-01-04 Reconfigurable, virtual processing system, cluster, network and method
US11/759,078 Abandoned US20070233825A1 (en) 2001-04-20 2007-06-06 Reconfigurable, virtual processing system, cluster, network and method
US11/759,076 Abandoned US20070233809A1 (en) 2001-04-20 2007-06-06 Reconfigurable, virtual processing system, cluster, network and method
US11/759,077 Abandoned US20070233810A1 (en) 2001-04-20 2007-06-06 Reconfigurable, virtual processing system, cluster, network and method

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US10/038,353 Active 2024-05-29 US7231430B2 (en) 2001-04-20 2002-01-04 Reconfigurable, virtual processing system, cluster, network and method

Family Applications After (2)

Application Number Title Priority Date Filing Date
US11/759,076 Abandoned US20070233809A1 (en) 2001-04-20 2007-06-06 Reconfigurable, virtual processing system, cluster, network and method
US11/759,077 Abandoned US20070233810A1 (en) 2001-04-20 2007-06-06 Reconfigurable, virtual processing system, cluster, network and method

Country Status (1)

Country Link
US (4) US7231430B2 (en)

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050169309A1 (en) * 2003-04-23 2005-08-04 Sunay Tripathi System and method for vertical perimeter protection
US20060215655A1 (en) * 2005-03-25 2006-09-28 Siu Wai-Tak Method and system for data link layer address classification
US20070244994A1 (en) * 2006-04-14 2007-10-18 International Business Machines Corporation Methods and Arrangements for Activating IP Configurations
US20090254640A1 (en) * 2008-04-07 2009-10-08 Hitachi, Ltd Method and apparatus for hba migration
US20090292858A1 (en) * 2008-05-23 2009-11-26 Vmware, Inc. Distributed Virtual Switch for Virtualized Computer Systems
US20100034117A1 (en) * 2008-08-08 2010-02-11 Dell Products L.P. Parallel vlan and non-vlan device configuration
US20110078331A1 (en) * 2008-10-22 2011-03-31 Fortinet, Inc. Mechanism for enabling layer two host addresses to be shielded from the switches in a network
US20110082910A1 (en) * 2009-10-05 2011-04-07 Vss Monitoring, Inc. Method, apparatus and system for inserting a vlan tag into a captured data packet
US20120116746A1 (en) * 2010-09-10 2012-05-10 Stmicroelectronics S.R.L. Simulation system for implementing computing device models in a multi-simulation environment
US20120131157A1 (en) * 2010-11-19 2012-05-24 Andy Gospodarek Naming network interface cards
US8200473B1 (en) * 2008-08-25 2012-06-12 Qlogic, Corporation Emulation of multiple MDIO manageable devices
US20130046865A1 (en) * 2011-08-18 2013-02-21 Lifang Liu Zero configuration of a virtual distributed device
US8396946B1 (en) * 2010-03-31 2013-03-12 Amazon Technologies, Inc. Managing integration of external nodes into provided computer networks
US20130188521A1 (en) * 2012-01-20 2013-07-25 Brocade Communications Systems, Inc. Managing a large network using a single point of configuration
US20130188514A1 (en) * 2012-01-20 2013-07-25 Brocade Communications Systems, Inc. Managing a cluster of switches using multiple controllers
US20130208721A1 (en) * 2012-02-14 2013-08-15 International Business Machines Corporation Packet routing with analysis assist for embedded applications sharing a single network interface over multiple virtual networks
US20130208728A1 (en) * 2012-02-14 2013-08-15 International Business Machines Corporation Packet routing for embedded applications sharing a single network interface over multiple virtual networks
US20150178235A1 (en) * 2013-12-23 2015-06-25 Ineda Systems Pvt. Ltd Network interface sharing
US20190109783A1 (en) * 2017-10-10 2019-04-11 Vmware, Inc. Methods and apparatus to perform network fabric migration in virtualized server systems
US11824735B1 (en) * 2022-03-30 2023-11-21 Amazon Technologies, Inc. Homogeneous pre-built isolated computing stacks for data center builds

Families Citing this family (179)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8380854B2 (en) 2000-03-21 2013-02-19 F5 Networks, Inc. Simplified method for processing multiple connections from the same client
US7343413B2 (en) 2000-03-21 2008-03-11 F5 Networks, Inc. Method and system for optimizing a network by independently scaling control segments and data flow
US7496095B1 (en) * 2000-06-22 2009-02-24 Intel Corporation Local area network emulation over a channel based network
US7113900B1 (en) * 2000-10-24 2006-09-26 Microsoft Corporation System and method for logical modeling of distributed computer systems
US7606898B1 (en) * 2000-10-24 2009-10-20 Microsoft Corporation System and method for distributed management of shared computers
US6915338B1 (en) * 2000-10-24 2005-07-05 Microsoft Corporation System and method providing automatic policy enforcement in a multi-computer service application
US7509369B1 (en) * 2001-07-11 2009-03-24 Swsoft Holdings, Ltd. Balancing shared servers in virtual environments
US20050021833A1 (en) * 2001-08-29 2005-01-27 Frank Hundscheid Method and device for multicasting in a umts network
US6747384B2 (en) * 2002-01-24 2004-06-08 Visteon Global Technologies, Inc. Alternator hybrid magnet rotor design
ATE413747T1 (en) * 2002-02-08 2008-11-15 Ericsson Telefon Ab L M METHOD AND SYSTEM FOR INTERLATING SERVICE PROVIDERS WITH CUSTOMERS IN AN ACCESS NETWORK USING DYNAMICALLY ALLOCATED MAC ADDRESSES
JP4338068B2 (en) * 2002-03-20 2009-09-30 株式会社日立製作所 Storage system
US7165258B1 (en) * 2002-04-22 2007-01-16 Cisco Technology, Inc. SCSI-based storage area network having a SCSI router that routes traffic between SCSI and IP networks
US7808924B2 (en) * 2002-05-24 2010-10-05 Cisco Technology, Inc. Apparatus and method for preventing disruption of fibre channel fabrics caused by ReConfigure Fabric (RCF) messages
US20030236852A1 (en) * 2002-06-20 2003-12-25 International Business Machines Corporation Sharing network adapter among multiple logical partitions in a data processing system
US7533159B1 (en) * 2002-08-30 2009-05-12 Adtran Inc. Selective flooding in a multicast network
US7640342B1 (en) * 2002-09-27 2009-12-29 Emc Corporation System and method for determining configuration of one or more data storage systems
US6934711B2 (en) * 2002-10-01 2005-08-23 Hewlett-Packard Development Company, L.P. Method and arrangement for communicating with SCSI devices
US7606939B1 (en) * 2002-10-31 2009-10-20 Cisco Technology, Inc. Scaling private virtual local area networks (VLANs) across large metropolitan area networks (MANs).
US7933983B2 (en) * 2002-12-17 2011-04-26 Hewlett-Packard Development Company, L.P. Method and system for performing load balancing across control planes in a data center
US7188062B1 (en) * 2002-12-27 2007-03-06 Unisys Corporation Configuration management for an emulator operating system
US7457300B2 (en) * 2003-01-21 2008-11-25 Telefonaktiebolaget L M Ericsson (Publ) Ethernet address management system
CN100396035C (en) * 2003-01-28 2008-06-18 华为技术有限公司 System and method for switch-in and transmission of different data frames in digital transmission network
US8122106B2 (en) 2003-03-06 2012-02-21 Microsoft Corporation Integrating design, deployment, and management phases for systems
US7689676B2 (en) 2003-03-06 2010-03-30 Microsoft Corporation Model-based policy application
US7890543B2 (en) 2003-03-06 2011-02-15 Microsoft Corporation Architecture for distributed computing system and automated design, deployment, and management of distributed applications
US7526527B1 (en) * 2003-03-31 2009-04-28 Cisco Technology, Inc. Storage area network interconnect server
US20040202185A1 (en) * 2003-04-14 2004-10-14 International Business Machines Corporation Multiple virtual local area network support for shared network adapters
JP2005018193A (en) * 2003-06-24 2005-01-20 Hitachi Ltd Interface command control method for disk device, and computer system
US7664823B1 (en) * 2003-09-24 2010-02-16 Cisco Technology, Inc. Partitioned packet processing in a multiprocessor environment
US20050068888A1 (en) * 2003-09-26 2005-03-31 Komarla Eshwari P. Seamless balde failover in platform firmware
US7533175B1 (en) * 2003-10-24 2009-05-12 Network Appliance, Inc. Network address resolution and forwarding TCP/IP packets over a fibre channel network
US7437738B2 (en) * 2003-11-12 2008-10-14 Intel Corporation Method, system, and program for interfacing with a network adaptor supporting a plurality of devices
US7769862B2 (en) * 2003-12-19 2010-08-03 Check Point Software Technologies Inc. Method and system for efficiently failing over interfaces in a network
US7757033B1 (en) 2004-02-13 2010-07-13 Habanero Holdings, Inc. Data exchanges among SMP physical partitions and I/O interfaces enterprise servers
US7664110B1 (en) 2004-02-07 2010-02-16 Habanero Holdings, Inc. Input/output controller for coupling the processor-memory complex to the fabric in fabric-backplane interprise servers
US7843907B1 (en) 2004-02-13 2010-11-30 Habanero Holdings, Inc. Storage gateway target for fabric-backplane enterprise servers
US7860961B1 (en) 2004-02-13 2010-12-28 Habanero Holdings, Inc. Real time notice of new resources for provisioning and management of fabric-backplane enterprise servers
US7561571B1 (en) 2004-02-13 2009-07-14 Habanero Holdings, Inc. Fabric address and sub-address resolution in fabric-backplane enterprise servers
US7990994B1 (en) 2004-02-13 2011-08-02 Habanero Holdings, Inc. Storage gateway provisioning and configuring
US8145785B1 (en) 2004-02-13 2012-03-27 Habanero Holdings, Inc. Unused resource recognition in real time for provisioning and management of fabric-backplane enterprise servers
US7953903B1 (en) 2004-02-13 2011-05-31 Habanero Holdings, Inc. Real time detection of changed resources for provisioning and management of fabric-backplane enterprise servers
US7873693B1 (en) 2004-02-13 2011-01-18 Habanero Holdings, Inc. Multi-chassis fabric-backplane enterprise servers
US7685281B1 (en) 2004-02-13 2010-03-23 Habanero Holdings, Inc. Programmatic instantiation, provisioning and management of fabric-backplane enterprise servers
US8868790B2 (en) 2004-02-13 2014-10-21 Oracle International Corporation Processor-memory module performance acceleration in fabric-backplane enterprise servers
US7633955B1 (en) 2004-02-13 2009-12-15 Habanero Holdings, Inc. SCSI transport for fabric-backplane enterprise servers
US7843906B1 (en) 2004-02-13 2010-11-30 Habanero Holdings, Inc. Storage gateway initiator for fabric-backplane enterprise servers
US7860097B1 (en) 2004-02-13 2010-12-28 Habanero Holdings, Inc. Fabric-backplane enterprise servers with VNICs and VLANs
US7751327B2 (en) * 2004-02-25 2010-07-06 Nec Corporation Communication processing system, packet processing load balancing device and packet processing load balancing method therefor
US7778422B2 (en) 2004-02-27 2010-08-17 Microsoft Corporation Security associations for devices
US8335909B2 (en) * 2004-04-15 2012-12-18 Raytheon Company Coupling processors to each other for high performance computing (HPC)
US20050235055A1 (en) * 2004-04-15 2005-10-20 Raytheon Company Graphical user interface for managing HPC clusters
US8190714B2 (en) * 2004-04-15 2012-05-29 Raytheon Company System and method for computer cluster virtualization using dynamic boot images and virtual disk
US7711977B2 (en) * 2004-04-15 2010-05-04 Raytheon Company System and method for detecting and managing HPC node failure
US9178784B2 (en) * 2004-04-15 2015-11-03 Raytheon Company System and method for cluster management based on HPC architecture
US8336040B2 (en) 2004-04-15 2012-12-18 Raytheon Company System and method for topology-aware job scheduling and backfilling in an HPC environment
US7577707B2 (en) * 2004-04-21 2009-08-18 International Business Machines Corporation Method, system, and program for executing data transfer requests
US20050246529A1 (en) 2004-04-30 2005-11-03 Microsoft Corporation Isolated persistent identity storage for authentication of computing devies
JP4027347B2 (en) * 2004-06-04 2007-12-26 キヤノン株式会社 Information processing apparatus, data communication method, program, and recording medium
US7707282B1 (en) 2004-06-29 2010-04-27 American Megatrends, Inc. Integrated network and management controller
US7668941B1 (en) * 2004-06-29 2010-02-23 American Megatrends, Inc. Systems and methods for implementing a TCP/IP stack and web interface within a management module
US8713295B2 (en) 2004-07-12 2014-04-29 Oracle International Corporation Fabric-backplane enterprise servers with pluggable I/O sub-system
US7058731B2 (en) * 2004-08-03 2006-06-06 Hitachi, Ltd. Failover and data migration using data replication
JP4082613B2 (en) * 2004-09-06 2008-04-30 インターナショナル・ビジネス・マシーンズ・コーポレーション Device for restricting communication services
US8244882B2 (en) * 2004-11-17 2012-08-14 Raytheon Company On-demand instantiation in a high-performance computing (HPC) system
US7433931B2 (en) * 2004-11-17 2008-10-07 Raytheon Company Scheduling in a high-performance computing (HPC) system
US20060123111A1 (en) * 2004-12-02 2006-06-08 Frank Dea Method, system and computer program product for transitioning network traffic between logical partitions in one or more data processing systems
US20060123204A1 (en) * 2004-12-02 2006-06-08 International Business Machines Corporation Method and system for shared input/output adapter in logically partitioned data processing system
US7730183B2 (en) * 2005-01-13 2010-06-01 Microsoft Corporation System and method for generating virtual networks
US7797147B2 (en) 2005-04-15 2010-09-14 Microsoft Corporation Model-based system monitoring
US7802144B2 (en) 2005-04-15 2010-09-21 Microsoft Corporation Model-based system monitoring
US8489728B2 (en) 2005-04-15 2013-07-16 Microsoft Corporation Model-based system monitoring
EP1715405A1 (en) * 2005-04-19 2006-10-25 STMicroelectronics S.r.l. Processing method, system and computer program product for dynamic allocation of processing tasks in a multiprocessor cluster platforms with power adjustment
US7558724B2 (en) * 2005-04-19 2009-07-07 Hewlett-Packard Development Company, L.P. Operation region describing a virtual device
US8260932B2 (en) 2005-04-27 2012-09-04 International Business Machines Corporation Using broadcast domains to manage virtual local area networks
US20060245354A1 (en) * 2005-04-28 2006-11-02 International Business Machines Corporation Method and apparatus for deploying and instantiating multiple instances of applications in automated data centers using application deployment template
US7430629B2 (en) * 2005-05-12 2008-09-30 International Business Machines Corporation Internet SCSI communication via UNDI services
US8549513B2 (en) 2005-06-29 2013-10-01 Microsoft Corporation Model-based virtual system provisioning
US7647634B2 (en) * 2005-06-30 2010-01-12 Microsoft Corporation Managing access to a network
US7969966B2 (en) * 2005-12-19 2011-06-28 Alcatel Lucent System and method for port mapping in a communications network switch
US7941309B2 (en) 2005-11-02 2011-05-10 Microsoft Corporation Modeling IT operations/policies
US8284782B1 (en) * 2005-11-15 2012-10-09 Nvidia Corporation System and method for avoiding ARP cache pollution
US8284783B1 (en) * 2005-11-15 2012-10-09 Nvidia Corporation System and method for avoiding neighbor cache pollution
US7783788B1 (en) * 2006-04-28 2010-08-24 Huawei Technologies Co., Ltd. Virtual input/output server
US8209405B1 (en) * 2006-06-12 2012-06-26 Juniper Networks, Inc. Failover scheme with service-based segregation
US7813286B2 (en) * 2006-08-30 2010-10-12 Hewlett-Packard Development Company, L.P. Method and system of distributing multicast group join request in computer systems operating with teamed communication ports
US7649892B2 (en) * 2006-08-30 2010-01-19 Hewlett-Packard Development Company, L.P. Method and system of network communication receive load balancing
US20080089320A1 (en) * 2006-10-16 2008-04-17 International Business Machines Corporation Fibre Channel Fabric Simulator
US7983178B2 (en) 2006-10-16 2011-07-19 International Business Machines Corporation Fibre channel fabric simulator
US20080144634A1 (en) * 2006-12-15 2008-06-19 Nokia Corporation Selective passive address resolution learning
DE602008004088D1 (en) * 2007-01-03 2011-02-03 Raytheon Co COMPUTER STORAGE SYSTEM
WO2008127321A2 (en) * 2007-04-13 2008-10-23 Thomson Licensing System software productization framework
WO2009009110A2 (en) * 2007-07-09 2009-01-15 Blaksley Ventures 108, Llc System and method for clustering group-centric networks
US9361264B2 (en) * 2007-10-15 2016-06-07 International Business Machines Corporation Systems and methods for access and control of hardware device resources using device-independent access interfaces
US8346912B2 (en) 2007-10-15 2013-01-01 Dell Products, Lp System and method of emulating a network controller within an information handling system
WO2009055717A1 (en) * 2007-10-24 2009-04-30 Jonathan Peter Deutsch Various methods and apparatuses for a central station to allocate virtual ip addresses
US10108460B2 (en) * 2008-02-28 2018-10-23 International Business Machines Corporation Method and system for integrated deployment planning for virtual appliances
US7958184B2 (en) * 2008-03-04 2011-06-07 International Business Machines Corporation Network virtualization in a multi-node system with multiple networks
US8396053B2 (en) * 2008-04-24 2013-03-12 International Business Machines Corporation Method and apparatus for VLAN-based selective path routing
US7861110B2 (en) * 2008-04-30 2010-12-28 Egenera, Inc. System, method, and adapter for creating fault-tolerant communication busses from standard components
US7917614B2 (en) * 2008-06-10 2011-03-29 International Business Machines Corporation Fault tolerance in a client side pre-boot execution
US20100043006A1 (en) * 2008-08-13 2010-02-18 Egenera, Inc. Systems and methods for a configurable deployment platform with virtualization of processing resource specific persistent settings
US9952845B2 (en) * 2008-08-29 2018-04-24 Red Hat, Inc. Provisioning machines having virtual storage resources
US8201237B1 (en) 2008-12-10 2012-06-12 Amazon Technologies, Inc. Establishing secure remote access to private computer networks
US9524167B1 (en) * 2008-12-10 2016-12-20 Amazon Technologies, Inc. Providing location-specific network access to remote services
US9137209B1 (en) 2008-12-10 2015-09-15 Amazon Technologies, Inc. Providing local secure network access to remote services
US8230050B1 (en) 2008-12-10 2012-07-24 Amazon Technologies, Inc. Providing access to configurable private computer networks
US8892789B2 (en) 2008-12-19 2014-11-18 Netapp, Inc. Accelerating internet small computer system interface (iSCSI) proxy input/output (I/O)
US8117317B2 (en) * 2008-12-31 2012-02-14 Sap Ag Systems and methods for integrating local systems with cloud computing resources
US20100235833A1 (en) * 2009-03-13 2010-09-16 Liquid Computing Corporation Methods and systems for providing secure image mobility
US8595378B1 (en) * 2009-03-30 2013-11-26 Amazon Technologies, Inc. Managing communications having multiple alternative destinations
US9106540B2 (en) * 2009-03-30 2015-08-11 Amazon Technologies, Inc. Providing logical networking functionality for managed computer networks
JP5090408B2 (en) 2009-07-22 2012-12-05 インターナショナル・ビジネス・マシーンズ・コーポレーション Method and apparatus for dynamically controlling destination of transmission data in network communication
KR101117402B1 (en) 2009-09-02 2012-03-02 한양대학교 산학협력단 Virtualized service management system and method and virtualized service system and virtualized service providing method for providing high-performance cluster
US8413241B2 (en) * 2009-09-17 2013-04-02 Oracle America, Inc. Integrated intrusion deflection, detection and introspection
US9158567B2 (en) * 2009-10-20 2015-10-13 Dell Products, Lp System and method for reconfigurable network services using modified network configuration with modified bandwith capacity in dynamic virtualization environments
US10721269B1 (en) 2009-11-06 2020-07-21 F5 Networks, Inc. Methods and system for returning requests with javascript for clients before passing a request to a server
CN102148735B (en) * 2010-02-10 2013-07-10 成都市华为赛门铁克科技有限公司 Virtual link establishing method, communication network element and Ethernet network system
US20110276963A1 (en) * 2010-05-04 2011-11-10 Riverbed Technology, Inc. Virtual Data Storage Devices and Applications Over Wide Area Networks
US8677111B2 (en) * 2010-05-04 2014-03-18 Riverbed Technology, Inc. Booting devices using virtual storage arrays over wide-area networks
US9141625B1 (en) 2010-06-22 2015-09-22 F5 Networks, Inc. Methods for preserving flow state during virtual machine migration and devices thereof
US10015286B1 (en) 2010-06-23 2018-07-03 F5 Networks, Inc. System and method for proxying HTTP single sign on across network domains
US10025734B1 (en) * 2010-06-29 2018-07-17 EMC IP Holding Company LLC Managing I/O operations based on application awareness
US8347100B1 (en) 2010-07-14 2013-01-01 F5 Networks, Inc. Methods for DNSSEC proxying and deployment amelioration and systems thereof
US8886981B1 (en) 2010-09-15 2014-11-11 F5 Networks, Inc. Systems and methods for idle driven scheduling
WO2012058643A2 (en) 2010-10-29 2012-05-03 F5 Networks, Inc. System and method for on the fly protocol conversion in obtaining policy enforcement information
US10135831B2 (en) 2011-01-28 2018-11-20 F5 Networks, Inc. System and method for combining an access control system with a traffic management system
US8959222B2 (en) 2011-05-19 2015-02-17 International Business Machines Corporation Load balancing system for workload groups
US9246819B1 (en) * 2011-06-20 2016-01-26 F5 Networks, Inc. System and method for performing message-based load balancing
US9270766B2 (en) 2011-12-30 2016-02-23 F5 Networks, Inc. Methods for identifying network traffic characteristics to correlate and manage one or more subsequent flows and devices thereof
US10230566B1 (en) 2012-02-17 2019-03-12 F5 Networks, Inc. Methods for dynamically constructing a service principal name and devices thereof
US9172753B1 (en) 2012-02-20 2015-10-27 F5 Networks, Inc. Methods for optimizing HTTP header based authentication and devices thereof
US9231879B1 (en) 2012-02-20 2016-01-05 F5 Networks, Inc. Methods for policy-based network traffic queue management and devices thereof
WO2013163648A2 (en) 2012-04-27 2013-10-31 F5 Networks, Inc. Methods for optimizing service of content requests and devices thereof
US9104639B2 (en) 2012-05-01 2015-08-11 SEAKR Engineering, Inc. Distributed mesh-based memory and computing architecture
US10375155B1 (en) 2013-02-19 2019-08-06 F5 Networks, Inc. System and method for achieving hardware acceleration for asymmetric flow connections
US9143512B2 (en) 2013-10-04 2015-09-22 At&T Intellectual Property I, L.P. Communication devices, computer readable storage devices, and methods for secure multi-path communication
US10187317B1 (en) 2013-11-15 2019-01-22 F5 Networks, Inc. Methods for traffic rate control and devices thereof
WO2015081515A1 (en) * 2013-12-04 2015-06-11 华为技术有限公司 Data processing method, device, storage controller and equipment cabinet
US10200301B1 (en) * 2014-03-28 2019-02-05 Amazon Technologies, Inc. Logical control groups for distributed system resources
US10015143B1 (en) 2014-06-05 2018-07-03 F5 Networks, Inc. Methods for securing one or more license entitlement grants and devices thereof
US10097410B2 (en) 2014-06-26 2018-10-09 Vmware, Inc. Methods and apparatus to scale application deployments in cloud computing environments
US11838851B1 (en) 2014-07-15 2023-12-05 F5, Inc. Methods for managing L7 traffic classification and devices thereof
US10122630B1 (en) 2014-08-15 2018-11-06 F5 Networks, Inc. Methods for network traffic presteering and devices thereof
US10182013B1 (en) 2014-12-01 2019-01-15 F5 Networks, Inc. Methods for managing progressive image delivery and devices thereof
US11895138B1 (en) 2015-02-02 2024-02-06 F5, Inc. Methods for improving web scanner accuracy and devices thereof
US10834065B1 (en) 2015-03-31 2020-11-10 F5 Networks, Inc. Methods for SSL protected NTLM re-authentication and devices thereof
US10505818B1 (en) 2015-05-05 2019-12-10 F5 Networks. Inc. Methods for analyzing and load balancing based on server health and devices thereof
US11350254B1 (en) 2015-05-05 2022-05-31 F5, Inc. Methods for enforcing compliance policies and devices thereof
US11757946B1 (en) 2015-12-22 2023-09-12 F5, Inc. Methods for analyzing network traffic and enforcing network policies and devices thereof
US10404698B1 (en) 2016-01-15 2019-09-03 F5 Networks, Inc. Methods for adaptive organization of web application access points in webtops and devices thereof
US10797888B1 (en) 2016-01-20 2020-10-06 F5 Networks, Inc. Methods for secured SCEP enrollment for client devices and devices thereof
US11178150B1 (en) 2016-01-20 2021-11-16 F5 Networks, Inc. Methods for enforcing access control list based on managed application and devices thereof
US10791088B1 (en) 2016-06-17 2020-09-29 F5 Networks, Inc. Methods for disaggregating subscribers via DHCP address translation and devices thereof
US11063758B1 (en) 2016-11-01 2021-07-13 F5 Networks, Inc. Methods for facilitating cipher selection and devices thereof
US10505792B1 (en) 2016-11-02 2019-12-10 F5 Networks, Inc. Methods for facilitating network traffic analytics and devices thereof
US10425287B2 (en) 2017-02-23 2019-09-24 Dell Products L.P. Systems and methods for network topology discovery
US10785118B2 (en) * 2017-02-23 2020-09-22 Dell Products L.P. Systems and methods for network topology validation
US10630543B1 (en) * 2017-03-17 2020-04-21 Amazon Technologies, Inc. Wireless mesh network implementation for IOT devices
US10812266B1 (en) 2017-03-17 2020-10-20 F5 Networks, Inc. Methods for managing security tokens based on security violations and devices thereof
US10516645B1 (en) * 2017-04-27 2019-12-24 Pure Storage, Inc. Address resolution broadcasting in a networked device
US10972453B1 (en) 2017-05-03 2021-04-06 F5 Networks, Inc. Methods for token refreshment based on single sign-on (SSO) for federated identity environments and devices thereof
US11122042B1 (en) 2017-05-12 2021-09-14 F5 Networks, Inc. Methods for dynamically managing user access control and devices thereof
US11343237B1 (en) 2017-05-12 2022-05-24 F5, Inc. Methods for managing a federated identity environment using security and access control data and devices thereof
US11122083B1 (en) 2017-09-08 2021-09-14 F5 Networks, Inc. Methods for managing network connections based on DNS data and network policies and devices thereof
US10503441B2 (en) * 2017-11-28 2019-12-10 Portworx, Inc. Resolving failed or hanging mount points in a clustered storage solution for containers
US11329925B2 (en) * 2018-09-28 2022-05-10 Intel Corporation Technologies for low-latency network packet transmission
US11245660B1 (en) * 2019-09-26 2022-02-08 Cisco Technology, Inc. Communication of endpoint information among virtual switches
US11606310B2 (en) 2020-09-28 2023-03-14 Vmware, Inc. Flow processing offload using virtual port identifiers
US20220100432A1 (en) * 2020-09-28 2022-03-31 Vmware, Inc. Distributed storage services supported by a nic
US11875172B2 (en) 2020-09-28 2024-01-16 VMware LLC Bare metal computer for booting copies of VM images on multiple computing devices using a smart NIC
US11636053B2 (en) 2020-09-28 2023-04-25 Vmware, Inc. Emulating a local storage by accessing an external storage through a shared port of a NIC
US11593278B2 (en) 2020-09-28 2023-02-28 Vmware, Inc. Using machine executing on a NIC to access a third party storage not supported by a NIC or host
US11516087B2 (en) * 2020-11-30 2022-11-29 Google Llc Connecting processors using twisted torus configurations
JP2022170275A (en) * 2021-04-28 2022-11-10 富士通株式会社 Network map creation support program, information processing device, and network map creation support method
US11863376B2 (en) 2021-12-22 2024-01-02 Vmware, Inc. Smart NIC leader election
CN114531415B (en) * 2022-03-08 2023-11-21 北京世纪互联宽带数据中心有限公司 Network communication method, data center, device and readable storage medium
US11928062B2 (en) 2022-06-21 2024-03-12 VMware LLC Accelerating data message classification with smart NICs
US11899594B2 (en) 2022-06-21 2024-02-13 VMware LLC Maintenance of data message classification cache on smart NIC
US11928367B2 (en) 2022-06-21 2024-03-12 VMware LLC Logical memory addressing for network devices

Citations (33)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5197148A (en) * 1987-11-30 1993-03-23 International Business Machines Corporation Method for maintaining data availability after component failure included denying access to others while completing by one of the microprocessor systems an atomic transaction changing a portion of the multiple copies of data
US5208811A (en) * 1989-11-06 1993-05-04 Hitachi, Ltd. Interconnection system and method for heterogeneous networks
US5473599A (en) * 1994-04-22 1995-12-05 Cisco Systems, Incorporated Standby router protocol
US5535338A (en) * 1993-07-28 1996-07-09 3Com Corporation Multifunction network station with network addresses for functional units
US5546535A (en) * 1992-03-13 1996-08-13 Emc Corporation Multiple controller sharing in a redundant storage array
US5818842A (en) * 1994-01-21 1998-10-06 Newbridge Networks Corporation Transparent interconnector of LANs by an ATM network
US5825772A (en) * 1995-11-15 1998-10-20 Cabletron Systems, Inc. Distributed connection-oriented services for switched communications networks
US5835725A (en) * 1996-10-21 1998-11-10 Cisco Technology, Inc. Dynamic address assignment and resolution technique
US5970066A (en) * 1996-12-12 1999-10-19 Paradyne Corporation Virtual ethernet interface
US6003137A (en) * 1996-09-11 1999-12-14 Nec Corporation Virtual group information managing method in bridge for network connection
US6091732A (en) * 1997-11-20 2000-07-18 Cisco Systems, Inc. Method for configuring distributed internet protocol gateways with lan emulation
US6148414A (en) * 1998-09-24 2000-11-14 Seek Systems, Inc. Methods and systems for implementing shared disk array management functions
US6178171B1 (en) * 1997-11-24 2001-01-23 International Business Machines Corporation Route switching mechanisms for source-routed ATM networks
US6195705B1 (en) * 1998-06-30 2001-02-27 Cisco Technology, Inc. Mobile IP mobility agent standby protocol
US6253230B1 (en) * 1998-09-22 2001-06-26 International Business Machines Corporation Distributed scalable device for selecting a server from a server cluster and a switched path to the selected server
US6411625B1 (en) * 1997-02-28 2002-06-25 Nec Corporation ATM-LAN network having a bridge that establishes communication with or without LAN emulation protocol depending on destination address
US20020138649A1 (en) * 2000-10-04 2002-09-26 Brian Cartmell Providing services and information based on a request that includes a unique identifier
US6480901B1 (en) * 1999-07-09 2002-11-12 Lsi Logic Corporation System for monitoring and managing devices on a network from a management station via a proxy server that provides protocol converter
US6597956B1 (en) * 1999-08-23 2003-07-22 Terraspring, Inc. Method and apparatus for controlling an extensible computing system
US6640278B1 (en) * 1999-03-25 2003-10-28 Dell Products L.P. Method for configuration and management of storage resources in a storage network
US6662221B1 (en) * 1999-04-12 2003-12-09 Lucent Technologies Inc. Integrated network and service management with automated flow through configuration and provisioning of virtual private networks
US6675268B1 (en) * 2000-12-11 2004-01-06 Lsi Logic Corporation Method and apparatus for handling transfers of data volumes between controllers in a storage environment having multiple paths to the data volumes
US6684343B1 (en) * 2000-04-29 2004-01-27 Hewlett-Packard Development Company, Lp. Managing operations of a computer system having a plurality of partitions
US6714980B1 (en) * 2000-02-11 2004-03-30 Terraspring, Inc. Backup and restore of data associated with a host in a dynamically changing virtual server farm without involvement of a server that uses an associated storage device
US6728780B1 (en) * 2000-06-02 2004-04-27 Sun Microsystems, Inc. High availability networking with warm standby interface failover
US6757753B1 (en) * 2001-06-06 2004-06-29 Lsi Logic Corporation Uniform routing of storage access requests through redundant array controllers
US6763479B1 (en) * 2000-06-02 2004-07-13 Sun Microsystems, Inc. High availability networking with alternate pathing failover
US6779016B1 (en) * 1999-08-23 2004-08-17 Terraspring, Inc. Extensible computing system
US6799202B1 (en) * 1999-12-16 2004-09-28 Hachiro Kawaii Federated operating system for a server
US6820171B1 (en) * 2000-06-30 2004-11-16 Lsi Logic Corporation Methods and structures for an extensible RAID storage architecture
US6883065B1 (en) * 2001-11-15 2005-04-19 Xiotech Corporation System and method for a redundant communication channel via storage area network back-end
US6950871B1 (en) * 2000-06-29 2005-09-27 Hitachi, Ltd. Computer system having a storage area network and method of handling data in the computer system
US6971044B2 (en) * 2001-04-20 2005-11-29 Egenera, Inc. Service clusters and method in a processing system with failover capability

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3617770B2 (en) * 1998-05-29 2005-02-09 株式会社日立製作所 Network management system and network management method
US6701358B1 (en) * 1999-04-02 2004-03-02 Nortel Networks Limited Bulk configuring a virtual private network
JP2005506726A (en) * 2001-04-20 2005-03-03 イジェネラ,インク. Virtual network system and method in processing system
US7174390B2 (en) * 2001-04-20 2007-02-06 Egenera, Inc. Address resolution protocol system and method in a virtual network
US7188062B1 (en) * 2002-12-27 2007-03-06 Unisys Corporation Configuration management for an emulator operating system

Patent Citations (34)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5197148A (en) * 1987-11-30 1993-03-23 International Business Machines Corporation Method for maintaining data availability after component failure included denying access to others while completing by one of the microprocessor systems an atomic transaction changing a portion of the multiple copies of data
US5208811A (en) * 1989-11-06 1993-05-04 Hitachi, Ltd. Interconnection system and method for heterogeneous networks
US5546535A (en) * 1992-03-13 1996-08-13 Emc Corporation Multiple controller sharing in a redundant storage array
US5535338A (en) * 1993-07-28 1996-07-09 3Com Corporation Multifunction network station with network addresses for functional units
US5590285A (en) * 1993-07-28 1996-12-31 3Com Corporation Network station with multiple network addresses
US5818842A (en) * 1994-01-21 1998-10-06 Newbridge Networks Corporation Transparent interconnector of LANs by an ATM network
US5473599A (en) * 1994-04-22 1995-12-05 Cisco Systems, Incorporated Standby router protocol
US5825772A (en) * 1995-11-15 1998-10-20 Cabletron Systems, Inc. Distributed connection-oriented services for switched communications networks
US6003137A (en) * 1996-09-11 1999-12-14 Nec Corporation Virtual group information managing method in bridge for network connection
US5835725A (en) * 1996-10-21 1998-11-10 Cisco Technology, Inc. Dynamic address assignment and resolution technique
US5970066A (en) * 1996-12-12 1999-10-19 Paradyne Corporation Virtual ethernet interface
US6411625B1 (en) * 1997-02-28 2002-06-25 Nec Corporation ATM-LAN network having a bridge that establishes communication with or without LAN emulation protocol depending on destination address
US6091732A (en) * 1997-11-20 2000-07-18 Cisco Systems, Inc. Method for configuring distributed internet protocol gateways with lan emulation
US6178171B1 (en) * 1997-11-24 2001-01-23 International Business Machines Corporation Route switching mechanisms for source-routed ATM networks
US6195705B1 (en) * 1998-06-30 2001-02-27 Cisco Technology, Inc. Mobile IP mobility agent standby protocol
US6253230B1 (en) * 1998-09-22 2001-06-26 International Business Machines Corporation Distributed scalable device for selecting a server from a server cluster and a switched path to the selected server
US6148414A (en) * 1998-09-24 2000-11-14 Seek Systems, Inc. Methods and systems for implementing shared disk array management functions
US6640278B1 (en) * 1999-03-25 2003-10-28 Dell Products L.P. Method for configuration and management of storage resources in a storage network
US6662221B1 (en) * 1999-04-12 2003-12-09 Lucent Technologies Inc. Integrated network and service management with automated flow through configuration and provisioning of virtual private networks
US6480901B1 (en) * 1999-07-09 2002-11-12 Lsi Logic Corporation System for monitoring and managing devices on a network from a management station via a proxy server that provides protocol converter
US6597956B1 (en) * 1999-08-23 2003-07-22 Terraspring, Inc. Method and apparatus for controlling an extensible computing system
US6779016B1 (en) * 1999-08-23 2004-08-17 Terraspring, Inc. Extensible computing system
US6799202B1 (en) * 1999-12-16 2004-09-28 Hachiro Kawaii Federated operating system for a server
US6714980B1 (en) * 2000-02-11 2004-03-30 Terraspring, Inc. Backup and restore of data associated with a host in a dynamically changing virtual server farm without involvement of a server that uses an associated storage device
US6684343B1 (en) * 2000-04-29 2004-01-27 Hewlett-Packard Development Company, Lp. Managing operations of a computer system having a plurality of partitions
US6728780B1 (en) * 2000-06-02 2004-04-27 Sun Microsystems, Inc. High availability networking with warm standby interface failover
US6763479B1 (en) * 2000-06-02 2004-07-13 Sun Microsystems, Inc. High availability networking with alternate pathing failover
US6950871B1 (en) * 2000-06-29 2005-09-27 Hitachi, Ltd. Computer system having a storage area network and method of handling data in the computer system
US6820171B1 (en) * 2000-06-30 2004-11-16 Lsi Logic Corporation Methods and structures for an extensible RAID storage architecture
US20020138649A1 (en) * 2000-10-04 2002-09-26 Brian Cartmell Providing services and information based on a request that includes a unique identifier
US6675268B1 (en) * 2000-12-11 2004-01-06 Lsi Logic Corporation Method and apparatus for handling transfers of data volumes between controllers in a storage environment having multiple paths to the data volumes
US6971044B2 (en) * 2001-04-20 2005-11-29 Egenera, Inc. Service clusters and method in a processing system with failover capability
US6757753B1 (en) * 2001-06-06 2004-06-29 Lsi Logic Corporation Uniform routing of storage access requests through redundant array controllers
US6883065B1 (en) * 2001-11-15 2005-04-19 Xiotech Corporation System and method for a redundant communication channel via storage area network back-end

Cited By (50)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050169309A1 (en) * 2003-04-23 2005-08-04 Sunay Tripathi System and method for vertical perimeter protection
US8539089B2 (en) * 2003-04-23 2013-09-17 Oracle America, Inc. System and method for vertical perimeter protection
US7715409B2 (en) * 2005-03-25 2010-05-11 Cisco Technology, Inc. Method and system for data link layer address classification
US20060215655A1 (en) * 2005-03-25 2006-09-28 Siu Wai-Tak Method and system for data link layer address classification
US20070244994A1 (en) * 2006-04-14 2007-10-18 International Business Machines Corporation Methods and Arrangements for Activating IP Configurations
US7886027B2 (en) * 2006-04-14 2011-02-08 International Business Machines Corporation Methods and arrangements for activating IP configurations
US8375111B2 (en) 2008-04-07 2013-02-12 Hitachi, Ltd. Method and apparatus for HBA migration
US7991860B2 (en) * 2008-04-07 2011-08-02 Hitachi, Ltd. Method and apparatus for HBA migration
US20090254640A1 (en) * 2008-04-07 2009-10-08 Hitachi, Ltd Method and apparatus for hba migration
US10637803B2 (en) 2008-05-23 2020-04-28 Vmware, Inc. Distributed virtual switch for virtualized computer systems
US20090292858A1 (en) * 2008-05-23 2009-11-26 Vmware, Inc. Distributed Virtual Switch for Virtualized Computer Systems
US9838339B2 (en) 2008-05-23 2017-12-05 Vmware, Inc. Distributed virtual switch for virtualized computer systems
US9160612B2 (en) 2008-05-23 2015-10-13 Vmware, Inc. Management of distributed virtual switch and distributed virtual ports
US8195774B2 (en) * 2008-05-23 2012-06-05 Vmware, Inc. Distributed virtual switch for virtualized computer systems
US20100034117A1 (en) * 2008-08-08 2010-02-11 Dell Products L.P. Parallel vlan and non-vlan device configuration
US8064469B2 (en) * 2008-08-08 2011-11-22 Dell Products L.P. Parallel VLAN and non-VLAN device configuration
US8200473B1 (en) * 2008-08-25 2012-06-12 Qlogic, Corporation Emulation of multiple MDIO manageable devices
US8125996B2 (en) * 2008-10-22 2012-02-28 Fortinet, Inc. Mechanism for enabling layer two host addresses to be shielded from the switches in a network
US9325526B2 (en) 2008-10-22 2016-04-26 Fortinet, Inc. Mechanism for enabling layer two host addresses to be shielded from the switches in a network
US20110078331A1 (en) * 2008-10-22 2011-03-31 Fortinet, Inc. Mechanism for enabling layer two host addresses to be shielded from the switches in a network
US9948576B2 (en) 2008-10-22 2018-04-17 Fortinet, Inc. Mechanism for enabling layer two host addresses to be shielded from the switches in a network
US8498293B2 (en) 2008-10-22 2013-07-30 Fortinet, Inc. Mechanism for enabling layer two host addresses to be shielded from the switches in a network
US20110235639A1 (en) * 2008-10-22 2011-09-29 Fortinet, Inc. Mechanism for enabling layer two host addresses to be shielded from the switches in a network
US8832222B2 (en) * 2009-10-05 2014-09-09 Vss Monitoring, Inc. Method, apparatus and system for inserting a VLAN tag into a captured data packet
US20110082910A1 (en) * 2009-10-05 2011-04-07 Vss Monitoring, Inc. Method, apparatus and system for inserting a vlan tag into a captured data packet
US9973379B1 (en) 2010-03-31 2018-05-15 Amazon Technologies, Inc. Managing integration of external nodes into provided computer networks
US8396946B1 (en) * 2010-03-31 2013-03-12 Amazon Technologies, Inc. Managing integration of external nodes into provided computer networks
US9330211B2 (en) * 2010-09-10 2016-05-03 Stmicroeletronics S.R.L. Simulation system for implementing computing device models in a multi-simulation environment
US20120116746A1 (en) * 2010-09-10 2012-05-10 Stmicroelectronics S.R.L. Simulation system for implementing computing device models in a multi-simulation environment
US20120131157A1 (en) * 2010-11-19 2012-05-24 Andy Gospodarek Naming network interface cards
US8948029B2 (en) * 2010-11-19 2015-02-03 Red Hat, Inc. Naming network interface cards
US9231826B2 (en) * 2011-08-18 2016-01-05 Hangzhou H3C Technologies Co., Ltd. Zero configuration of a virtual distributed device
US20130046865A1 (en) * 2011-08-18 2013-02-21 Lifang Liu Zero configuration of a virtual distributed device
US10050824B2 (en) * 2012-01-20 2018-08-14 Arris Enterprises Llc Managing a cluster of switches using multiple controllers
US20130188521A1 (en) * 2012-01-20 2013-07-25 Brocade Communications Systems, Inc. Managing a large network using a single point of configuration
US20130188514A1 (en) * 2012-01-20 2013-07-25 Brocade Communications Systems, Inc. Managing a cluster of switches using multiple controllers
US9935781B2 (en) * 2012-01-20 2018-04-03 Arris Enterprises Llc Managing a large network using a single point of configuration
US20130208728A1 (en) * 2012-02-14 2013-08-15 International Business Machines Corporation Packet routing for embedded applications sharing a single network interface over multiple virtual networks
US20130208726A1 (en) * 2012-02-14 2013-08-15 International Business Machines Corporation Packet routing for embedded applications sharing a single network interface over multiple virtual networks
US20130208722A1 (en) * 2012-02-14 2013-08-15 International Business Machines Corporation Packet routing with analysis assist for embedded applications sharing a single network interface over multiple virtual networks
US9077659B2 (en) * 2012-02-14 2015-07-07 International Business Machines Corporation Packet routing for embedded applications sharing a single network interface over multiple virtual networks
US20130208721A1 (en) * 2012-02-14 2013-08-15 International Business Machines Corporation Packet routing with analysis assist for embedded applications sharing a single network interface over multiple virtual networks
US9148368B2 (en) * 2012-02-14 2015-09-29 International Business Machines Corporation Packet routing with analysis assist for embedded applications sharing a single network interface over multiple virtual networks
US9148369B2 (en) * 2012-02-14 2015-09-29 International Business Machines Corporation Packet routing with analysis assist for embedded applications sharing a single network interface over multiple virtual networks
US9083644B2 (en) * 2012-02-14 2015-07-14 International Business Machines Corporation Packet routing for embedded applications sharing a single network interface over multiple virtual networks
US20150178235A1 (en) * 2013-12-23 2015-06-25 Ineda Systems Pvt. Ltd Network interface sharing
US9772968B2 (en) * 2013-12-23 2017-09-26 Ineda Systems Inc. Network interface sharing
US20190109783A1 (en) * 2017-10-10 2019-04-11 Vmware, Inc. Methods and apparatus to perform network fabric migration in virtualized server systems
US10693769B2 (en) * 2017-10-10 2020-06-23 Vmware, Inc. Methods and apparatus to perform network fabric migration in virtualized server systems
US11824735B1 (en) * 2022-03-30 2023-11-21 Amazon Technologies, Inc. Homogeneous pre-built isolated computing stacks for data center builds

Also Published As

Publication number Publication date
US20070233809A1 (en) 2007-10-04
US20030130833A1 (en) 2003-07-10
US20070233810A1 (en) 2007-10-04
US7231430B2 (en) 2007-06-12

Similar Documents

Publication Publication Date Title
US7231430B2 (en) Reconfigurable, virtual processing system, cluster, network and method
US6971044B2 (en) Service clusters and method in a processing system with failover capability
US7174390B2 (en) Address resolution protocol system and method in a virtual network
US20030130832A1 (en) Virtual networking system and method in a processing system
WO2002086712A1 (en) Virtual networking system and method in a processing system
US11184185B2 (en) System and method to provide multicast group membership defined relative to partition membership in a high performance computing environment
US7783788B1 (en) Virtual input/output server
US8086755B2 (en) Distributed multicast system and method in a network
US7983275B2 (en) LAN emulation over infiniband fabric apparatus, systems, and methods
US7178059B2 (en) Disaster recovery for processing resources using configurable deployment platform
TWI423038B (en) Network communications for operating system partitions
US8625595B2 (en) Fiber channel identifier mobility for fiber channel and fiber channel over ethernet networks
EP0889624B1 (en) Trunking ethernet-compatible networks
US5999974A (en) Internet protocol assists for high performance LAN connections
US20030005039A1 (en) End node partitioning using local identifiers
EP1482711A2 (en) Virtual networking system and method in a processing system
US7526527B1 (en) Storage area network interconnect server
US6023734A (en) Establishing direct communications between two hosts without using a high performance LAN connection
US7451208B1 (en) Systems and methods for network address failover
US5974049A (en) Internet protocol assists for high performance LAN connections
US7751398B1 (en) Techniques for prioritization of messaging traffic
EP1429249A2 (en) Virtual networking system and method in a processing system
US7433952B1 (en) System and method for interconnecting a storage area network

Legal Events

Date Code Title Description
AS Assignment

Owner name: SILICON VALLEY BANK, CALIFORNIA

Free format text: SECURITY AGREEMENT;ASSIGNOR:EGENERA, INC.;REEL/FRAME:022102/0963

Effective date: 20081229

Owner name: SILICON VALLEY BANK,CALIFORNIA

Free format text: SECURITY AGREEMENT;ASSIGNOR:EGENERA, INC.;REEL/FRAME:022102/0963

Effective date: 20081229

AS Assignment

Owner name: PHAROS CAPITAL PARTNERS II-A, L.P., AS COLLATERAL

Free format text: SECURITY AGREEMENT;ASSIGNOR:EGENERA, INC.;REEL/FRAME:023792/0527

Effective date: 20090924

Owner name: PHAROS CAPITAL PARTNERS II-A, L.P., AS COLLATERAL

Free format text: SECURITY AGREEMENT;ASSIGNOR:EGENERA, INC.;REEL/FRAME:023792/0538

Effective date: 20100115

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION

AS Assignment

Owner name: EGENERA, INC., MASSACHUSETTS

Free format text: RELEASE OF SECURITY INTEREST;ASSIGNOR:SILICON VALLEY BANK;REEL/FRAME:033026/0393

Effective date: 20140523