US20020087614A1 - Programmable tuning for flow control and support for CPU hot plug - Google Patents

Programmable tuning for flow control and support for CPU hot plug Download PDF

Info

Publication number
US20020087614A1
US20020087614A1 US09/944,776 US94477601A US2002087614A1 US 20020087614 A1 US20020087614 A1 US 20020087614A1 US 94477601 A US94477601 A US 94477601A US 2002087614 A1 US2002087614 A1 US 2002087614A1
Authority
US
United States
Prior art keywords
port
transactions
cache
control register
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US09/944,776
Inventor
Andrej Kocev
Samuel Duncan
Steven Ho
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hewlett Packard Development Co LP
Original Assignee
Compaq Information Technologies Group LP
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Compaq Information Technologies Group LP filed Critical Compaq Information Technologies Group LP
Priority to US09/944,776 priority Critical patent/US20020087614A1/en
Assigned to COMPAQ INFORMATION TECHNOLOGIES GROUP, L.P. reassignment COMPAQ INFORMATION TECHNOLOGIES GROUP, L.P. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: DUNCAN, SAMUEL H., HO, STEVEN, KOCEV, ANDREJ
Assigned to COMPAQ INFORMATION TECHNOLOGIES GROUP, L.P. reassignment COMPAQ INFORMATION TECHNOLOGIES GROUP, L.P. CORRECTIVE ASSIGNMENT TO REMOVE 3RD ASSIGNOR'S NAME THAT WAS INADVERTENTLY INCLUDED ON PREVIOUSLY RECORDED ASSIGNMENT, REEL: 012137, FRAME 0993. Assignors: DUNCAN, SAMUEL H., KOCEV, ANDREJ
Publication of US20020087614A1 publication Critical patent/US20020087614A1/en
Assigned to HEWLETT-PACKARD DEVELOPMENT COMPANY, L.P. reassignment HEWLETT-PACKARD DEVELOPMENT COMPANY, L.P. CHANGE OF NAME (SEE DOCUMENT FOR DETAILS). Assignors: COMPAQ INFORMATION TECHNOLOGIES GROUP, L.P.
Abandoned legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03KPULSE TECHNIQUE
    • H03K5/00Manipulating of pulses not covered by one of the other main groups of this subclass
    • H03K5/19Monitoring patterns of pulse trains
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/20Handling requests for interconnection or transfer for access to input/output bus
    • G06F13/24Handling requests for interconnection or transfer for access to input/output bus using interrupt
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/40Bus structure
    • G06F13/4063Device-to-bus coupling
    • G06F13/4068Electrical coupling
    • G06F13/4081Live connection to bus, e.g. hot-plugging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2213/00Indexing scheme relating to interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F2213/24Interrupt
    • G06F2213/2402Avoidance of interrupt starvation

Definitions

  • the present invention relates to symmetrical distributed multiprocessor computer system architectures, and more particularly to adapting the system and its resources to improve system performance.
  • SMP symmetric multiprocessor
  • Conventional SMP systems include a plurality of processors coupled together by a bus.
  • memory space is typically shared among all of the processors. That is, each processor accesses programs and data in the shared memory, and processors communicate with each other via that memory (e.g., through messages and status information left in shared address spaces).
  • the processors may also be able to exchange signals directly.
  • One or more operating systems are typically stored in the shared memory. These operating systems control the distribution of processes or threads among the various processors.
  • the operating system kernels may execute on any processor, and may even execute in parallel. By allowing many different processors to execute different processes or threads simultaneously, the execution speed of a given application may be greatly increased.
  • FIG. 1 is a block diagram of a conventional SMP system 100 .
  • System 100 includes a plurality of processors 102 a - e, each connected to a system bus 104 .
  • a memory 106 and an input/output (I/O) bridge 108 are also connected to the system bus 104 .
  • the I/O bridge 108 is also coupled to one or more I/O busses 110 a - c.
  • the I/O bridge 108 basically provides a “bridging” function between the system bus 104 and the I/O busses 110 a - c.
  • I/O devices 112 such as disk drives, data collection devices, keyboards, CD-ROM drives, etc., may be attached to the I/O busses 110 a - c.
  • Each processor 102 a - e can access memory 106 and/or various input/output devices 112 via the system bus 104 .
  • Each processor 102 a - e has at least one level of cache memory 114 a - e that is private to the respective processor 102 a - e.
  • the present inventive method and system where the resources may be applied depending on the number and/or type of I/O devices or device controllers attached to the various I/O ports and/or the number and type of transactions anticipated at the various I/O ports.
  • Such arrangements may increase the processing speed and/or reduce the latency of response or be more efficient in use of main, cache or other memories.
  • the present invention provides the ability to “hot swap” or “hot plug” some electronic sub-assemblies, that is replace the sub-assemblies without removing power from or shutting down the remaining system.
  • the present invention provides a method for programmably allocating system resources to accommodate I/O transactions at I/O ports of a multiprocessor computer system.
  • the inventive method determines the number and type of transactions anticipated at a port, the number and type of devices being serviced via the port, a criteria for the transactions at the port with respect to the number and type of transactions and device, and assigns the system resources to the port with respect to the criteria.
  • the criteria may include, among other parameters, increasing system operating speeds, reducing latency, or ensuring that some devices, even low priority devices, are serviced, slowing the system to allow debugging and other such servicing, ensuring proper communications credits, and supporting hot swapping of processor and module assemblies.
  • control registers are provided for each port.
  • the control registers may includes a plurality of programmable fields. Where additional control registers are used, the many fields are distributed among the control registers.
  • control registers may be configured, in a preferred embodiment, to contain the number of direct memory access engines available at a port to support a transaction, the number of cache lines available for requested data, and a number representing priorities among the anticipated transactions
  • the local cache and local memory contents are stored so as not to be affected by the hot swapping, the cache data is flushed or invalidated, a flush indicator is set in the cache status and control register, no new transactions are allowed, and any transactions started or pending are completed, translating look-aside buffers are invalidated.
  • FIG. 1 previously discussed, is a schematic block diagram of a conventional symmetrical multiprocessor computer system
  • FIG. 2 is a schematic block diagram of a symmetrical multiprocessor computer system in accordance with the present invention.
  • FIG. 3 is a schematic block diagram of a dual processor module of the computer system of FIG. 2;
  • FIG. 4 is a schematic block diagram of an I/O bridge in accordance with the present invention.
  • FIG. 5 is a schematic block diagram of an I/O subsystem of the computer system of FIG. 2;
  • FIG. 6 is a schematic diagram of an illustrative embodiment of four (4) 8P drawers of the SMP system mounted within a standard 19-inch rack;
  • FIG. 7 is a functional block diagram of an I/O bridge
  • FIGS. 8 - 9 are schematic blocks diagram of various ports of the I/O bridge of FIG. 7;
  • FIGS. 10 A-C, 11 , and 12 are detailed block diagrams of preferred register formats utilized at the I/O bridge of the present invention.
  • FIG. 2 is a schematic block diagram of a data processing system that may advantageously include the present invention.
  • the data processing system is preferably a symmetrical multiprocessor (SMP) system 200 comprising a plurality of processor modules 300 interconnected to form a two dimensional (2D) torus configuration.
  • Each processor module 300 comprises two central processing units (CPUs) or processors 202 and has connections for two input/output (I/O) ports (one for each processor 202 ) and six inter-processor (IP) network ports.
  • the IP network ports are preferably referred to as North (N), South (S), East (E) and West (W) compass points and connect to two unidirectional links.
  • the North-South (NS) and East-West (EW) compass point connections create a (manhattan) grid, while the outside ends wrap-around and connect to each other, thereby forming the 2D torus.
  • the SMP system 200 further comprises a plurality of I/O subsystems 500 .
  • I/O traffic enters the processor modules 300 of the 2D torus via the I/O ports.
  • I/O subsystem 500 Although only one I/O subsystem 500 is shown connected to each processor module 300 , because each processor module 300 has two I/O ports, any given processor module 300 may be connected to two I/O subsystems 500 (i.e., each processor 202 may be connected to its own I/O subsystem 500 ).
  • FIG. 3 is a schematic block diagram of the dual CPU (2P) module 300 .
  • the 2P module 300 comprises two CPUs 202 each having connections 310 for the IP (“compass”) network ports and an I/O port 320 .
  • the 2P module 300 also includes one or more power regulators 330 , server management logic 350 and two memory subsystems 370 each coupled to a respective memory port (one for each CPU 202 ).
  • the system management logic 350 cooperates with a server management system to control functions of the SMP system 200 .
  • Each of the N, S, E and W compass points along with the I/O and memory ports moreover, use clock-forwarding, i.e., forwarding clock signals with the data signals, to increase data transfer rates and reduce skew between the clock and data.
  • clock-forwarding i.e., forwarding clock signals with the data signals
  • Each CPU 202 of a 2P module 300 is preferably an “EV7” processor that includes part of an “EV6” processor as its core together with “wrapper” circuitry comprising two memory controllers, an I/O interface and four network ports.
  • the EV7 address space is 44 physical address bits and supports up to 256 processors 202 and 256 I/O subsystems 500 .
  • the EV6 core preferably incorporates a traditional reduced instruction set computer (RISC) load/store architecture.
  • RISC reduced instruction set computer
  • the EV6 core is an Alpha® 21264 processor chip manufactured by Compaq Computer Corporation of Houston, Tex., with the addition of a 1.75 megabyte (MB) 7-way associative internal cache and “CBOX,” the latter providing integrated cache controller functions to the EV7 processor.
  • MB 1.75 megabyte
  • CBOX 7.75 megabyte
  • the EV7 processor also includes an “RBOX” that provides integrated routing/networking control functions with respect to the compass points, and a “ZBOX” that provides integrated memory controller functions for controlling the memory subsystem 370 .
  • Each memory subsystem 370 may be and/or may include one or more conventional or commercially available dynamic random access memory (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDR-SDRAM) or Rambus DRAM (RDRAM) memory devices.
  • DRAM dynamic random access memory
  • SDRAM synchronous DRAM
  • DDR-SDRAM double data rate SDRAM
  • RDRAM Rambus DRAM
  • Each EV7 processor 202 moreover, can operate with 0, 1 or 2 memory controllers.
  • FIG. 4 is a schematic block diagram of an I/O bridge or “IO 7 ” 400 , that provides a fundamental building block for the SMP I/O subsystem.
  • the IO 7 is preferably implemented as an application specific integrated circuit (ASIC).
  • ASIC application specific integrated circuit
  • Each EV7 processor supports one I/O ASIC connection; however, there is no requirement that each processor have an I/O connection.
  • the I/O subsystem 500 includes a PCI and/or PCI-X I/O expansion box with hot-swap PCI-x and AGP support.
  • the PCI-x expansion box includes an IO 7 plug-in card that spawns four I/O buses.
  • the IO 7 400 comprises a North circuit region 410 that interfaces to the EV7 processor and a South circuit region 450 that includes a plurality of I/O ports 460 (P 0 -P 3 ) that interface to standard I/O buses.
  • An EV7 port 420 of the North region 410 couples to the EV7 processor via 2 unidirectional, clock forwarded links 430 .
  • Each link 430 has a 32-bit data path that operates at 400 Mbps for a total bandwidth of 1.6 GB in each direction.
  • three of the four I/O ports 460 interface to the PCI and/or PCI-X bus, while the fourth port interfaces to an AGP bus.
  • a cache coherent domain of the SMP system 200 extends into the IO 7 400 and, in particular, to I/O caches located within each I/O port 460 of the IO 7 400 .
  • the cache coherent domain extends to a write cache (WC) 462 , a read cache (RC) 464 and a translation look-aside buffer (TLB) 466 located within each I/O port 460 .
  • the caches 462 , 464 , 466 function as coherent buffers.
  • the 2D-torus configuration of the SMP system 200 may comprise sixteen EV7 processors 202 interconnected within two 8P drawer enclosures 600 .
  • four 8P drawers 600 may be mounted within a standard 19-inch rack (2 meters in height) as shown in FIG. 6. Mounting four 8P drawers 600 in a single rack creates a substantial cabling problem when interconnecting the 32 processors within the 2D-torus configuration and when coupling the processors to the I/O subsystems 500 via the IO 7 devices 400 associated with the EV7 processors 202 .
  • FIG. 5 is a schematic block diagram of an I/O subsystem or drawer 500 of the SMP system 200 .
  • Each I/O subsystem 500 includes a first I/O riser card 510 containing an IO 7 400 , a connector 520 coupling the IO 7 400 to its EV7 processor 202 and a plurality of I/O buses.
  • the speed of the I/O buses contained within the I/O subsystem 500 is a function of the length and the number of loads of each I/O bus.
  • the I/O subsystem 500 is divided into two parts: a hot-plug region 530 and an embedded region 550 .
  • Each I/O subsystem 500 also includes power supplies, fans and storage/load devices (not shown).
  • the I/O standard module card 580 contains a Small Computer System Interface (SCSI) controller for storage/load devices and a Universal Serial Bus (USB) that enables keyboard, mouse, CD and similar input/output functions.
  • SCSI Small Computer System Interface
  • USB Universal Serial Bus
  • the embedded region 550 of the I/O subsystem 500 is typically pre-configured and does not support hot-swap operations.
  • the hot-plug region 530 includes a plurality of slots adapted to support hot-swap. Specifically, there are two ports 532 , 534 of the hot plug region 530 dedicated to I/O port one (P 1 of FIG.
  • the dedicated AGP Pro slot 560 comprises port three (P 3 ) and the three embedded PCI slots 572 - 576 comprise port zero (P 0 ).
  • the I/O buses in the hot-plug region 530 are configured to support PCI and/or PCI-X standards operating at 33 MHz, 50 MHz, 66 MHz, 100 MHz and/or 133 MHz. Not all slots are capable of supporting all of these operating speeds.
  • PCI backplane manager PBM 502 .
  • the PBM 502 is part of a platform management infrastructure.
  • the PBM 502 is coupled to a local area network (LAN), e.g., 100 base T LAN, by way of another I/O riser board 590 within the I/O drawer 500 .
  • the LAN provides an interconnect for the server management platform that includes, in addition to the PBM 502 , a CPU Management Module (CMM) located on each 2 P module 300 (FIG. 3) and an MBM (Marvel Backplane Manager).
  • CMS CPU Management Module
  • FIG. 7 is a schematic block diagram of an IO 7 400 in greater detail.
  • the South region 450 of IO 7 400 includes four data south ports, which may be numbers SP 0 -SP 3 .
  • ports SP 0 -SP 3 further include an up hose ordering engine (UPE), a down hose ordering engine (DNE), a down hose forward initiator (DFI), a down hose address buffer (DNA), and a control and status register (CSR) block.
  • UPE up hose ordering engine
  • DNE down hose ordering engine
  • DFI down hose forward initiator
  • DNA down hose address buffer
  • CSR control and status register
  • South ports SP 0 -SP 2 which may be configured to run the PCI and/or PCI-X bus standards, further include hot plug interface gates (HPIG).
  • HPIG hot plug interface gates
  • FIG. 8 is a representative block diagram of one of south ports SP 0 -SP 2 , each of which preferably includes a PCI or PCI-X controller for controlling the PCI or PCI-X bus to which the port SP 0 -SP 2 is coupled.
  • the UPE of south ports SP 0 -SP 2 has a plurality (preferably twelve) DMA controllers, which are numerically identified “00” to “11.”
  • FIG. 9 is a representative block diagram of South port SP 3 which, as described above, is configured to support AGP.
  • FIG. 10 is a representative block diagram of the interrupt port P7 of South region 450
  • FIG. 11 is a block diagram of the MUX of South region 450 .
  • each port P 0 -P 3 of the IO 7 has a plurality of DMA engines/state machines (preferably twelve) that are used to process transactions for the I/O devices coupled to that port.
  • a method is disclosed for tuning the availability of resources needed to process I/O transactions.
  • FIGS. 11 A-C are a detailed preferred format of a the POx_CTRL control register 1500 .
  • a POx_CTRL control register 800 is preferably disposed at each port P 0 -P 3 of the IO 7 and is read during initialization of the IO 7 . It may also be read during operation of the IO 7 .
  • the POx_CTRL control register format 1500 is organized into a plurality of fields.
  • An RM_TYPE field 1502 (FIG.
  • the RM_TYPE field 1502 controls the maximum number of DMA engines that may be assigned to process a given transaction.
  • the RM_TYPE field 1502 may be 2-bits long.
  • the port may assign at most two DMA engines to any given transaction. If the RM TYPE field 1502 is set to “01”, then up to six DMA engines may be assigned to a single transaction. If the RM_TYPE field 1502 is set to “10”, up to eight DMA engines may be assigned to a single transaction. If it is set to “11”, up to eleven DMA engines may be assigned to a single transaction.
  • each port P 0 -P 3 of an IO 7 may be individually tuned for the type and number of transactions that are anticipated on that port. For example, if many I/O devices are coupled to a given port, then the RM_TYPE field 1502 may be set to “00” to ensure that no one I/O device ends up consuming all of the DMA engines for its own transaction(s). In contrast, if only a single I/O device is coupled to a given port, the RM_TYPE field 1502 may be set to “11” so that the one I/O device can take advantage of the DMA engine resources at that port.
  • the IO 7 has a certain number of miss addressed file (MAF) values that may be used to request data (in terms of cache lines) from the SMP system 200 in response to a cache line miss at the IO 7 .
  • the number of MAF values at a given IO 7 is user programmable. More specifically, each IO 7 preferably includes a IO 7 _MAF register.
  • the IO 7 _MAF register may be 5-bits in length. By setting the IO 7 _MAF register, a user may adjust the number of MAF values available to the IO 7 . For example, if the IO 7 _MAF register is set to “0000”, then no MAF values are available to the IO 7 .
  • the IO 7 _MAF register basically provides a mechanism by which a user can “throttle” the performance of a selected IO 7 . This may be desirable to perform debugging and other service related tasks, as well to control the operation of the IO 7 s. Preferably, at least two MAF values are provided to the IO 7 .
  • the IO 7 400 utilizes a credit-based flow control system to communicate with its respective EV7 processor 202 .
  • the MUX has a corresponding credit buffer. If the MUX has a packet to be transmitted on the Request channel, it first checks to see if there is at least one credit in the Request channel credit buffer against which to “charge” this packet. If a credit exists, then the IO 7 assumes that the EV7 processor 202 has sufficient buffer space to store the packet. Accordingly, the MUX decrements the Request channel credit buffer by “1” (i.e., the number of messages to be sent) and sends the message. If there are no Request channel credits, the MUX must wait until at least one Request credit is received from the EV7 processor 202 before sending the message.
  • the starting number of credits for the Read I/O, Write I/O, Request, Block Response and No Block Response virtual channels is preferably programmable by writing to a corresponding register at the IO 7 .
  • the IO 7 preferably includes an IO 7 _UPH control register.
  • FIG. 11 is a detailed format of a preferred IO 7 _UPH control register 1600 .
  • the IO 7 _UPH control register 1600 includes a plurality of fields, including a plurality of up hose credit fields 1602 .
  • the IO 7 _UPH control register 1600 has an up hose Request channel credit field 1602 a, an up hose Read I/O channel credit field 1602 b, an up hose Write I/O channel credit field 1602 c, an up hose Block Response channel credit field 1602 d, and an up hose No Block Response credit field 1602 e.
  • Each of the credit fields 1602 is programmable so that a user may independently set the number of credits available for each channel.
  • Each credit field 1602 may be 5-bits in length. Accordingly, is each channel may be programmed with “0” to “31” credits.
  • an EV7 processor 202 and/or the corresponding 2P module 300 can be hot swapped. That is, a selected EV7 processor 202 and/or 2P module 300 can be removed and replaced with a new processor or module without bringing the entire SMP system 200 (or the partition in which the processor or module is located) down.
  • a selected EV7 processor 202 and/or 2P module 300 can be removed and replaced with a new processor or module without bringing the entire SMP system 200 (or the partition in which the processor or module is located) down.
  • Prior to removing an EV7 processor 202 several steps must be performed. Specifically, all data in the EV7 processor's local cache and in the memory subsystem 370 directly controlled by the EV7 processor 202 must be “flushed” to secondary (or alternative) storage. The information contained in the respective directory (e.g., cache line ownership status and location) must also be flushed. Additionally, new transactions that target the EV7 processor 202 or that transit through it must be stopped and/or re-routed
  • the transactions initiated by a given IO 7 are received, at least initially, by the EV7 processor 202 to which the IO 7 400 is coupled.
  • the EV7 processor 202 receives the transactions initiated by a given IO 7 and a given IO 7 400 from the EV7 processor 202 .
  • an efficient system for stopping the transactions initiated by a given IO 7 400 in order to facilitate hot swap of the respective EV7 processor 202 is described.
  • each IO 7 400 includes a POx_CTRL control register 1500 (FIGS. 11 A-C) that contains information utilized by the IO 7 400 when it is initialized.
  • the POx_CTRL register 1500 preferably includes a UPE_ENG_EN field 1504 (FIG. 10C).
  • the UPE_ENG_EN field 1504 preferably includes at least one bit for each DMA engine at the IO 7 400 .
  • each IO 7 400 has twelve DMA engines. Accordingly, the UPE_ENG_EN field 1504 has twelve DMA engine enable bits.
  • the IO 7 400 When the IO 7 400 is initialized, it reads the POx_CTRL control register 1500 , including the UPE_ENG_EN field 1504 and, among other things, starts and runs a DMA engine for each DMA engine enable bit that is asserted. In this way a user can program the number of DMA engines (up to some maximum, e.g., twelve) that are started and run at a given IO 7 400 . If an EV7 processor 202 is to be hot swapped a user, operating through system software or firmware, preferably de-asserts all twelve DMA engine enable bits of the IO 7 400 coupled to the EV7 202 that is to be removed.
  • the user sets all bits of the UPE_ENG_EN field 1504 of the respective POx_CTRL control register 1500 to “0”.
  • the IO 7 400 stops allocating DMA engines for new transactions, thereby stopping the IO 7 400 from commencing new transactions.
  • a DMA engine that was in use is subsequently disabled, it nonetheless completes the pending or existing transaction(s) that were assigned to it.
  • the user In addition to stopping the IO 7 400 from initiating any new transactions by disabling its DMA engines, the user also causes any data stored in the IO 7 's WCs 462 , RCs 464 and TLBs 466 to be invalidated, whether that data is coherent or not.
  • the IO 7 400 further includes a POx_CACHE_CTL register at each port 460 , which governs the operation of the WC 462 and RC 464 at that port 460 .
  • FIG. 12 is a schematic block diagram of a preferred format of a POx_CACHE_CTL register 1700 .
  • the POx_CACHE_CTL register 1700 includes a UPE_FLUSH_CACHE field 1702 , which may be 1-bit. If asserted, the UPE_FLUSH_CACHE field 1702 causes the IO 7 400 to flush all coherent and noncoherent data stored in the WC 462 and RC 464 for that port 460 . Accordingly, as part of the hot swapping of an EV7 processor 202 , the user also asserts the flush bit of each port's cache status and control register.
  • the IO 7 400 invalidates the contents of all of its WCs 462 and RCs 464 ; and for each entry of the WCs 462 and RCs 464 , the IO 7 400 returns Victim or VictimClean messages to the directory depending, among other things, on the ownership status of the invalidated data.
  • the IO 7 400 For each of its TLBs 466 , the IO 7 400 preferably includes a translation invalidate all (TBIA) register. If the TBIA register contains any value, including all zeros, the IO 7 400 preferably responds by flushing the contents of the respective TLB 466 . Accordingly, as part of the hot swapping of an EV7 processor 202 , the user also writes any value to each TBIA register of the affected IO 7 400 . In response, the IO 7 400 invalidates the contents of all of its TLBs and sends VictimClean messages to the directory 380 .
  • TBIA translation invalidate all
  • the POx_CACHE_CTL register 1700 at each port 460 the IO 7 400 also includes a UPE_CACHE_INVAL field 1704 which indicates whether one or more blocks of the WC 462 , RC 464 or TLB 466 of that port 460 contain valid data. After invalidating the contents of the port's WC 462 , RC 464 and TLB 466 , as described above, the IO 7 400 preferably de-asserts the UPE_CACHE_INVAL field 1704 . Before actually removing the EV7 processor 202 that is to be hot-swapped, the user preferably confirms that the UPE_CACHE_INVAL field 1704 is de-asserted. It should be understood that the state of the EV7 processor 202 to be removed (and the state of its associated memory subsystem 370 ) is copied to a new location before the EV7 processor 202 is removed.

Abstract

A method and system for controlling the operations of a multi-processor system in a programmable fashion that allows tuning of the operational flow including support for hot swapping. A system control register or registers with a plurality of fields are defined to allocate system resources available at I/O ports to anticipated transactions at those ports. The control register(s) fields may include, for each port, the number of direct memory access engines available to support transactions, the number of cache lines available for requested data, the priorities of the anticipated transactions, etc. One field supports hot swapping wherein the registers, memory and cache contents and status are flushed and stored and the system directory is updated. Also, and the status of data with respect to the swapped assembly is updated to inform the system.

Description

    CROSS-REFERENCE TO RELATED APPLICATIONS
  • The present application claims priority from U.S. Provisional Patent Application Serial No. 60/229,830, which was filed on Aug. 31, 2000, by the present inventors among others for a Symmetrical Multiprocessor Computer System and which is hereby incorporated herein by reference.[0001]
  • BACKGROUND OF THE INVENTION
  • 1. Field of the Invention [0002]
  • The present invention relates to symmetrical distributed multiprocessor computer system architectures, and more particularly to adapting the system and its resources to improve system performance. [0003]
  • 2. Background Information [0004]
  • Distributed shared memory computer systems, such as symmetric multiprocessor (SMP) systems, support high performance application processing. Conventional SMP systems include a plurality of processors coupled together by a bus. One characteristic of SMP systems is that memory space is typically shared among all of the processors. That is, each processor accesses programs and data in the shared memory, and processors communicate with each other via that memory (e.g., through messages and status information left in shared address spaces). In some SMP systems, the processors may also be able to exchange signals directly. One or more operating systems are typically stored in the shared memory. These operating systems control the distribution of processes or threads among the various processors. The operating system kernels may execute on any processor, and may even execute in parallel. By allowing many different processors to execute different processes or threads simultaneously, the execution speed of a given application may be greatly increased. [0005]
  • FIG. 1 is a block diagram of a [0006] conventional SMP system 100. System 100 includes a plurality of processors 102 a-e, each connected to a system bus 104. A memory 106 and an input/output (I/O) bridge 108 are also connected to the system bus 104. The I/O bridge 108 is also coupled to one or more I/O busses 110 a-c. The I/O bridge 108 basically provides a “bridging” function between the system bus 104 and the I/O busses 110 a-c. Various I/O devices 112, such as disk drives, data collection devices, keyboards, CD-ROM drives, etc., may be attached to the I/O busses 110 a-c. Each processor 102 a-e can access memory 106 and/or various input/output devices 112 via the system bus 104. Each processor 102 a-e has at least one level of cache memory 114 a-e that is private to the respective processor 102 a-e.
  • In large multiprocessor computer systems, the manner in which system resources are allocated can significantly affect performance. High flexibility is also desired so as to increase the number of uses to which the system may be applied. Although the addition of redundant subsystems can often improve both performance and system flexibility, the overall cost of the system cannot be ignored. [0007]
  • Accordingly, a need exists to improve the flexibility of large multiprocessor computer systems. [0008]
  • SUMMARY OF THE INVENTION
  • The above needs and other advantages, for multiprocessor systems with many resources that may be arranged to support different tasks, are provided by the present inventive method and system where the resources may be applied depending on the number and/or type of I/O devices or device controllers attached to the various I/O ports and/or the number and type of transactions anticipated at the various I/O ports. Such arrangements may increase the processing speed and/or reduce the latency of response or be more efficient in use of main, cache or other memories. In one arrangement the present invention provides the ability to “hot swap” or “hot plug” some electronic sub-assemblies, that is replace the sub-assemblies without removing power from or shutting down the remaining system. [0009]
  • The present invention provides a method for programmably allocating system resources to accommodate I/O transactions at I/O ports of a multiprocessor computer system. The inventive method determines the number and type of transactions anticipated at a port, the number and type of devices being serviced via the port, a criteria for the transactions at the port with respect to the number and type of transactions and device, and assigns the system resources to the port with respect to the criteria. In preferred embodiments the criteria may include, among other parameters, increasing system operating speeds, reducing latency, or ensuring that some devices, even low priority devices, are serviced, slowing the system to allow debugging and other such servicing, ensuring proper communications credits, and supporting hot swapping of processor and module assemblies. [0010]
  • In one embodiment one or more control registers are provided for each port. The control registers may includes a plurality of programmable fields. Where additional control registers are used, the many fields are distributed among the control registers. [0011]
  • The control registers may be configured, in a preferred embodiment, to contain the number of direct memory access engines available at a port to support a transaction, the number of cache lines available for requested data, and a number representing priorities among the anticipated transactions [0012]
  • With respect to hot swapping assemblies, the state of the assembly being replaced, including its associated memory systems and their status and control registers, and the contents of its cache and memory systems is preserved. The remaining system is informed of such swapping with respect to any cached data, and the system directory is updated. [0013]
  • In particular, the local cache and local memory contents are stored so as not to be affected by the hot swapping, the cache data is flushed or invalidated, a flush indicator is set in the cache status and control register, no new transactions are allowed, and any transactions started or pending are completed, translating look-aside buffers are invalidated. [0014]
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The above and further advantages of the invention may be better understood by referring to the following description in conjunction with the accompanying drawings, in which like reference numbers indicate identical or functionally similar elements: [0015]
  • FIG. 1, previously discussed, is a schematic block diagram of a conventional symmetrical multiprocessor computer system; [0016]
  • FIG. 2 is a schematic block diagram of a symmetrical multiprocessor computer system in accordance with the present invention; [0017]
  • FIG. 3 is a schematic block diagram of a dual processor module of the computer system of FIG. 2; [0018]
  • FIG. 4 is a schematic block diagram of an I/O bridge in accordance with the present invention; [0019]
  • FIG. 5 is a schematic block diagram of an I/O subsystem of the computer system of FIG. 2; [0020]
  • FIG. 6 is a schematic diagram of an illustrative embodiment of four (4) 8P drawers of the SMP system mounted within a standard 19-inch rack; [0021]
  • FIG. 7 is a functional block diagram of an I/O bridge; [0022]
  • FIGS. [0023] 8-9 are schematic blocks diagram of various ports of the I/O bridge of FIG. 7;
  • FIGS. [0024] 10A-C, 11, and 12 are detailed block diagrams of preferred register formats utilized at the I/O bridge of the present invention;
  • DETAILED DESCRIPTION OF AN ILLUSTRATIVE EMBODIMENT
  • FIG. 2 is a schematic block diagram of a data processing system that may advantageously include the present invention. In the illustrative embodiment, the data processing system is preferably a symmetrical multiprocessor (SMP) [0025] system 200 comprising a plurality of processor modules 300 interconnected to form a two dimensional (2D) torus configuration. Each processor module 300 comprises two central processing units (CPUs) or processors 202 and has connections for two input/output (I/O) ports (one for each processor 202) and six inter-processor (IP) network ports. The IP network ports are preferably referred to as North (N), South (S), East (E) and West (W) compass points and connect to two unidirectional links. The North-South (NS) and East-West (EW) compass point connections create a (manhattan) grid, while the outside ends wrap-around and connect to each other, thereby forming the 2D torus. The SMP system 200 further comprises a plurality of I/O subsystems 500. I/O traffic enters the processor modules 300 of the 2D torus via the I/O ports. Although only one I/O subsystem 500 is shown connected to each processor module 300, because each processor module 300 has two I/O ports, any given processor module 300 may be connected to two I/O subsystems 500 (i.e., each processor 202 may be connected to its own I/O subsystem 500).
  • FIG. 3 is a schematic block diagram of the dual CPU (2P) [0026] module 300. As noted, the 2P module 300 comprises two CPUs 202 each having connections 310 for the IP (“compass”) network ports and an I/O port 320. The 2P module 300 also includes one or more power regulators 330, server management logic 350 and two memory subsystems 370 each coupled to a respective memory port (one for each CPU 202). The system management logic 350 cooperates with a server management system to control functions of the SMP system 200. Each of the N, S, E and W compass points along with the I/O and memory ports, moreover, use clock-forwarding, i.e., forwarding clock signals with the data signals, to increase data transfer rates and reduce skew between the clock and data.
  • Each [0027] CPU 202 of a 2P module 300 is preferably an “EV7” processor that includes part of an “EV6” processor as its core together with “wrapper” circuitry comprising two memory controllers, an I/O interface and four network ports. In the illustrative embodiment, the EV7 address space is 44 physical address bits and supports up to 256 processors 202 and 256 I/O subsystems 500. The EV6 core preferably incorporates a traditional reduced instruction set computer (RISC) load/store architecture. In the illustrative embodiment described herein, the EV6 core is an Alpha® 21264 processor chip manufactured by Compaq Computer Corporation of Houston, Tex., with the addition of a 1.75 megabyte (MB) 7-way associative internal cache and “CBOX,” the latter providing integrated cache controller functions to the EV7 processor. However, it will be apparent to those skilled in the art that other types of processor chips may be advantageously used. The EV7 processor also includes an “RBOX” that provides integrated routing/networking control functions with respect to the compass points, and a “ZBOX” that provides integrated memory controller functions for controlling the memory subsystem 370.
  • Each [0028] memory subsystem 370 may be and/or may include one or more conventional or commercially available dynamic random access memory (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDR-SDRAM) or Rambus DRAM (RDRAM) memory devices. Each EV7 processor 202, moreover, can operate with 0, 1 or 2 memory controllers.
  • FIG. 4 is a schematic block diagram of an I/O bridge or “IO[0029] 7400, that provides a fundamental building block for the SMP I/O subsystem. The IO7 is preferably implemented as an application specific integrated circuit (ASIC). Each EV7 processor supports one I/O ASIC connection; however, there is no requirement that each processor have an I/O connection. In the illustrative embodiment, the I/O subsystem 500 includes a PCI and/or PCI-X I/O expansion box with hot-swap PCI-x and AGP support. The PCI-x expansion box includes an IO7 plug-in card that spawns four I/O buses.
  • The [0030] IO7 400 comprises a North circuit region 410 that interfaces to the EV7 processor and a South circuit region 450 that includes a plurality of I/O ports 460 (P0-P3) that interface to standard I/O buses. An EV7 port 420 of the North region 410 couples to the EV7 processor via 2 unidirectional, clock forwarded links 430. Each link 430 has a 32-bit data path that operates at 400 Mbps for a total bandwidth of 1.6 GB in each direction. In the illustrative embodiment, three of the four I/O ports 460 interface to the PCI and/or PCI-X bus, while the fourth port interfaces to an AGP bus.
  • In accordance with an aspect of the present invention, a cache coherent domain of the [0031] SMP system 200 extends into the IO7 400 and, in particular, to I/O caches located within each I/O port 460 of the IO7 400. Specifically, the cache coherent domain extends to a write cache (WC) 462, a read cache (RC) 464 and a translation look-aside buffer (TLB) 466 located within each I/O port 460. As described further herein, the caches 462, 464, 466 function as coherent buffers.
  • Referring again to the embodiment of FIG. 2, the 2D-torus configuration of the [0032] SMP system 200 may comprise sixteen EV7 processors 202 interconnected within two 8P drawer enclosures 600. Specifically, there are four 2P modules 300 interconnected by a backplane within each enclosure 600. In the illustrative embodiment, four 8P drawers 600 may be mounted within a standard 19-inch rack (2 meters in height) as shown in FIG. 6. Mounting four 8P drawers 600 in a single rack creates a substantial cabling problem when interconnecting the 32 processors within the 2D-torus configuration and when coupling the processors to the I/O subsystems 500 via the IO7 devices 400 associated with the EV7 processors 202. Note that multiple racks may be cabled together to net a system of 64, 128, 256 or a larger number of processors. In accordance with a related invention, an efficient means for interconnecting cables among the 8P drawers 600 of a fully-configured, 19-inch rack is provided.
  • FIG. 5 is a schematic block diagram of an I/O subsystem or [0033] drawer 500 of the SMP system 200. Each I/O subsystem 500 includes a first I/O riser card 510 containing an IO7 400, a connector 520 coupling the IO7 400 to its EV7 processor 202 and a plurality of I/O buses. The speed of the I/O buses contained within the I/O subsystem 500 is a function of the length and the number of loads of each I/O bus. The I/O subsystem 500 is divided into two parts: a hot-plug region 530 and an embedded region 550. In the illustrative embodiment, there is a dedicated slot 560 adjacent to the I/O riser card 510 within the embedded region 550 that is dedicated to a 4x AGP Pro graphics card. Additional slots (e.g., for power and an additional data path) may be provided to support the AGP Pro card. Also included within the embedded region 550 are three standard, 64-bit PCI card slots 572-576, which are available for embedded I/O card options. For example, an I/O standard module card 580 may be inserted within one of the PCI card slots 572-576.
  • Each I/[0034] O subsystem 500 also includes power supplies, fans and storage/load devices (not shown). The I/O standard module card 580 contains a Small Computer System Interface (SCSI) controller for storage/load devices and a Universal Serial Bus (USB) that enables keyboard, mouse, CD and similar input/output functions. The embedded region 550 of the I/O subsystem 500 is typically pre-configured and does not support hot-swap operations. In contrast, the hot-plug region 530 includes a plurality of slots adapted to support hot-swap. Specifically, there are two ports 532, 534 of the hot plug region 530 dedicated to I/O port one (P1 of FIG. 4) and six slots 538-548 dedicated to I/O port two (P2). Likewise, the dedicated AGP Pro slot 560 comprises port three (P3) and the three embedded PCI slots 572-576 comprise port zero (P0). The I/O buses in the hot-plug region 530 are configured to support PCI and/or PCI-X standards operating at 33 MHz, 50 MHz, 66 MHz, 100 MHz and/or 133 MHz. Not all slots are capable of supporting all of these operating speeds.
  • Also included within the I/[0035] O subsystem 500 and coupled adjacent to the IO7 400 is a PCI backplane manager (PBM) 502. The PBM 502 is part of a platform management infrastructure. The PBM 502 is coupled to a local area network (LAN), e.g., 100 base T LAN, by way of another I/O riser board 590 within the I/O drawer 500. The LAN provides an interconnect for the server management platform that includes, in addition to the PBM 502, a CPU Management Module (CMM) located on each 2P module 300 (FIG. 3) and an MBM (Marvel Backplane Manager).
  • FIG. 7 is a schematic block diagram of an [0036] IO7 400 in greater detail. As shown, the South region 450 of IO7 400 includes four data south ports, which may be numbers SP0-SP3. In addition to the read buffer, write buffer and TLB, as described above, ports SP0-SP3 further include an up hose ordering engine (UPE), a down hose ordering engine (DNE), a down hose forward initiator (DFI), a down hose address buffer (DNA), and a control and status register (CSR) block. South ports SP0-SP2, which may be configured to run the PCI and/or PCI-X bus standards, further include hot plug interface gates (HPIG).
  • FIG. 8 is a representative block diagram of one of south ports SP[0037] 0-SP2, each of which preferably includes a PCI or PCI-X controller for controlling the PCI or PCI-X bus to which the port SP0-SP2 is coupled. As shown, the UPE of south ports SP0-SP2 has a plurality (preferably twelve) DMA controllers, which are numerically identified “00” to “11.”
  • FIG. 9 is a representative block diagram of South port SP[0038] 3 which, as described above, is configured to support AGP. FIG. 10 is a representative block diagram of the interrupt port P7 of South region 450, and FIG. 11 is a block diagram of the MUX of South region 450.
  • As described above, each port P[0039] 0-P3 of the IO7 has a plurality of DMA engines/state machines (preferably twelve) that are used to process transactions for the I/O devices coupled to that port. In accordance with another aspect of the present invention, a method is disclosed for tuning the availability of resources needed to process I/O transactions.
  • In particular, for each port P[0040] 0-P3 of the IO7 there is a POx_CTRL control register 800 of FIG. 7 having a plurality of fields. FIGS. 11A-C are a detailed preferred format of a the POx_CTRL control register 1500. A POx_CTRL control register 800 is preferably disposed at each port P0-P3 of the IO7 and is read during initialization of the IO7. It may also be read during operation of the IO7. As shown, the POx_CTRL control register format 1500 is organized into a plurality of fields. An RM_TYPE field 1502 (FIG. 10C) is preferably used, at least in part, to control a novel pre-fetch algorithm, which is disclosed in a co-pending patent Ser. No. ______, entitled, Adaptive Data Prefetch Prediction Algorithm, filed ______. Which application is hereby incorporated herein by reference. In particular, the RM_TYPE field 1502 controls the maximum number of DMA engines that may be assigned to process a given transaction. The RM_TYPE field 1502 may be 2-bits long.
  • For example, if the [0041] RM_TYPE field 1502 is set to “00”, then the port may assign at most two DMA engines to any given transaction. If the RM TYPE field 1502 is set to “01”, then up to six DMA engines may be assigned to a single transaction. If the RM_TYPE field 1502 is set to “10”, up to eight DMA engines may be assigned to a single transaction. If it is set to “11”, up to eleven DMA engines may be assigned to a single transaction.
  • It should be understood that other values may also be assigned and/or additional bits used to provide finer granularity. [0042]
  • Those skilled in the art will recognize that, by adjusting the [0043] RM_TYPE field 1502, each port P0-P3 of an IO7 may be individually tuned for the type and number of transactions that are anticipated on that port. For example, if many I/O devices are coupled to a given port, then the RM_TYPE field 1502 may be set to “00” to ensure that no one I/O device ends up consuming all of the DMA engines for its own transaction(s). In contrast, if only a single I/O device is coupled to a given port, the RM_TYPE field 1502 may be set to “11” so that the one I/O device can take advantage of the DMA engine resources at that port.
  • The IO[0044] 7 has a certain number of miss addressed file (MAF) values that may be used to request data (in terms of cache lines) from the SMP system 200 in response to a cache line miss at the IO7. In accordance with another aspect of the present invention, the number of MAF values at a given IO7 is user programmable. More specifically, each IO7 preferably includes a IO7_MAF register. The IO7_MAF register may be 5-bits in length. By setting the IO7_MAF register, a user may adjust the number of MAF values available to the IO7. For example, if the IO7_MAF register is set to “0000”, then no MAF values are available to the IO7. If the IO7_MAF register is set to “0001”, then one MAF value is available. If the IO7_MAF register is set to “0010”, then two MAF values are available, and so on up to a maximum of 31 MAF values. The IO7_MAF register basically provides a mechanism by which a user can “throttle” the performance of a selected IO7. This may be desirable to perform debugging and other service related tasks, as well to control the operation of the IO7s. Preferably, at least two MAF values are provided to the IO7.
  • The [0045] IO7 400 utilizes a credit-based flow control system to communicate with its respective EV7 processor 202. In particular, for each of the Read I/O, Write I/O, Request, Block Response and No Block Response virtual channels, the MUX has a corresponding credit buffer. If the MUX has a packet to be transmitted on the Request channel, it first checks to see if there is at least one credit in the Request channel credit buffer against which to “charge” this packet. If a credit exists, then the IO7 assumes that the EV7 processor 202 has sufficient buffer space to store the packet. Accordingly, the MUX decrements the Request channel credit buffer by “1” (i.e., the number of messages to be sent) and sends the message. If there are no Request channel credits, the MUX must wait until at least one Request credit is received from the EV7 processor 202 before sending the message.
  • In accordance with another aspect of the present invention, the starting number of credits for the Read I/O, Write I/O, Request, Block Response and No Block Response virtual channels is preferably programmable by writing to a corresponding register at the IO[0046] 7. In particular, the IO7 preferably includes an IO7_UPH control register.
  • FIG. 11 is a detailed format of a preferred IO[0047] 7_UPH control register 1600. The IO7_UPH control register 1600 includes a plurality of fields, including a plurality of up hose credit fields 1602. In particular, the IO7_UPH control register 1600 has an up hose Request channel credit field 1602 a, an up hose Read I/O channel credit field 1602 b, an up hose Write I/O channel credit field 1602 c, an up hose Block Response channel credit field 1602 d, and an up hose No Block Response credit field 1602 e. Each of the credit fields 1602 is programmable so that a user may independently set the number of credits available for each channel. Each credit field 1602 may be 5-bits in length. Accordingly, is each channel may be programmed with “0” to “31” credits.
  • By adjusting the maximum number of credits available for each of these virtual channels, a user can finely tune the operation of each IO[0048] 7 within the SMP system 200.
  • CPU Hot Plug [0049]
  • As described above, an [0050] EV7 processor 202 and/or the corresponding 2P module 300 can be hot swapped. That is, a selected EV7 processor 202 and/or 2P module 300 can be removed and replaced with a new processor or module without bringing the entire SMP system 200 (or the partition in which the processor or module is located) down. Prior to removing an EV7 processor 202, several steps must be performed. Specifically, all data in the EV7 processor's local cache and in the memory subsystem 370 directly controlled by the EV7 processor 202 must be “flushed” to secondary (or alternative) storage. The information contained in the respective directory (e.g., cache line ownership status and location) must also be flushed. Additionally, new transactions that target the EV7 processor 202 or that transit through it must be stopped and/or re-routed.
  • As described above, the transactions initiated by a given IO[0051] 7 are received, at least initially, by the EV7 processor 202 to which the IO7 400 is coupled. In accordance with an aspect of the present invention, an efficient system for stopping the transactions initiated by a given IO7 400 in order to facilitate hot swap of the respective EV7 processor 202 is described.
  • Specifically, as described above, each [0052] IO7 400 includes a POx_CTRL control register 1500 (FIGS. 11A-C) that contains information utilized by the IO7 400 when it is initialized. The POx_CTRL register 1500 preferably includes a UPE_ENG_EN field 1504 (FIG. 10C). The UPE_ENG_EN field 1504 preferably includes at least one bit for each DMA engine at the IO7 400. In the preferred embodiment, each IO7 400 has twelve DMA engines. Accordingly, the UPE_ENG_EN field 1504 has twelve DMA engine enable bits. When the IO7 400 is initialized, it reads the POx_CTRL control register 1500, including the UPE_ENG_EN field 1504 and, among other things, starts and runs a DMA engine for each DMA engine enable bit that is asserted. In this way a user can program the number of DMA engines (up to some maximum, e.g., twelve) that are started and run at a given IO7 400. If an EV7 processor 202 is to be hot swapped a user, operating through system software or firmware, preferably de-asserts all twelve DMA engine enable bits of the IO7 400 coupled to the EV7 202 that is to be removed. That is, the user sets all bits of the UPE_ENG_EN field 1504 of the respective POx_CTRL control register 1500 to “0”. In response, the IO7 400 stops allocating DMA engines for new transactions, thereby stopping the IO7 400 from commencing new transactions. When a DMA engine that was in use is subsequently disabled, it nonetheless completes the pending or existing transaction(s) that were assigned to it.
  • In addition to stopping the [0053] IO7 400 from initiating any new transactions by disabling its DMA engines, the user also causes any data stored in the IO7's WCs 462, RCs 464 and TLBs 466 to be invalidated, whether that data is coherent or not. To facilitate this operation, among other reasons, the IO7 400 further includes a POx_CACHE_CTL register at each port 460, which governs the operation of the WC 462 and RC 464 at that port 460. FIG. 12 is a schematic block diagram of a preferred format of a POx_CACHE_CTL register 1700. The POx_CACHE_CTL register 1700 includes a UPE_FLUSH_CACHE field 1702, which may be 1-bit. If asserted, the UPE_FLUSH_CACHE field 1702 causes the IO7 400 to flush all coherent and noncoherent data stored in the WC 462 and RC 464 for that port 460. Accordingly, as part of the hot swapping of an EV7 processor 202, the user also asserts the flush bit of each port's cache status and control register. In response, the IO7 400 invalidates the contents of all of its WCs 462 and RCs 464; and for each entry of the WCs 462 and RCs 464, the IO7 400 returns Victim or VictimClean messages to the directory depending, among other things, on the ownership status of the invalidated data.
  • For each of its [0054] TLBs 466, the IO7 400 preferably includes a translation invalidate all (TBIA) register. If the TBIA register contains any value, including all zeros, the IO7 400 preferably responds by flushing the contents of the respective TLB 466. Accordingly, as part of the hot swapping of an EV7 processor 202, the user also writes any value to each TBIA register of the affected IO7 400. In response, the IO7 400 invalidates the contents of all of its TLBs and sends VictimClean messages to the directory 380.
  • The [0055] POx_CACHE_CTL register 1700 at each port 460 the IO7 400 also includes a UPE_CACHE_INVAL field 1704 which indicates whether one or more blocks of the WC 462, RC 464 or TLB 466 of that port 460 contain valid data. After invalidating the contents of the port's WC 462, RC 464 and TLB 466, as described above, the IO7 400 preferably de-asserts the UPE_CACHE_INVAL field 1704. Before actually removing the EV7 processor 202 that is to be hot-swapped, the user preferably confirms that the UPE_CACHE_INVAL field 1704 is de-asserted. It should be understood that the state of the EV7 processor 202 to be removed (and the state of its associated memory subsystem 370) is copied to a new location before the EV7 processor 202 is removed.

Claims (10)

What is claimed is:
1. A method for programmably allocating system resources to accommodate I/O transactions at I/O ports of a multiprocessor computer system comprising the steps of:
determining the number and type of transactions anticipated at a port,
determining the number and type of devices being serviced via the port,
setting criteria for transactions at the port with respect to the number and type of transactions and devices,
assigning the system resources to the port with respect to the criteria.
2. The method as defined in claim 1 further comprising the steps of:
providing at least one control register for each port, wherein the control register includes a plurality of programmable fields.
3. The method as defined in claim 2 further comprising the steps of configuring the control register fields to contain a number of direct memory access engines available at a port to support a transaction, a number of cache lines for requested data, and a number representing priorities among the anticipated transactions.
4. The method as defined in claim 1 further comprising the step of preparing for hot swapping an assembly, wherein the preparing for hot swapping comprises, with respect to the assembly being replaced, copying the assembly's state, the state of its associated memory systems, its status and control registers, and the contents of its cache and memory systems.
5. The method as defined in claim 4 wherein the copying comprises the steps of:
flushing the data in the local cache and local memory to storage not affected by the hot swapping,
invalidating data in cache,
setting a flush indicator in the port's cache status and control register,
flushing directory data to non-affected storage,
finding and stopping any new transactions,
completing any transactions already started or pending,
flushing the translation look-aside buffers,
invalidating the contents of the translation look-aside buffers, and
updating the system directory.
6. A system for allocating system resources to accommodate I/O transactions at I/O ports of a multiprocessor computer system comprising:
the number and type of transactions anticipated at a port,
number and type of devices being serviced via the port,
criteria for operations at the port with respect to the number and type of transactions and devices,
means for assigning the system resources to the port with respect to the criteria.
7. The system as defined in claim 6 further comprising:
at least one control register for each port, wherein the control register includes a plurality of programmable fields.
8. The system as defined in claim 7 wherein the control register fields include a number of direct memory access engines available at a port to support a transaction, a number of cache lines for requested data, and a number representing priorities among the anticipated transactions.
9. The method as defined in claim 6 further comprising:
means for hot swapping of an assembly, including means for copying the assembly's state, the state of its associated memory systems, its status and control registers, and the contents of its cache and memory systems.
10. The system as defined in claim 9 wherein the means for copying comprises:
means for flushing the data in the local cache and local memory to storage not affected by the hot swapping,
means for flushing, modifying and invalidating unmodified data in cache,
means for setting a flush indicator in the port's cache status and control register,
means for flushing directory data to non-affected storage,
means for finding and stopping any new transactions,
means for completing any transactions already started or pending,
means for flushing the translation look-aside buffers,
means for invalidating the contents of the translation look-aside buffers, and
means for updating the directory.
US09/944,776 2000-08-31 2001-08-31 Programmable tuning for flow control and support for CPU hot plug Abandoned US20020087614A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US09/944,776 US20020087614A1 (en) 2000-08-31 2001-08-31 Programmable tuning for flow control and support for CPU hot plug

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US22983000P 2000-08-31 2000-08-31
US09/944,776 US20020087614A1 (en) 2000-08-31 2001-08-31 Programmable tuning for flow control and support for CPU hot plug

Publications (1)

Publication Number Publication Date
US20020087614A1 true US20020087614A1 (en) 2002-07-04

Family

ID=26923650

Family Applications (4)

Application Number Title Priority Date Filing Date
US09/944,500 Expired - Fee Related US7065169B2 (en) 2000-08-31 2001-08-31 Detection of added or missing forwarding data clock signals
US09/944,516 Expired - Lifetime US6920516B2 (en) 2000-08-31 2001-08-31 Anti-starvation interrupt protocol
US09/944,515 Expired - Fee Related US7024509B2 (en) 2000-08-31 2001-08-31 Passive release avoidance technique
US09/944,776 Abandoned US20020087614A1 (en) 2000-08-31 2001-08-31 Programmable tuning for flow control and support for CPU hot plug

Family Applications Before (3)

Application Number Title Priority Date Filing Date
US09/944,500 Expired - Fee Related US7065169B2 (en) 2000-08-31 2001-08-31 Detection of added or missing forwarding data clock signals
US09/944,516 Expired - Lifetime US6920516B2 (en) 2000-08-31 2001-08-31 Anti-starvation interrupt protocol
US09/944,515 Expired - Fee Related US7024509B2 (en) 2000-08-31 2001-08-31 Passive release avoidance technique

Country Status (1)

Country Link
US (4) US7065169B2 (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060117124A1 (en) * 2004-11-30 2006-06-01 Grasso Lawrence J Multiple host support for remote expansion apparatus
CN1308869C (en) * 2003-04-28 2007-04-04 国际商业机器公司 Non-disruptive, dynamic hot-plug and hot-remove system and method
US20070245128A1 (en) * 2006-03-23 2007-10-18 Microsoft Corporation Cache metadata for accelerating software transactional memory
US20070245309A1 (en) * 2005-12-07 2007-10-18 Microsoft Corporation Software accessible cache metadata
US20070245099A1 (en) * 2005-12-07 2007-10-18 Microsoft Corporation Cache metadata for implementing bounded transactional memory
US20080040551A1 (en) * 2005-12-07 2008-02-14 Microsoft Corporation Cache metadata identifiers for isolation and sharing
US20160140040A1 (en) * 2014-11-14 2016-05-19 Cavium, Inc. Filtering translation lookaside buffer invalidations
US20160140051A1 (en) * 2014-11-14 2016-05-19 Cavium, Inc. Translation lookaside buffer invalidation suppression
US20160179668A1 (en) * 2014-05-28 2016-06-23 Mediatek Inc. Computing system with reduced data exchange overhead and related data exchange method thereof
US10089150B2 (en) 2015-02-03 2018-10-02 Alibaba Group Holding Limited Apparatus, device and method for allocating CPU resources
US10268247B2 (en) 2015-12-11 2019-04-23 Samsung Electronics Co., Ltd. Thermal management of spatially dispersed operation processors
US10545562B2 (en) 2016-07-05 2020-01-28 Samsung Electronics Co., Ltd. Electronic device and method for operating the same
US10977183B2 (en) * 2018-12-11 2021-04-13 International Business Machines Corporation Processing a sequence of translation entry invalidation requests with regard to draining a processor core
US11327918B2 (en) * 2018-06-29 2022-05-10 Intel Corporation CPU hot-swapping

Families Citing this family (40)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7054975B2 (en) * 2001-08-10 2006-05-30 Koninklijke Philips Electronics N.V. Interrupt generation in a bus system
US7065596B2 (en) * 2002-09-19 2006-06-20 Intel Corporation Method and apparatus to resolve instruction starvation
US20040059879A1 (en) * 2002-09-23 2004-03-25 Rogers Paul L. Access priority protocol for computer system
KR100961947B1 (en) 2003-05-19 2010-06-10 삼성전자주식회사 Method of detecting input clock error
US8984199B2 (en) * 2003-07-31 2015-03-17 Intel Corporation Inter-processor interrupts
US7409483B2 (en) * 2003-12-19 2008-08-05 Intel Corporation Methods and apparatuses to provide message signaled interrupts to level-sensitive drivers
JP2005196459A (en) * 2004-01-07 2005-07-21 Fujitsu Ltd Interrupt control program, recording medium with the same, and interrupt control method
US20050283554A1 (en) * 2004-06-22 2005-12-22 General Electric Company Computer system and method for queuing interrupt messages in a device coupled to a parallel communication bus
US7362739B2 (en) * 2004-06-22 2008-04-22 Intel Corporation Methods and apparatuses for detecting clock failure and establishing an alternate clock lane
US9626194B2 (en) 2004-09-23 2017-04-18 Intel Corporation Thread livelock unit
US7748001B2 (en) * 2004-09-23 2010-06-29 Intel Corporation Multi-thread processing system for detecting and handling live-lock conditions by arbitrating livelock priority of logical processors based on a predertermined amount of time
CN100359457C (en) * 2004-12-16 2008-01-02 华为技术有限公司 Method for realizing normal working of communication interface based on sending interruption
US7386642B2 (en) * 2005-01-28 2008-06-10 Sony Computer Entertainment Inc. IO direct memory access system and method
JP2006216042A (en) * 2005-02-04 2006-08-17 Sony Computer Entertainment Inc System and method for interruption processing
US7680972B2 (en) * 2005-02-04 2010-03-16 Sony Computer Entertainment Inc. Micro interrupt handler
US7694055B2 (en) * 2005-10-15 2010-04-06 International Business Machines Corporation Directing interrupts to currently idle processors
US7519752B2 (en) * 2006-02-07 2009-04-14 International Business Machines Corporation Apparatus for using information and a count in reissuing commands requiring access to a bus and methods of using the same
US7689748B2 (en) * 2006-05-05 2010-03-30 Ati Technologies, Inc. Event handler for context-switchable and non-context-switchable processing tasks
US20090271548A1 (en) * 2006-06-23 2009-10-29 Freescale Semiconductor, Inc. Interrupt response control apparatus and method therefor
US7590784B2 (en) * 2006-08-31 2009-09-15 Intel Corporation Detecting and resolving locks in a memory unit
US20080091883A1 (en) * 2006-10-12 2008-04-17 International Business Machines Corporation Load starvation detector and buster
US7949813B2 (en) * 2007-02-06 2011-05-24 Broadcom Corporation Method and system for processing status blocks in a CPU based on index values and interrupt mapping
US7660927B2 (en) * 2007-05-21 2010-02-09 International Business Machines Corporation Apparatus and method to control access to stored information
US8006013B2 (en) * 2008-08-07 2011-08-23 International Business Machines Corporation Method and apparatus for preventing bus livelock due to excessive MMIO
US8135894B1 (en) * 2009-07-31 2012-03-13 Altera Corporation Methods and systems for reducing interrupt latency by using a dedicated bit
KR101262846B1 (en) * 2009-12-15 2013-05-10 한국전자통신연구원 Apparatus and method for measuring the performance of embedded devices
TW201123732A (en) * 2009-12-31 2011-07-01 Ind Tech Res Inst Processing devices
US8516577B2 (en) 2010-09-22 2013-08-20 Intel Corporation Regulating atomic memory operations to prevent denial of service attack
US9225321B2 (en) * 2010-12-28 2015-12-29 Stmicroelectronics International N.V. Signal synchronizing systems and methods
US8713235B2 (en) * 2011-05-02 2014-04-29 Fairchild Semiconductor Corporation Low latency interrupt collector
US9146776B1 (en) * 2011-08-16 2015-09-29 Marvell International Ltd. Systems and methods for controlling flow of message signaled interrupts
US8706936B2 (en) 2011-11-14 2014-04-22 Arm Limited Integrated circuit having a bus network, and method for the integrated circuit
US9128920B2 (en) 2011-11-30 2015-09-08 Marvell World Trade Ltd. Interrupt handling systems and methods for PCIE bridges with multiple buses
US9329880B2 (en) * 2013-02-13 2016-05-03 Red Hat Israel, Ltd. Counter for fast interrupt register access in hypervisors
US10331589B2 (en) 2013-02-13 2019-06-25 Red Hat Israel, Ltd. Storing interrupt location for fast interrupt register access in hypervisors
JP6123487B2 (en) * 2013-05-28 2017-05-10 富士通株式会社 Control device, control method, and control program
EP3106996A4 (en) * 2014-02-14 2017-09-06 Murakumo Corporation System, storage device, and method
CN105337607B (en) * 2014-06-30 2019-05-17 澜起科技股份有限公司 Device and method for clock signal loss detection
US10101795B2 (en) * 2015-11-10 2018-10-16 Wipro Limited System-on-chip (SoC) and method for dynamically optimizing power consumption in the SoC
US10437755B2 (en) 2015-11-16 2019-10-08 International Business Machines Corporation Techniques for handling interrupts in a processing unit using virtual processor thread groups

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5864679A (en) * 1993-09-06 1999-01-26 Kabushiki Kaisha Toshiba Transaction routing in a multiple processor system using an extracted transaction feature parameter and transaction historical data
US6085276A (en) * 1997-10-24 2000-07-04 Compaq Computers Corporation Multi-processor computer system having a data switch with simultaneous insertion buffers for eliminating arbitration interdependencies
US6085294A (en) * 1997-10-24 2000-07-04 Compaq Computer Corporation Distributed data dependency stall mechanism
US6119185A (en) * 1997-11-25 2000-09-12 Ncr Corporation Computer system configuration apparatus and method that performs pre-assignment conflict analysis
US6219734B1 (en) * 1997-05-13 2001-04-17 Micron Electronics, Inc. Method for the hot add of a mass storage adapter on a system including a statically loaded adapter driver
US6243778B1 (en) * 1998-10-13 2001-06-05 Stmicroelectronics, Inc. Transaction interface for a data communication system
US6718413B1 (en) * 1999-08-31 2004-04-06 Adaptec, Inc. Contention-based methods for generating reduced number of interrupts

Family Cites Families (47)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4929918A (en) * 1989-06-07 1990-05-29 International Business Machines Corporation Setting and dynamically adjusting VCO free-running frequency at system level
US5325536A (en) * 1989-12-07 1994-06-28 Motorola, Inc. Linking microprocessor interrupts arranged by processing requirements into separate queues into one interrupt processing routine for execution as one routine
US5452452A (en) * 1990-06-11 1995-09-19 Cray Research, Inc. System having integrated dispatcher for self scheduling processors to execute multiple types of processes
US5504894A (en) * 1992-04-30 1996-04-02 International Business Machines Corporation Workload manager for achieving transaction class response time goals in a multiprocessing system
US5379428A (en) * 1993-02-01 1995-01-03 Belobox Systems, Inc. Hardware process scheduler and processor interrupter for parallel processing computer systems
JP3159345B2 (en) * 1993-07-02 2001-04-23 日本電気株式会社 Pipeline arithmetic processing unit
US5588125A (en) * 1993-10-20 1996-12-24 Ast Research, Inc. Method and apparatus for increasing bus bandwidth on a system bus by inhibiting interrupts while posted I/O write operations are pending
US5568649A (en) * 1994-05-31 1996-10-22 Advanced Micro Devices Interrupt cascading and priority configuration for a symmetrical multiprocessing system
US5619706A (en) * 1995-03-02 1997-04-08 Intel Corporation Method and apparatus for switching between interrupt delivery mechanisms within a multi-processor system
US5754866A (en) * 1995-05-08 1998-05-19 Nvidia Corporation Delayed interrupts with a FIFO in an improved input/output architecture
JPH0916406A (en) * 1995-06-27 1997-01-17 Toshiba Corp Computer system
US5933613A (en) 1995-07-06 1999-08-03 Hitachi, Ltd. Computer system and inter-bus control circuit
US5822317A (en) * 1995-09-04 1998-10-13 Hitachi, Ltd. Packet multiplexing transmission apparatus
US5819112A (en) * 1995-09-08 1998-10-06 Microsoft Corporation Apparatus for controlling an I/O port by queuing requests and in response to a predefined condition, enabling the I/O port to receive the interrupt requests
DE19535546B4 (en) * 1995-09-25 2004-04-08 Siemens Ag Method for operating a real-time computer system controlled by a real-time operating system
US5671365A (en) * 1995-10-20 1997-09-23 Symbios Logic Inc. I/O system for reducing main processor overhead in initiating I/O requests and servicing I/O completion events
US5606703A (en) * 1995-12-06 1997-02-25 International Business Machines Corporation Interrupt protocol system and method using priority-arranged queues of interrupt status block control data structures
US5892956A (en) * 1995-12-19 1999-04-06 Advanced Micro Devices, Inc. Serial bus for transmitting interrupt information in a multiprocessing system
US5913045A (en) 1995-12-20 1999-06-15 Intel Corporation Programmable PCI interrupt routing mechanism
US5771387A (en) * 1996-03-21 1998-06-23 Intel Corporation Method and apparatus for interrupting a processor by a PCI peripheral across an hierarchy of PCI buses
JP2809187B2 (en) * 1996-04-15 1998-10-08 日本電気株式会社 Interrupt line sharing circuit and interrupt line sharing method
US6108741A (en) * 1996-06-05 2000-08-22 Maclaren; John M. Ordering transactions
US5881296A (en) 1996-10-02 1999-03-09 Intel Corporation Method for improved interrupt processing in a computer system
US6021456A (en) 1996-11-12 2000-02-01 Herdeg; Glenn Arthur Method for communicating interrupt data structure in a multi-processor computer system
KR100218675B1 (en) * 1996-12-04 1999-09-01 정선종 Method and apparatus of multiple interrupt control in intellectual priority determine mode
US6078997A (en) * 1996-12-09 2000-06-20 Intel Corporation Directory-based coherency system for maintaining coherency in a dual-ported memory system
US5848279A (en) * 1996-12-27 1998-12-08 Intel Corporation Mechanism for delivering interrupt messages
DE19720446A1 (en) * 1997-05-15 1998-11-19 Siemens Ag Snap detection circuit for a phase locked loop
JPH10322200A (en) * 1997-05-21 1998-12-04 Mitsubishi Electric Corp Phase lock detecting circuit
US6023743A (en) * 1997-06-10 2000-02-08 International Business Machines Corporation System and method for arbitrating interrupts on a daisy chained architected bus
US6035376A (en) * 1997-10-21 2000-03-07 Apple Computer, Inc. System and method for changing the states of directory-based caches and memories from read/write to read-only
JP3347036B2 (en) * 1997-10-29 2002-11-20 東芝情報システム株式会社 Analog PLL circuit, semiconductor device, and oscillation control method for voltage controlled oscillator
US6275888B1 (en) * 1997-11-19 2001-08-14 Micron Technology, Inc. Method for configuring peer-to-peer bus bridges in a computer system using shadow configuration registers
US6418496B2 (en) * 1997-12-10 2002-07-09 Intel Corporation System and apparatus including lowest priority logic to select a processor to receive an interrupt message
US5956516A (en) * 1997-12-23 1999-09-21 Intel Corporation Mechanisms for converting interrupt request signals on address and data lines to interrupt message signals
US6105085A (en) * 1997-12-26 2000-08-15 Emc Corporation Lock mechanism for shared resources having associated data structure stored in common memory include a lock portion and a reserve portion
US6163829A (en) * 1998-04-17 2000-12-19 Intelect Systems Corporation DSP interrupt control for handling multiple interrupts
US6173351B1 (en) * 1998-06-15 2001-01-09 Sun Microsystems, Inc. Multi-processor system bridge
US6065088A (en) * 1998-08-31 2000-05-16 International Business Machines Corporation System and method for interrupt command queuing and ordering
US6253275B1 (en) * 1998-11-25 2001-06-26 Advanced Micro Devices, Inc. Interrupt gating method for PCI bridges
US6480918B1 (en) * 1998-12-22 2002-11-12 International Business Machines Corporation Lingering locks with fairness control for multi-node computer systems
US6442631B1 (en) * 1999-05-07 2002-08-27 Compaq Information Technologies Group, L.P. Allocating system resources based upon priority
US6389526B1 (en) * 1999-08-24 2002-05-14 Advanced Micro Devices, Inc. Circuit and method for selectively stalling interrupt requests initiated by devices coupled to a multiprocessor system
US6604161B1 (en) * 1999-09-29 2003-08-05 Silicon Graphics, Inc. Translation of PCI level interrupts into packet based messages for edge event drive microprocessors
US6532501B1 (en) * 1999-09-30 2003-03-11 Silicon Graphics, Inc. System and method for distributing output queue space
US6629252B1 (en) * 1999-10-28 2003-09-30 International Business Machines Corporation Method for determining if a delay required before proceeding with the detected interrupt and exiting the interrupt without clearing the interrupt
US6629179B1 (en) * 2000-07-31 2003-09-30 Adaptec, Inc. Message signaled interrupt generating device and method

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5864679A (en) * 1993-09-06 1999-01-26 Kabushiki Kaisha Toshiba Transaction routing in a multiple processor system using an extracted transaction feature parameter and transaction historical data
US6219734B1 (en) * 1997-05-13 2001-04-17 Micron Electronics, Inc. Method for the hot add of a mass storage adapter on a system including a statically loaded adapter driver
US6085276A (en) * 1997-10-24 2000-07-04 Compaq Computers Corporation Multi-processor computer system having a data switch with simultaneous insertion buffers for eliminating arbitration interdependencies
US6085294A (en) * 1997-10-24 2000-07-04 Compaq Computer Corporation Distributed data dependency stall mechanism
US6119185A (en) * 1997-11-25 2000-09-12 Ncr Corporation Computer system configuration apparatus and method that performs pre-assignment conflict analysis
US6243778B1 (en) * 1998-10-13 2001-06-05 Stmicroelectronics, Inc. Transaction interface for a data communication system
US6718413B1 (en) * 1999-08-31 2004-04-06 Adaptec, Inc. Contention-based methods for generating reduced number of interrupts

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1308869C (en) * 2003-04-28 2007-04-04 国际商业机器公司 Non-disruptive, dynamic hot-plug and hot-remove system and method
US20070283070A1 (en) * 2004-11-30 2007-12-06 Grasso Lawrence J Multiple Host Support For Remote Expansion Apparatus
US8984202B2 (en) 2004-11-30 2015-03-17 Lenovo Enterprise Solutions (Singapore) Pte. Ltd. Multiple host support for remote expansion apparatus
US8484398B2 (en) * 2004-11-30 2013-07-09 International Business Machines Corporation Multiple host support for remote expansion apparatus
US20060117124A1 (en) * 2004-11-30 2006-06-01 Grasso Lawrence J Multiple host support for remote expansion apparatus
US20080040551A1 (en) * 2005-12-07 2008-02-14 Microsoft Corporation Cache metadata identifiers for isolation and sharing
US8001538B2 (en) * 2005-12-07 2011-08-16 Microsoft Corporation Software accessible cache metadata
US8225297B2 (en) 2005-12-07 2012-07-17 Microsoft Corporation Cache metadata identifiers for isolation and sharing
US20070245309A1 (en) * 2005-12-07 2007-10-18 Microsoft Corporation Software accessible cache metadata
US8813052B2 (en) 2005-12-07 2014-08-19 Microsoft Corporation Cache metadata for implementing bounded transactional memory
US20070245099A1 (en) * 2005-12-07 2007-10-18 Microsoft Corporation Cache metadata for implementing bounded transactional memory
US8898652B2 (en) 2006-03-23 2014-11-25 Microsoft Corporation Cache metadata for accelerating software transactional memory
US20070245128A1 (en) * 2006-03-23 2007-10-18 Microsoft Corporation Cache metadata for accelerating software transactional memory
US20160179668A1 (en) * 2014-05-28 2016-06-23 Mediatek Inc. Computing system with reduced data exchange overhead and related data exchange method thereof
CN105874431A (en) * 2014-05-28 2016-08-17 联发科技股份有限公司 Computing system with reduced data exchange overhead and related data exchange method thereof
US20160140051A1 (en) * 2014-11-14 2016-05-19 Cavium, Inc. Translation lookaside buffer invalidation suppression
US20160140040A1 (en) * 2014-11-14 2016-05-19 Cavium, Inc. Filtering translation lookaside buffer invalidations
US9684606B2 (en) * 2014-11-14 2017-06-20 Cavium, Inc. Translation lookaside buffer invalidation suppression
US9697137B2 (en) * 2014-11-14 2017-07-04 Cavium, Inc. Filtering translation lookaside buffer invalidations
US10089150B2 (en) 2015-02-03 2018-10-02 Alibaba Group Holding Limited Apparatus, device and method for allocating CPU resources
US10268247B2 (en) 2015-12-11 2019-04-23 Samsung Electronics Co., Ltd. Thermal management of spatially dispersed operation processors
US10545562B2 (en) 2016-07-05 2020-01-28 Samsung Electronics Co., Ltd. Electronic device and method for operating the same
US11327918B2 (en) * 2018-06-29 2022-05-10 Intel Corporation CPU hot-swapping
US10977183B2 (en) * 2018-12-11 2021-04-13 International Business Machines Corporation Processing a sequence of translation entry invalidation requests with regard to draining a processor core

Also Published As

Publication number Publication date
US20020042856A1 (en) 2002-04-11
US20020091891A1 (en) 2002-07-11
US7024509B2 (en) 2006-04-04
US7065169B2 (en) 2006-06-20
US6920516B2 (en) 2005-07-19
US20020046384A1 (en) 2002-04-18

Similar Documents

Publication Publication Date Title
US20020087614A1 (en) Programmable tuning for flow control and support for CPU hot plug
US6633967B1 (en) Coherent translation look-aside buffer
US5524235A (en) System for arbitrating access to memory with dynamic priority assignment
US5878268A (en) Multiprocessing system configured to store coherency state within multiple subnodes of a processing node
US7047374B2 (en) Memory read/write reordering
US5805839A (en) Efficient technique for implementing broadcasts on a system of hierarchical buses
US5754877A (en) Extended symmetrical multiprocessor architecture
EP0817073B1 (en) A multiprocessing system configured to perform efficient write operations
JP3765586B2 (en) Multiprocessor computer system architecture.
US6622214B1 (en) System and method for maintaining memory coherency in a computer system having multiple system buses
US5859988A (en) Triple-port bus bridge
US6330630B1 (en) Computer system having improved data transfer across a bus bridge
US6353877B1 (en) Performance optimization and system bus duty cycle reduction by I/O bridge partial cache line write
US6553446B1 (en) Modular input/output controller capable of routing packets over busses operating at different speeds
US5398325A (en) Methods and apparatus for improving cache consistency using a single copy of a cache tag memory in multiple processor computer systems
EP1010090B1 (en) Multiprocessing computer system employing a cluster protection mechanism
US6826653B2 (en) Block data mover adapted to contain faults in a partitioned multiprocessor system
US20090037614A1 (en) Offloading input/output (I/O) virtualization operations to a processor
US5269005A (en) Method and apparatus for transferring data within a computer system
US6157980A (en) Cache directory addressing scheme for variable cache sizes
KR20050005553A (en) Memory hub with internal cache and/or memory access prediction
US6546465B1 (en) Chaining directory reads and writes to reduce DRAM bandwidth in a directory based CC-NUMA protocol
JP5265827B2 (en) Hybrid coherence protocol
US20090006777A1 (en) Apparatus for reducing cache latency while preserving cache bandwidth in a cache subsystem of a processor
EP0817095B1 (en) Extended symmetrical multiprocessor architecture

Legal Events

Date Code Title Description
AS Assignment

Owner name: COMPAQ INFORMATION TECHNOLOGIES GROUP, L.P., TEXAS

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:KOCEV, ANDREJ;DUNCAN, SAMUEL H.;HO, STEVEN;REEL/FRAME:012137/0993

Effective date: 20010830

AS Assignment

Owner name: COMPAQ INFORMATION TECHNOLOGIES GROUP, L.P., TEXAS

Free format text: CORRECTIVE ASSIGNMENT TO REMOVE 3RD ASSIGNOR'S NAME THAT WAS INADVERTENTLY INCLUDED ON PREVIOUSLY RECORDED ASSIGNMENT, REEL;ASSIGNORS:KOCEV, ANDREJ;DUNCAN, SAMUEL H.;REEL/FRAME:012375/0317

Effective date: 20010830

AS Assignment

Owner name: HEWLETT-PACKARD DEVELOPMENT COMPANY, L.P., TEXAS

Free format text: CHANGE OF NAME;ASSIGNOR:COMPAQ INFORMATION TECHNOLOGIES GROUP, L.P.;REEL/FRAME:016764/0921

Effective date: 20021001

STCB Information on status: application discontinuation

Free format text: ABANDONED -- AFTER EXAMINER'S ANSWER OR BOARD OF APPEALS DECISION