WO1999066417A9

WO1999066417A9 - Bus controller with cycle termination monitor

Info

Publication number: WO1999066417A9
Application number: PCT/US1999/013190
Authority: WO
Inventors: Paul J Garnett
Original assignee: Sun Microsystems Inc
Priority date: 1998-06-15
Filing date: 1999-06-10
Publication date: 2000-07-06
Also published as: EP1086431B1; EP1086431A1; JP2002518745A; DE69900990T2; US5991900A; WO1999066417A1; DE69900990D1

Abstract

A bus controller for a computer system. The controller comprises a monitor for monitoring request signals and response signals between a first component and a second component each connected to a bus of the computer system; and a terminator controlled by the monitor to terminate a request from one of the first and second components if a response to the request has not issued within a predetermined period of time.

Description

BUS CONTROLLER WITH CYCLE TERMINATION MONITOR

BACKGROUND OF THE INVENΗON

This invention relates to a bus controller for a computer system such as a multi-processor system, for example, m which first and second processing sets (each of which may compπse one or more processors) communicate with an I/O device bus via a bπdge

The application finds particular but not exclusive application to fault tolerant computer systems where two or more processor sets need to communicate in lockstep with an I/O device bus via a bπdge

In such a fault tolerant computer system, an aim is not only to be able to identify faults, but also to provide a structure which is able to provide a high degree of system availability and system resihence to internal or external disturbances In order to provide high levels of system resilience to mtemal disturbances, such as a processor failure or a bπdge failure for example, it would be desirable for such systems automaftcally to control access to and from a device that might appear to be causing problems

Automatic access control provides significant technical challenges in that the system has not only to monitor the devices in question to detect errors, but also has to provide an environment where the system as a whole tan continue to operate despite a failure of one or more of the system components In addition, the controller must also deal with any outstanding requests issued by components of the computer system and this is typically problematic as bus protocols, such as PCI for example, typically don't support eπor termination at all stages of operation

Accordingly, an aim of the present invention is to address these technical problems

SUMMARY OF THE INVENΗON

Particular and preferred aspects of the invention are set out m the accompanying mdependent and dependent claims Combinations of features from the dependent claims may be combined with features of the independent claims as appropnate and not merely as explicitly set out in the claims In accordance with one aspect of the invention, there is provided a bus control mechanism for a computer system that mcludes a bus, a first component and a second component, wherein the first and second components are interconnected via the bus for performing a data transfer operation, the data transfer operation being initiated by an exchange of request and response signals between the first and second components, and a component that issued a request signal is operable to effect data transfer on receipt of a response signal, the bus control mechanism compπsmg- a switch selectively operable to disable the bus, a fake response generator selectively operable io generate a fake response signal, and a controller operable to monitor the request and response signals exchanged between the components and, m situations where a coπesponding response signal is not issued within Δ predetermined time following a particular request signal, to cause the switch to disable the bus and to cause the fake response generator to issue a fake response signal to the component that issued the particular request signal tor terminating the data transfer operation

Thus, embodiments of the present invention can provide a bus controller which monitors request and response signals on the bus Where a response signal is not provided within a predetermined penod following a request signal, the controller causes a switch to disable the bus an causes a fake response signal to be supplied to the initiator of the request As a result, locking up of the system may be prevented In one particular embodiment the switch isolates first and second parts of the bus, so that data signals cannot reach the target component

The component that issued the particular request signal can be operable wherein the component that issued the particular request signal is operable, on receipt of the fake response signal, to transfer data to the bus, which data is thus discarded as a result of the disabling of the bus by the switch The switch may be a FET

The computer system can mclude a clock signal generator, and the terminator includes a counter for counting clock signals between detection of a particular request signal and detection of a corresponding response signal, the request bemg terminated if a response has not been detected within a predetermined number of clock cycles The first component may be a processing set comprising one or more processors, and the second component may be a bπdge for directing signals from the processmg set to one or more system resources and from the one or more resources to the processmg set

The bus may be a PCI bus

In one embodiment, it may be assumed that one of the first and second components is a functioning component and the other of the components is a malfunctioning component, the terminator bemg arranged to terminate a request from the functioning component only

Alternatively, the terminator may be arranged to terminate requests from either component

The request signal may be an IRDY# signal and the response signal may be a TRDY# signal

In accordance with another aspect of the invention, there is provided a computer system comprising a bus a first component and a second component interconnected via the bus for performing a data transfer operation the data transfer operation bemg initiated by an exchange of request and lesponse signals, wherein a component that initiates a request signal is operable to effect data transfer upon receipt of a response signal, and a bus control mechanism that compπses a switch selectively operable to disable the bus, and a fake response generator selectively operable to generate a fake response signal, and a controller operable to monitor the request and response signals exchanged between the first and second components and, in situations where a coπesponding response signal is not issued within a predetermined t me following a particular request signal, to cause the switch to disable the bus an to cause the fake response generator to issue a fake response signal to the component that issued the particular request signal for terminating the data transfer operation

In accordance with a further aspect of the invention, there is provided a method of controlling a bus of a computer system including a first component and a second component mterconnected via the bus for performing a data transfer operation, wherem the data transfer operation is initiated by an exchange of request and response signals, wherem a component which initiated a requested signal is operable to effect data transfer upon receipt of a response signal, the method compπsmg monitoring the request signal on the bus, timing a period following the request signal, m the absence of a corresponding response signal within the period, disabling the bus and issuing a fake response signal to the component which initiated the request to thereby terminate the data transfer operation

The method may further compπse disabling the bus if a response to the request has not issued within the predetermined peπod of time The method may further compnse issuing a fake response to terminate the request DESCRIPTION OF THE PREFERRED EMBODIMENTS

Exemplary embodiments of the present invention will be described hereinafter, by way of example only, with reference to the accompanying drawings in which like reference signs relate to like elements and in which:

Figure 1 is a schematic overview of a fault tolerant computer system incorporating an embodiment of the invention;

Figure 2 is a schematic overview of a specific implementation of a system based on that of Figure 1 ;

Figure 3 is a schematic representation of one implementation of a processing set;

Figure 4 is a schematic representation of another example of a processing set;

Figure 5 is a schematic representation of a further processing set; Figure 6 is a schematic block diagram of an embodiment of a bridge for the system of Figure 1 ;

Figure 7 is a schematic block diagram of storage for the bridge of Figure 6;

Figure 8 is a schematic block diagram of control logic of the bridge of Figure 6;

Figure 9 is a schematic representation of a routing matrix of the bridge of Figure 6;

Figure 10 is an example implementation of the bridge of Figure 6; Figure 11 is a state diagram illustrating operational states of the bridge of Figure 6;

Figure 12 is a flow diagram illustrating stages in the operation of the bridge of Figure 6;

Figure 13 is a detail of a stage of operation from Figure 12;

Figure 14 illustrates the posting of I O cycles in the system of Figure 1;

Figure 15 illustrates the data stored in a posted write buffer; Figure 16 is a schematic representation of a slot response register;

Figure 17 illustrates a dissimilar data write stage;

Figure 18 illustrates a modification to Figure 17;

Figure 19 illustrates a dissimilar data read stage;

Figure 20 illustrates an alternative dissimilar data read stage; Figure 21 is a flow diagram summarising the operation of a dissimilar data write mechanism;

Figure 22 is a schematic block diagram explaining arbitration within the system of Figure 1 ;

Figure 23 is a state diagram illustrating the operation of a device bus arbiter;

Figure 24 is a state diagram illustrating the operation of a bridge arbiter;

Figure 25 is a timing diagram for PCI signals; Figure 26 is a schematic diagram illustrating the operation of the bridge of Figure 6 for direct memory access;

Figure 27 is a flow diagram illustrating a direct memory access method in the bridge of Figure 6;

Figure 28 is a flow diagram of a re-integration process including the monitoring of a dirty RAM;

Figure 29 is a schematic representation of a bus timeout mechanism;

Figures 30a, 30b, 30c and 30d are schematic representations of PCI protocol signals; Figure 31 is a schematic representation of the termination of a bus signal;

Figure 32 an alternative implementation of the mechanism of Figure 29; and

Figure 33 is a schematic representation of another implementation of the mechanism of Figure 29. DESCRIPTION OF THE PREFERRED EMBODIMENTS

Figure 1 is a schematic overview of a fault tolerant computing system 10 comprising a plurality of CPUsets (processmg sets) 14 and 16 and a bridge 12 As shown m Figure 1, there are two processmg sets 14 and 16, although m other embodiments there may be three or more processing sets The bπdge 12 forms an mterface between the processmg sets and I O devices such as devices 28, 29, 30, 31 and 32 In this document, the term "processing set" is used to denote a group of one or more processors, possibly including memory, which output and receive common outputs and inputs It should be noted that the alternative term mentioned above, "CPUset", could be used instead, and that these terms could be used interchangeably throughout this document Also, it should be noted that the term "bπdge" is used to denote any device, apparatus or arrangement suitable for interconnecting two or more buses of the same or different types

The first processmg set 14 is connected to the bridge 12 via a first processing set I/O bus (PA bus) 24 ,n the present instance a Penpheral Component Interconnect (PCI) bus The second processing set 16 is connected to the bndge 12 via a second processmg set I/O bus (PB bus) 26 of the same type as the PA bus 24 (I e here a PCI bus) The I/O devices are connected to the bπdge 12 via a device I/O bus (D bus) 22 in the present instance also a PCI bus

Although, in the particular example described, the buses 22, 24 and 26 are all PCI buses, this is merel> b\ way of example, and in other embodiments other bus protocols may be used and the D-bus 22 may have a diffeiem protocol from that of the PA bus and the PB bus (P buses) 24 and 26

The processmg sets 14 and 16 and the bndge 12 are operable in synchronism under the control ot a common clock 20, which is connected thereto by clock signal lines 21

Some of the devices mcludmg an Ethernet (E-NET) mterface 28 and a Small Computer System Interface (SCSI) mterface 29 are permanently connected to the device bus 22, but other I/O devices such as I/O devices 30, 31 and 32 can be hot insertable into individual switched slots 33, 34 and 35 Dynamic field effect transistor (FET) switchmg can be provided for the slots 33, 34 and 35 to enable hot lnsertabihty of the devices such as devices 30, 31 and 32 The provision of the FETs enables an mcrease in the length of the D bus 22 as only those devices which are active are switched on, reducmg the effective total bus length It will be appreciated that the number of I/O devices which may be connected to the D bus 22, and the number of slots provided for them, can be adjusted accordmg to a particular implementation m accordance with specific design requirements

Figure 2 is a schematic overview of a particular implementation of a fault tolerant computer employing a bπdge structure of the type illustrated in Figure 1 In Figure 2, the fault tolerant computer system includes a plurality (here four) of bndges 12 on first and second I/O motherboards (MB 40 and MB 42) order to increase the number of I/O devices which may be connected and also to improve reliability and redundancy Thus in th_*. embodiment shown in Figure 2, two processmg sets 14 and 16 are each provided on a respective processing set board 44 and 46, with the processmg set boards 44 and 46 'bridging' the I/O motherboards MB 40 and MB 42 A first, master clock source 20A is mounted on the first motheiboard 40 and a second, slave clock source 20B is mounted on the second motherboard 42 Clock signals are supplied to the processing set boards 44 and 46 via respective connections (not shown in Figure 2)

First and second bndges 12 1 and 12 2 are mounted on the first I/O motherboard 40 The first bndge 12 1 is connected to the processing sets 14 and 16 by P buses 24 1 and 26 1, respectively Similarly, the second bπdge 12.2 is connected to the processing sets 14 and 16 by P buses 24.2 and 26.2, respectively. The bridge 12.1 is connected to an I/O databus (D bus) 22.1 and the bridge 12.2 is connected to an I/O databus (D bus) 22.2.

Third and fourth bridges 12.3 and 12.4 are mounted on the second I/O motherboard 42. The bridge 12.3 is connected to the processing sets 14 and 16 by P buses 24.3 and 26.3, respectively. Similarly, the bridge 4 is connected to the processing sets 14 and 16 by P buses 24.4 and 26.4, respectively. The bridge 12.3 is connected to an I/O databus (D bus) 22.3 and the bridge 12.4 is connected to an I/O databus (D bus) 22.4.

It can be seen that the aπangement shown in Figure 2 can enable a large number of I/O devices to be connected to the two processing sets 14 and 16 via the D buses 22.1, 22.2, 22.3 and 22.4 for either increasing the range of I O devices available, or providing a higher degree of redundancy, or both. Figure 3 is a schematic overview of one possible configuration of a processing set, such as the processing set 14 of Figure 1. The processing set 16 could have the same configuration. In Figure 3, a plurality of processors (here four) 52 are connected by one or more buses 54 to a processing set bus controller 50. As shown in Figure 3, one or more processing set output buses 24 are connected to the processing set bus controller 50, each processing set output bus 24 being connected to a respective bridge 12. For example, in the aπangement of Figure 1, only one processing set I/O bus (P bus) 24 would be provided, whereas in the arrangement of Figure 2, four such processing set I O buses (P buses) 24 would be provided. In the processing set 14 shown in Figure 3, individual processors operate using the common memory 56, and receive inputs and provide outputs on the common P bus(es) 24.

Figure 4 is an alternative configuration of a processing set, such as the processing set 14 of Figure 1 Here a plurality of processor/memory groups 61 are connected to a common internal bus 64. Each processor/memory group 61 includes one or more processors 62 and associated memory 66 connected to a internal group bus 63. An interface 65 connects the internal group bus 63 to the common internal bus 64. Accordingly, in the arrangement shown in Figure 4, individual processing groups, with each of the processors 62 and associated memory 66 are connected via a common internal bus 64 to a processing set bus controller 60. The interfaces 65 enable a processor 62 of one processing group to operate not only on the data in its local memory 66, but also in the memory of another processing group 61 within the processing set 14. The processing set bus controller 60 provides a common interface between the common internal bus 64 and the processing set I/O bus(es) (P bus(es)) 24 connected to the bridge(s) 12. It should be noted that although only two processing groups 61 are shown in Figure 4, it will be appreciated that such a structure is not limited to this number of processing groups.

Figure 5 illustrates an alternative configuration of a processing set, such as the processing set 14 of Figure 1. Here a simple processing set includes a single processor 72 and associated memory 76 connected via a common bus 74 to a processing set bus controller 70. The processing set bus controller 70 provides an interface between the internal bus 74 and the processing set I/O bus(es) (P bus(es)) 24 for connection to the bridge(s) 12.

Accordingly, it will be appreciated from Figures 3, 4 and 5 that the processing set may have many different forms and that the particular choice of a particular processing set structure can be made on the basis of the processing requirement of a particular application and the degree of redundancy required. In the following description, it is assumed that the processing sets 14 and 16 refeπed to have a structure as shown in Figure 3, although it will be appreciated that another form of processing set could be provided.

The bridge(s) 12 are operable in a number of operating modes. These modes of operation will be described in more detail later. However, to assist in a general understanding of the structure of the bridge, the two operating modes will be briefly summanzed here In a first, combined mode, a bridge 12 is operable to route addresses and data between the processing sets 14 and 16 (via the PA and PB buses 24 and 26, respectively) and the devices (via the D bus 22) In this combined mode, I/O cycles generated by the processing sets 14 and 16 are compared to ensure that both processing sets are operating coπectly Companson failures force the bπdge 12 into an error limiting mode (EState) in which device I/O is prevented and diagnostic information is collected In the second, split mode, the bπdge 12 routes and arbitrates addresses and data from one of the processing sets 14 and 16 onto the D bus 22 and/or onto the other one of the processing sets 16 and 14, respectively In this mode ot operation, the processmg sets 14 and 16 are not synchronized and no I/O comparisons are made DMA operations are also permitted in both modes As mentioned abo\ e, the different modes of operation including the combn d and split modes, will be described in more detail later Howevei, theie now follows a description ot tin. bask structure of an example of the bridge 12

Figure 6 is a schematic functional overview of the bridge 12 of Figure 1 First and second processing set I/O bus mterfaces, PA bus mterface 84 and PB bus mterface 86, are connected to the PA and PB buses 24 and 26 respectively A device I/O bus mterface, D bus mterface 82, is connected to the D bus 22 It should be noted that the PA, PB and D bus mterfaces need not be configured as separate elements but could be mcorporated in other elements of the bndge Accordingly, within the context of this document, where a references is made to a bus mterface, this does not require the presence of a specific separate component, but rather the capability of the bπdge to connect to the bus concerned, for example by means of physical or logical bπdge connections for the lmes of the buses concerned Routmg (hereinafter termed a routmg matrix) 80 is connected via a first internal path 94 to the PA bus interface 84 and via a second internal path 96 to the PB bus mterface 86 The routmg matnx 80 is further connected via a third internal path 92 to the D bus mterface 82 The routing matnx 80 is thereby able to provide I/O bus transaction routmg m both directions between the PA and PB bus interfaces 84 and 86 It is also able to provide routmg in both directions between one or both of the P\ and PB bus interfaces and the D bus interface 82 The routing matrix 80 is connected via a further internal path 100 to storage control logic 90 The storage contiol logic 90 controls access to bndge registers 110 and to a random access memory (SRAM) 126 The routing man

80 is therefore also operable to provide routmg m both directions between the PA, PB and D bus interfaces 84 86 and 82 and the storage control logic 90 The routing matrix 80 is controlled by bndge control logic 88 over control paths 98 and 99 The bπdge control logic 88 is responsive to control signals, data and addresses on internal paths 93, 95 and 97, and also to clock signals on the clock hne(s) 21

In the embodiment of the invention, each of the P buses (PA bus 24 and PB bus 26) operates under a PCI protocol The processing set bus controllers 50 (see Figure 3) also operate under the PCI protocol Accordingly, the PA and PB bus mterfaces 84 and 86 each provide all the functionality required for a compatible mterface providmg both master and slave operation for data transfeπed to and from the D bus 22 or internal memories and registers of the bπdge m the storage subsystem 90 The bus mterfaces 84 and 86 can provide diagnostic information to mtemal bndge status registers m the storage subsystem 90 on transition of the bridge to an eπor state (EState) or on detection of an I/O eπor

The device bus mterface 82 performs all the functionality required for a PCI compliant master and slave interface for transfemng data to and from one of the PA and PB buses 84 and 86 The D bus 82 is operable during direct memory access (DMA) transfers to provide diagnostic information to internal status registers m the storage subsystem 90 of the bπdge on transition to an EState or on detection of an I/O eπor

Figure 7 illustrates m more detail the bridge registers 110 and the SRAM 124 The storage control logic 110 is connected via a path (e g a bus) 112 to a number of register components 114, 116, 118, 120 The storage control logic is also connected via a path (e g a bus) 128 to the SRAM 126 m which a posted write buffer component 122 and a dirty RAM component 124 are mapped Although a particular configuration of the components 114, 116, 118, 120, 122 and 124 is shown in Figure 7, these components may be configured m other ways, with other components defined as regions of a common memory (e g a random access memory such as the SRAM 126, with the path 112/128 bemg formed by the internal addressing of the regions of memory) As shown m Figure 7, the posted wπte buffer 122 and the dirty RAM 124 are mapped to different regions of the SRAM memory 126, whereas the registers 114, 116, 118 and 120 are configured as separate from the SRAM memory

Control and status registers (CSRs) 114 form mtemal registers which allow the control of various operating modes of the bπdge, allow the capture of diagnostic information for an EState and for I/O errors and control processmg set access to PCI slots and devices connected to the D bus 22 These registers are set by signals from the routing matrix 80

Dissimilar data registers (DDRs) 116 provide locations for containing dissimilar data for diffon. ni processmg sets to enable non-deterministic data events to be handled These registers are set by signals from the PA and PB buses

Bridge decode logic enables a common write to disable a data comparator and allow writes to two DDRs 116, one for each processmg set 14 and 16

A selected one of the DDRs can then be read ln-sync by the processmg sets 14 and 16 The DDRs thus provide a mechanism enablmg a location to be reflected from one processmg set (14/16) to another (16/14)

Slot response registers (SRRs) 118 determine ownership of device slots on the D bus 22 and to allow DMA to be routed to the appropnate processmg set(s) These registers are linked to address decode logic Disconnect registers 120 are used for the storage of data phases of an I/O cycle which is aborted while data is m the bπdge on the way to another bus The disconnect registers 120 receive all data queued m the bndge when a target device disconnects a transaction, or as the EState is detected These registers are connected to the routing matnx 80 The routmg matnx can queue up to three data words and byte enables Provided the initial addresses are voted as bemg equal, address target controllers deπve addresses which increment as data is exchanged between the bπdge and the destmation (or target) Where a wnter (for example a processor I/O wnte, or a DVMA (D bus to P bus access)) is wπting data to a target, this data can be caught in the bπdge when an error occurs Accordingly, this data is stored in the disconnect registers 120 when an error occurs These disconnect registers can then be accessed on recovery from an EState to recover the data associated with the wnte or read cycle which was in progress when the EState was initiated Although shown separately, the DDRs 116, the SRRs 118 and the disconnect registers may form an mtegral part of the CSRs 114

EState and error CSRs 114 provided for the capture of a failing cycle on the P buses 24 and 26, with an indication of the failing datum Following a move to an EState, all of the wntes initiated to the P buses are logged in the posted wnte buffer 122 These may be other writes that have been posted in the processmg set bus controllers 50, or which may be initiated by software before an EState interrupt causes the processors to stop carrying out writes to the P buses 24 and 26.

A dirty RAM 124 is used to indicate which pages of the main memory 56 of the processing sets 14 and 16 have been modified by direct memory access (DMA) transactions from one or more devices on the D bus 22. Each page (e.g. each 8K page) is marked by a single bit in the dirty RAM 124 which is set when a DMA write occurs and can be cleared by a read and clear cycle initiated on the dirty RAM 124 by a processor 52 of a processing set 14 and 16.

The dirty RAM 124 and the posted write buffer 118 may both be mapped into the memory 124 in the bridge 12. This memory space can be accessed during normal read and write cycles for testing purposes. Figure 8 is a schematic functional overview of the bridge control logic 88 shown in Figure 6.

All of the devices connected to the D bus 22 are addressed geographically. Accordingly, the bridge caπies out decoding necessary to enable the isolating FETs for each slot before an access to those slots is initiated.

The address decoding performed by the address decode logic 136 and 138 essentially permits four basic access types: - an out-of-sync access (i.e. not in the combined mode) by one processing set (e.g. processing set 14 of

Figure 1) to the other processing set (e.g. processing set 16 of Figure 1), in which case the access is routed from the PA bus interface 84 to the PB bus interface 86;

- an access by one of the processing sets 14 and 16 in the split mode, or both processing sets 14 and 16 in the combined mode to an I O device on the D bus 22, in which case the access is routed via the D bus interface 82: - a DMA access by a device on the D bus 22 to one or both of the processing sets 14 and 16, which w ould be directed to both processing sets 14 and 16 in the combined mode, or to the relevant processing set 14 or 16 it out-of-sync, and if in a split mode to a processing set 14 or 16 which owns a slot in which the device is located; and

- a PCI configuration access to devices in I/O slots.

As mentioned above, geographic addressing is employed. Thus, for example, slot 0 on motherboard A has the same address when refeπed to by processing set 14 or by processing set 16.

Geographic addressing is used in combination with the PCI slot FET switching. During a configuration access mentioned above, separate device select signals are provided for devices which are not FET isolated. A single device select signal can be provided for the switched PCI slots as the FET signals can be used to enable a correct card. Separate FET switch lines are provided to each slot for separately switching the FETs for the slots. The SRRs 118, which could be incorporated in the CSR registers 114, are associated with the address decode functions. The SRRs 118 serve in a number of different roles which will be described in more detail later. However, some of the roles are summarized here.

In a combined mode, each slot may be disabled so that writes are simply acknowledged without any transaction occurring on the device bus 22, whereby the data is lost. Reads will return meaningless data, once again without causing a transaction on the device board.

In the split mode, each slot can be in one of three states. The states are:

- Not owned;

- Owned by processing set A 14;

- Owned by processing set B 16. A slot that is not owned by a processmg set 14 or 16 making an access (this includes not owned or unowned slots) cannot be accessed Accordmgly, such an access is aborted

When a processmg set 14 or 16 is powered off, all slots owned by it move to the un-owned state A processmg set 14 or 16 can only claim an un-owned slot, it cannot wrest ownership away from another processing set This can only be done by powermg off the other processing set, or by gettmg the other processing set to relinquish ownership

The ownership bits are assessable and settable while in the combined mode, but have no effect until a spin state is entered This allows the configuration of a split system to be determined while still in the combined mode

Each PCI device is allocated an area of the processing set address map The top bits of the address are determmed by the PCI slot Where a device carries out DMA, the bπdge is able to check that the device is using the coπect address because a D bus arbiter informs the bridge which device is using the bus at a particular time If a device access is a processmg set address which is not valid foi it, then the device access will be ignored It should be noted that an address presented by a device will be a virtual address which would be translated by an I O memory management unit in the processmg set bus controller 50 to an actual memory address The addresses output by the address decoders are passed via the initiator and target controllers 138 and

140 to the routmg matnx 80 via the lmes 98 under control of a bndge controller 132 and an arbiter 134

An arbiter 134 is operable m vanous different modes to arbitrate for use of the bndge on a first-come-first- served basis usmg conventional PCI bus signals on the P and D buses

In a combined mode, the arbiter 134 is operable to arbitrate between the ln-sync processmg sets 14 and 16 and any initiators on the device bus 22 for use of the bridge 12 Possible scenarios are

- processmg set access to the device bus 22,

- processmg set access to internal registers m the bridge 12,

- Device access to the processmg set memory 56

In split mode, both processmg sets 14 and 16 must arbitrate the use of the bπdge and thus access to the device bus 22 and internal bπdge registers (e g CSR registers 114) The bndge 12 must also contend with initiators on the device bus 22 for use of that device bus 22

Each slot on the device bus has an arbitration enable bit associated with it These arbitration enable bits are cleared after reset and must be set to allow a slot to request a bus When a device on the device bus 22 is suspected of providing an I/O eπor, the arbitration enable bit for that device is automatically reset by the bridge A PCI bus mterface in the processmg set bus controllers) 50 expects to be the master bus controller for the P bus concerned, that is it contams the PCI bus arbiter for the PA or PB bus to which it is connected The bπdge 12 cannot directly control access to the PA and PB buses 24 and 26 The bndge 12 competes for access to the PA or PB bus with the processmg set on the bus concerned under the control of the bus controller 50 on the bus concerned Also shown m Figure 8 is a comparator 130 and a bndge controller 132 The comparator 130 is operable to compare I O cycles from the processmg sets 14 and 16 to determine any out-of-sync events On determining an out-of-sync event, the comparator 130 is operable to cause the bπdge controller 132 to activate an EState for analysis of the out-of-sync event and possible recovery therefrom

Figure 9 is a schematic functional overview of the routmg matrix 80 The routing matrix 80 compπses a multiplexer 143 which is responsive to initiator control signals 98 lioin the initiator controller 138 of Figure 8 to select one of the PA bus path 94 , PB bus path 96, D bus path 92 oi mtemal bus path 100 as the cuπent input to the routmg matnx Separate output buffers 144, 145, 146 and 147 are provided for output to each of the paths 94, 96, 92 and 100, with those buffers bemg selectively enabled by signals 99 from the target controller 140 of Figure 8 Between the multiplexer and the buffers 144-147 signals are held in a buffer 149 In the present embodiment three cycles of data for an I/O cycle will be held in the pipeline represented by the multiplexer 143, the buffer 149 and the buffers 144

In Figures 6 to 9 a functional descπption of elements of the bπdge has been given Figure 10 is a schematic representation of a physical configuration of the bridge in which the bndge control logic 88, the storage control logic 90 and the bndge registers 110 are implemented m a first field programmable gate aπay (FPGA) 89, the routmg matnx 80 is implemented m further FPGAs 80 1 and 80 2 and the SRAM 126 is implemented as one or more separate SRAMs addressed by a address control lines 127 The bus mterfaces 82, 84 and 86 shown in Figure 6 are not separate elements, but are integrated in the FPGAs 80 1 , 80 2 and 89 Two FPGAs 80 1 and 80 2 are used for the upper 32 bits 32-63 of a 64 bit PCI bus and the lower 32 bits 0-31 of the 64 bit PCI bus It w ill k appreciated that a single FPGA could be employed for the routing matnx 80 where the necessary logic tan b<. accommodated withm the device Indeed, where a FPGA of sufficient capacity is available, the bndge contiol logic, storage control logic and the bndge registers could be incorporated m the same FPGA as the routing matrix Indeed many other configurations may be envisaged, and mdeed technology other than FPGAs, for example one or more Application Specific Integrated Circuits (ASICs) may be employed As shown in Figure 10, the FPGAs 89 80 1 and 80.2 and the SRAM 126 are connected via mtemal bus paths 85 and path control lines 87

Figure 11 is a transition diagram illustrating in more detail the vaπous operatmg modes of the bndge The bndge operation can be divided mto three basic modes, namely an eπor state (EState) mode 150, a split state mode 156 and a combmed state mode 158. The EState mode 150 can be further divided mto 2 states.

After initial resetting on powermg up the bndge, or following an out-of sync event, the bndge is m this initial EState 152 In this state, all wntes are stored m the posted wnte buffer 120 and reads from the mtemal bπdge registers (e.g., the CSR registers 116) are allowed, and all other reads are treated as eπors (I e they are aborted). In this state, the individual processing sets 14 and 16 perform evaluations for determining a restart tune Each processing set 14 and 16 will determine its own restart timer timing The timer setting depends on a "blame" factor for the transition to the EState. A processmg set which determines that it is likely to have caused the eπor sets a long time for the tuner A processmg set which thinks it unlikely to have caused the eπor sets a short time for the timer. The first processmg set 14 and 16 which tunes out, becomes a primary processmg set Accordingly, when this is determmed, the bπdge moves (153) to the pπmary EState 154

When either processmg set 14/16 has become the pπmary processmg set, the bπdge is then operating in the pπmary EState 154 This state allows the primary processmg set to wnte to bndge registers (specifically the SRRs 118) Other wntes are no longer stored in the posted wnte buffer, but are simply lost Device bus reads are still aborted in the primary EState 154.

Once the EState condition is removed, the bndge then moves (155) to the split state 156. In the split state 156, access to the device bus 22 is controlled by the SRR registers 118 while access to the bπdge storage is simply arbitrated. The primary status of the processmg sets 14 and 16 is ignored Transition to a combmed operation is achieved by means of a sync_reset (157). After issue of the sync reset operation, the bridge is then operable in the combined state 158, whereby all read and write accesses on the D bus 22 and the PA and PB buses 24 and 26 aie allowed. All such accesses on the PA and PB buses 24 and 26 are compared in the comparator 130. Detection of a mismatch between any read and write cycles (with an exception of specific dissimilar data I/O cycles) cause a transition 151 to the EState 150. The vaπous states described are controlled by the bridge controller 132

The role of the comparator 130 is to monitor and compare I/O operations on the PA and PB buses in the combined state 151 and, in response to a mismatched signal, to notify the bπdge controller 132, whereby the bπdge controller 132 causes the transition 152 to the eπor state 150 The I/O operations can include all I/O operations mitiated by the processing sets, as well as DMA transfers in respect of DMA initiated by a device on the device bus.

Table 1 below summarizes the vanous access operations which are allowed m each of the operational states

TABLE 1

D Bus - Read D Bus- Write

E State Master Abort Stored in Post Wnte Buffer

Primary EState Master Abort Lost

Split Controlled by SRR bits Controlled by SRR bits and arbitrated and arbitrated

Combmed Allowed and compared Allowed and compared

As described above, after an initial reset, the system is in the initial EState 152 In this state, neithei processing sets 14 or 16 can access the D bus 22 or the P bus 26 or 24 of the other processmg set 16 or 14 The internal bridge registers 116 of the bridge are accessible, but are read only. A system running in the combined mode 158 transitions to the EState 150 where there is a comparison failure detected in this bridge, or alternatively a companson failure is detected m another bridge m a multi-bridge system as shown, for example, m Figure 2. Also, transitions to an EState 150 can occur m other situations, for example m the case of a software controlled event forming part of a self test operation.

On moving to the EState 150, an interrupt is signaled to all or a subset of the processors of the processing sets via an interrupt line 95. Following this, all I/O cycles generated on a P bus 24 or 26 result in reads being returned with an exception and writes being recorded in the posted wπte buffer.

The operation of the comparator 130 will now be described m more detail. The comparator is connected to paths 94, 95, 96 and 97 for comparing address, data and selected control signals from the PA and PB bus mterfaces 84 and 86. A failed companson of ui-sync accesses to device I/O bus 22 devices causes a move from the combined state 158 to the EState 150.

For processing set I/O read cycles, the address, command, address parity, byte enables and panty eπor parameters are compared. If the comparison fails dunng the address phase, the bridge asserts a retry to the processmg set bus controllers 50, which prevents data leaving the I O bus controllers 50 No activity occurs in this case on the device I O bus 22 On the processor(s) retrying, no eπor is returned

If the companson fails durmg a data phase (only control signals and byte enables are checked), the bridge signals a target-abort to the processmg set bus controllers 50 An eπor is returned to the processors

In the case of processmg set I/O bus wπte cycles, the address, command, panty, byte enables and data parameters are compared

If the comparison fails dunng the address phase, the bridge asserts a retry to the processing set bus controllers 50, which results in the processmg set bus controllers 50 retrying the cycle again The posted write buffer 122 is then active No activity occurs on the device I/O bus 22

If the companson fails durmg the data phase of a write operation, no data is passed to the D bus 22 I he failing data and any other transfer attributes from both processing sets 14 and 16 are stored in the disconntα registers 122, and any subsequent posted wnte cycles are recorded in the posted wπte buffer 118

In the case of direct virtual memory access (DVMA) reads, the data control and panty are checked for each datum If the data does not match, the bndge 12 terminates the transfer on the P bus In the case of DVMA wntes, control and panty eπor signals are checked for coπectness

Other signals m addition to those specifically mentioned above can be compared to give an indication of divergence of the processmg sets Examples of these are bus grants and vanous specific signals dunng processmg set transfers and dunng DMA transfers Errors fall roughly mto two types, those which are made visible to the software by the processmg set bus controller 50 and those which are not made visible by the processmg set bus controller 50 and hence need to be made visible by an interrupt from the bπdge 12 Accordmgly, the bndge is operable to capture eπors reported in connection with processmg set read and wnte cycles, and DMA reads and wntes

Clock control for the bπdge is performed by the bπdge controller 132 in response to the clock signals from the clock lme 21. Individual control lines from the controller 132 to the vanous elements of the bndge are not shown m Figures 6 to 10

Figure 12 is a flow diagram illustrating a possible sequence of operatmg stages where lockstep eπors are detected dunng a combmed mode of operahon

Stage SI represents the combmed mode of operation where lockstep eπor checkmg is performed by the comparator 130 shown m Figure 8

In Stage S2, a lockstep eπor is assumed to have been detected by the comparator 130 In Stage S3, the cuπent state is saved in the CSR registers 114 and posted writes are saved in the posted wnte buffer 122 and/or m the disconnect registers 120

Figure 13 illustrates Stage S3 in more detail Accordmgly, in Stage S31, the bndge controller 132 detects whether the lockstep error notified by the comparator 130 has occuπed dunng a data phase m which it is possible to pass data to the device bus 22 In this case, m Stage S32, the bus cycle is terminated Then, m Stage S33 the data phases are stored in the disconnect registers 120 and control then passes to Stage S35 where an evaluation is made as to whether a further I/O cycle needs to be stored Alternatively, if at Stage S31 , it is determined that the lockstep error did not occur dunng a data phase, the address and data phases for any posted wnte I/O cycles are stored m the posted write buffer 122 At Stage S34, if there are any further posted write I/O operations pending these are also stored in the posted write buffei 122

Stage S3 is performed at the initiation of the initial error state 152 shown in Figure 11 In this state the first and second processmg sets arbitrate for access to the bπdge Accordingly, m Stage S31-S35, the posted write address and data phases for each of the processmg sets 14 and 16 are stored m separate portions of the posted write buffer 122, and/or m the smgle set of disconnect registers as descnbed above

Figure 14 illustrates the source of the posted wnte I/O cycles which need to be stored in the posted write buffer 122 Dunng normal operation of the processmg sets 14 and 16, output buffers 162 m the individual processors contam I/O cycles which have been posted for transfer via the processmg set bus controllers 50 to the bπdge 12 and eventually to the device bus 22 Similarly, buffers 160 m the processmg set controllers 50 also contam posted I O cycles for transfer over the buses 24 and 26 to the bπdge 12 and eventually to the device bus 22

Accordmgly, it can be seen that when an eπor state occurs, I/O wπte cycles may already have been posted by the processors 52, either m their own buffers 162, or already transfeπed to the buffers 160 of the processing set bus controllers 50 It is the I/O wπte cycles m the buffers 162 and 160 which gradually propagate through and need to be stored m the posted wπte buffer 122

As shown m Figure 15, a wπte cycle 164 posted to the posted wnte buffer 122 can comprise an address field 165 mcludmg an address and an address type, and between one and 16 data fields 166 including a byte enable field and the data itself

The data is wπtten mto the posted wnte buffer 122 in the EState unless the initiating processing set has been designated as a primary CPU set At that time, non-primary wntes in an EState still go to the posted write buffer even after one of the CPU sets has become a pπmary processmg set An address pomter m the CSR registers 114 pomts to the next available posted wπte buffer address, and also provides an overflow bit which is set when the bndge attempts to wπte past of the top of the posted wnte buffer for any one of the processmg sets 14 and 16 Indeed, m the present implementation, only the first 16 K of data is recorded m each buffer Attempts to write beyond the top of the posted wnte buffer are ignored The value of the posted wnte buffer pomter can be cleared at reset, or by software usmg a wnte under the control of a pπmary processmg set

Returning to Figure 12, after saving the status and posted wntes, at Stage S4 the individual processmg sets mdependently seek to evaluate the eπor state and to determine whether one of the processmg sets is faulty This determination is made by the individual processors m an eπor state in which they individually read status from the control state and EState registers 114 Dunng this eπor mode, the arbiter 134 arbitrates for access to the bndge 12

In Stage S5, one of the processmg sets 14 and 16 establishes itself as the pnmary processmg set This is determmed by each of the processmg sets identifying a time factor based on the estimated degree of responsibility for the eπor, whereby the first processmg set to time out becomes the pπmary processmg set In Stage S5, the status is recovered for that processmg set and is copied to the other processmg set The pnmary processmg is able to access the posted wnte buffer 122 and the disconnect registers 120

In Stage S6, the bπdge is operable m a split mode If it is possible to re-establish an equivalent status for the first and second processmg sets, then a reset is issued at Stage S7 to put the processmg sets m the combined mode at Stage SI However, it may not be possible to re-establish an equivalent state until a faulty processing set is replaced. Accordmgly the system will stay m the Split mode of Stage S6 m order to continued operation based on a single processing set. After replacing the faulty processing set the system could then establish an equivalent state and move via Stage S7 to Stage SI.

As described above, the comparator 130 is operable in the combined mode to compare the I/O operations output by the first and second processing sets 14 and 16. This is fine as long as all of the I/O operations of the I^'hM and second processing sets 14 and 16 are fully synchronized and deterministic. Any deviation from this will be interpreted by the comparator 130 as a loss of lockstep. This is in principle coπect as even a minor deviation from identical outputs, if not trapped by the comparator 130, could lead to the processing sets diverging further from each other as the individual processing sets act on the deviating outputs. However, a strict application of this puts significant constraints on the design of the individual processing sets. An example of this is that it would not be possible to have independent time of day clocks in the individual processing sets operating under their own clocks. This is because it is impossible to obtain two crystals which are 100% identical in operation. Even small differences in the phase of the clocks could be critical as to whether the same sample is taken at any one time, for example either side of a clock transition for the respective processing sets.

Accordingly, a solution to this problem employs the dissimilar data registers (DDR) 116 mentioned earlier. The solution is to write data from the processing sets into respective DDRs in the bridge while disabling the comparison of the data phases of the write operations and then to read a selected one of the DDRs back to each processmg set, whereby each of the processing sets is able to act on the same data.

Figure 17 is a schematic representation of details of the bridge of Figures 6 to 10. It will be noted that details of the bridge not shown in Figure 6 to 8 are shown in Figure 17, whereas other details of the bridge shown in Figures 6 to 8 are not shown in Figure 17, for reasons of clarity.

The DDRs 116 are provided in the bridge registers 110 of Figure 7, but could be provided elsewhere in the bridge in other embodiments. One DDR 116 is provided for each processing set. In the example of the multiprocessor system of Figure 1 where two processing sets 14 and 16 are provided, two DDRs 1 16A and 1 16B are provided, one for each of the first and second processing sets 14 and 16, respectively. Figure 17 represents a dissimilar data write stage. The addressing logic 136 is shown schematically to comprise two decoder sections, one decoder section 136A for the first processing set and one decoder section 136B for the second processing set 16. During an address phase of a dissimilar data I/O write operation each of the processing sets 14 and 16 outputs the same predetermined address DDR-W which is separately interpreted by the respective first and second decoding sections 136A and 136B as addressing the respective first and second respective DDRs 116A and 116B. As the same address is output by the first and second processing sets 14 and 16, this is not interpreted by the comparator 130 as a lockstep eπor.

The decoding section 136A, or the decoding section 136B, or both are aπanged to further output a disable signal 137 in response to the predetermined write address supplied by the first and second processing sets 14 and 16. This disable signal is supplied to the comparator 130 and is operative during the data phase of the write operation to disable the comparator. As a result, the data output by the first processing set can be stored in the first DDR 116A and the data output by the second processing set can be stored in the second DDR 116B without the comparator being operative to detect a difference, even if the data from the first and second processing sets is different. The first decoding section is operable to cause the routing matrix to store the data from the first processing set 14 in the first DDR 116A and the second decoding section is operable to cause the routing matrix to store the data from the second processing set 16 in the second DDR 116B. At the end of the data phase the comparator 130 is once again enabled to detect any differences between I/O address and/or data phases as indicative of a lockstep error.

Following the writing of the dissimilar data to the first and second DDRs 1 16A and 1 16B, the processing sets are then operable to read the data from a selected one of the DDRs 116A/116B.

Figure 18 illustrates an alternative aπangement where the disable signal 137 is negated and is used to control a gate 131 at the output of the comparator 130. When the disable signal is active the output of the comparator is disabled, whereas when the disable signal is macttve the output of the comparator is enabled

Figure 19 illustrates the reading of the first DDR 116A in a subsequent dissimilar data read stage As illustrated in Figure 19, each of the processing sets 14 and 16 outputs the same predetermined address DDR-RA which is separately interpreted by the respective first and second decoding sections 136A and 136B as addressing the same DDR, namely the first DDR 116A. As a result, the content of the first DDR 116A is read by both of the processing sets 14 and 16, thereby enabling those processing sets to receive the same data. This enables the two processing sets 14 and 16 to achieve deterministic behavior, even if the source of the data written into the DDRs 116 by the processing sets 14 and 16 was not deterministic.

As an alternative, the processing sets could each read the data from the second DDR 116B. Figure 20 illustrates the reading of the second DDR 116B in a dissimilar data read stage following the dissimilar data write stage of Figure 15. As illustrated in Figure 20, each of the processing sets 14 and 16 outputs the same predetermined address DDR-RB which is separately interpreted by the respective first and second decoding sections 136A and 136B as addressing the same DDR, namely the second DDR 116B. As a result, the content of the second DDR 116B is read by both of the processing sets 14 and 16, thereby enabling those processing sets to receive the same data. As with the dissimilar data read stage of Figure 16, this enables the two processing sets 14 and 16 to achieve deterministic behavior, even if the source of the data written into the DDRs 1 16 by the processing sets 14 and 16 was not deterministic. The selection of which of the first and second DDRs 116A and 116B to be read can be determined in any appropriate manner by the software operating on the processing modules. This could be done on the basis of a simple selection of one or the other DDRs, or on a statistical basis or randomly or in any other manner as long as the same choice of DDR is made by both or all of the processing sets.

Figure 21 is a flow diagram summarizing the various stages of operation of the DDR mechanism described above.

In stage S10, a DDR write address DDR-W is received and decoded by the address decoders sections 136A and 136B during the address phase of the DDR write operation. In stage SI 1, the comparator 130 is disabled.

In stage SI 2, the data received from the processing sets 14 and 16 during the data phase of the DDR write operation is stored in the first and second DDRs 116A and 116B, respectively, as selected by the first and second decode sections 136A and 136B, respectively.

In stage SI 3, a DDR read address is received from the first and second processing sets and is decoded by the decode sections 136A and 136B, respectively. If the received address DDR-RA is for the first DDR 116A, then m stage S 14 the content of that DDR 116A is read by both of the processmg sets 14 and 16

Alternatively, 116A if the received address DDR-RB is for the second DDR 116B, then m stage S15 the content of that DDR 116B is read by both of the processmg sets 14 and 16 Figure 22 is a schematic representation of the arbitration performed on the respective buses 22, 24 and 26, and the arbitration for the bπdge itself

Each of the processmg set bus controllers 50 m the respective processmg sets 14 and 16 includes a conventional PCI master bus arbiter 180 for providmg arbitration to the respective buses 24 and 26 Each of the master arbiters 180 is responsive to request signals from the associated processing set bus controller 50 and the bπdge 12 on respective request (REQ) lmes 181 and 182 The master arbiters 180 allocate access to the bus on a first-come-First-served basis, issumg a grant (GNT) signal to the winning party on an appropriate grants line 183 oι 184

A conventional PCI bus arbiter 185 provides arbitration on the D bus 22 The D bus arbiter 185 can be configured as part of the D bus mterface 82 of Figure 6 or could be separate therefrom As with the P bus mastei arbiters 180, the D bus arbiter is responsive to request signals from the contending devices, including the bridge and the devices 30, 31, etc connected to the device bus 22 Respective request lmes 186, 187, 188, etc for each of the entities compering for access to the D bus 22 are provided for the request signals (REQ) The D bus arbiter 185 allocates access to the D bus on a first-come-first-served basis, issumg a grant (GNT) signal to the winning entity via respective grant lmes 189, 190, 192, etc Figure 23 is a state diagram summansing the operation of the D bus arbiter 185 In a particular embodiment up to six request signals may be produced by respective D bus devices and one by the bndge itself On a transition mto the GRANT state, these are sorted by a pnonty encoder and a request signal (REQ#) with the highest pnonty is registered as the winner and gets a grant (GNT#) signal Each winner which is selected modifies the pπonties in a pnonty encoder so that given the same REQ# signals on the next move to grant A different device has the highest pnonty, hence each device has a "fan" chance of accessmg DEVs The bndge REQ# has a higher weighting than D bus devices and will, under very busy conditions, get the bus for every second device

If a device requesting the bus fails to perform a transaction within 16 cycles it may lose GNT# via the BACKOFF state BACKOFF is requned as, under PCI rules, a device may access the bus one cycle after GNT# is removed Devices may only be granted access to D bus if the bπdge is not in the not m the EState A new GNT# is produced at the times when the bus is idle

In the GRANT and BUSY states, the FETs are enabled and an accessing device is known and forwarded to the D bus address decode logic for checking agamst a DMA address provided by the device

Turning now to the bπdge arbiter 134, this allows access to the bπdge for the first device which asserts the PCI FRAME# signal indicating an address phase Figure 24 is a state diagram summansing the operation of the bndge arbiter 134

As with the D bus arbiter, a pnonty encoder can be provided to resolve access attempts which collide In this case "a collision" the loser/losers are retned which forces them to give up the bus Under PCI rules retried devices must try repeatedly to access the bndge and this can be expected to happen To prevent devices which are very quick with their retry attempt from hoggmg the bridge, retried interfaces are remembered and assigned a higher priority. These remembered retries are prioritised in the same way as address phases. However as a precaution this mechanism is timed out so as not to get stuck waiting for a faulty or dead device. The algorithm employed prevents a device which hasn't yet been reined, but which would be a higher pnority retry than a device cuπently waiting for, from bemg retried at the first attempt.

In combmed operations a PA or PB bus input selects which P bus interface will win a bridge access Both are informed they won. Allowed selection enables latent fault checkmg durmg normal operation. EState prevents the D bus from winning.

The bridge arbiter 134 is responsive to standard PCI signals provided on standard PCI control lines 22, 24 and 25 to control access to the bridge 12.

Figure 25 illustrates signals associated with an I/O operation cycle on the PCI bus. A PCI frame signal (FRAME#) is initially asserted. At the same time, address (A) signals will be available on the DATA BUS and the appropriate command (write/read) signals (C) will be available on the command bus (CMD BUS). Shortly after the frame signal being asserted low, the initiator ready signal (IRDY#) will also be asserted low. When the device responds, a device selected signal (DEVSEL#) will be asserted low. When a target ready signal is asserted low (TRDY#), data transfer (D) can occur on the data bus.

The bridge is operable to allocate access to the bridge resources and thereby to negotiate allocation of a target bus in response to the FRAME# bemg asserted low for the mitiator bus concerned. Accordmgly, the bridge arbiter 134 is operable to allocate access to the bridge resources and/or to a target bus on a first-come-first-served basis in response to the FRAME# being asserted low. As well as the simple first-come-first-served basis, the arbiters may be additionally provided with a mechanism for logging the arbitration requests, and can imply a conflict resolution based on the request and allocation history where two requests are received at an identical time Alternatively, a simple priority can be allocated to the various requesters, whereby, in the case of identically timed requests, a particular requester always wins the allocation process. Each of the slots on the device bus 22 has a slot response register (SRR) 118, as well as other devices connected to the bus, such as a SCSI interface. Each of the SRRs 118 contams bits defining the ownership of the slots, or the devices connected to the slots on the direct memory access bus. In this embodiment, and for reasons to be elaborated below, each SRR 118 comprises a four bit register. However, it will be appreciated that a larger register will be required to determine ownership between more than two processing sets. For example, if three processing sets are provided, then a five bit register will be required for each slot.

Figure 16 illustrates schematically one such four bit register 600. As shown in Figure 16, a first bit 602 is identified as SRR[0], a second bit 604 is identified as SRR[1], a third bit 606 is identified as SRR[2] and a fourth bit 608 is identified as SRR[3].

Bit SRR[0] is a bit which is set when writes for valid transactions are to be suppressed. Bit SRR[1] is set when the device slot is owned by the first processmg set 14. This defines the access route between the first processing set 14 and the device slot. When this bit is set, the first processmg set 14 can always be master of a device slot 22, while the ability for the device slot to be master depends on whether bit SRR[3] is set. Bit SRR[2] is set when the device slot is owned by the second processmg set 16 This defines the access route between the second processmg set 16 and the device slot When this bit is set, the second processmg set 16 can always be master of the device slot or bus 22, while the ability for the device slot to be master depends on whether bit SRR[3] is set Bit SRR[3] is an arbitration bit which gives the device slot the ability to become master of the device bus

22, but only if it is owned by one of the processmg sets 14 and 16, that is if one of the SRR [1] and SRR[2] bits is set.

When the fake bit (SRR[0]) of an SRR 118 is set, wntes to the device for that slot are ignored and do noi appear on the device bus 22 Reads return indeterminate data without causing a transaction on the de\ ice bus 22 In the event of an I/O eπor the fake bit SRR[0] of the SRR 188 coπespondmg to the device which caused the enoi is set by the hardware configuration of the bridge to disable further access to the device slot concerned An interrupt may also be generated by the bπdge to inform the software which oπgmated the access leading to the I/O eπor that the eπor has occuπed The fake bit has an effect whether the system is in the split or the combined mode of operation The ownership bits only have effect, however, m the split system mode of operation In this mode, each slot can be m three states- Not-owned,

Owned by processmg set 14; and Owned by processmg set 16 This is determmed by the two SRR bits SRR[1] and SRR[2], with SRRfl] bemg set when the slot is owned by processmg set 14 and SRR[2] being set when the slot is owned by processmg set B If the slot is unowned, then neither bit is set (both bits set is an illegal condition and is prevented by the hardware)

A slot which is not owned by the processmg set making the access (this mcludes un-owned slots) cannot be accessed and results m an abort A processing set can only claim an un-owned slot, it cannot wrest ownership away from another processmg set This can only be done by powenng-off the other processmg set When a processmg set is powered off, all slots owned by it move to the un-owned state Whilst it is not possible for a processmg set to wrest ownership from another processmg set, it is possible for a processmg set to give ownership to another processmg set.

The owned bits can be altered when m the combmed mode of operation state but they have no effect until the split mode is entered.

Table 2 below summarizes the access πghts as determined by an SRR 118.

From Table 2, it can be seen that when the 4-bit SRR for a given device is set to 1100, for example, then the slot is owned by processmg set B (i.e SRR[2] is logic high) and processmg set A may not read from or write to the device (i.e. SRR[1] is logic low), although it may read from or wπte to the bπdge "FAKE AT" is set logic low (ι.e. SRR[0] is logic low) mdicatmg that access to the device bus is allowed as there are no faults on the bus As "ARB_EN" is set logic high (i.e. SRR[3] is logic high), the device with which the register is associated can become master of the D bus. This example demonstrates the operation of the register when the bus and associated devices are operatmg coπectly TABLE 2

SRR PA BUS PB BUS Device Interface

[3[2][1][0]

0000 xOOx Read/Wnte bndge SRR Read/Write bndge SRR Access denied

0010 Read/Wπte bndge Read/Write bπdge Access Denied because

Owned D Slot No access to D Slot arbitration bit is off

0100 Read/Wnte bndge Read/wπte bndge Access Denied because

No access to D Slot Access to D Slot arbitration bit is off

1010 Read/Wnte bπdge, Read/Write Bπdge Access to CPU B Demed

Owned D Slot No access to D Slot Access to CPU A OK

1100 Read/Wnte bndge, Read/Wnte bπdge Access to CPU A Denied

No access to D Slot Access to D Slot Access to CPU B OK

0011 Read/Wnte bndge, Read/Wnte bridge Access Denied because Bπdge discard writes No access to D Slot Arbitration bit is off

0101 Read/Wnte bndge, Read/Write bndge Access Denied because No access to D slot Bridge discards wntes Arbitration bit is off

1011 Read/Wnte bndge, Read/Wnte bπdge Access to CPU B Denied Bndge discard wntes No access to D Slot Access to CPU A OK

1101 Read/Wnte bπdge, Read/Write bndge Access to CPU B Denied No access to D slot Bndge discards wntes Access to CPU A OK

In an alternative example, where the SRR for the device is set to 0101, the setting of SRR[2] logic high mdicates that the device is owned by processmg set B However, as the device is malfunctioning, SRR[3] is set logic low and the device is not allowed access to the processmg set SRR[0] is set high so that any wntes to the device are ignored and reads therefrom return indeterminate data In this way, the malfunctioning device is effectively isolated from the processmg set, and provides indeterminate data to satisfy any device dnvers, for example, that might be lookmg for a response from the device Figure 26 illustrates the operation of the bndge 12 for direct memory access by a device such as one of the devices 28, 29, 30, 31 and 32 to the memory 56 of the processmg sets 14 and 16 When the D bus arbiter 185 receives a direct memory access (DMA) request 193 from a device (e g , device 30 m slot 33) on the device bus, the D bus arbiter determines whether to allocate the bus to that slot As a result of this granting procedure, the D-bus arbiter knows the slot which has made the DMA request 193 The DMA request is supplied to the address decoder 142 in the bπdge, where the addresses associated with the request are decoded The address decoder is responsive to the D bus grant signal 194 for the slot concerned to identify the slot which has been granted access to the D bus for the DMA request

The address decode logic 142 holds or has access to a geographic address map 196, which identifies the relationship between the processor address space and the slots as a result of the geographic address employed This geographic address map 196 could be held as a table m the bndge memory 126, along with the posted write buffer 122 and the dirty RAM 124 Alternatively, it could be held as a table in a separate memory element possibly forming part of the address decoder 142 itself The map 182 could be configured in a form other than a table

The address decode logic 142 is configured to veπfy the coπectness of the DMA addresses supplied by the device 30 In one embodiment of the mvention, this is achieved by comparing four significant address bits of the address supplied by the device 30 with the coπespondmg foui address bits of the address held in the geographic addressing map 196 for the slot identified by the D bus grant signal for the DMA request In this example four address bits are sufficient to determme whether the address supplied is within the coπect address range In this specific example, 32 bit PCI bus addresses are used, with bits 31 and 30 always bemg set to 1, bit 29 being allocated to identify which of two bndges on a motherboard is bemg addressed (see Figure 2) and bits 28 to 26 identifying a PCI device Bits 25-0 define an offset from the base address for the address range for each slot Accordmgly, by comparing bits 29-26, it is possible to identify whether the address(es) supplied fall(s) within the appropnate address range for the slot concerned It will be appreciated that m other embodiments a different number of bits may need to be compared to make this determination dependmg upon the allocation of the addresses

The address decode logic 142 could be aπanged to use the bus grant signal 184 for the slot concerned to identify a table entry for the slot concerned and then to compare the address m that entry with the address(es) received with the DMA request as descπbed above Alternatively, the address decode logic 142 could be aπanged to use the address(es) received with the DMA address to address a relational geographic address map and to determme a slot number therefrom, which could be compared to the slot for which the bus grant signal 194 is mtended and thereby to determme whether the addresses fall withm the address range appropriate for the slot concerned

Either way, the address decode logic 142 is aπanged to permit DMA to proceed if the DMA addresses fall within the expected address space for the slot concerned Otherwise, the address decoder is aπanged to ignore the slots and the physical addresses

The address decode logic 142 is further operable to control the routmg of the DMA request to the appropnate processmg set(s) 14/16 If the bndge is m the combined mode, the DMA access will automatically be allocated to all of the ui-sync processmg sets 14/16 The address decode logic 142 will be aware that the bndge is m the combmed mode as it is under the control of the bπdge controller 132 (see Figure 8) However, where the bndge is m the split mode, a decision will need to be made as to which, if any, of the processmg sets the DMA request is to be sent

When the system is in split mode, the access will be directed to a processmg set 14 or 16 which owns the slot concerned If the slot is un-owned, then the bndge does not respond to the DMA request In the split mode, the address decode logic 142 is operable to determme the ownership of the device oπginaring the DMA request by accessmg the SRR 118 for the slot concerned The appropnate slot can be identified by the D bus grant signal The address decode logic 142 is operable to control the target controller 140 (see Figure 8) to pass the DMA request to the appropnate processmg set(s) 14/16 based on the ownership bits SRR[1] and SRR[2] If bit SRR[1] is set, the first processmg set 14 is the owner and the DMA request is passed to the first processmg set If bit SRR[2] is set, the second processmg set 16 is the owner and the DMA request is passed to the second processmg set If neither of the bit SRR[1] and SRR[2] is set, then the DMA request is ignored by the address decoder and is not passed to either of the processing sets 14 and 16.

Figure 27 is a flow diagram summarizing the DMA verification process as illustrated with reference to Figure 24. In stage S20, the D-bus arbiter 160 arbitrates for access to the D bus 22.

In stage S21, the address decoder 142 verifies the DMA addresses supplied with the DMA request by accessing the geographic address map.

In stage S22, the address decoder ignores the DMA access where the address falls outside the expected range for the slot concerned. Alternatively, as represented by stage S23, the actions of the address decoder are dependent upon whethei the bridge is in the combined or the split mode.

If the bridge is in the combined mode, then in stage S24 the address decoder controls the target controller 140 (see Figure 8) to cause the routing matrix 80 (see Figure 6) to pass the DMA request to both processing sets 14 and 16. If the bridge is in the split mode, the address decoder is operative to verify the ownership of the slot concerned by reference to the SRR 118 for that slot in stage S25.

If the slot is allocated to the first processing set 14 (i.e. the SRR[1] bit is set), then in stage S26 the address decoder 142 controls the target controller 140 (see Figure 8) to cause the routing matrix 80 (see Figure 6) to pass the DMA request to first processing set 14. If the slot is allocated to the second processing set 16 (i.e. the SRR[2] bit is set), then in stage S27 the address decoder 142 controls the target controller 140 (see Figure 8) to cause the routing matrix 80 (see Figure 6) to pass the DMA request to the second processing set 16.

If the slot is unallocated (i.e. neither the SRR[1] bit nor the SRR[2] bit is set), then in step SI 8 the address decoder 142 ignores or discards the DMA request and the DMA request is not passed to the processing sets 14 and 16.

A DMA, or direct vector memory access (DVMA), request sent to one or more of the processing sets causes the necessary memory operations (read or write as appropriate) to be effected on the processing set memory. There now follows a description of an example of a mechanism for enabling automatic recovery from an EState (see Figure 11). The automatic recovery process includes reintegration of the state of the processing sets to a common status in order to attempt a restart in lockstep. To achieve this, the processing set which asserts itself as the primary processing set as described above copies its complete state to the other processing set. This involves ensuring that the content of the memory of both processors is the same before trying a restart in lockstep mode.

However, a problem with the copying of the content of the memory from one processing set to the other is that during this copying process a device connected to the D bus 22 might attempt to make a direct memory access (DMA) request for access to the memory of the primary processing set. If DMA is enabled, then a write made to an area of memory which has already been copied would result in the memory state of the two processors at the end of the copy not being the same. In principle, it would be possible to inhibit DMA for the whole of the copy process. However, this would be undesirable, bearing in mind that it is desirable to minimise the time that the system or the resources of the system are unavailable. As an alternative, it would be possible to retry the whole copy operation when a DMA operation has occurred during the period of the copy. However, it is likely that further DMA operations would be performed during the copy retry, and accordingly this is not a good option either. Accordingly, in the present system, a dirty RAM 124 is provided in the bridge. As described earlier the dirty RAM 124 is configured as part of the bridge SRAM memory 126.

The dirty RAM 124 comprises a bit map having a dirty indicator, for example a dirty bit, for each block, or page, of memory. The bit for a page of memory is set when a write access to the area of memory concerned is made. In an embodiment of the invention one bit is provided for every 8K page of main processing set memory. The bit for a page of processing set memory is set automatically by the address decoder 142 when this decodes a DMA request for that page of memory for either of the processing sets 14 or 16 from a device connected to the D bus 22. The dirty RAM can be reset, or cleared when it is read by a processing set, for example by means of read and clear instructions at the beginning of a copy pass, so that it can start to record pages which are dirtied since a given time.

The dirty RAM 124 can be read word by word. If a large word size is chosen for reading the dirty RAM 124. this will optimise the reading and resetting of the dirty RAM 124. Accordingly, at the end of the copy pass the bits in the dirty RAM 124 will indicate those pages of processing set memory which have been changed (or dirtied) by DMA writes during the period of the copy. A further copy pa s can then be performed for only those pages of memory which have been dirtied. This will take less time that a full copy of the memory. Accordingly, there are typically less pages marked as dirty at the end of the next copy pass and, as a result, the copy passes can become shorter and shorter. As some time it is necessary to decide to inhibit DMA writes for a short period for a final, short, copy pass, at the end of which the memories of the two processing sets will be the same and the primary processing set can issue a reset operation to restart the combined mode.

The dirty RAM 124 is set and cleared in both the combined and split modes. This means that in split mode the dirty RAM 124 may be cleared by either processing set.

The dirty RAM 124 address is decoded from bits 13 to 28 of the PCI address presented by the D bus device. Eπoneous accesses which present illegal combinations of the address bits 29 to 31 are mapped into the dirty RAM 124 and a bit is dirtied on a write, even though the bridge will not pass these transactions to the processing sets.

When reading the dirty RAM 124, the bridge defines the whole area from 0x00008000 to OxOOOOffff as dirty RAM and will clear the contents of any location in this range on a read. As an alternative to providing a single dirty RAM 124 which is cleared on being read, another alternative would be to provide two dirty RAMs which are used in a toggle mode, with one being written to while another is read.

Figure 28 is a flow diagram summarising the operation of the dirty RAM 124.

In stage S41, the primary processing set reads the dirty RAM 124 which has the effect of resetting the dirty RAM 124.

In stage S42, the primary processor (e.g. processing set 14) copies the whole of its memory 56 to the memory 56 of the other processing set (e.g. processing set 16).

In stage S43, the primary processing set reads the dirty RAM 124 which has the effect of resetting the dirty RAM 124. In stage S44, the primary processor determines whether less than a predetermined number of bits have been written in the dirty RAM 124.

If more than the predetermined number of bits have been set, then the processor in stage S45 copies those pages of its memory 56 which have been dirtied, as indicated by the dirty bits read from the dirty RAM 124 in stage S43, to the memory 56 of the other processing set. Control then passes back to stage S43.

If, in stage S44, it is determined less than the predetermined number of bits have been written in the dirty RAM 124, then in Stage S45 the primary processor causes the bridge to inhibit DMA requests from the devices connected to the D bus 22. This could, for example, be achieved by clearing the arbitration enable bit for each of the device slots, thereby denying access of the DMA devices to the D bus 22. Alternatively, the address decoder 142 could be configured to ignore DMA requests under instructions from the primary processor. During the period in which DMA accesses are prevented, the primary processor then makes a final copy pass from its memory to the memory 56 of the other processor for those memory pages coπesponding to the bits set in the dirty RAM 124.

In stage S47 the primary processor can issue a reset operation for initiating a combined mode.

In stage S48, DMA accesses are once more permitted. As has been described above in detail, the computer system 10 is provided with various mechanisms to enable the system to survive various different situations that would otherwise cause the system to crash. Some of those mechanisms are concerned with surviving eπors associated with I/O device failures, for example, and others relate to eπors that can arise when the multi-processors of the system deviate from a lock-step mode.

One final, and hitherto undescribed problem that could occur in a computer system such as that described herein is concerned with a failure of either the processing sets 14, 16 or the bridge 12. The following description illustrates a mechanism for limiting the effect of any such failures.

As shown in Figure 29, the computer system comprises, in part, the bridge 12 provided on the bridge motherboard 42, the processing set 14, 16 (only one of which is shown for illustrative purposes only) and a PCI bus 24, 26 connecting the processing set 14, 16 to the bridge 12. As described above in Figure 3, the prefeπed processing set architecture comprises a processing set controller 50 to which a plurality of processors (not shown in Figure 29) are connected. The PCI bus interconnecting the processing set controller 50 and the bridge 12 typically comprises a plurality of bus lines, but one line is shown here for clarity.

In normal operation, PCI protocol signals propagate in both directions between the processing set controller 50 and the bridge 12. A problem exists in that the PCI bus protocol is provided with only a rudimentary fault recovery mechanism. Thus, it has been noted that crashes can be caused if either the processing set or the bridge should fail. These crashes typically result because the surviving bridge or processing set continues to attempt to distribute transactions along the PCI bus, and these distributed transactions tend to back-up whereupon either the bridge or the processing set can deadlock.

As shown in Figure 30a, the PCI protocol includes two signals, a first signal ERDY# indicating that an initiator is ready to supply data, and a second signal TRDY# indicating that a target is ready to receive data. It will be understood of course that the initiator refers to the component which is initiating the signals and that the target refers to the component which is to receive the signals.

The IRDY# and TRDY# signal exchange takes place once the initiator address phase has been completed. In the address phase, the initiator addresses the device to which it intends to send data, and the addressed device (the target) responds with a DEVSEL# signal confirming to the initiator that it has been selected by the initiator. In effect, the address phase and the DEVSEL# exchange comprises a hand-shaking process between the initiator and target.

Once the handshaking process has been successfully completed, it is then necessary to determine that the initiator is ready to supply data, and that the target is ready to receive that supplied data. The IRDY# and TRDY# signals are used for this purpose.

Figures 30b and 30d schematically indicate possible IRDY# and TRDY# signals when the PCI bus is operating coπectly. As shown in Figure 30b, an IRDY# signal has issued at time "a" indicating that the initiator is ready to supply data. A period of time later, a TRDY# signal is outputted indicating that the target is ready. When both the IRDY# and TRDY# signals are asserted at the same clock edge (i.e. at time "b"), the initiator transfers a datum to the target.

For example, for Programmed Input/Output (PIO) from the processing set 14, 16 via the bridge 12. an initiator ready signal IRDY# is generated on the PCI bus 24 once the address phase of the PCI protocol has been completed, and if everything is working properly a TRD Y# signal is subsequently generated on the PCI bus when the bridge has indicated that it is ready to receive data from the processing set. If a fault has occuπed, for example as shown in Figure 30a, then no response is received within a predetermined period of time from the target and an eπor is assumed to have occuπed. Figure 30a illustrates an example of PIO where an eπor is assumed to have occuπed. In this example, the processing set has initiated an IRDY# signal to indicate that the processing set as initiator is ready and this signal has not been answered by a TRDY# signal indicating that the bridge is ready within a predetermined period of time "t". The predetermined period of time "t" is set to be a relatively long time in the order of 40 Ds or thereabouts. In other words, the period of time "t" coπesponds to several thousand clock cycles in a computer system operating at 25MHz, for example. For normal PCI bus exchanges, one would expect an IRDY# or TRDY# signal to be answered within 16 or 32 clock cycles. The predetermined period of time "t" may be adjusted to take account of different clock speeds and operating requirements.

In the state shown in Figure 30a, and if nothing was done to remedy this situation, an eπor might occur as data exchanges behind the depicted data exchange in the queue would back-up until a point is reached at which the system becomes deadlocked and the processing set fails.

Figures 30c and 30d illustrate, in a similar manner to Figures 31a and 31b, a coπectly operating PCI bus (Figure 30d) and a PCI bus which appears to have encountered problems (Figure 30c). Figures 30c and 30d, however, differ from Figures 31a and 31b in that they depict DMA from the bridge to the processing set. For both sets of figures, Figures 30a, 30b and 30c and 30d, it is the clock signal CLOCK that determines the point in time at which the various signals are generated on the bus. For example, as CLOCK is driven high in Figure 30a (at the broken line in Figure 30a), the initiator is triggered to generate an IRDY# signal on the bus. As the clock signal goes high a predetermined period of time "t" later, so the mechanism is again triggered to determine that an eπor has occurred. We have described above a faulty bridge, but it will be appreciated that a similar situation can occur where a processing set is faulty.

Returning to Figure 29, there is shown a mechanism 800 by means of which the problematic situation shown in Figures 30a and 30c may be alleviated. As shown in Figure 29, the mechanism 800 comprises a controller 810. a tri-state buffer 820 and a switchable field effect transistor (FET) 830 connected to the bus 24. The controller 810 mcludes a counter which counts the number of clock signals between the issuance ol an IRDY# signal and the issuance of a TRDY# signal or between a TRDY# signal and an IRDY# signal where the order is reversed If the number of counted clock cycles exceeds a predetermined limit, then the controller 810 needs to tin n off FET 830 to turn off the PCI bus 24 between the processmg set 14, 16 and the bndge 12 before it controls the bultei 820 to abort the requested data transfer to thereby to prevent further data transfer requests Issuance of TRDY# and IRDY# signals are detected on a sense line 825 connected between the controller 810 and the PCI bus 24

If the controller 810 should detect the issuance of a TRDY# signal before the number of counted clock cycles has exceeded the predetermined limit, then the counter is reset and the PCI bus is left on

The abortion of the requested data transfer is required as the mitiator will expect to receive a TRDY# signal from the target, and thus could crash as data bus transactions back-up behind the request which has not been answered by the selected target To abort the transaction from the initiator's point of view, the controller 810 controls the tπ- state buffer 820 by means of a control lme 840 to issue a fake response signal to the mitiator, whereupon the mitiator is satisfied and outputs data that is discarded

Figure 29 shows an aπangement wherem it is assumed that the bπdge 12 is malfunctioning, and that the processmg set 14, 16 is functioning coπectly In the implementation of Figure 29, the malfunctioning bndge is isolated from the processmg set and cycles from the processmg set 14, 16 are aborted m the address phase because the processmg set does not see DEVSEL# asserted However, it will be appreciated that for other bus protocols the processmg set may need to continue to abort bus cycles

Figure 31 is a schematic representation of the signals generated on the bus when a transaction is terminated by the mechanism 800 Figure 31 shows a bus where an IRDY# signal has issued dunng PIO and has not been responded to within the predetermined time limit "t" shown m Figure 30a At some pomt in time after it has been determmed that an eπor has occurred (due to the lack of a response to the issued IRDY# signal), an ISOLATES signal is generated by the controller 810 to turn off the FET 830 thereby isolating the processmg set from the bus The controller 810 then issues a FAKE_ENABLE# signal to cancel the DEVSEL# signal issued dunng the address phase (I e the handshaking phase) between the target and mitiator Issuance of the FAKE_ENABLE# signal also causes a STOP# signal to be generated on the bus The STOP# signal mdicates to the system that the bus cycle has been stopped and serves to cause a FRAME# signal and the issued IRDY# signal to be dπven high The FRAME# signal is one of the basic components of the illustrative PCI protocol descnbed herem and is used to indicate when an access (either PIO or

DMA) is occurring DEVSEL# is dπven high and STOP# low which signals a target-abort to the mitiator which then negates FRAME* and IRDY# m response

It should be noted that the arrangement of Figure 31 is purely illustrative and that the FAKE_ENABLE# and ISOLATES signals may be separated by a greater or lesser number of clock cycles than the number shown It may also be necessary for a bus hold circuit to be provided to hold DEVSEL# low until the FAKE_ENABLE# signal is asserted An alternative to the aπangement of Figure 29 is shown in Figure 32 The arrangement of Figure 32 operates m the same manner as the aπangement of Figure 29, except that it is assumed that the bndge is functioning coπectly and that it is the processmg set 14, 16 which is malfunctioning In this aπangement, therefore, a FET 850 is positioned between a processmg set controller 50 and a controller 870 In this instance, therefore, signals from the malfunctionmg processmg set are ignored and signals from the bndge 12 are answered with a fake response issued by the controller 870 and a buffer 880 A further implementation providing a higher degree of tolerance is shown in Figure 33. As shown in Figure 33, a FET 890 is provided between two controllers 900 and 910 and the controllers are operable, in the same way as the aπangements of Figures 29 and 32 to answer signals from the bridge and processing set with a fake response.

As mentioned above, it will be appreciated that the PCI bus 24 will normally comprise a number of bus lines (as shown in Figure 3 for example). Accordingly, the aπangement described above with reference to Figures 29. 32 and 33 could enable the generation of fake signals for all relevant bus lines of the PCI bus. In an

aπangement, an appropriate mechanism could be provided for each bus line of the PCI bus between the processing set

14, 16 and bridge 12.

Thus there has been described a bus control mechanism for a computer system that includes a bus, a first component and a second component, wherein the first and second components are interconnected via the bus for performing a data transfer operation, the data transfer operation being initiated by an exchange of request and response signals, and a component which initiated a request signal is operable to effect data transfer on receipt of a response signal, the bus control mechanism comprising: first means for selectively disabling the bus; second means for generating a fake response signal; and third means for monitoring the request and response signals exchanged between the components, and for controlling the first means to disable the bus and for controlling the second means to issue a fake response signal to the component that issued the request signal for terminating the data transfer operation in situations where the response signal is not issued within a predetermined period following the request signal.

It will be appreciated that although particular embodiments of the invention have been described, many modifications/additions and/or substitutions may be made within the spirit and scope of the present invention as defined in the appended claims. For example, although in the specific description two processmg sets are provided, it will be appreciated that the specifically described features may be modified to provide for three or more processing sets. Also, it will be appreciated that bus isolation mechanisms such as those described herein can be used for bus protocols other a PCI bus protocol as is used in the particularly described embodiment.

Claims

WHAT IS CLAIMED IS:

1. A bus control mechanism for a computer system that includes a bus, a first component and a second component, wherein the first and second components are interconnected via the bus for performing a data transfer operation, the data transfer operation being initiated by an exchange of request and response signals between the first and second components, and a component that issued a request signal is operable to effect data transfer on receipt of a response signal, the bus control mechanism comprising: a switch selectively operable to disable the bus; a fake response generator selectively operable to generate a fake response signal; and a controller operable to monitor the request and response signals exchanged between the components and, in situations where a coπesponding response signal is not issued within a predetermined time following a particular request signal, to cause the switch to disable the bus and to cause the fake response generator to issue a fake response signal to the component that issued the particular request signal for terminating the data transfer operation.

2. The bus control mechanism of claim 1, wherein the component that issued the particular request signal is operable, on receipt of the fake response signal, to transfer data to the bus, which data is thus discarded as a result of the disabling of the bus by the switch.

3. The bus control mechanism of claim 1, wherein the computer system includes a clock signal generator and the controller comprises a counter for counting clock signals between detection of the particular request signal and detection of the coπesponding response signal, the controller being operable, in the absence of detecting the coπesponding response signal within a predetermined number of clock cycles, to cause the switch to disable the bus and to cause the fake response generator to issue a fake response signal to the component that issued the particular request signal for terminating the data transfer operation.

4. The bus control mechanism of claim 1, wherein the switch is selectively operable to disable the bus by isolating a first part of the bus connected to the first component from a second part of the bus connected to the second component.

5. The bus control mechanism of claim 4, wherein: the fake response generator is connected to the first part of the bus; and the controller is connected to the switch and to the fake response generator and is operable, in response to detection of the particular request signal from the first component and the absence of the coπesponding response signal from the second component within the predetermined time, to cause the switch to disable the bus by isolating the first part of the bus from the second part of the bus and to cause the fake response generator to assert a fake response signal on the first part of the bus for causing the first component to terminate the data transfer operation.

6. The bus control mechanism of claim 5, wherein: a second fake response generator is connected to the second part of the bus; and a second controller is connected to the switch and to the further fake response generator and is opeiable. in response to detection of a given request signal from the second component and the absence of a resulting response signal from the first component within a predetermined period, to cause the switch to disable the bus by isolating the first part of the bus from the second part of the bus and to cause the second fake response generator to assert a fake response signal on the second part of the bus for causing the second component to terminate the data transfer operation.

7. The bus control mechanism of claim 4, wherein the switch comprises an FET.

8. The bus controller of claim 1, wherem the request signal is an initiate transfer signal, the response signal is a target ready signal, and the fake response signal is a fake target ready signal.

9. The bus control mechanism of claim 1 , wherein the bus is a PCI bus.

10. The bus controller of claim 9, wherem the request signal is an IRDY# signal and the response signal is a

TRDY# signal.

11. A bus control mechanism for a computer system that includes a bus, a first component and a second component, wherein the first and second components are interconnected via the bus for performing a data transfer operation, the data transfer operation being initiated by an exchange of request and response signals, and a component which initiated a request signal is operable to effect data transfer on receipt of a response signal, the bus control mechanism comprising: first means for selectively disabling the bus; second means for generating a fake response signal; and third means for monitoring the request and response signals exchanged between the components, and for controlling the first means to disable the bus and for controlling the second means to issue a fake response signal to the component that issued the request signal for terminating the data transfer operation in situations where the response signal is not issued within a predetermined period following the request signal

12. The bus control mechanism of claim 11 , wherein. the second means comprises a first buffer selectively operable to assert a fake response signal on a first part of the bus and a second buffer selectively operable to assert a fake response signal on a second part of the bus, and the third means comprises a first controller connected to the first part of the bus and operable to control the first means and the first buffer in situations where the first component issues the request signal, and a second controller connected to the second part of the bus and operable to control the first means and the second buffer in situations where the second component issues the request signal.

13. The bus control mechanism of claim 12, wherein the first means is a switch selectively operable to disable the bus by isolating the first part of the bus connected to the first component from the second part of the bus connected to the second component.

14. A bus control mechanism for a computer system that includes a bus, a first component and a second component, wherein the first and second components are interconnected via the bus for performing a data transfer operation, and at least one of the components is operable as an initiator to assert an initiate transfer signal on the bus, and at least the other of the components is operable as a target to assert a target ready signal, the bus coniml mechanism comprising: a switch selectively operable to isolate a second part of the bus connected to the second component from a first part of the bus connected to the first component; a fake response generator selectively operable to assert a fake response signal; and a controller connected to the bus for sensing the initiate transfer signal and the target ready signal, the controller being operable to determine a timed period after sensing the initiate transfer signal and being connected to the switch and to the fake response generator to cause the switch to isolate a part of the bus and to cause the fake response generator to assert the fake response signal to the initiator for causing termination of the data transfer operation in the absence of sensing the target ready signal within the timed period.

15. A computer system comprising: a bus, a first component and a second component interconnected via the bus for performing a data transfer operation, the data transfer operation being initiated by an exchange of request and response signals, wherein a component that initiates a request signal is operable to effect data transfer upon receipt of a response signal; and a bus control mechanism that comprises: a switch selectively operable to disable the bus; and a fake response generator selectively operable to generate a fake response signal; and a controller operable to monitor the request and response signals exchanged between the first and second components and, in situations where a coπesponding response signal is not issued within a predetermined time following a particular request signal, to cause the switch to disable the bus an to cause the fake response generator to issue a fake response signal to the component that issued tlie particular request signal for terminating the data transfer operation.

16. A computer system according to claim 15, wherein the computer system is a fault tolerant computer system, and wherein the first component is a processing set comprising at least one processor, and the second component is a bus bridge.

17. A method of controlling a bus of a computer system including a first component and a second component interconnected via the bus for performing a data transfer operation, wherein the data transfer operation is initiated by an exchange of request and response signals, wherein a component which initiated a requested signal is operable to effect data transfer upon receipt of a response signal, the method comprising: monitoring the request signal on the bus; timing a period following the request signal; in the absence of a coπesponding response signal within the period, disabling the bus and issuing a fake response signal to the component which initiated the request to thereby terminate the data transfer operation.

18. The method of claim 17, wherein the step of disabling the bus comprises causing a switch to isolate a part of bus connected to a corresponding component that should have issued the response signal within the period.

19. The method of claim 17, wherein the bus is a PCI bus, the request signal is an IRDY# signal and the response signal is TRDY# signal.