US20030217119A1 - Replication of remote copy data for internet protocol (IP) transmission - Google Patents

Replication of remote copy data for internet protocol (IP) transmission Download PDF

Info

Publication number
US20030217119A1
US20030217119A1 US10/147,751 US14775102A US2003217119A1 US 20030217119 A1 US20030217119 A1 US 20030217119A1 US 14775102 A US14775102 A US 14775102A US 2003217119 A1 US2003217119 A1 US 2003217119A1
Authority
US
United States
Prior art keywords
data
primary
file system
network
volume
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
US10/147,751
Other versions
US7546364B2 (en
Inventor
Suchitra Raman
Philippe Armangau
Milena Bergant
Raymond Angelone
Jean-Pierre Bono
Uresh Vahalia
Uday Gupta
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
EMC Corp
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to US10/147,751 priority Critical patent/US7546364B2/en
Assigned to EMC CORPORATION reassignment EMC CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: RAMAN, SUCHITRA, ANGELONE, RAYMOND A., ARMANGAU, PHILIPPE, BERGANT, MILENA, BONO, JEAN-PIERRE, GUPTA, UDAY K., VAHALIA, URESH
Publication of US20030217119A1 publication Critical patent/US20030217119A1/en
Application granted granted Critical
Publication of US7546364B2 publication Critical patent/US7546364B2/en
Assigned to CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH, AS COLLATERAL AGENT reassignment CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH, AS COLLATERAL AGENT SECURITY AGREEMENT Assignors: ASAP SOFTWARE EXPRESS, INC., AVENTAIL LLC, CREDANT TECHNOLOGIES, INC., DELL INTERNATIONAL L.L.C., DELL MARKETING L.P., DELL PRODUCTS L.P., DELL SOFTWARE INC., DELL SYSTEMS CORPORATION, DELL USA L.P., EMC CORPORATION, EMC IP Holding Company LLC, FORCE10 NETWORKS, INC., MAGINATICS LLC, MOZY, INC., SCALEIO LLC, SPANNING CLOUD APPS LLC, WYSE TECHNOLOGY L.L.C.
Assigned to THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT reassignment THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT SECURITY AGREEMENT Assignors: ASAP SOFTWARE EXPRESS, INC., AVENTAIL LLC, CREDANT TECHNOLOGIES, INC., DELL INTERNATIONAL L.L.C., DELL MARKETING L.P., DELL PRODUCTS L.P., DELL SOFTWARE INC., DELL SYSTEMS CORPORATION, DELL USA L.P., EMC CORPORATION, EMC IP Holding Company LLC, FORCE10 NETWORKS, INC., MAGINATICS LLC, MOZY, INC., SCALEIO LLC, SPANNING CLOUD APPS LLC, WYSE TECHNOLOGY L.L.C.
Assigned to EMC IP Holding Company LLC reassignment EMC IP Holding Company LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: EMC CORPORATION
Assigned to THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A. reassignment THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A. SECURITY AGREEMENT Assignors: CREDANT TECHNOLOGIES, INC., DELL INTERNATIONAL L.L.C., DELL MARKETING L.P., DELL PRODUCTS L.P., DELL USA L.P., EMC CORPORATION, EMC IP Holding Company LLC, FORCE10 NETWORKS, INC., WYSE TECHNOLOGY L.L.C.
Assigned to THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A. reassignment THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A. SECURITY AGREEMENT Assignors: CREDANT TECHNOLOGIES INC., DELL INTERNATIONAL L.L.C., DELL MARKETING L.P., DELL PRODUCTS L.P., DELL USA L.P., EMC CORPORATION, EMC IP Holding Company LLC, FORCE10 NETWORKS, INC., WYSE TECHNOLOGY L.L.C.
Assigned to DELL INTERNATIONAL, L.L.C., CREDANT TECHNOLOGIES, INC., DELL SYSTEMS CORPORATION, EMC CORPORATION, DELL PRODUCTS L.P., EMC IP Holding Company LLC, AVENTAIL LLC, MOZY, INC., MAGINATICS LLC, WYSE TECHNOLOGY L.L.C., DELL SOFTWARE INC., DELL MARKETING L.P., ASAP SOFTWARE EXPRESS, INC., FORCE10 NETWORKS, INC., DELL USA L.P., SCALEIO LLC reassignment DELL INTERNATIONAL, L.L.C. RELEASE BY SECURED PARTY (SEE DOCUMENT FOR DETAILS). Assignors: CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH
Assigned to EMC IP HOLDING COMPANY LLC (ON BEHALF OF ITSELF AND AS SUCCESSOR-IN-INTEREST TO MOZY, INC.), DELL MARKETING CORPORATION (SUCCESSOR-IN-INTEREST TO FORCE10 NETWORKS, INC. AND WYSE TECHNOLOGY L.L.C.), DELL INTERNATIONAL L.L.C., DELL MARKETING CORPORATION (SUCCESSOR-IN-INTEREST TO ASAP SOFTWARE EXPRESS, INC.), SCALEIO LLC, EMC CORPORATION (ON BEHALF OF ITSELF AND AS SUCCESSOR-IN-INTEREST TO MAGINATICS LLC), DELL PRODUCTS L.P., DELL USA L.P., DELL MARKETING L.P. (ON BEHALF OF ITSELF AND AS SUCCESSOR-IN-INTEREST TO CREDANT TECHNOLOGIES, INC.) reassignment EMC IP HOLDING COMPANY LLC (ON BEHALF OF ITSELF AND AS SUCCESSOR-IN-INTEREST TO MOZY, INC.) RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (040136/0001) Assignors: THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT
Assigned to SCALEIO LLC, DELL USA L.P., DELL MARKETING L.P. (ON BEHALF OF ITSELF AND AS SUCCESSOR-IN-INTEREST TO CREDANT TECHNOLOGIES, INC.), EMC CORPORATION (ON BEHALF OF ITSELF AND AS SUCCESSOR-IN-INTEREST TO MAGINATICS LLC), EMC IP HOLDING COMPANY LLC (ON BEHALF OF ITSELF AND AS SUCCESSOR-IN-INTEREST TO MOZY, INC.), DELL MARKETING CORPORATION (SUCCESSOR-IN-INTEREST TO FORCE10 NETWORKS, INC. AND WYSE TECHNOLOGY L.L.C.), DELL INTERNATIONAL L.L.C., DELL MARKETING CORPORATION (SUCCESSOR-IN-INTEREST TO ASAP SOFTWARE EXPRESS, INC.), DELL PRODUCTS L.P. reassignment SCALEIO LLC RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (045455/0001) Assignors: THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1095Replication or mirroring of data, e.g. scheduling or transport for data synchronisation between network nodes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/30Definitions, standards or architectural aspects of layered protocol stacks
    • H04L69/32Architecture of open systems interconnection [OSI] 7-layer type protocol stacks, e.g. the interfaces between the data link level and the physical level
    • H04L69/322Intralayer communication protocols among peer entities or protocol data unit [PDU] definitions
    • H04L69/329Intralayer communication protocols among peer entities or protocol data unit [PDU] definitions in the application layer [OSI layer 7]

Definitions

  • the present invention relates generally to data storage systems, and more particularly to network file servers.
  • the present invention specifically relates to a network file server distributing remote copy data over a network using the Internet Protocol (IP).
  • IP Internet Protocol
  • Remote copy systems have been used for automatically providing data backup at a remote site in order to insure continued data availability after a disaster at a primary site.
  • a remote copy facility is described in Ofek, U.S. Pat. No. 5,901,327 issued May 4, 1999, entitled “Bundling of Write Data from Channel Commands in a Command Chain for Transmission over a Data Link Between Data Storage Systems For Remote Data Mirroring,” incorporated herein by reference.
  • This remote copy facility uses a dedicated network link and a link-layer protocol for 1:1 replication between a primary storage system and a secondary storage system.
  • the invention relates to a method used in a data processing system having a plurality of host computers linked by an-Internet Protocol (IP) network to a plurality of data storage systems.
  • IP Internet Protocol
  • Each of the data storage systems has data storage and at least one data mover computer for moving data between the data storage and the IP network.
  • the method distributes remote copy data over the IP network from a primary data mover computer to a plurality of secondary data mover computers.
  • the method includes the primary data mover computer sending the remote copy data over the IP network to at least one forwarder data mover computer, and the forwarder data mover computer routing the remote copy data over the IP network to the plurality of secondary data mover computers.
  • the invention provides a data processing system.
  • the data processing system includes a plurality of data storage systems linked by an Internet Protocol (IP) network for access by a plurality of host computers.
  • IP Internet Protocol
  • Each of the storage systems has data storage and at least one data mover computer for moving data between the data storage and the IP network.
  • the data mover computers include means for distributing remote copy data over the IP network from a primary data mover computer to a plurality of secondary data mover computers by the primary data mover computer sending the remote copy data over the IP network to at least one forwarder data mover computer, and the forwarder data mover computer routing the remote copy data over the IP network to the plurality of secondary data mover computers.
  • the invention provides a server for an Internet Protocol (IP) network.
  • the server is programmed with a routing table, a TCP/IP layer, and a replication control protocol (RCP) session layer over the TCP/IP layer.
  • the routing table identifies destinations in the network for remote copy data.
  • the replication control protocol session layer is programmed to produce an inbound session in response to the file server receiving remote copy data from a source in the IP network, and at least one outbound session for transmitting the remote copy data to a plurality of destinations identified in the routing table as destinations for the remote copy data from the source.
  • the invention provides a primary data storage system for distributing remote copy data over an Internet Protocol (IP) network to at least one secondary data storage system in the IP network.
  • the primary data storage system includes data storage and a data mover computer for moving data between the IP network and the data storage.
  • the data storage includes a primary volume including a primary copy of the remote copy data, and a save volume used as a buffer between the primary volume and the IP network.
  • the data mover computer is programmed with a TCP/IP layer, a replication control protocol (RCP) layer over the TCP/IP layer for transmitting blocks of data from the save volume over the IP network, and a replication module for writing modified blocks of the primary volume to the save volume.
  • RCP replication control protocol
  • the invention provides a secondary data storage system for receiving remote copy data distributed over an Internet Protocol (IP) network from a primary data storage system.
  • the remote copy data includes modified blocks of a primary volume in the primary data storage system.
  • the secondary data storage system includes data storage and a data mover computer for moving data between the IP network and the data storage, wherein the data storage includes a secondary volume including a secondary copy of the primary volume, and a save volume used as a buffer between the IP network and the secondary volume for buffering the modified blocks in the remote copy data.
  • IP Internet Protocol
  • the data mover computer is programmed with a TCP/IP layer, a replication control protocol (RCP) layer over the TCP/IP layer for transmitting the modified blocks of remote copy data from the IP network to the save volume, and a playback module for writing the modified blocks of the remote copy data from the save volume to the secondary volume.
  • RCP replication control protocol
  • the invention provides a network file server for use in an Internet Protocol (IP) network.
  • IP Internet Protocol
  • the network file server has data storage including a file system volume for storing a file system, and a TCP port for connection to the IP network to permit access from the IP network to the file system.
  • the network file server is programmed with a series of protocol layers including a TCP/IP layer, a replication control protocol (RCP) layer, and a volume multicast layer.
  • the TCP/IP layer provides access to the IP network through the TCP port in accordance with the standard Transmission Control Protocol.
  • the replication control protocol (RCP) session layer is over the TCP/IP layer for transmission, forwarding, and reception of blocks of remote copy data in accordance with a replication control protocol in which the blocks of remote copy data are transmitted and forwarded to specified groups of destinations in the IP network.
  • the network file server also has a routing table configured with the groups of destinations, and the RCP layer accesses the routing table to determine the destinations in the specified groups for transmission or forwarding.
  • the volume multicast layer is over the RCP layer for transmission or reception of specified volume extents of blocks between the file system volume and the IP network.
  • FIG. 1 is a block diagram of a data processing system in which a primary data storage system servicing a primary host processor is linked to a secondary storage system servicing a secondary host processor to provide the secondary host processor uninterrupted read-only access to a consistent dataset concurrent with read-write access by the primary host processor;
  • FIG. 2 is a block diagram showing data flow through the data processing system of FIG. 1;
  • FIG. 3 is a block diagram showing control flow through the secondary data storage system of FIG. 1;
  • FIG. 4 is a flowchart showing how the secondary data storage system in FIG. 1 is programmed to respond to a write command received from the primary data storage system;
  • FIG. 5 is a flowchart showing how the secondary data storage system in FIG. 1 is programmed to respond to a read command received from the secondary host processor;
  • FIG. 6 is a flowchart showing how the secondary data storage system in FIG. 1 is programmed to respond to a transaction commit command from the primary data storage system;
  • FIG. 7 is a flowchart showing how the secondary data storage system in FIG. 1 is programmed to perform a background task of integrating revisions into secondary dataset storage in the secondary data storage system;
  • FIG. 8 is a block diagram of a preferred construction for the data processing system of FIG. 1, in which a pair of “delta volumes” are mirrored between a primary data storage system and a secondary data storage system in order to buffer transmission of write commands from the primary data storage system to the secondary data storage system;
  • FIG. 9 is a block diagram showing data flow in the data processing system of FIG. 8;
  • FIG. 10 is a block diagram of a delta volume in the data processing system of FIG. 8;
  • FIG. 11 is a block diagram of data structures in the secondary storage of the secondary data storage system in FIG. 8;
  • FIG. 12 is a flowchart of programming in a delta volume facility of the primary data storage system of FIG. 8 for remote transmission of write commands to the secondary data storage system;
  • FIG. 13 is a block diagram of an alternative embodiment of the invention, in which the data storage systems are file servers, and the write commands include all file system access commands that modify the organization or content of a file system;
  • FIG. 14 is a block diagram of a directory of file system revisions and storage of file system revisions for the system of FIG. 13;
  • FIG. 15 is a block diagram of an IP network including multiple hosts and multiple data mover computers;
  • FIG. 16 is a block diagram showing a primary data rover distributing remote copy data to multiple secondary data movers in the IP network by establishing a Transmission Control Protocol (TCP) connection with each of the secondary data movers;
  • TCP Transmission Control Protocol
  • FIG. 17 is a block diagram showing a primary data mover distributing remote copy data to multiple data movers through forwarder data movers;
  • FIG. 18 is a block diagram showing a shared save volume used to buffer local copy data transmitted from a primary data mover to a secondary data mover;
  • FIG. 19 is a block diagram showing a primary save volume and a secondary save volume
  • FIG. 20 is a flowchart showing local replication in the system of FIG. 18;
  • FIG. 21 is a flowchart showing remote replication in the system of FIG. 19;
  • FIG. 22 is a block diagram of a primary site, including layered programming in a primary data mover
  • FIG. 23 is a block diagram of a secondary site, including layered programming in a secondary data mover
  • FIG. 24 is a flowchart of a process of replication at the primary site of FIG. 22;
  • FIG. 25 is a flowchart of a procedure for producing a new remote copy of a primary file system concurrent with ongoing replication and multicasting of modifications to the primary file system;
  • FIG. 26 is a flowchart of an IP-replication send-thread introduced in FIG. 22;
  • FIG. 27 is a block diagram of a volume multicast level in the data mover programming of FIG. 22 and FIG. 23;
  • FIG. 28 is a block diagram of the RCP level in the primary data mover programming of FIG. 22;
  • FIG. 29 is a block diagram of the RCP level in the secondary data mover programming of FIG. 23;
  • FIG. 30 is a block diagram of a RPC forwarder at the RPC level in a forwarder data mover
  • FIG. 31 is a flowchart of an inbound RCP session in the secondary data mover
  • FIG. 32 is a block diagram showing a forwarder data mover performing local replication
  • FIG. 33 is a block diagram showing the sharing of a data mover's single TCP port for RCP connections with Hypertext Transfer Protocol (HTTP) connections.
  • HTTP Hypertext Transfer Protocol
  • the present invention relates to replication of remote copy data for Internet Protocol (IP) transmission.
  • IP Internet Protocol
  • One application of the present invention is wide-area distribution of read-only data.
  • it is desired to provide uninterrupted read-only access to remote copies of a consistent file system concurrent with read-write updating of the file system.
  • the preferred method of providing such uninterrupted read-only access is to use a “delta set” mechanism described in Srinivasan et al., U.S. patent application Ser. No. 09/669,939, filed Sep. 26, 2000, which is commonly owned by the assignee of the present application.
  • FIGS. 1 to 14 and the corresponding written description in the present application have been reproduced from Ser. No. 09/669, 939.
  • FIG. 1 With reference to FIG. 1, there is shown a data processing system in which a primary data storage system 20 servicing a primary host processor 21 is connected via a transmission link 22 to a secondary storage system 23 servicing a secondary host processor 24 .
  • the primary data storage system 20 includes a storage controller 25 controlling access to primary storage 26
  • the secondary data storage system 23 has a storage controller 27 controlling access to secondary storage 28 .
  • the storage controller 25 is programmed, via a program storage device such as a floppy disk 29 , with a remote mirroring facility 30 , which transmits write commands from the primary host processor 21 over the link 22 to the storage controller 27 in the secondary storage system.
  • the storage controller 27 receives the write commands and executes them to maintain, in the secondary storage 28 , a copy of data that appears in the primary storage 26 of the primary data storage system. Further details of a suitable remote mirroring facility are disclosed in Ofek et al., U.S. Pat. No. 5,901,327 issued May 4, 1999, incorporated herein by reference.
  • the storage controller 27 in the secondary data storage system is programmed with a concurrent access facility for providing the secondary host processor 24 uninterrupted read-only access to a consistent dataset in the secondary storage 28 concurrent with read-write access by the primary host processor.
  • the concurrent access facility 31 is loaded into the storage controller 27 from a program storage device such as a floppy disk 32 .
  • the concurrent access facility 31 is responsive to the write commands from the primary data storage system, and read-only access commands from the secondary processor 24 .
  • the concurrent access facility 31 is also responsive to transaction commit commands, which specify when the preceding write commands will create a consistent dataset in the secondary storage 28 .
  • the transaction commit commands originate from the primary host processor 21 , and the storage controller 25 forwards at least some of these transaction commit commands over the link 22 to the storage controller 27 .
  • FIG. 2 is a block diagram showing data flow through the data processing system of FIG. 1.
  • the primary data storage system 20 stores a dataset 41 in primary storage
  • the secondary data storage system 23 maintains a copy of the dataset 42 in secondary storage.
  • the dataset for example, could be a set of volumes, a single volume, a file system, a set of files, or a single file. Initially, each of the datasets 41 and 42 are empty, or they are identical because they are loaded from the same external source, or the dataset 42 is copied from the dataset 41 before any write operations are permitted upon the dataset 41 .
  • write operation by the primary host processor 21 cause write data to be written to the dataset 41 in primary storage
  • read operations by the primary host processor 21 cause read data to be read from the dataset 41 in primary storage
  • the primary data storage system forwards the write data from the primary host processor 21 over the link 22 to the secondary data storage system 23 .
  • a first switch 45 directs write data from the link 22 alternately to either a first storage “A” of dataset revisions 43 , or a second storage “B” of dataset revisions 44 .
  • a second switch 46 alternately directs write data to the dataset secondary storage 42 from either the first storage “A” of dataset revisions 43 , or the second storage “B” of dataset revisions.
  • the switches 45 and 46 are linked so that when the first switch 45 selects the first storage “A” of dataset revisions for receiving write data from the link 22 , the second switch 46 selects the second storage “B” of dataset revisions for transmitting write data to the dataset secondary storage 42 . Conversely, when the first switch 45 selects the second storage “A” of dataset revisions for receiving write data from the link 22 , the second switch 46 selects the first storage “B” of dataset revisions for transmitting write data to the dataset secondary storage 42 .
  • the switches 45 and 46 are toggled in response to receipt of a transaction commit command received over the link 22 from the primary data storage system. Moreover, the switches 45 and 46 are not toggled unless all of the revisions in the read-selected storage “A” or “B” of dataset revisions have been transferred to the dataset secondary storage 42 , and unless all of the updates since the last transaction commit command have actually been written from the link 22 into the write-selected storage “A” or “B” of dataset revisions. (For the switch positions in FIG.
  • the storage “A” of dataset revisions 43 is write-selected, and the storage “B” of dataset revisions is read-selected.) Therefore, the combination of the dataset revisions in the read-selected storage “A” or “B” of dataset revisions with the dataset in the dataset secondary storage represents a consistent dataset.
  • the secondary data storage system begins a background process of reading dataset revisions from the read-selected storage “A” or “B” of dataset revisions, and writing the updates into the dataset secondary storage.
  • the secondary host processor 24 may read any dataset revisions from the read-selected storage “A” or “B” of dataset revisions. If a dataset revision is not found in the read-selected storage “A” or “B” of dataset revisions for satisfying a read command from the secondary host processor 24 , then read data is fetched from the dataset secondary storage 42 .
  • the concurrent access facility 31 can provide the secondary host processor with substantially uninterrupted and concurrent read-only access to a consistent dataset regardless of the rate at which the dataset secondary storage 42 is updated to a consistent state by the completion of integration of a set of revisions into the dataset secondary storage. Therefore, the dataset in the dataset secondary storage 42 can be updated at a relatively low rate, and the storage controller 25 of the primary data storage system 20 can send transaction commit commands to the storage controller 27 of the secondary data storage system 23 at a much lower rate than the rate at which the storage controller 25 receives transaction commit commands from the primary host processor 21 . Moreover, the transaction commit commands can be encoded in the write commands sent over the link.
  • the write commands can write alternate sets of revisions to alternate dataset revision storage, as will be described below with respect to FIG. 9.
  • the storage controller 27 in the secondary data storage system 23 can regenerate the transaction commit commands by detecting that the addresses of the write commands have switched from one area of dataset revision storage to the other.
  • each write command can be tagged with a corresponding sequence number so that the storage controller 27 in the secondary data storage system 23 can verify that a complete set of write commands has been received prior to the switch of the write command addresses from one area of the dataset revision storage to the other.
  • FIG. 3 is a block diagram showing control flow through the secondary data storage system of FIG. 1.
  • the secondary data storage system Upon receipt of a write command (from the link 22 in FIGS. 1 and 2), the secondary data storage system accesses a directory 47 or 48 for the write-selected storage “A” or “B” of dataset revisions.
  • the directory is accessed to determine whether or not the write command is accessing the same data item or data storage location as an update existing in the write-selected storage “A” or “B” of dataset revisions. If so, then the directory provides the location of the update in the write-selected storage “A” or “B” of dataset revisions, and the write command is executed upon that pre-existing update.
  • the secondary data storage system Upon receipt of a read-only access command from the secondary host processor, the secondary data storage system accesses the directory 47 or 48 for the read-selected storage “A” or “B” of dataset revisions. The directory is accessed to determine whether or not the read-only access command is accessing the same data item or data storage location as an update existing in the read-selected storage “A” or “D” of dataset revisions. If so, then the directory provides the location of the update in the read-selected storage “A” or “B” of dataset revisions, and the read-only access command is executed upon that pre-existing update. If not, then the secondary data storage system accesses a dataset directory 49 for the dataset secondary storage 42 , in order to locate the requested data in the dataset secondary storage 42 .
  • FIG. 4 is a flowchart showing how the secondary data storage system in FIG. 1 is programmed to respond to a write command received from the primary data storage system.
  • the write command specifies an address of a data item or storage location, and data to be written to the data item or storage location.
  • the storage controller accesses the write-selected directory “A” or “B” of dataset revisions ( 47 or 48 ) for the address specified by the write command.
  • step 64 the storage controller writes the data to the allocated storage.
  • step 65 the storage controller creates a new directory entry (in the write-selected directory “A” or “B” of dataset revisions 47 or 48 ) associating the write address with the allocated storage.
  • step 66 the storage controller returns an acknowledgement over the link to the primary storage system, and the task is finished.
  • step 62 if the write address is in the directory, then execution branches to step 67 .
  • step 67 the storage controller writes the data of the write command to the associated address in the write-selected storage “A” or “B” of dataset revisions ( 43 or 44 ). Execution continues from step 67 to step 66 to return an acknowledgement to the primary storage system, and the task is finished.
  • FIG. 5 is a flowchart showing how the storage controller of the secondary data storage system in FIG. 1 is programmed to respond to a read command received from the secondary host processor.
  • the read command specifies an address of a data item or storage location.
  • the storage controller accesses the read-selected directory “A” or “B” of dataset revisions ( 47 or 48 ).
  • step 72 execution branches depending on whether the address in the read command is found in the directory. If so, then execution branches from step 72 to step 73 .
  • step 73 the storage controller reads data from the read-selected storage “A” or “B” of dataset revisions. Execution continues from step 73 to step 74 , to return the data to the secondary host processor, and then the task is finished.
  • step 72 If in step 72 the read address is not in the directory accessed in step 71 , then execution continues from step 72 to step 75 .
  • step 75 the storage controller accesses the dataset directory ( 48 in FIG. 3). Then in step 76 , execution branches depending on whether the address of the read command is in the dataset directory. If not, execution continues to step 77 , to return an error code to the secondary host processor, and then the task is finished. Otherwise, if the address of the read command is found in the dataset directory, execution branches from step 76 to step 78 .
  • step 78 the storage controller reads data from the dataset secondary storage ( 42 in FIG. 3). Execution continues from step 78 to step 74 , to return the data to the secondary host processor, and the task is finished.
  • FIG. 6 is a flowchart showing how the storage controller of the secondary data storage system in FIG. 1 is programmed to respond to a transaction commit command from the primary data storage system.
  • the storage controller checks whether or not the background task of FIG. 7 is done with integration of the dataset into the dataset secondary storage. For example, this background task is done when the read-selected directory “A” or “B” of dataset revisions is empty. If not, then in step 82 , the storage controller returns a flow control signal to the primary data storage system, because subsequent write commands from the link should not be placed in the storage “A” or “B” of dataset revisions until completion of the integration of the dataset revisions into secondary storage.
  • step 83 the task of FIG. 6 is suspended for a time to permit the background task to continue with integration of the dataset into secondary storage, and then the task is resumed. After step 83 , execution loops back to step 81 . Once the dataset has been integrated into secondary storage, execution continues from step 81 to step 84 .
  • step 84 the switches ( 45 and 46 in FIGS. 2 and 3) are toggled. This is done by complementing a logical variable or flag, which indicates what storage of dataset revision is selected for read and write operations. For example, when the flag has a logical value of 0, the storage “A” of dataset revisions 43 is read-selected, and the storage “B” of dataset revisions 44 is write-selected. When the flag has a logical value of 1, the storage “A” of dataset revisions 43 is write-selected, and the storage “B” of dataset revisions is read-selected.
  • step 85 the storage controller initiates the background task of integrating dataset revisions from the read-selected storage “A” or “B” of dataset revisions into the dataset secondary storage. Then, in step 86 , the storage controller returns an acknowledgement of the transaction commit command to the primary data storage system, and the task of FIG. 6 is done.
  • FIG. 7 is a flowchart showing how the storage controller of the secondary data storage system in FIG. 1 is programmed to perform a background task of integrating revisions into the dataset secondary storage.
  • the first dataset revision is obtained from the read-selected “A” or “B” dataset revision storage ( 43 or 44 in FIG. 3).
  • the storage controller searches the dataset directory ( 49 in FIG. 3) for the write address of the dataset revision.
  • step 93 execution branches depending on whether the write address is found in the directory. If not, execution continues from step 93 to step 94 .
  • step 94 the storage controller stores the revision in the dataset secondary storage ( 42 in FIG. 3), and the storage controller updates the dataset directory ( 49 in FIG. 3). Execution continues from step 94 to step 96 .
  • step 93 if the address of the dataset revision is found in the dataset directory, then execution branches to step 95 to replace the obsolete data in the dataset secondary storage with the dataset revision, and the dataset directory is updated if appropriate.
  • the dataset directory is updated, for example, if the information in the directory for the obsolete data is no longer applicable to the revision.
  • step 95 execution continues in step 96 .
  • step 96 the storage controller de-allocates storage of the dataset revision from the read-selected “A” or “B” dataset revision storage ( 43 or 44 in FIG. 3). Execution continues from step 96 to step 97 . In step 97 , the task is finished if the dataset revision storage is found to be empty. Otherwise, execution continues from step 97 to step 98 . In step 98 , the task is suspended to permit any higher priority tasks to begin, and once the higher priority tasks are completed, the background task is resumed. Execution then continues to step 99 . In step 99 , the storage controller obtains the next dataset revision from the read-selected “A” or “B” dataset revision storage. Execution loops back to step 92 from step 99 , in order to integrate all of the revisions from the read-selected “A” or “B” dataset revision storage into the dataset secondary storage.
  • FIG. 8 is a block diagram of one preferred construction for a data processing system in which the write commands for the dataset revisions access direct mapped, numerically addressed storage.
  • the data processing system includes a primary data storage system 110 , a data mover computer 111 , a primary host processor 112 , a secondary data storage system 113 , a data mover computer 114 , and a secondary host processor 115 .
  • the data mover computer 111 includes a file system 116 that translates file system read and write commands from the primary host processor 112 to logical block read and write commands to the primary data storage system. Therefore, the combination of the data mover computer 111 and the primary data storage system 110 functions as a file server. Further details regarding the programming of the data mover computer 111 and the file system 116 are disclosed in Vahalia et al., U.S. Pat. No. 5,893,140, issued Apr. 6, 1999, and entitled “File Server Having A File System Cache And Protcol For Truly Safe Asynchronous Writes,” incorporated herein by reference. In a similar fashion, the combination of the secondary data storage system 113 and the data mover computer 114 also functions as a file server.
  • the primary data storage system has primary storage 118 , and a storage controller 119 .
  • the storage controller includes a semiconductor random access cache memory 120 , a host adapter 121 interfacing the data mover computer 111 to the cache memory, disk adapters 122 , 123 interfacing the cache memory to the primary storage 118 , and a remote mirroring facility 124 for interfacing the cache memory 120 to dual redundant data transmission links 125 , 126 interconnecting the primary data storage system 110 to the secondary data storage system 113 .
  • the remote mirroring facility is constructed and operates as described in the above-cited Ofek et al., U.S. Pat. No. 5,901,327 issued May 4, 1999. This remote mirroring facility mirrors file system storage 141 in the primary storage 118 .
  • the file system storage 141 is mirrored by mirroring delta volume storage 143 that is used to buffer the updates to file system storage 141 of the primary storage 118 .
  • the host adapter 121 is programmed with a “delta volume facility” 127 that loads the updates into the delta volume storage 143 of the primary storage 118 .
  • the remote mirroring facility transmits the updates over the dual redundant links 125 , 126 to mirrored delta volume storage 144 in secondary storage 128 in the secondary data storage system 113 , as further shown and described below with reference to FIGS. 9 to 12 .
  • the delta volume facility 127 is located at a volume manager level in data processing system of FIG. 8.
  • the volume manager level lies between the level of the files system 116 and the level of the primary storage 110 .
  • the file system 116 addresses logical blocks in logical volumes. In other words, each logical volume appears as an array of blocks having contiguous logical block numbers.
  • the volume manager maps the logical block number into an appropriate basic storage volume and physical offset within the basic volume.
  • the volume manager permits a number of the basic storage volumes to be combined in various ways to construct a single metavolume that can be used to build a file system.
  • the file system views the metavolume as a single, contiguous array of blocks that is accessible by specifying a logical block number within this array.
  • the secondary data storage system 113 also includes a storage controller 129 .
  • the storage controller 129 includes a semiconductor cache memory 130 , a host adapter 131 , disk adapters 132 and 133 , and a remote mirroring facility 134 .
  • the host adapter 131 is programmed with a concurrent access facility 135 that is similar to the concurrent access facility ( 31 in FIG. 1) described above with respect to FIGS. 1 to 7 , except that the concurrent access facility 135 obtains updates from the mirrored delta volume storage 144 in the secondary storage 128 (as further described below with reference to FIGS. 9 to 11 ) instead of directly from the primary data storage system.
  • FIG. 9 is a block diagram showing data flow in the data processing system of FIG. 8.
  • the file system 116 performs read and write operations upon the file system primary storage 141 .
  • Write data for sets of sequential transactions are alternately written to an “A” delta volume 145 and a “B” delta volume 146 in the delta volume storage 143 of the primary data storage system 110 .
  • the remote mirroring facility transfers the write data to a mirrored “A′” delta volume 147 and a mirrored “B′” delta volume 148 in the delta volume storage 144 of the secondary data storage system 113 .
  • the data mover computer When the secondary host processor requests read-only file system access from the data mover computer 114 , the data mover computer reads file system data from a read-selected one of the “A′” delta volume 147 or the “B′” delta volume 148 in the delta volume storage 144 of the secondary data storage system 113 , and if the required file system data are not found in the read-selected one of the delta volumes, then the data mover computer reads the file system data from the file system secondary storage 142 .
  • FIG. 10 is a block diagram of a delta volume in the data processing system of FIG. 8.
  • Each delta volume is logically divided into delta chunks of a fixed size.
  • the fixed size is preselected depending on various factors such as the serving capacity of the primary site and the write activity at the primary site.
  • the fixed size is large enough to contain all of the updates for any single transaction.
  • Each delta set is a set of changes to the file system blocks that, when viewed as a whole, leave the file system in a consistent state.
  • the delta sets are identified by a sequence number (SEQNO) and written to the delta volume (and thus propagated to the replica sites).
  • a new delta set begins at the start of a delta chunk and the size of a delta set cannot exceed the size of a delta chunk.
  • the sequence number (SEQNO) and also the delta set size (DSS) can be written to a header or trailer 149 of the delta chunk.
  • the delta volume therefore functions as a transaction log for updates to the file system, and also as a buffer for transmitting the updates between the primary data storage system and the secondary data storage system.
  • each delta set can also have a fixed size, to facilitate asynchronous transmission of the updates over the data link between the primary and secondary data storage systems.
  • each block update can have its own sequence number. If a transmission error is detected, such as a failure of the secondary data storage system to receive a block update in sequence, the block update can be retransmitted, and written into its delta set in proper sequence when received.
  • the delta volume has been selected to minimize computational overhead for accessing the delta volume rather than to minimize storage requirements.
  • a conventional transaction log has a format selected to minimize storage requirements rather than to minimize computational overhead for accessing the log.
  • the delta volume could use a conventional transaction log data structure.
  • the delta volume could also include a delta set directory overlaid upon the conventional transaction log data structure.
  • a single delta volume rather than two delta volumes, could be used for buffering the transmission of file system updates between the primary data storage system and the secondary data storage system. If a single delta volume were used, then alternate delta chunks in the delta volume could be read-selected and write-selected. It should also be apparent that more than two delta volumes could be used for buffering file system updates.
  • the primary data storage system could store data for multiple file systems, and each file system to be accessed from the secondary data storage system could have its updates buffered in one, two, or more delta volumes used for buffering the updates of only one file system.
  • FIG. 11 is a block diagram of data structures in the file system secondary storage ( 128 in FIG. 8) of the secondary data storage system ( 113 in FIG. 8).
  • the concurrent access facility ( 135 in FIG. 8) in the secondary data storage system uses a volume manager utility that inserts the read-selected delta set 151 as an overlay on top of the file system metavolume 152 .
  • a delta set map 153 is created of the block entries in the delta set 151 . This map is then used to route a block read request to either the delta set or the file system metavolume depending on whether there is a block entry in the delta set for the requested block or not.
  • the read-selected delta set 151 corresponds to the read-selected storage “A” or “B” of dataset revisions 43 or 44 in FIG. 2 and FIG. 3, and the delta set map 153 corresponds to the directory 47 or 48 in FIG. 3 for the dataset revisions.
  • the time of insertion of the read-selected delta set and the creation of the delta set map corresponds to the time between steps 84 and 85 of FIG. 6.
  • the integration of the file system revisions involves copying the updates into the corresponding blocks of the file system metavolume 152 .
  • the routing of a block read request to either the delta set or the file system metavolume corresponds to steps 71 and 72 in FIG. 5.
  • FIG. 12 is a flowchart of programming in a delta volume facility of the primary data storage system of FIG. 8 for remote transmission of write commands to the secondary data storage system.
  • the storage controller of the primary data storage system clears the sequence number (SEQNO).
  • the sequence number is used to map the current delta chunk into either the “A” delta volume or the “B” delta volume. For example, if the sequence number is even, then the current delta chunk is in the “A” delta volume, and if the sequence number is odd, then the current delta chunk is in the “B” delta volume.
  • the position of the delta chunk in the corresponding delta volume is computed by an integer division by two (i.e., a right shift by one bit position), and then masking off the two least significant bits (i.e., the remainder of an integer division by four).
  • step 162 the storage controller clears a variable indicating the delta set size (DSS). Then in step 163 , the storage controller clears a timer.
  • the timer is a variable that is periodically incremented. The timer is used to limit the frequency at which transaction commit commands are forwarded from the primary data storage system to the secondary storage system unless the transaction commit commands need to be transmitted at a higher rate to prevent the size of the delta sets from exceeding the size of the delta chunk.
  • step 164 execution continues to step 165 if the storage controller receives a write command from the primary host processor.
  • step 165 the storage controller places the write command in the current delta chunk. This involves writing a number of data blocks to the delta volume selected by the sequence number (SEQNO), beginning at an offset computed from the sequence number and the current delta set size (DSS). Then in step 166 , the storage controller increments the delta set size (DSS) by the number of blocks written to the delta chunk.
  • step 167 the storage controller compares the delta set size to a maximum size (DSM) to check whether a delta chunk overflow error has occurred. If so, then execution branches to an error handler 168 . Otherwise, execution continues to step 169 .
  • SEQNO sequence number
  • DSM maximum size
  • step 169 execution continues to step 170 unless a transaction commit command is received from the primary host processor. If not, execution continues to step 170 , to temporarily suspend, and then resume, the delta volume facility task of FIG. 12. Otherwise, if a transaction commit command is received, execution continues to step 171 .
  • the data mover computer ( 111 in FIG. 8) has already flushed any and all file system updates preceding the transaction commit command from any of its local buffer storage to the primary data storage system. Write operations by the primary host processor subsequent to the transaction commit command are temporarily suspended until this flushing is finished. Therefore, once step 171 has been reached, the updates in the delta set of the current delta chunk represent a change of the file system from one consistent state to another consistent state.
  • step 171 the storage controller compares the delta set size (DSS) to a threshold size (THS) that is a predetermined fraction (such as one-half) of the maximum size (DSM) for a delta chunk. If the delta set size (DSS) is not greater than this threshold, then execution continues to step 172 .
  • the timer is compared to a predetermined threshold (THT) representing the minimum update interval for the file system secondary storage unless a smaller update interval is needed to prevent the size of the delta set from exceeding the size of the delta chunk.
  • THT predetermined threshold
  • the minimum update interval (THT) should depend on the particular application. A value of 5 minutes for THT would be acceptable for many applications.
  • step 173 Execution also branches from step 171 to step 173 if the delta set size (DSS) is greater than the threshold size (THS).
  • DSS delta set size
  • step 173 the storage controller writes the delta set size (DSS) and the sequence number (SEQNO) into an attributes block of the delta chunk (e.g., the trailer 199 in FIG. 10.)
  • the updating of the sequence number in the delta chunk validates the delta set in the delta chunk.
  • step 174 the storage controller flushes the current delta volume to ensure that all updates in the delta set of the current delta chunk will be transmitted to the secondary data storage system, and then sends a transaction commit command to the secondary data storage system.
  • the secondary data storage system should have received all of the updates in the delta set of the current delta chunk before receipt of the transaction commit command.
  • the remote data mirroring facility can be operated in an asynchronous or semi-synchronous mode for the current delta volume until step 174 , and switched in step 174 to a synchronous mode to synchronize the current delta volume in the primary data storage system with its mirrored volume in the secondary data storage system, and then the transaction commit command can be sent once the remote mirroring facility indicates that synchronization has been achieved for the current delta volume.
  • the storage controller increments the sequence number (SEQNO).
  • step 176 the storage controller temporarily suspends the delta volume facility task of FIG. 12, and later resumes the task. Execution then loops back from step 176 to step 162 .
  • delta volumes As buffers for transmission of updates from the primary data storage system to the secondary data storage system, there is no need for the delta volume facility to wait for receipt of an acknowledgement of the transaction commit command sent in step 174 , before continuing to step 175 . Instead, flow control of the updates can be based upon the sequence numbers and the use of sufficiently large delta volumes.
  • the delta sets are numbered in an increasing sequence.
  • the delta sets are loaded and unloaded in the order of this sequence. If any delta sets are corrupted during transmission between the sites, they can be retransmitted and then reordered in terms of their sequence numbers.
  • the primary data storage system will start by producing set number 1 , followed by set number 2 and so on.
  • each secondary data storage system will integrate the file system secondary storage with the delta sets by unloading and integrating set number 1 , followed by set number 2 and so on.
  • the primary does not wait for an immediate acknowledgement from the secondaries when moving from one delta set to the next. This is made feasible by having a sufficiently large delta volume so that there is enough buffer space to allow the secondaries to be a few delta sets behind the primary without having an overflow.
  • An overflow happens when a primary reuses a delta chunk to write a new delta set before one or more of the secondaries have completely processed the old delta set residing on that delta chunk.
  • the primary data storage system can prevent such an overflow condition by suspending the processing of delta sets if there is a failure to receive an acknowledgement of a transaction commit command over the production of a certain number of the delta sets, such as seven delta sets for the example of four delta per delta volume and two delta volumes per file system.
  • the overflow condition can also be detected at a secondary data storage system by inspecting the delta set sequence numbers. An overflow is detected if the SEQNO in the delta chunk exceeds the next number in sequence when the secondary data storage system read-selects the next delta chunk for integration of the updates into the file system secondary storage.
  • a flow control error recovery procedure is activated. This error recovery procedure, for example, involves suspending write operations by the primary host processor, re-synchronizing the file system secondary storage with the file system primary storage, and restarting the delta volume facility.
  • the conventional remote mirroring facility ( 124 in FIG. 8) in such a way to ensure that all of the updates to the delta set in the current delta chunk will be flushed and received by the secondary data storage system, prior to sending the transaction commit to secondary storage, with a minimal impact on continued host processing in the primary data storage system.
  • the remote mirroring facility operates to transmit updates from the primary data storage system to the secondary data storage system concurrently with writing to the volume. During the flushing and synchronization of a remotely mirrored volume, however, the writing to a volume is temporarily suspended.
  • one of the delta volumes can be flushed in step 174 while the subsequent processing in FIG. 12 (steps 175 , 176 et seq.) continues concurrently for the next delta set, which is mapped to the other of the two delta volumes.
  • a transmit message volume can be allocated in the primary and mirrored to a similar receive message volume in the secondary data storage system. Also a transmit message volume can be allocated in the secondary data storage system and mirrored to a similar receive message volume in the secondary.
  • the remote mirroring facility will then automatically copy messages deposited in the transmit volume of one data storage system to a receive volume of another data storage system.
  • the volumes can be partitioned into a number of identical message regions (analogous to the delta chunks of FIG. 10), each having an associated sequence number, so that the message volumes also function as a message queue.
  • Each data storage system could inspect the sequence numbers in the receive message volume to identify a new message that has been deposited in the message volume.
  • Each block in each message region could be allocated to a predefined message type.
  • FIG. 13 is a block diagram of an alternative embodiment of the invention, in which the data storage systems are file servers, and the write commands include all file system access commands that modify the organization or content of a file system.
  • the data processing system in FIG. 13 includes a conventional primary file server coupled to a primary host processor 182 for read-write access of the primary host processor to a file system stored in the file server.
  • the conventional file server includes a storage controller 183 and primary storage 184 .
  • the storage controller 183 includes facilities for file access protocols 191 , a virtual file system 192 , a physical file system 193 , a buffer cache 194 , and logical-to-physical mapping 195 . Further details regarding such a conventional file server are found in the above-cited Vahalia et al., U.S. Pat. No. 5,893,140, issued Apr. 6, 1999.
  • the data processing system in FIG. 13 also includes a secondary file server 185 coupled to the primary host processor 182 to receive copies of at least the write access commands sent from primary host processor to the primary file server.
  • the secondary file server has a storage controller 187 and secondary storage 188 .
  • the storage controller 187 includes facilities for file access protocols 201 , a virtual file system 202 , a physical file system 203 , a buffer cache 204 , and logical-to-physical mapping 205 . To this extent the secondary file server is similar to the primary file server.
  • the primary host processor 182 has a remote mirroring facility 186 for ensuring that all such write access commands are copied to the secondary file server 185 .
  • This remote mirroring facility 186 could be located in the primary file server 181 instead of in the primary host processor.
  • the remote mirroring facility 186 also ensures that the primary host processor will receive acknowledgement of completion of all preceding write commands from an application 199 from both the primary file server 181 and the secondary file server 185 before the primary host processor will return to the application an acknowledgement of completion of a transaction commit command from the application 199 .
  • the secondary file server 185 therefore stores a copy of the file system that is stored in the primary file server 181 .
  • a secondary host processor 189 is coupled to the secondary file server 185 for read-only access of the secondary host processor to the copy of the file system that is stored in the secondary storage.
  • the secondary file server 185 has a concurrent access facility 200 that is an interface between the virtual file system 202 and the physical file system 203 .
  • the physical file system layer 203 is a UNIX-based file system having a hierarchical file system structure including directories and files, and each directory and file has an “inode” containing metadata of the directory or file.
  • Popular UNIX-based file systems are the UNIX file system (ufs), which is a version of Berkeley Fast File System (FFS) integrated with a vnode/vfs structure, and the System V file system (s5fs).
  • the concurrent access facility 200 in FIG. 13 can be constructed as shown in FIGS. 2 to 7 above.
  • the dataset is the file system.
  • the directories of dataset revisions ( 47 , 48 in FIG. 3) and the storage of data revisions ( 43 , 44 ) prefferably have a hierarchical inode structure to facilitate integration with the hierarchical inode structure of the UNIX-based file system directory corresponding to the dataset directory 49 in FIG. 3.
  • FIG. 14 shows a hierarchical structure of a directory of dataset revisions and storage of dataset revisions for a write to a file D:/SUB1/FILE-Y followed by a file rename operation RENAME D:/SUB 1 /FILE-X to D:/SUB1/FILE-Y.
  • the first update would be processed in steps 61 to 66 of FIG. 4 by creating a root directory 210 named “D:” in the write-selected directory of dataset revisions, as shown in FIG.
  • the task would process the second update by searching the root directory 210 and subdirectory 211 to find the “FILE-X” entry, and creating a new file entry 214 named “FILE-Y” in the subdirectory, and then linking an alias attribute pointing to the “FLEX” entry in the subdirectory, and then creating a command list linked to the “FILE-X” entry and including the command “RENAME [FILE-X to] FILE-Y”, and then unlinking the new metadata 216 and new data 217 from the “FILE-X” entry and linking the new metadata 216 and new data 217 to the “FILE-Y” entry.
  • the resulting data structure would then facilitate subsequent read-only access and integration of the new data of “FILE-Y” with any non-obsolete write data for “FILE-X” in the dataset secondary storage ( 42 in FIG. 3) for the file system “D:/”.
  • the remote mirroring aspect of the present invention could be implemented at an intermediate level in the file server below the file access command level (as in the system of FIG. 13) and above the logical block level (as in the system of FIG. 8).
  • the remote mirroring could operate at the physical file system inode level.
  • the storage of dataset revisions could be implemented as a sequential transactional log of the file system modifications on the primary side, with sufficient information stored in the log, such as inode numbers and old values and new values, to allow the secondary concurrent access facility to “replay” the transactions into the “live” file system in the file system secondary storage.
  • IP network 220 including multiple network file servers 221 , 222 , and multiple hosts 223 , 224 , 225 .
  • the hosts and network file servers can be distributed world wide and linked via the Internet.
  • Each of the network file servers 221 , 222 has multiple data movers 226 , 227 , 228 , 232 , 233 , 234 , for moving data between the IP network 220 and the cached disk array 229 , 235 , and a control station 230 , 236 connected via a dedicated dual-redundant data link 231 , 237 among the data movers for configuring the data movers and the cached disk array 229 , 235 . Further details regarding the network file servers 221 , 222 are found in Vahalia et al., U.S. Pat. No. 5,893,140, incorporated herein by reference.
  • each of the network file servers 221 , 222 it is desired for each of the network file servers 221 , 222 to provide read-only access to a copy of the same file system.
  • each of the network file servers could be programmed to respond to user requests to access the same Internet site.
  • the IP network 220 routes user requests to the network file servers 221 , 222 in the same continent or geographic region as the user. In this fashion, the user load is shared among the network file servers.
  • a primary data mover establishes a connection 242 , 243 , 244 in accordance with the industry-standard Transmission Control Protocol (TCP) over the IP network 220 to each secondary data mover 245 , 246 , 247 , and then concurrently sends the updates to each secondary data mover over the TCP connection.
  • TCP Transmission Control Protocol
  • the updates need to be distributed to a large number of secondary data movers, however, the amount of time for distributing the updates may become excessive due to limited resources (CPU execution cycles, connection state, or bandwidth) of the primary data mover 241 .
  • One way of extending these limited resources would be to use existing IP routers and switches to implement “fan out” from the primary data mover 241 to the secondary data movers 245 , 246 , 247 . Still, a mechanism for reliability should be layered over the Internet Protocol.
  • FIG. 17 shows that the time for distributing updates from a primary data mover 251 to a large number of secondary data movers 254 , 255 , 256 , 257 can be reduced by using intermediate data movers 252 , 253 as forwarders.
  • the primary data mover 251 sends the updates to the forwarder data movers 252 , 253 , and each of the forwarder data movers sends the updates to a respective number of secondary data movers.
  • the forwarder data movers 252 , 253 may themselves be secondary data movers; in other words, each may apply the updates to its own copy of the replicated read-only file system.
  • the distribution from the primary data mover 251 to the forwarder data movers 252 , 253 can be done in a fashion suitable for wide-area distribution (such as over TCP connections).
  • the forwarding method of replication of FIG. 17 also has the advantage that the distribution from each forwarder data mover to its respective data movers can be done in a different way most suitable for a local area or region of the network. For example, some of the forwarder data movers could use TCP connections, and others could use a combination of TCP connections for control and UDP for data transmission, and still other forwarders could be connected to their secondary data movers by a dedicated local area network.
  • the save volume is a buffer between the data producer (i.e., the host or application updating the primary file system), the replication process, and the data consumer (the secondary data movers).
  • the save volume stores the progress of the replication over the Internet Protocol so as to maintain the consistency of the replication process upon panic, reboot, and recovery.
  • the transport process need not depend on any “in memory” replication information other than the information in the save volume, so as to permit the replication process to be started or terminated easily on any data mover for load shifting or load balancing.
  • a save volume When a save volume is used, it can be shared between a primary data mover and a secondary data mover in the case of local file system replication, or a primary copy of the shared volume can be kept at the primary site, and a secondary copy of the shared volume can be kept at the secondary site, in the case of remote file system replication.
  • FIG. 18 shows a primary site including a primary data mover 260 managing access to a primary file system 261 , and a secondary data mover 262 managing access to a secondary file system 263 maintained as a read-only copy of the primary file system 261 .
  • a save volume 264 is shared between the primary data mover 260 and the secondary data mover 262 . This sharing is practical when the secondary site is relatively close to the primary site.
  • a redo log 265 records a log of modifications to the primary file system 261 during the replication process for additional protection from an interruption that would require a reboot and recovery.
  • Local replication can be used to replicate files within the same network file server.
  • the primary data mover could be the data mover 226
  • the secondary data mover could be the data mover 227
  • the save volume could be stored in the cached disk array 229
  • replication control messages could be transmitted between the data movers over the data link 231 .
  • FIG. 19 shows a primary site including a primary data mover 270 managing access to a primary file system 271 , and a secondary data mover 272 managing access to a secondary file system 273 maintained as a read-only copy of the primary file system 271 .
  • the primary site includes a primary save volume 274
  • the remote site includes a secondary save volume 275 .
  • a redo log 276 records a log of modifications to the primary file system 271 during the replication process for additional protection from an interruption that would require a reboot and recovery.
  • FIG. 20 shows a method of operating the system of FIG. 18 for local replication.
  • the primary data mover migrates a copy of the primary file system to create a secondary file system at the secondary site in such a way to permit concurrent write access to the primary file system.
  • the migration may use the method shown in FIG. 17 of the above-cited Ofek U.S. Pat. No. 5,901,327, in which a bit map indicates remote write pending blocks.
  • the migration may use a snapshot copy mechanism, for example, as described in Kedem, U.S. Pat. No. 6,076,148, in which a bit map indicates the blocks that have changed since the time of snap-shotting of the primary file system.
  • the snapshot method is preferred, because it is most compatible with the delta set technique for remote copy of subsequent modifications.
  • a snapshot manager creates a snapshot copy of the primary file system, as will be further described below with reference to FIGS. 22 to 25 .
  • the secondary file system it is desired for the secondary file system to become a copy of the state of the primary file system existing at some point of time, with any subsequent modifications of the primary file system being transferred through the shared save volume.
  • step 282 the primary data mover writes subsequent modifications of the primary file system to the shared save volume.
  • step 283 the secondary data mover reads the subsequent modifications from the shared save volume and writes them to the secondary file system.
  • step 284 the secondary data mover provides user read-only access to consistent views of the secondary file system. This can be done by integrating the subsequent revisions into the secondary file system and providing concurrent read-only access to the secondary file system in the fashion described above with reference to FIGS. 2 to 7 . Execution loops from step 284 back to step 282 . In this fashion, the secondary file system is updated from the primary site concurrently with read-only access at the secondary site.
  • FIG. 21 shows a method of operating the system of FIG. 19 for remote replication.
  • the primary data mover migrates a copy of the primary file system to create a secondary file system at the secondary site, in a fashion similar to step 281 in FIG. 20.
  • the primary data mover writes subsequent modifications of the primary file system to the primary save volume, in a fashion similar to step 282 in FIG. 20.
  • the modifications are copied from the primary save volume to the secondary save volume, for example, by using a delta volume facility for transmitting delta chunks as described above with reference to FIGS. 9 to 12 .
  • step 294 the secondary data mover reads the modifications from the secondary save volume and writes them to the secondary file system.
  • step 295 the secondary data mover provides user read-only access to consistent views of the secondary file system, in a fashion similar to step 284 of FIG. 20. Execution loops from step 295 back to step 292 . In this fashion, the secondary file system is remotely updated from the primary site concurrently with read-only access at the secondary site.
  • FIG. 22 shows layered programming 300 for a primary data mover. It is desired to use layered programming in accordance with the International Standard Organization's Open Systems Interconnection (ISO/OSI) model for networking protocols and distributed applications. As is well known in the art, this OSI model defines seven network layers, namely, the physical layer, the data link layer, the network layer, the transport layer, the session layer, the presentation layer, and the application layer.
  • ISO/OSI International Standard Organization's Open Systems Interconnection
  • the layered programming 300 includes a conventional TCP/IP transport layer 301 .
  • the layers above the TCP/IP transport layer 301 include a replication control protocol (RCP) session layer 302 , a volume multicast presentation layer 303 , and an IP-FS (file system) copy send-thread 304 and an IP-replication send-thread 305 at the program layer level.
  • RCP replication control protocol
  • IP-FS file system copy send-thread
  • IP-replication send-thread 305 at the program layer level.
  • MAC_CMD management and configuration command interpreter
  • the RCP layer 302 provides an application program interface (API) for multicasting data over TCP/IP.
  • API application program interface
  • RCP provides callback, acknowledgement (ACK), and resumption of aborted transfers.
  • RCP provides the capability for a remote site to replicate and rebroadcast remote copy data.
  • the remote site functions as a router when it rebroadcasts the remote copy data.
  • RCP can also be used to replicate data locally within a group of data movers that share a data storage system.
  • the command interpreter 306 initiates execution of a replication module 310 if the replication module is not presently in an active mode. Then, the command interpreter 306 invokes a snapshot manager 308 to create a snapshot copy 309 of a primary file system volume 307 . When the snapshot copy is created, the snapshot manager 308 obtains a current delta set number from the replication module 310 and inserts the current delta set number into the metadata of the snapshot. The current delta set number for the snapshot is all that the secondary needs to identify modifications that are made subsequent to the creation of the snapshot.
  • any number of new remote copies can be created at various times during operation of the replication module, with the snapshot process operating concurrently and virtually independent of the replication module. For example, whenever synchronization of a remote copy is lost, for example due to a prolonged disruption of network traffic from the primary site to the remote site, a new remote copy can be created to replace the unsynchronized remote copy.
  • the command interpreter 306 initiates execution of an instance of the IP-FS copy send-thread 304 .
  • the instance of the IP-FS copy send-thread 304 reads data from the snapshot copy 309 and calls upon the volume multicast layer 303 to multicast the remote copy data to all of the secondary data movers where the remote copies are to be created.
  • This can be a copy by extent, so there is no copying of invalid or unused data blocks.
  • the volume multicast layer 303 is given a copy command (@vol., length) specifying a volume and an extent to be copied, and may also specify a group of destinations (an RCP group).
  • the snapshot copy 309 of the primary file system identifies the next valid block to be copied, and the number of valid contiguous blocks following the next block. These blocks are copied at the logical level, so it does not matter what physical structure is used for storing the secondary file system at the secondary site. The copying is done locally, or by remote copy, for example by transporting the data block over IP.
  • the volume multicast layer 303 invokes the RCP layer 302 to transport each data block.
  • the replication module 310 logs an indication of the modified block in a log 314 and later assembles the modification into a delta set chunk written to a primary save volume 311 .
  • the replication module 310 logs the indications in the log 314 on a priority or foreground basis as data is written to the primary file system volume 307 , and also logs boundaries between delta sets.
  • the replication module 310 later reads the log 314 to read the indicated modifications from the primary file system volume 307 , assemble the indicated modifications into delta set chunks on a background basis, and store the delta set chunks in a save volume chunk area of the save volume 311 .
  • the log is in the form of a queue of two bit-map tables, a new one of the tables being written to coincident with write operations upon the primary file system volume 307 , and an old one of the tables being read to determine blocks to copy from the primary file system to create a new delta set in the save volume 311 .
  • the replication module 310 updates the save volume mailbox area 312 by storing each delta set chunk definition (@vol., length).
  • the IP-replication send-thread instance 305 polls the save volume mailbox area 312 to see if any delta set chunks have been stored in the save volume chunk area 313 . If so, then the thread instance calls upon the volume multicast layer 303 to multicast the delta set chunks to the data movers that manage the storage of the respective remote file system copies. For example, for each delta set chunk, the IP-replication send-thread instance 305 issues a volume multicast command to the volume multicast layer 303 . When the chunk multicast is completed, the IP-replication send-thread instance 305 updates its context on the save volume 311 in the mailbox area 312 .
  • the IP-replication send-thread instance At reboot after an interruption of multicast of a chunk, the IP-replication send-thread instance is able to restart the multicast of the chunk.
  • the IP-replication send-thread instance also is responsible for retrying transmission of the chunk whenever the connection with the secondary is interrupted.
  • FIG. 23 shows the layered programming 320 for a secondary data mover.
  • the programming includes a TCP/IP layer 321 , an RCP layer 322 , a volume multicast layer 323 , and a management and configuration command interpreter (MAC_CMD) 324 .
  • MAC_CMD management and configuration command interpreter
  • the volume multicast layer 323 writes remote copy data from the primary data mover to the secondary file system volume 325 , and concurrently writes modifications (delta set chunks) from the primary data mover to a save volume chunk area 326 of a secondary save volume 327 .
  • the volume multicast layer performs the steps in FIG. 4 described above to write the modifications to the save volume chunk area 326 .
  • a header for the changes in a next version of the delta set is sent last, because there is no guarantee of the order of receipt of the IP packets.
  • the header of the delta set includes a generation count, the number of delta blocks for the next version of the delta set, a checksum for the header, and a checksum for the data of all the delta blocks.
  • the receiver checks whether all of the changes indicated in the header have been received.
  • a playback module 328 is activated to read the modifications from the save volume chunk area 326 and integrates them into the secondary file system volume 325 .
  • the playback module 328 for example, performs the steps in FIGS. 6 to 7 as described above. From each delta-set chunk in the save volume area 326 , the playback module 328 gets the block address and number of contiguous blocks to be written to the secondary file system volume.
  • An access module 329 provides read-only access to a consistent view of the secondary file system in the secondary file system volume 325 .
  • the access module 329 for example, performs the steps shown in FIG. 5 as described above.
  • FIG. 24 shows a procedure executed by the primary site of FIG. 22 to perform replication of the primary file system.
  • the primary file system is paused to make it consistent. Migration of the primary file system to the secondaries can then be started using a remote copy facility or snapshot manager.
  • step 342 concurrent write access to the primary file system is resumed, and all modifications made on the primary file system are logged at the volume level on a priority or foreground basis when each modification is made.
  • a background process of delta-set creation is initiated.
  • Two configurable triggers specify the rate of delta set creation: a timeout parameter and a high water mark parameter.
  • delta set creation Whenever delta set creation is initiated, the current time, as indicated by a real-time clock, is added to a configurable timeout interval to produce the timeout parameter.
  • the high water mark specifies an amount of modified data, in megabytes.
  • the first trigger that occurs will trigger the creation of a delta set.
  • the replication module creates the delta set by pausing the primary file system, copying the modified blocks from the primary file system to the delta set volume, and then resuming the primary file system. By logging indications of the modified blocks and later copying the modified blocks, multiple modifications to the same block are represented and transported once during a single delta set.
  • step 343 the background process of delta set creation is temporarily suspended, for example, by placing the process on a task queue that is periodically serviced.
  • step 344 execution of the delta set creation process is resumed.
  • step 345 the modification size is compared to the high water mark. If the high water mark is not exceeded, then execution continues to step 346 .
  • step 346 the present value of the real-time clock is compared to the timeout parameter. If the timeout parameter has not been exceeded, then execution loops back to step 343 . Otherwise, execution continues to step 347 . Execution also branches to step 347 from step 345 if the modification size is greater than the high water mark.
  • step 347 the primary file system is paused.
  • step 348 a new delta set is created by starting the copying of modified blocks from the primary file system volume to the new delta set.
  • step 349 the logging of new modifications into a new table is started.
  • step 350 the time-out and high water mark is re-armed. In other words, a new value for the timeout parameter is computed as the current real time plus the configurable timeout interval, and the modification size is reset to indicate the size of the new modifications.
  • step 351 the primary file system is resumed. Execution loops from step 351 back to step 343 to suspend the background process of delta set creation.
  • the primary file system could remain paused and not resumed in step 351 until the copy process begun in step 348 is completed.
  • the copy process begun in step 348 is a snapshot copy process, so that write access to the primary file system may resume in step 351 before the copy process has been completed.
  • the modification log being a queue of two bit-map tables, when a write access to a block in the primary file system is requested, the old bit map is accessed on a priority basis.
  • the corresponding bit in the old bit map indicates a modified block in the primary file system volume not yet copied to the save volume, then it is copied on a priority basis to the save volume before the new write data is written to the primary file system volume. As soon as a modified block has been copied from the primary file system volume to the save volume, the corresponding bit in the old bit map is cleared. In this fashion, at the completion of the copy process, the entire old table will be in a reset state, ready to be used as the next new table.
  • the replication module sets the save volume mailbox area to show that a new delta set is ready for transmission.
  • the IP-replication send-thread finds that the new delta set is ready for transmission, and invokes the volume multicast layer to transmit the delta set to the secondary sites.
  • execution loops back to step 343 .
  • FIG. 25 shows a flow chart of the overall procedure of creating a new remote copy, either for the first time at a secondary site or as a replacement for a remote copy that needs to be resynchronized with the primary file system.
  • the snapshot manager creates a snapshot copy of the primary file system at the end of any pending transaction upon the primary file system (e.g., when the primary file system becomes consistent after it is paused in step 341 of FIG. 24 or in step 347 of FIG. 24.)
  • the replication module independently writes any subsequent modifications into a current delta set for the next transaction.
  • step 353 the snapshot manager obtains the current delta set number from the replication module and inserts it into metadata of the snapshot copy.
  • the IP-FS copy send-thread is started in order to send volume extents of the snapshot copy to the secondary data mover, by invoking the volume multicast layer for each extent.
  • step 355 when the IP-FS copy send-thread is finished, the primary data mover sends a “start playback” signal to the secondary data mover.
  • step 356 secondary data mover receives the “start playback” signal from the primary data mover, and starts the playback module.
  • step 357 playback module begins playback from the delta set indicated by the delta set number in the snapshot metadata.
  • the playback module ( 328 in FIG. 23) at the secondary site integrates the delta set modifications into secondary file system.
  • the modifications can be integrated into the secondary file system, for example, by pausing the secondary file system, copying the modifications from the secondary save volume into the secondary file system, and resuming the secondary file system.
  • a timeout interval and a high water mark value can be configured for the secondary site, so that the modifications may be integrated into the secondary file system at a rate less frequent than the rate at which the new delta sets appear in the secondary save volume.
  • the modifications from the secondary save volume would not be integrated into the secondary file system until the timeout time is reached unless the amount of modifications in the save volume reaches the high water mark.
  • the integration of the modifications can be performed concurrently with read-only access to a consistent view of the secondary file system as show in FIGS. 3, 6, and 7 , as described above.
  • FIG. 26 shows a flowchart of the IP-replication send-thread ( 305 in FIG. 22).
  • the thread polls the primary save volume mailbox area. If the mailbox area indicates that there is not a new delta set chunk in the primary save volume area, then the thread is finished for the present task invocation interval. Execution of the thread is suspended in step 363 , and resumed in step 364 at the next task invocation interval.
  • step 365 the IP-replication send-thread issues a volume multicast command to broadcast or forward the delta set chunk to specified destination data movers.
  • step 366 if the multicast has been successful, then execution branches to step 367 .
  • step 367 the IP-replication send-thread updates the primary save volume mailbox to indicate completion of the multicast, and execution continues to step 363 to suspend execution of the thread until the next task invocation interval.
  • step 366 if the multicast is not successful, the execution continues to step 368 to test whether more than a certain number (N) of retries have been attempted. If not, then execution loops back to step 365 to retry the multicast of step 365 . If more than N retries have been attempted, then execution continues from step 368 to step 369 . In step 369 , the IP-replication send-thread logs the error, and then in step 370 , passes execution to an error handler.
  • N a certain number
  • FIG. 27 shows various objects defined by the volume multicast layer.
  • the volume multicast layer provides multicast service to instances of a VolMCast object 370 representing a volume multicast relationship between a respective primary file system volume specified by a volume name (volumeName) and a respective group of secondary data movers specified by an RCP group name (rcpgpeName).
  • volumeName volume name
  • rcpgpeName RCP group name
  • one or more RCP groups are defined in response to configuration commands such as:
  • This configuration command adds the IP address (IP) of a specified destination data mover (server_name) to an RCP group.
  • a specified data mover can be defined to be a primary data mover with respect to the RCP group (a relationship called a MultiCastNode) in response to a configuration command such as:
  • server_name is the name for the primary data mover
  • groupname is the name of a configured RCP group
  • IP is the IP address of the primary data mover.
  • the VolMCast object can then be built on top of a MultiCastNode object.
  • the additional information required for the VolMCast object is, on the sender side, the primary or source file system volume and on each receiver side, the secondary or destination file system volume. For flexibility, it is permitted to specify a different volume name on each secondary data mover. By specifying the destination volume names during creation of the VolMCast object, it is not necessary to specify the destination volume names at each copy time.
  • the VolMCast object is defined by configuration commands to the primary data mover such as:
  • ⁇ server_name> is the name of the MultiCast Node.
  • an IP-replication service can be configured for the object upon the primary data mover. Then the primary data mover will respond to commands for starting the replication service and stopping the replication service upon the VolMCast object.
  • the secondary file system is left in a consistent state. In other words, if a replay was in progress, the stop will complete when the replay is finished.
  • the primary data mover may respond to additional commands for create a new delta set on demand, updating the replication policy (high water mark and timeout interval parameters) on the primary file system or secondary file systems, and defining persistency of the replication process upon remount or reboot of the primary file system or any one of the secondary file systems. For example, at reboot the replication service is re-started on the primary file system and the secondary file system in the state it was at unmount or shutdown. A recovery of the replication context happens at reboot or on remount. The replica recovery is executed before the primary and secondary file systems are made available for user access. This allows all modifications during the recovery of the primary file system to be logged by the replication service.
  • the replication policy high water mark and timeout interval parameters
  • the volume multicast layer is responsive to a number of commands 371 from higher layers in the protocol stack.
  • an existing VolMCast object can be opened for either a sender mode or a receiver mode.
  • An opened VolMCast object can be closed.
  • a control block CB
  • Control blocks may specify various operations upon the secondary volumes of the VolMCast object, such as cluster file system commands for performing operations such as invalidations, deletions, renaming, or other changes in the configuration of the objects of the file system upon all copies (local or remote) of the file system.
  • RCP is used for the broadcast or forwarding of the cluster file system commands to all the data movers that are to operate upon the local or remote copies of the file system, and for returning acknowledgement of completion of the operations upon all of the copies of the file system.
  • the volume multicast layer defines a VolMCastSender object 372 instantiated when a VolMCast instance is opened in the sending mode, and a VolMCastReceiver object 373 instantiated when a VolMCast instance is opened in a receiving mode.
  • the VolMCastSender object class and the VolMCastReceiver object class inherit properties of the VolMCast object class.
  • the VolMCastCopy thread instance accesses the delta sets from a primary save volume 375 to produce a write stream 376 of blocks sent down to the RCP layer.
  • an instance of a VolMCastReceiver thread 377 is instantiated and executed to receive a read stream 378 of blocks and write the copied delta sets into a secondary save volume 379 .
  • An instance of an acknowledgement thread 380 returns an acknowledgement 381 of completion of copying of a delta-set for an extent to the secondary file system. The acknowledgement is sent down to the RCP layer of the secondary data mover.
  • the RCP layer sends the acknowledgement 382 to an instance of an acknowledgement thread 383 .
  • RCP is a session-layer protocol, for replication from one primary to multiple secondary sites. Control is initiated by the primary, except when recovering from aborted transfers. RCP uses TCP between the primary and secondary for control and data. Network distribution is by an application-level multicast (ALM) using the RCP as a forwarder. Port sharing with HTTP is used for crossing firewalls.
  • ALM application-level multicast
  • RCP may support other replication applications in addition to 1-to-N IP-based replication for wide-area distribution of read-only data. These other applications include 1-to-N volume mirroring, cluster file system commands, remote file system replication, and distribution and replication of other commands that may be recognized by the data movers.
  • the 1-to-N volume mirroring is a simplification of to 1-to-N IP-based replication for wide-area distribution of read-only data, because the volume mirroring need not synchronize a remote volume with any consistent version of the primary volume until the remote volume needs to be accessed for recovery purposes.
  • Remote file system replication also uses RCP for broadcast or forwarding an application command to a remote data mover to initiate a replication of a file system managed by the remote data mover.
  • RCP may broadcast or forward other commands recognized by data movers, such as iSCSI or remote-control type commands for archival storage.
  • RCP could broadcast or forward remote control commands of the kind described in Dunham, U.S. Pat. No. 6,353,878 issued Mar. 5, 2002 entitled “Remote Control of Backup Media in a Secondary Storage Subsystem Through Access to a Primary Storage Subsystem,” incorporated herein by reference.
  • the RCP forwarder is composed of two RCP sessions, an outbound session at the primary, and an inbound session at the secondary
  • the inbound RCP session receives a group name and looks up the group in a routing table. If routes for the group exist in the routing table, then an RCP forwarder is created at the secondary, including a data path by pointer passing from an “in” session to an “out” session.
  • An RCP group may be configured to include application-level multicast (ALM) topology.
  • ALM route configuration commands begin with an identifier number for the network file server (“cel”) that contains the forwarder data mover, and an identifier number (“ser”) for the forwarder data mover in the network server.
  • the configuration commands end with a “nexthop” specification of an immediate destination data mover:
  • the forwarder data mover adds the “nexthop” specification to an entry for the RCP group in the routing table in the forwarder data mover. This entire entry can be displayed by the following configuration command:
  • cel2-ser2 rcproute display
  • the entry is displayed, for example, as a list of the “nexthop” destination data movers.
  • the entry can be deleted by the following configuration command:
  • cel2-ser2 rcproute delete
  • Each immediate destination data mover may itself be configured as a forwarder in the RCP group.
  • RCP commands and data will be forwarded more than once, through a chain of forwarders.
  • the set of possible RCP routes from a primary or forwarder in effect becomes a tree or hierarchy of destinations.
  • the ALM commands may also include commands for creating sessions and sending control blocks or data.
  • the following ALM command creates a session and sends application data to all destinations in group “g1” from cell I-ser2 from a file (named “filename”) using a test application (named “rcpfiletest”).
  • FIG. 28 shows the RCP collector service 390 at a primary site.
  • the programming for the RCP collector service includes an RCP session manager 391 , collector and worker threads 392 , and a single-thread RCP daemon 393 .
  • the RCP session manager 391 responds to requests from higher levels in the protocol stack, such as a request from an application 394 to open an RCP pipe 395 between the application 394 and the RCP collector service 390 .
  • the application 394 may then send to the session manager 391 requests to setup sessions with RCP groups.
  • a session queue 396 stores the state of each session, and a control block queue 397 keeps track of control blocks sent via TCP/IP to the secondary data movers in the RCP groups.
  • An RCP routing table 398 identifies the immediate destinations of each RCP group to which the TCP/IP messages from the RCP collection service are to be sent, as well as any other destinations to which the messages will be forwarded.
  • TCP port: 80 is opened in both directions (i.e., for input and output).
  • the single thread RCP daemon 393 is used for interfacing with this TCP port: 80 .
  • FIG. 29 shows the RCP collector service 400 at a secondary site.
  • the RCP collector service at the secondary site is similar to the RCP collector service at the primary site, in that it includes an RCP session manager 401 , collector and worker threads 402 , a single thread RCP daemon 403 for access to/from TCP port: 80 , an RCP session state queue 406 , an RCP control block queue 407 , and an RCP routing table 408 .
  • the primary difference between the RCP collector service at the secondary site from the RCP collector service at the primary site is in the collector and worker threads 402 .
  • the RCP commands and data are received from the TCP port: 80 instead of from the application 404 .
  • the application 404 is the consumer of the RCP data, instead of a source for RCP data.
  • the RCP collector service 400 at the secondary site may also serve as a forwarder for RCP commands, and therefore the RCP collector service and worker threads 402 at the secondary site include a forwarder thread that does not have a similar or complementary thread in the RCP collector service at the primary site.
  • an application 404 can initialize the RCP collector service so that the RCP collector service will call back the application upon receipt of certain RCP commands from TCP port: 80 . For example, if a new connection command is received from TCP port: 80 , then the RCP daemon 403 forwards the new connection command to the RCP session manager. The RCP session manager 401 recognizes that this connection command is associated with an application 404 at the secondary site, opens an RCP pipe 405 to this application, and calls the application 404 indicating that the RCP pipe 405 has been opened for the RCP session. (The volume multicast receiver thread 377 of FIG. 27 is an example of such an application.) The application 404 returns an acknowledgement.
  • the session manager creates a new RCP session, and places state information for the new session on the RCP session queue 406 .
  • RCP control blocks and data may be received for the session from the TCP port: 80 .
  • the data may be forwarded to the application, or to a file specified by the application.
  • RCP control blocks to be executed by the RCP collector service 400 may be temporarily placed on the control block queue 407 .
  • RCP control blocks or data intended for other secondary site may be forwarded to the intended secondary sites.
  • FIG. 30 shows further details of the forwarding of RCP commands and data by a data mover 430 identified as Cel2-Ser1.
  • the data mover 430 is programmed with a TCP/IP layer 431 for communication with the IP network 220 , and an RCP layer 432 over the TCP/IP layer.
  • the RCP layer 432 creates an inbound session 433 and an outbound session 434 .
  • the inbound session 433 receives RCP commands from the TCP/IP layer 431 .
  • the TCP/IP data stream is retained in a data buffer 435 .
  • the inbound session 433 performs a lookup for the group in a routing table 436 .
  • the routing table 436 includes a copy of all of the routing information for each group of which the data mover 430 is a member.
  • the primary data mover sends RCP commands to at least data movers CEL2-SER1 and CEL9-SER1.
  • CEL2-SER1 i.e., the data mover 430
  • the inbound session 433 creates an outbound session 434 and creates a TCP/IP data path from the inbound session 433 to the outbound session 434 by passing pointers to the data in the data buffer.
  • the outbound session 434 invokes the RCP/IP layer 431 to multicast the TCP data stream in the data buffer 435 over the IP network 220 to the data movers CEL3-SER1 and CEL7-SER1.
  • the data mover CEL3-SER1 in succession forwards the RCP commands to data movers CEL4-SER1 and CEL5-SER1.
  • the data mover CEL2-SER1 ( 430 ) does not need to know that the data mover CEL3-SER1 forwards the RCP commands to data movers CEL4-SER1 and CEL5-SER1, but if the data mover CEL2-SER1 ( 430 ) would fail to receive an acknowledgement from CEL3-SER1, then the data mover CEL2-SER1 could minimize the impact of a failure of CEL3-SER1 by forwarding the RCP commands to CEL4-SER1 and CEL5-SER1 until the failure of CEL3-SER1 could be corrected.
  • FIG. 31 shows a flowchart of how the RCP collector service at the secondary site processes an inbound RCP session command.
  • the RCP collector service receives a session command.
  • this session command is not a command to be forwarded to other secondary sites, then execution branches to step 413 to execute the action of the command, and the processing of the session command is finished.
  • step 412 if the session command is a command to be forwarded to other secondary sites, then execution continues from step 412 to step 414 .
  • step 414 the RCP collector service gets the RCP group name from the session command. Then, in step 415 , the RCP collector service looks up the group name in the RCP routing table ( 408 in FIG. 29). If the group name is not found, then execution branches from step 416 to step 417 . In step 417 , the RCP collector service returns an error message to the sender of the session command.
  • step 416 if the group name is found in the RCP routing table, then execution continues from step 416 to step 418 .
  • the RCP collector service forwards the action of the session command to each secondary in the group that is an immediate destination of the forwarder (i.e., the data mover that is the secondary presently processing the RCP session command). This is done by instantiating local replication threads or creating outbound sessions for forwarding the action of the session command to each secondary in the group that is an immediate destination of the forwarder.
  • processing of the RCP session command is finished.
  • FIG. 32 shows an example of forwarding and local replication.
  • the IP network 220 connects a primary data mover 421 to a network file server 422 and a secondary data mover 423 .
  • the network file server 422 includes three data movers 424 , 425 , and 426 , and storage 427 .
  • the primary data mover manages network access to a primary file system 428 .
  • the data mover 424 functions as a forwarder data mover.
  • the data mover 425 functions as a secondary data mover managing access from the network to a secondary file system (copy A) 429 .
  • the data mover 426 functions as a secondary data mover managing access from the network to a secondary file system (copy B) 430 .
  • the data mover 423 manages network access to a secondary file system (copy C) 431 .
  • the primary data mover 421 updates the primary file system 428 , it multicasts the modified logical blocks of the file system volume over the IP network 220 to the forwarder data mover 424 and to the secondary data mover 423 .
  • the forwarder data mover 424 receives the modified blocks, and performs a local replication of the blocks to cause the secondary data mover 425 to update the secondary file system (copy A) 429 and the to cause the secondary data mover 426 to update the secondary file system (copy B) 430 .
  • the forwarder data mover 424 has its volume multicast layer ( 323 in FIG. 23) save the modified blocks in a save volume 432 in the storage 427 , and then the forwarder data mover 424 sends replication commands to the local secondary data movers 425 and 426 .
  • Each local secondary data mover 425 , 426 has its playback module ( 328 in FIG. 23) replay the modifications from the save volume 432 into its respective secondary file system copy 429 , 430 .
  • FIG. 33 shows the sharing of the data mover's network TCP port: 80 ( 440 ) between HTTP and RCP.
  • This configuration is used in all data movers having the RCP collector service; i.e., primary, secondary, or forwarder.
  • the TCP data channel from TCP port: 80 ( 440 ) provides an in-order byte stream interface.
  • IP packets 444 for HTTP connections and IP packets 445 for RCP connections from the network 220 are directed to the data mover's TCP port: 80 ( 440 ).
  • the TCP port: 80 ( 440 ) is opened in both directions (i.e., input and output).
  • the data mover uses a level 5 (L5) filter 441 for demultiplexing the IP packets for the HTTP connections from the IP packets for the RCP connections based on an initial segment of each TCP connection.
  • the L5 filter hands the TCP connection off to either a HTTP collector service 442 or an RCP collector service 443 .
  • the RCP collector service 443 is the collector service 390 in the RCP primary of FIG. 28 or the RCP collector service 400 in an RCP secondary of FIG. 29.
  • the L5 filter 441 directs the IP packets for the connection to the HTTP collector service 442 .
  • the initial segment of the TCP connection contains “RCP/1.0”, then the IP packets for the TCP connection are directed to the RCP collector service 443 .
  • the connection could be split as is done in a conventional standalone IP switch.
  • a replication control protocol (RCP) is layered over TCP/IP providing the capability for a remote site to replicate and rebroadcast blocks of the remote copy data to specified groups of destinations, as configured in a routing table.
  • RCP replication control protocol
  • a volume multicast layer over RCP provides for multicasting to specified volume extents of the blocks. The blocks are copied at the logical level, so that it does not matter what physical structure is used for storing the remote copies.
  • Save volumes buffer the remote copy data transmitted between the primary or secondary file system volume and the IP network, in order to ensure independence between the replication process, the IP transport method, and the primary file system being replicated.
  • the save volumes store the progress of the replication over the IP network so as to maintain the consistency of the replication process upon panic, reboot, and recovery.

Abstract

Consistent updates are made automatically over a wide-area IP network, concurrently with read-only access to the remote copies. A replication control protocol (RCP) is layered over TCP/IP providing the capability for a remote site to replicate and rebroadcast blocks of the remote copy data to specified groups of destinations, as configured in a routing table. A volume multicast layer over RCP provides for multicasting to specified volume extents of the blocks. The blocks are copied at the logical level, so that it does not matter what physical structure is used for storing the remote copies. Save volumes buffer the remote copy data transmitted between the primary or secondary file system volume and the IP network, in order to ensure independence between the replication process, the IP transport method, and the primary file system being replicated.

Description

    BACKGROUND OF THE INVENTION
  • 1. Limited Copyright Waiver [0001]
  • A portion of the disclosure of this patent document contains computer code listings to which the claim of copyright protection is made. The copyright owner has no objection to the facsimile reproduction by any person of the patent document or the patent disclosure, as it appears in the U.S. Patent and Trademark Office patent file or records, but reserves all other rights whatsoever. [0002]
  • 2. Field of the Invention [0003]
  • The present invention relates generally to data storage systems, and more particularly to network file servers. The present invention specifically relates to a network file server distributing remote copy data over a network using the Internet Protocol (IP). [0004]
  • 3. Description of the Related Art [0005]
  • Remote copy systems have been used for automatically providing data backup at a remote site in order to insure continued data availability after a disaster at a primary site. Such a remote copy facility is described in Ofek, U.S. Pat. No. 5,901,327 issued May 4, 1999, entitled “Bundling of Write Data from Channel Commands in a Command Chain for Transmission over a Data Link Between Data Storage Systems For Remote Data Mirroring,” incorporated herein by reference. This remote copy facility uses a dedicated network link and a link-layer protocol for 1:1 replication between a primary storage system and a secondary storage system. [0006]
  • More recently there has arisen a need for wide-area distribution of read-only data. This need typically arises when wide-area distribution of the read-only data would prevent remote users from overloading a local server, and would reduce signal transmission delay because the remote users could access remote copies nearer to them. The wide-area distribution of the read-only data is complicated by the need for consistent updates to the remote copies. It is desired for these updates to be made automatically over the wide-area network, and concurrently with read-only access to the remote copies. [0007]
  • SUMMARY OF THE INVENTION
  • In accordance with a first aspect, the invention relates to a method used in a data processing system having a plurality of host computers linked by an-Internet Protocol (IP) network to a plurality of data storage systems. Each of the data storage systems has data storage and at least one data mover computer for moving data between the data storage and the IP network. The method distributes remote copy data over the IP network from a primary data mover computer to a plurality of secondary data mover computers. The method includes the primary data mover computer sending the remote copy data over the IP network to at least one forwarder data mover computer, and the forwarder data mover computer routing the remote copy data over the IP network to the plurality of secondary data mover computers. [0008]
  • In accordance with another aspect, the invention provides a data processing system. The data processing system includes a plurality of data storage systems linked by an Internet Protocol (IP) network for access by a plurality of host computers. Each of the storage systems has data storage and at least one data mover computer for moving data between the data storage and the IP network. Moreover, the data mover computers include means for distributing remote copy data over the IP network from a primary data mover computer to a plurality of secondary data mover computers by the primary data mover computer sending the remote copy data over the IP network to at least one forwarder data mover computer, and the forwarder data mover computer routing the remote copy data over the IP network to the plurality of secondary data mover computers. [0009]
  • In accordance with yet another aspect, the invention provides a server for an Internet Protocol (IP) network. The server is programmed with a routing table, a TCP/IP layer, and a replication control protocol (RCP) session layer over the TCP/IP layer. The routing table identifies destinations in the network for remote copy data. The replication control protocol session layer is programmed to produce an inbound session in response to the file server receiving remote copy data from a source in the IP network, and at least one outbound session for transmitting the remote copy data to a plurality of destinations identified in the routing table as destinations for the remote copy data from the source. [0010]
  • In accordance with still another aspect, the invention provides a primary data storage system for distributing remote copy data over an Internet Protocol (IP) network to at least one secondary data storage system in the IP network. The primary data storage system includes data storage and a data mover computer for moving data between the IP network and the data storage. The data storage includes a primary volume including a primary copy of the remote copy data, and a save volume used as a buffer between the primary volume and the IP network. The data mover computer is programmed with a TCP/IP layer, a replication control protocol (RCP) layer over the TCP/IP layer for transmitting blocks of data from the save volume over the IP network, and a replication module for writing modified blocks of the primary volume to the save volume. [0011]
  • In accordance with yet still another aspect, the invention provides a secondary data storage system for receiving remote copy data distributed over an Internet Protocol (IP) network from a primary data storage system. The remote copy data includes modified blocks of a primary volume in the primary data storage system. The secondary data storage system includes data storage and a data mover computer for moving data between the IP network and the data storage, wherein the data storage includes a secondary volume including a secondary copy of the primary volume, and a save volume used as a buffer between the IP network and the secondary volume for buffering the modified blocks in the remote copy data. The data mover computer is programmed with a TCP/IP layer, a replication control protocol (RCP) layer over the TCP/IP layer for transmitting the modified blocks of remote copy data from the IP network to the save volume, and a playback module for writing the modified blocks of the remote copy data from the save volume to the secondary volume. [0012]
  • In accordance with a final aspect, the invention provides a network file server for use in an Internet Protocol (IP) network. The network file server has data storage including a file system volume for storing a file system, and a TCP port for connection to the IP network to permit access from the IP network to the file system. The network file server is programmed with a series of protocol layers including a TCP/IP layer, a replication control protocol (RCP) layer, and a volume multicast layer. The TCP/IP layer provides access to the IP network through the TCP port in accordance with the standard Transmission Control Protocol. The replication control protocol (RCP) session layer is over the TCP/IP layer for transmission, forwarding, and reception of blocks of remote copy data in accordance with a replication control protocol in which the blocks of remote copy data are transmitted and forwarded to specified groups of destinations in the IP network. The network file server also has a routing table configured with the groups of destinations, and the RCP layer accesses the routing table to determine the destinations in the specified groups for transmission or forwarding. The volume multicast layer is over the RCP layer for transmission or reception of specified volume extents of blocks between the file system volume and the IP network.[0013]
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • Other objects and advantages of the invention will become apparent upon reading the following detailed description with reference to the accompanying drawings wherein: [0014]
  • FIG. 1 is a block diagram of a data processing system in which a primary data storage system servicing a primary host processor is linked to a secondary storage system servicing a secondary host processor to provide the secondary host processor uninterrupted read-only access to a consistent dataset concurrent with read-write access by the primary host processor; [0015]
  • FIG. 2 is a block diagram showing data flow through the data processing system of FIG. 1; [0016]
  • FIG. 3 is a block diagram showing control flow through the secondary data storage system of FIG. 1; [0017]
  • FIG. 4 is a flowchart showing how the secondary data storage system in FIG. 1 is programmed to respond to a write command received from the primary data storage system; [0018]
  • FIG. 5 is a flowchart showing how the secondary data storage system in FIG. 1 is programmed to respond to a read command received from the secondary host processor; [0019]
  • FIG. 6 is a flowchart showing how the secondary data storage system in FIG. 1 is programmed to respond to a transaction commit command from the primary data storage system; [0020]
  • FIG. 7 is a flowchart showing how the secondary data storage system in FIG. 1 is programmed to perform a background task of integrating revisions into secondary dataset storage in the secondary data storage system; [0021]
  • FIG. 8 is a block diagram of a preferred construction for the data processing system of FIG. 1, in which a pair of “delta volumes” are mirrored between a primary data storage system and a secondary data storage system in order to buffer transmission of write commands from the primary data storage system to the secondary data storage system; [0022]
  • FIG. 9 is a block diagram showing data flow in the data processing system of FIG. 8; [0023]
  • FIG. 10 is a block diagram of a delta volume in the data processing system of FIG. 8; [0024]
  • FIG. 11 is a block diagram of data structures in the secondary storage of the secondary data storage system in FIG. 8; [0025]
  • FIG. 12 is a flowchart of programming in a delta volume facility of the primary data storage system of FIG. 8 for remote transmission of write commands to the secondary data storage system; [0026]
  • FIG. 13 is a block diagram of an alternative embodiment of the invention, in which the data storage systems are file servers, and the write commands include all file system access commands that modify the organization or content of a file system; [0027]
  • FIG. 14 is a block diagram of a directory of file system revisions and storage of file system revisions for the system of FIG. 13; [0028]
  • FIG. 15 is a block diagram of an IP network including multiple hosts and multiple data mover computers; [0029]
  • FIG. 16 is a block diagram showing a primary data rover distributing remote copy data to multiple secondary data movers in the IP network by establishing a Transmission Control Protocol (TCP) connection with each of the secondary data movers; [0030]
  • FIG. 17 is a block diagram showing a primary data mover distributing remote copy data to multiple data movers through forwarder data movers; [0031]
  • FIG. 18 is a block diagram showing a shared save volume used to buffer local copy data transmitted from a primary data mover to a secondary data mover; [0032]
  • FIG. 19 is a block diagram showing a primary save volume and a secondary save volume; [0033]
  • FIG. 20 is a flowchart showing local replication in the system of FIG. 18; [0034]
  • FIG. 21 is a flowchart showing remote replication in the system of FIG. 19; [0035]
  • FIG. 22 is a block diagram of a primary site, including layered programming in a primary data mover; [0036]
  • FIG. 23 is a block diagram of a secondary site, including layered programming in a secondary data mover; [0037]
  • FIG. 24 is a flowchart of a process of replication at the primary site of FIG. 22; [0038]
  • FIG. 25 is a flowchart of a procedure for producing a new remote copy of a primary file system concurrent with ongoing replication and multicasting of modifications to the primary file system; [0039]
  • FIG. 26 is a flowchart of an IP-replication send-thread introduced in FIG. 22; [0040]
  • FIG. 27 is a block diagram of a volume multicast level in the data mover programming of FIG. 22 and FIG. 23; [0041]
  • FIG. 28 is a block diagram of the RCP level in the primary data mover programming of FIG. 22; [0042]
  • FIG. 29 is a block diagram of the RCP level in the secondary data mover programming of FIG. 23; [0043]
  • FIG. 30 is a block diagram of a RPC forwarder at the RPC level in a forwarder data mover; [0044]
  • FIG. 31 is a flowchart of an inbound RCP session in the secondary data mover; [0045]
  • FIG. 32 is a block diagram showing a forwarder data mover performing local replication; and [0046]
  • FIG. 33 is a block diagram showing the sharing of a data mover's single TCP port for RCP connections with Hypertext Transfer Protocol (HTTP) connections.[0047]
  • While the invention is susceptible to various modifications and alternative forms, specific embodiments thereof have been shown by way of example in the drawings and will be described in detail. It should be understood, however, that it is not intended to limit the form of the invention to the particular forms shown, but on the contrary, the intention is to cover all modifications, equivalents, and alternatives falling within the scope of the invention as defined by the appended claims. [0048]
  • DESCRIPTION OF ILLUSTRATIVE EMBODIMENTS
  • The present invention relates to replication of remote copy data for Internet Protocol (IP) transmission. One application of the present invention is wide-area distribution of read-only data. For this application, it is desired to provide uninterrupted read-only access to remote copies of a consistent file system concurrent with read-write updating of the file system. The preferred method of providing such uninterrupted read-only access is to use a “delta set” mechanism described in Srinivasan et al., U.S. patent application Ser. No. 09/669,939, filed Sep. 26, 2000, which is commonly owned by the assignee of the present application. FIGS. [0049] 1 to 14 and the corresponding written description in the present application have been reproduced from Ser. No. 09/669, 939.
  • Uninterrupted Read-Only Access to a Remote Copy of a Consistent File System Concurrent with Read-Write Updating of a Primary Copy of the File System. [0050]
  • With reference to FIG. 1, there is shown a data processing system in which a primary [0051] data storage system 20 servicing a primary host processor 21 is connected via a transmission link 22 to a secondary storage system 23 servicing a secondary host processor 24. The primary data storage system 20 includes a storage controller 25 controlling access to primary storage 26, and the secondary data storage system 23 has a storage controller 27 controlling access to secondary storage 28. The storage controller 25 is programmed, via a program storage device such as a floppy disk 29, with a remote mirroring facility 30, which transmits write commands from the primary host processor 21 over the link 22 to the storage controller 27 in the secondary storage system. The storage controller 27 receives the write commands and executes them to maintain, in the secondary storage 28, a copy of data that appears in the primary storage 26 of the primary data storage system. Further details of a suitable remote mirroring facility are disclosed in Ofek et al., U.S. Pat. No. 5,901,327 issued May 4, 1999, incorporated herein by reference.
  • In accordance with an aspect of the present invention, the [0052] storage controller 27 in the secondary data storage system is programmed with a concurrent access facility for providing the secondary host processor 24 uninterrupted read-only access to a consistent dataset in the secondary storage 28 concurrent with read-write access by the primary host processor. For example, the concurrent access facility 31 is loaded into the storage controller 27 from a program storage device such as a floppy disk 32. The concurrent access facility 31 is responsive to the write commands from the primary data storage system, and read-only access commands from the secondary processor 24. The concurrent access facility 31 is also responsive to transaction commit commands, which specify when the preceding write commands will create a consistent dataset in the secondary storage 28. The transaction commit commands originate from the primary host processor 21, and the storage controller 25 forwards at least some of these transaction commit commands over the link 22 to the storage controller 27.
  • FIG. 2 is a block diagram showing data flow through the data processing system of FIG. 1. The primary [0053] data storage system 20 stores a dataset 41 in primary storage, and the secondary data storage system 23 maintains a copy of the dataset 42 in secondary storage. The dataset, for example, could be a set of volumes, a single volume, a file system, a set of files, or a single file. Initially, each of the datasets 41 and 42 are empty, or they are identical because they are loaded from the same external source, or the dataset 42 is copied from the dataset 41 before any write operations are permitted upon the dataset 41. Subsequently, write operation by the primary host processor 21 cause write data to be written to the dataset 41 in primary storage, and read operations by the primary host processor 21 cause read data to be read from the dataset 41 in primary storage. In addition, the primary data storage system forwards the write data from the primary host processor 21 over the link 22 to the secondary data storage system 23. A first switch 45 directs write data from the link 22 alternately to either a first storage “A” of dataset revisions 43, or a second storage “B” of dataset revisions 44. A second switch 46 alternately directs write data to the dataset secondary storage 42 from either the first storage “A” of dataset revisions 43, or the second storage “B” of dataset revisions. The switches 45 and 46 are linked so that when the first switch 45 selects the first storage “A” of dataset revisions for receiving write data from the link 22, the second switch 46 selects the second storage “B” of dataset revisions for transmitting write data to the dataset secondary storage 42. Conversely, when the first switch 45 selects the second storage “A” of dataset revisions for receiving write data from the link 22, the second switch 46 selects the first storage “B” of dataset revisions for transmitting write data to the dataset secondary storage 42.
  • To provide the secondary host processor with uninterrupted read-only access to a consistent dataset, the [0054] switches 45 and 46 are toggled in response to receipt of a transaction commit command received over the link 22 from the primary data storage system. Moreover, the switches 45 and 46 are not toggled unless all of the revisions in the read-selected storage “A” or “B” of dataset revisions have been transferred to the dataset secondary storage 42, and unless all of the updates since the last transaction commit command have actually been written from the link 22 into the write-selected storage “A” or “B” of dataset revisions. (For the switch positions in FIG. 2, the storage “A” of dataset revisions 43 is write-selected, and the storage “B” of dataset revisions is read-selected.) Therefore, the combination of the dataset revisions in the read-selected storage “A” or “B” of dataset revisions with the dataset in the dataset secondary storage represents a consistent dataset. Just after the switches 45 and 46 are toggled, the secondary data storage system begins a background process of reading dataset revisions from the read-selected storage “A” or “B” of dataset revisions, and writing the updates into the dataset secondary storage. Moreover, at any time the secondary host processor 24 may read any dataset revisions from the read-selected storage “A” or “B” of dataset revisions. If a dataset revision is not found in the read-selected storage “A” or “B” of dataset revisions for satisfying a read command from the secondary host processor 24, then read data is fetched from the dataset secondary storage 42.
  • One advantage to the present invention is that the [0055] concurrent access facility 31 can provide the secondary host processor with substantially uninterrupted and concurrent read-only access to a consistent dataset regardless of the rate at which the dataset secondary storage 42 is updated to a consistent state by the completion of integration of a set of revisions into the dataset secondary storage. Therefore, the dataset in the dataset secondary storage 42 can be updated at a relatively low rate, and the storage controller 25 of the primary data storage system 20 can send transaction commit commands to the storage controller 27 of the secondary data storage system 23 at a much lower rate than the rate at which the storage controller 25 receives transaction commit commands from the primary host processor 21. Moreover, the transaction commit commands can be encoded in the write commands sent over the link. For example, the write commands can write alternate sets of revisions to alternate dataset revision storage, as will be described below with respect to FIG. 9. In such a case, the storage controller 27 in the secondary data storage system 23 can regenerate the transaction commit commands by detecting that the addresses of the write commands have switched from one area of dataset revision storage to the other. Moreover, each write command can be tagged with a corresponding sequence number so that the storage controller 27 in the secondary data storage system 23 can verify that a complete set of write commands has been received prior to the switch of the write command addresses from one area of the dataset revision storage to the other.
  • FIG. 3 is a block diagram showing control flow through the secondary data storage system of FIG. 1. Upon receipt of a write command (from the [0056] link 22 in FIGS. 1 and 2), the secondary data storage system accesses a directory 47 or 48 for the write-selected storage “A” or “B” of dataset revisions. The directory is accessed to determine whether or not the write command is accessing the same data item or data storage location as an update existing in the write-selected storage “A” or “B” of dataset revisions. If so, then the directory provides the location of the update in the write-selected storage “A” or “B” of dataset revisions, and the write command is executed upon that pre-existing update. If not, then storage is allocated in the write-selected storage “A” or “B” of dataset revisions for the update of the write command, the update of the write command is written into the allocated storage, and the directory 47 or 48 of the write-selected storage “A” or “B” of dataset revisions is updated to associate the allocated storage for the storage location or data item accessed by the write command.
  • Upon receipt of a read-only access command from the secondary host processor, the secondary data storage system accesses the [0057] directory 47 or 48 for the read-selected storage “A” or “B” of dataset revisions. The directory is accessed to determine whether or not the read-only access command is accessing the same data item or data storage location as an update existing in the read-selected storage “A” or “D” of dataset revisions. If so, then the directory provides the location of the update in the read-selected storage “A” or “B” of dataset revisions, and the read-only access command is executed upon that pre-existing update. If not, then the secondary data storage system accesses a dataset directory 49 for the dataset secondary storage 42, in order to locate the requested data in the dataset secondary storage 42.
  • FIG. 4 is a flowchart showing how the secondary data storage system in FIG. 1 is programmed to respond to a write command received from the primary data storage system. The write command specifies an address of a data item or storage location, and data to be written to the data item or storage location. In the first step [0058] 61, the storage controller accesses the write-selected directory “A” or “B” of dataset revisions (47 or 48) for the address specified by the write command. Next, in step 62, execution branches depending on whether or not the address is in the directory. If not, then in step 63, the storage controller allocates storage for the write data in the write-selected storage “A” or “B” of dataset revisions (43 or 44). Then in step 64, the storage controller writes the data to the allocated storage. Then in step 65, the storage controller creates a new directory entry (in the write-selected directory “A” or “B” of dataset revisions 47 or 48) associating the write address with the allocated storage. Then in step 66, the storage controller returns an acknowledgement over the link to the primary storage system, and the task is finished.
  • In [0059] step 62, if the write address is in the directory, then execution branches to step 67. In step 67, the storage controller writes the data of the write command to the associated address in the write-selected storage “A” or “B” of dataset revisions (43 or 44). Execution continues from step 67 to step 66 to return an acknowledgement to the primary storage system, and the task is finished.
  • FIG. 5 is a flowchart showing how the storage controller of the secondary data storage system in FIG. 1 is programmed to respond to a read command received from the secondary host processor. The read command specifies an address of a data item or storage location. In a [0060] first step 71, the storage controller accesses the read-selected directory “A” or “B” of dataset revisions (47 or 48). Then in step 72, execution branches depending on whether the address in the read command is found in the directory. If so, then execution branches from step 72 to step 73. In step 73, the storage controller reads data from the read-selected storage “A” or “B” of dataset revisions. Execution continues from step 73 to step 74, to return the data to the secondary host processor, and then the task is finished.
  • If in [0061] step 72 the read address is not in the directory accessed in step 71, then execution continues from step 72 to step 75. In step 75, the storage controller accesses the dataset directory (48 in FIG. 3). Then in step 76, execution branches depending on whether the address of the read command is in the dataset directory. If not, execution continues to step 77, to return an error code to the secondary host processor, and then the task is finished. Otherwise, if the address of the read command is found in the dataset directory, execution branches from step 76 to step 78. In step 78, the storage controller reads data from the dataset secondary storage (42 in FIG. 3). Execution continues from step 78 to step 74, to return the data to the secondary host processor, and the task is finished.
  • FIG. 6 is a flowchart showing how the storage controller of the secondary data storage system in FIG. 1 is programmed to respond to a transaction commit command from the primary data storage system. In a [0062] first step 81, the storage controller checks whether or not the background task of FIG. 7 is done with integration of the dataset into the dataset secondary storage. For example, this background task is done when the read-selected directory “A” or “B” of dataset revisions is empty. If not, then in step 82, the storage controller returns a flow control signal to the primary data storage system, because subsequent write commands from the link should not be placed in the storage “A” or “B” of dataset revisions until completion of the integration of the dataset revisions into secondary storage. Any such subsequent write commands could be placed in a temporary buffer until completion of the integration of the dataset revisions into the secondary storage, and a preferred buffering technique will be described below with reference to FIGS. 8 to 11. Execution continues from step 82 to step 83. In step 83, the task of FIG. 6 is suspended for a time to permit the background task to continue with integration of the dataset into secondary storage, and then the task is resumed. After step 83, execution loops back to step 81. Once the dataset has been integrated into secondary storage, execution continues from step 81 to step 84.
  • In [0063] step 84, the switches (45 and 46 in FIGS. 2 and 3) are toggled. This is done by complementing a logical variable or flag, which indicates what storage of dataset revision is selected for read and write operations. For example, when the flag has a logical value of 0, the storage “A” of dataset revisions 43 is read-selected, and the storage “B” of dataset revisions 44 is write-selected. When the flag has a logical value of 1, the storage “A” of dataset revisions 43 is write-selected, and the storage “B” of dataset revisions is read-selected. Next, in step 85, the storage controller initiates the background task of integrating dataset revisions from the read-selected storage “A” or “B” of dataset revisions into the dataset secondary storage. Then, in step 86, the storage controller returns an acknowledgement of the transaction commit command to the primary data storage system, and the task of FIG. 6 is done.
  • FIG. 7 is a flowchart showing how the storage controller of the secondary data storage system in FIG. 1 is programmed to perform a background task of integrating revisions into the dataset secondary storage. In a [0064] first step 91, the first dataset revision is obtained from the read-selected “A” or “B” dataset revision storage (43 or 44 in FIG. 3). Next, in step 92, the storage controller searches the dataset directory (49 in FIG. 3) for the write address of the dataset revision. Then, in step 93, execution branches depending on whether the write address is found in the directory. If not, execution continues from step 93 to step 94. In step 94, the storage controller stores the revision in the dataset secondary storage (42 in FIG. 3), and the storage controller updates the dataset directory (49 in FIG. 3). Execution continues from step 94 to step 96.
  • In [0065] step 93, if the address of the dataset revision is found in the dataset directory, then execution branches to step 95 to replace the obsolete data in the dataset secondary storage with the dataset revision, and the dataset directory is updated if appropriate. The dataset directory is updated, for example, if the information in the directory for the obsolete data is no longer applicable to the revision. After step 95, execution continues in step 96.
  • In [0066] step 96, the storage controller de-allocates storage of the dataset revision from the read-selected “A” or “B” dataset revision storage (43 or 44 in FIG. 3). Execution continues from step 96 to step 97. In step 97, the task is finished if the dataset revision storage is found to be empty. Otherwise, execution continues from step 97 to step 98. In step 98, the task is suspended to permit any higher priority tasks to begin, and once the higher priority tasks are completed, the background task is resumed. Execution then continues to step 99. In step 99, the storage controller obtains the next dataset revision from the read-selected “A” or “B” dataset revision storage. Execution loops back to step 92 from step 99, in order to integrate all of the revisions from the read-selected “A” or “B” dataset revision storage into the dataset secondary storage.
  • The above description with respect to FIGS. [0067] 1 to 5 has not been limited to any particular form of dataset structure or directory structure. For example, the dataset revisions could operate upon direct mapped, numerically addressed storage, or they could operate upon dynamically allocated, symbolically addressed storage. For example, FIG. 8 is a block diagram of one preferred construction for a data processing system in which the write commands for the dataset revisions access direct mapped, numerically addressed storage. The data processing system includes a primary data storage system 110, a data mover computer 111, a primary host processor 112, a secondary data storage system 113, a data mover computer 114, and a secondary host processor 115. The data mover computer 111 includes a file system 116 that translates file system read and write commands from the primary host processor 112 to logical block read and write commands to the primary data storage system. Therefore, the combination of the data mover computer 111 and the primary data storage system 110 functions as a file server. Further details regarding the programming of the data mover computer 111 and the file system 116 are disclosed in Vahalia et al., U.S. Pat. No. 5,893,140, issued Apr. 6, 1999, and entitled “File Server Having A File System Cache And Protcol For Truly Safe Asynchronous Writes,” incorporated herein by reference. In a similar fashion, the combination of the secondary data storage system 113 and the data mover computer 114 also functions as a file server.
  • The primary data storage system has [0068] primary storage 118, and a storage controller 119. The storage controller includes a semiconductor random access cache memory 120, a host adapter 121 interfacing the data mover computer 111 to the cache memory, disk adapters 122, 123 interfacing the cache memory to the primary storage 118, and a remote mirroring facility 124 for interfacing the cache memory 120 to dual redundant data transmission links 125, 126 interconnecting the primary data storage system 110 to the secondary data storage system 113. The remote mirroring facility is constructed and operates as described in the above-cited Ofek et al., U.S. Pat. No. 5,901,327 issued May 4, 1999. This remote mirroring facility mirrors file system storage 141 in the primary storage 118. However, the file system storage 141 is mirrored by mirroring delta volume storage 143 that is used to buffer the updates to file system storage 141 of the primary storage 118. The host adapter 121 is programmed with a “delta volume facility” 127 that loads the updates into the delta volume storage 143 of the primary storage 118. The remote mirroring facility transmits the updates over the dual redundant links 125, 126 to mirrored delta volume storage 144 in secondary storage 128 in the secondary data storage system 113, as further shown and described below with reference to FIGS. 9 to 12.
  • The [0069] delta volume facility 127 is located at a volume manager level in data processing system of FIG. 8. The volume manager level lies between the level of the files system 116 and the level of the primary storage 110. The file system 116 addresses logical blocks in logical volumes. In other words, each logical volume appears as an array of blocks having contiguous logical block numbers. The volume manager maps the logical block number into an appropriate basic storage volume and physical offset within the basic volume. In addition, the volume manager permits a number of the basic storage volumes to be combined in various ways to construct a single metavolume that can be used to build a file system. The file system views the metavolume as a single, contiguous array of blocks that is accessible by specifying a logical block number within this array.
  • The secondary [0070] data storage system 113 also includes a storage controller 129. The storage controller 129 includes a semiconductor cache memory 130, a host adapter 131, disk adapters 132 and 133, and a remote mirroring facility 134. The host adapter 131 is programmed with a concurrent access facility 135 that is similar to the concurrent access facility (31 in FIG. 1) described above with respect to FIGS. 1 to 7, except that the concurrent access facility 135 obtains updates from the mirrored delta volume storage 144 in the secondary storage 128 (as further described below with reference to FIGS. 9 to 11) instead of directly from the primary data storage system.
  • FIG. 9 is a block diagram showing data flow in the data processing system of FIG. 8. When the [0071] primary host processor 112 requests file system access from the data mover computer 111, the file system 116 performs read and write operations upon the file system primary storage 141. Write data for sets of sequential transactions are alternately written to an “A” delta volume 145 and a “B” delta volume 146 in the delta volume storage 143 of the primary data storage system 110. The remote mirroring facility transfers the write data to a mirrored “A′” delta volume 147 and a mirrored “B′” delta volume 148 in the delta volume storage 144 of the secondary data storage system 113. When the secondary host processor requests read-only file system access from the data mover computer 114, the data mover computer reads file system data from a read-selected one of the “A′” delta volume 147 or the “B′” delta volume 148 in the delta volume storage 144 of the secondary data storage system 113, and if the required file system data are not found in the read-selected one of the delta volumes, then the data mover computer reads the file system data from the file system secondary storage 142.
  • FIG. 10 is a block diagram of a delta volume in the data processing system of FIG. 8. Each delta volume is logically divided into delta chunks of a fixed size. The fixed size is preselected depending on various factors such as the serving capacity of the primary site and the write activity at the primary site. The fixed size is large enough to contain all of the updates for any single transaction. During initialization of the data processing system, file system access by the primary host processor is temporarily suspended and the file system [0072] primary storage 141 is copied to the file system secondary storage 142. Thereafter, file system access by the primary host processor is enabled, and the primary captures changes to the file system in delta sets. Each delta set is a set of changes to the file system blocks that, when viewed as a whole, leave the file system in a consistent state. The delta sets are identified by a sequence number (SEQNO) and written to the delta volume (and thus propagated to the replica sites). A new delta set begins at the start of a delta chunk and the size of a delta set cannot exceed the size of a delta chunk. The sequence number (SEQNO) and also the delta set size (DSS) can be written to a header or trailer 149 of the delta chunk. The delta volume therefore functions as a transaction log for updates to the file system, and also as a buffer for transmitting the updates between the primary data storage system and the secondary data storage system. In case of a system crash, the sequence numbers can be inspected to find the last valid delta volume. The block updates in each delta set can also have a fixed size, to facilitate asynchronous transmission of the updates over the data link between the primary and secondary data storage systems. In this case, each block update can have its own sequence number. If a transmission error is detected, such as a failure of the secondary data storage system to receive a block update in sequence, the block update can be retransmitted, and written into its delta set in proper sequence when received.
  • The specific format shown for the delta volume has been selected to minimize computational overhead for accessing the delta volume rather than to minimize storage requirements. In contrast, a conventional transaction log has a format selected to minimize storage requirements rather than to minimize computational overhead for accessing the log. Depending on the availability of computational resources in the primary data storage system and the secondary data storage system, the delta volume could use a conventional transaction log data structure. To reduce the computational overhead for accessing such a conventional transaction log, the delta volume could also include a delta set directory overlaid upon the conventional transaction log data structure. [0073]
  • It should also be apparent that a single delta volume, rather than two delta volumes, could be used for buffering the transmission of file system updates between the primary data storage system and the secondary data storage system. If a single delta volume were used, then alternate delta chunks in the delta volume could be read-selected and write-selected. It should also be apparent that more than two delta volumes could be used for buffering file system updates. For example, the primary data storage system could store data for multiple file systems, and each file system to be accessed from the secondary data storage system could have its updates buffered in one, two, or more delta volumes used for buffering the updates of only one file system. [0074]
  • FIG. 11 is a block diagram of data structures in the file system secondary storage ([0075] 128 in FIG. 8) of the secondary data storage system (113 in FIG. 8). The concurrent access facility (135 in FIG. 8) in the secondary data storage system uses a volume manager utility that inserts the read-selected delta set 151 as an overlay on top of the file system metavolume 152. At the time of insertion, a delta set map 153 is created of the block entries in the delta set 151. This map is then used to route a block read request to either the delta set or the file system metavolume depending on whether there is a block entry in the delta set for the requested block or not. Therefore, the read-selected delta set 151 corresponds to the read-selected storage “A” or “B” of dataset revisions 43 or 44 in FIG. 2 and FIG. 3, and the delta set map 153 corresponds to the directory 47 or 48 in FIG. 3 for the dataset revisions. The time of insertion of the read-selected delta set and the creation of the delta set map corresponds to the time between steps 84 and 85 of FIG. 6. In other words, after the read selection of the delta set and before initiation of the background task of integrating file system revisions from the read-selected delta set into the file system volume 143 in the secondary storage. The integration of the file system revisions involves copying the updates into the corresponding blocks of the file system metavolume 152. The routing of a block read request to either the delta set or the file system metavolume corresponds to steps 71 and 72 in FIG. 5.
  • FIG. 12 is a flowchart of programming in a delta volume facility of the primary data storage system of FIG. 8 for remote transmission of write commands to the secondary data storage system. In a [0076] first step 161, the storage controller of the primary data storage system clears the sequence number (SEQNO). The sequence number is used to map the current delta chunk into either the “A” delta volume or the “B” delta volume. For example, if the sequence number is even, then the current delta chunk is in the “A” delta volume, and if the sequence number is odd, then the current delta chunk is in the “B” delta volume. For the case of four delta chunks per delta volume, for example, the position of the delta chunk in the corresponding delta volume is computed by an integer division by two (i.e., a right shift by one bit position), and then masking off the two least significant bits (i.e., the remainder of an integer division by four).
  • Next, in [0077] step 162, the storage controller clears a variable indicating the delta set size (DSS). Then in step 163, the storage controller clears a timer. The timer is a variable that is periodically incremented. The timer is used to limit the frequency at which transaction commit commands are forwarded from the primary data storage system to the secondary storage system unless the transaction commit commands need to be transmitted at a higher rate to prevent the size of the delta sets from exceeding the size of the delta chunk.
  • In [0078] step 164, execution continues to step 165 if the storage controller receives a write command from the primary host processor. In step 165, the storage controller places the write command in the current delta chunk. This involves writing a number of data blocks to the delta volume selected by the sequence number (SEQNO), beginning at an offset computed from the sequence number and the current delta set size (DSS). Then in step 166, the storage controller increments the delta set size (DSS) by the number of blocks written to the delta chunk. In step 167, the storage controller compares the delta set size to a maximum size (DSM) to check whether a delta chunk overflow error has occurred. If so, then execution branches to an error handler 168. Otherwise, execution continues to step 169. In step 169, execution continues to step 170 unless a transaction commit command is received from the primary host processor. If not, execution continues to step 170, to temporarily suspend, and then resume, the delta volume facility task of FIG. 12. Otherwise, if a transaction commit command is received, execution continues to step 171. It should be noted that once step 171 has been reached, the data mover computer (111 in FIG. 8) has already flushed any and all file system updates preceding the transaction commit command from any of its local buffer storage to the primary data storage system. Write operations by the primary host processor subsequent to the transaction commit command are temporarily suspended until this flushing is finished. Therefore, once step 171 has been reached, the updates in the delta set of the current delta chunk represent a change of the file system from one consistent state to another consistent state.
  • In [0079] step 171, the storage controller compares the delta set size (DSS) to a threshold size (THS) that is a predetermined fraction (such as one-half) of the maximum size (DSM) for a delta chunk. If the delta set size (DSS) is not greater than this threshold, then execution continues to step 172. In step 172, the timer is compared to a predetermined threshold (THT) representing the minimum update interval for the file system secondary storage unless a smaller update interval is needed to prevent the size of the delta set from exceeding the size of the delta chunk. The minimum update interval (THT) should depend on the particular application. A value of 5 minutes for THT would be acceptable for many applications. If the timer is not greater than the threshold (THT), then execution loops back to step 170. Otherwise, execution contines to step 173. Execution also branches from step 171 to step 173 if the delta set size (DSS) is greater than the threshold size (THS).
  • In [0080] step 173, the storage controller writes the delta set size (DSS) and the sequence number (SEQNO) into an attributes block of the delta chunk (e.g., the trailer 199 in FIG. 10.) The updating of the sequence number in the delta chunk validates the delta set in the delta chunk. Then, step 174, the storage controller flushes the current delta volume to ensure that all updates in the delta set of the current delta chunk will be transmitted to the secondary data storage system, and then sends a transaction commit command to the secondary data storage system. The secondary data storage system should have received all of the updates in the delta set of the current delta chunk before receipt of the transaction commit command. For example, the remote data mirroring facility can be operated in an asynchronous or semi-synchronous mode for the current delta volume until step 174, and switched in step 174 to a synchronous mode to synchronize the current delta volume in the primary data storage system with its mirrored volume in the secondary data storage system, and then the transaction commit command can be sent once the remote mirroring facility indicates that synchronization has been achieved for the current delta volume. In step 175, the storage controller increments the sequence number (SEQNO). In step 176, the storage controller temporarily suspends the delta volume facility task of FIG. 12, and later resumes the task. Execution then loops back from step 176 to step 162.
  • By using delta volumes as buffers for transmission of updates from the primary data storage system to the secondary data storage system, there is no need for the delta volume facility to wait for receipt of an acknowledgement of the transaction commit command sent in [0081] step 174, before continuing to step 175. Instead, flow control of the updates can be based upon the sequence numbers and the use of sufficiently large delta volumes. Starting at initialization, the delta sets are numbered in an increasing sequence. At each site (primary and one or more secondaries), the delta sets are loaded and unloaded in the order of this sequence. If any delta sets are corrupted during transmission between the sites, they can be retransmitted and then reordered in terms of their sequence numbers. Thus, the primary data storage system will start by producing set number 1, followed by set number 2 and so on. Similarly, each secondary data storage system will integrate the file system secondary storage with the delta sets by unloading and integrating set number 1, followed by set number 2 and so on. The primary does not wait for an immediate acknowledgement from the secondaries when moving from one delta set to the next. This is made feasible by having a sufficiently large delta volume so that there is enough buffer space to allow the secondaries to be a few delta sets behind the primary without having an overflow. An overflow happens when a primary reuses a delta chunk to write a new delta set before one or more of the secondaries have completely processed the old delta set residing on that delta chunk. The primary data storage system can prevent such an overflow condition by suspending the processing of delta sets if there is a failure to receive an acknowledgement of a transaction commit command over the production of a certain number of the delta sets, such as seven delta sets for the example of four delta per delta volume and two delta volumes per file system. The overflow condition can also be detected at a secondary data storage system by inspecting the delta set sequence numbers. An overflow is detected if the SEQNO in the delta chunk exceeds the next number in sequence when the secondary data storage system read-selects the next delta chunk for integration of the updates into the file system secondary storage. When the secondary data storage system detects such an overflow condition, a flow control error recovery procedure is activated. This error recovery procedure, for example, involves suspending write operations by the primary host processor, re-synchronizing the file system secondary storage with the file system primary storage, and restarting the delta volume facility.
  • By using two delta volumes per file system instead of one, it is easy to use the conventional remote mirroring facility ([0082] 124 in FIG. 8) in such a way to ensure that all of the updates to the delta set in the current delta chunk will be flushed and received by the secondary data storage system, prior to sending the transaction commit to secondary storage, with a minimal impact on continued host processing in the primary data storage system. Normally, the remote mirroring facility operates to transmit updates from the primary data storage system to the secondary data storage system concurrently with writing to the volume. During the flushing and synchronization of a remotely mirrored volume, however, the writing to a volume is temporarily suspended. By using two delta volumes, one of the delta volumes can be flushed in step 174 while the subsequent processing in FIG. 12 ( steps 175, 176 et seq.) continues concurrently for the next delta set, which is mapped to the other of the two delta volumes.
  • An approach similar to the mirroring of delta volumes can be used for signaling between the primary data storage system and the secondary data storage system. A transmit message volume can be allocated in the primary and mirrored to a similar receive message volume in the secondary data storage system. Also a transmit message volume can be allocated in the secondary data storage system and mirrored to a similar receive message volume in the secondary. The remote mirroring facility will then automatically copy messages deposited in the transmit volume of one data storage system to a receive volume of another data storage system. The volumes can be partitioned into a number of identical message regions (analogous to the delta chunks of FIG. 10), each having an associated sequence number, so that the message volumes also function as a message queue. Each data storage system could inspect the sequence numbers in the receive message volume to identify a new message that has been deposited in the message volume. Each block in each message region could be allocated to a predefined message type. [0083]
  • FIG. 13 is a block diagram of an alternative embodiment of the invention, in which the data storage systems are file servers, and the write commands include all file system access commands that modify the organization or content of a file system. The data processing system in FIG. 13 includes a conventional primary file server coupled to a [0084] primary host processor 182 for read-write access of the primary host processor to a file system stored in the file server. The conventional file server includes a storage controller 183 and primary storage 184. The storage controller 183 includes facilities for file access protocols 191, a virtual file system 192, a physical file system 193, a buffer cache 194, and logical-to-physical mapping 195. Further details regarding such a conventional file server are found in the above-cited Vahalia et al., U.S. Pat. No. 5,893,140, issued Apr. 6, 1999.
  • The data processing system in FIG. 13 also includes a [0085] secondary file server 185 coupled to the primary host processor 182 to receive copies of at least the write access commands sent from primary host processor to the primary file server. The secondary file server has a storage controller 187 and secondary storage 188. The storage controller 187 includes facilities for file access protocols 201, a virtual file system 202, a physical file system 203, a buffer cache 204, and logical-to-physical mapping 205. To this extent the secondary file server is similar to the primary file server.
  • In the data processing system of FIG. 13, the [0086] primary host processor 182 has a remote mirroring facility 186 for ensuring that all such write access commands are copied to the secondary file server 185. (This remote mirroring facility 186 could be located in the primary file server 181 instead of in the primary host processor.) The remote mirroring facility 186 also ensures that the primary host processor will receive acknowledgement of completion of all preceding write commands from an application 199 from both the primary file server 181 and the secondary file server 185 before the primary host processor will return to the application an acknowledgement of completion of a transaction commit command from the application 199. (This is known in the remote mirroring art as a synchronous mode of operation, and alternatively the remote mirroring facility 186 could operate in an asynchronous mode or a semi-synchronous mode.) The secondary file server 185 therefore stores a copy of the file system that is stored in the primary file server 181. Moreover, a secondary host processor 189 is coupled to the secondary file server 185 for read-only access of the secondary host processor to the copy of the file system that is stored in the secondary storage.
  • To provide the [0087] secondary host processor 189 with uninterrupted read-only access to a consistent version of the file system concurrent with read-write access by the primary host processor, the secondary file server 185 has a concurrent access facility 200 that is an interface between the virtual file system 202 and the physical file system 203. The physical file system layer 203, for example, is a UNIX-based file system having a hierarchical file system structure including directories and files, and each directory and file has an “inode” containing metadata of the directory or file. Popular UNIX-based file systems are the UNIX file system (ufs), which is a version of Berkeley Fast File System (FFS) integrated with a vnode/vfs structure, and the System V file system (s5fs). The implementation of the ufs and s5fs file systems is described in Chapter 9, pp. 261-289, of Uresh Vahalia, Unix Internals: The New Frontiers, 1996, Prentice Hall, Inc., Simon & Schuster, Upper Valley River, N.J. 07458.
  • The [0088] concurrent access facility 200 in FIG. 13 can be constructed as shown in FIGS. 2 to 7 above. In this case, the dataset is the file system. In addition, it is preferable for the directories of dataset revisions (47, 48 in FIG. 3) and the storage of data revisions (43, 44) to have a hierarchical inode structure to facilitate integration with the hierarchical inode structure of the UNIX-based file system directory corresponding to the dataset directory 49 in FIG. 3. In order to provide uninterrupted read-only access to all possible file system revisions, however, the hierarchical structure of the directories of dataset revisions, and the integration of the revisions with the UNIX-based file system directory, must consider some special types of file system modifications, such as file or directory deletions, and file or directory name changes.
  • FIG. 14 shows a hierarchical structure of a directory of dataset revisions and storage of dataset revisions for a write to a file D:/SUB1/FILE-Y followed by a file rename operation RENAME D:/SUB[0089] 1/FILE-X to D:/SUB1/FILE-Y. Assuming that these are the first two updates received by the secondary file server (185 in FIG. 13), the first update would be processed in steps 61 to 66 of FIG. 4 by creating a root directory 210 named “D:” in the write-selected directory of dataset revisions, as shown in FIG. 14, and then creating a subdirectory 211 named “SUB1”, and then creating a file entry 210 named “FILE-X” in the subdirectory, and then creating a new metadata entry 216 linked to the file entry and including is a directory of the blocks of the new file data 217. The second update would be recognized and processed by the task of FIG. 4 as a special case. The task would process the second update by searching the root directory 210 and subdirectory 211 to find the “FILE-X” entry, and creating a new file entry 214 named “FILE-Y” in the subdirectory, and then linking an alias attribute pointing to the “FLEX” entry in the subdirectory, and then creating a command list linked to the “FILE-X” entry and including the command “RENAME [FILE-X to] FILE-Y”, and then unlinking the new metadata 216 and new data 217 from the “FILE-X” entry and linking the new metadata 216 and new data 217 to the “FILE-Y” entry. The resulting data structure would then facilitate subsequent read-only access and integration of the new data of “FILE-Y” with any non-obsolete write data for “FILE-X” in the dataset secondary storage (42 in FIG. 3) for the file system “D:/”.
  • It should be apparent that the remote mirroring aspect of the present invention could be implemented at an intermediate level in the file server below the file access command level (as in the system of FIG. 13) and above the logical block level (as in the system of FIG. 8). For example, the remote mirroring could operate at the physical file system inode level. In this case, the storage of dataset revisions could be implemented as a sequential transactional log of the file system modifications on the primary side, with sufficient information stored in the log, such as inode numbers and old values and new values, to allow the secondary concurrent access facility to “replay” the transactions into the “live” file system in the file system secondary storage. [0090]
  • Replication of Remote Copy Data for Internet Protocol (IP) Transmission. [0091]
  • More recently there has arisen a need for wide-area distribution of read-only data. Shown in FIG. 15, for example, is an [0092] IP network 220 including multiple network file servers 221, 222, and multiple hosts 223, 224, 225. The hosts and network file servers, for example, can be distributed world wide and linked via the Internet. Each of the network file servers 221, 222, for example, has multiple data movers 226, 227, 228, 232, 233, 234, for moving data between the IP network 220 and the cached disk array 229, 235, and a control station 230, 236 connected via a dedicated dual- redundant data link 231, 237 among the data movers for configuring the data movers and the cached disk array 229, 235. Further details regarding the network file servers 221, 222 are found in Vahalia et al., U.S. Pat. No. 5,893,140, incorporated herein by reference.
  • In operation, it is desired for each of the [0093] network file servers 221, 222 to provide read-only access to a copy of the same file system. For example, each of the network file servers could be programmed to respond to user requests to access the same Internet site. The IP network 220 routes user requests to the network file servers 221, 222 in the same continent or geographic region as the user. In this fashion, the user load is shared among the network file servers.
  • In the wide-area network of FIG. 15, it is desired to perform read-write updating of the respective file system copies in the [0094] network file servers 221, 222 while permitting concurrent read-only access by the hosts. It is also desired to distribute the updates over the IP network.
  • There are a number of ways that updates could be distributed over the IP network from a primary data mover to multiple secondary data movers. As shown in FIG. 16, for example, a primary data mover establishes a [0095] connection 242, 243, 244 in accordance with the industry-standard Transmission Control Protocol (TCP) over the IP network 220 to each secondary data mover 245, 246, 247, and then concurrently sends the updates to each secondary data mover over the TCP connection. When the updates need to be distributed to a large number of secondary data movers, however, the amount of time for distributing the updates may become excessive due to limited resources (CPU execution cycles, connection state, or bandwidth) of the primary data mover 241. One way of extending these limited resources would be to use existing IP routers and switches to implement “fan out” from the primary data mover 241 to the secondary data movers 245, 246, 247. Still, a mechanism for reliability should be layered over the Internet Protocol.
  • FIG. 17 shows that the time for distributing updates from a [0096] primary data mover 251 to a large number of secondary data movers 254, 255, 256, 257 can be reduced by using intermediate data movers 252, 253 as forwarders. The primary data mover 251 sends the updates to the forwarder data movers 252, 253, and each of the forwarder data movers sends the updates to a respective number of secondary data movers. The forwarder data movers 252, 253 may themselves be secondary data movers; in other words, each may apply the updates to its own copy of the replicated read-only file system. The distribution from the primary data mover 251 to the forwarder data movers 252, 253 can be done in a fashion suitable for wide-area distribution (such as over TCP connections). The forwarding method of replication of FIG. 17 also has the advantage that the distribution from each forwarder data mover to its respective data movers can be done in a different way most suitable for a local area or region of the network. For example, some of the forwarder data movers could use TCP connections, and others could use a combination of TCP connections for control and UDP for data transmission, and still other forwarders could be connected to their secondary data movers by a dedicated local area network.
  • For implementing the replication method of FIG. 17 over the Internet Protocol, there are a number of desired attributes. It is desired to maintain independence between the primary data mover and each of the secondary data movers. For example, a new secondary data mover can be added at any time to replicate an additional remote copy. The primary data mover should continue to function even if a secondary data mover becomes inoperative. It is also desired to maintain independence between the replication method and the IP transport method. Replication should continue to run even if the IP transport is temporarily inactive. It is desired to recover in a consistent fashion from a panic or shutdown and reboot. A record or log of the progress of the replication can be stored for recovery after an interruption. It is desired to build re-usable program blocks for the replication function, so that the program blocks for the replication function can be used independent of the location of the primary file system or its replicas. [0097]
  • In a preferred implementation, independence between the replication process, the IP transport method, and the primary file system being replicated, is ensured by use of a save volume. The save volume is a buffer between the data producer (i.e., the host or application updating the primary file system), the replication process, and the data consumer (the secondary data movers). The save volume stores the progress of the replication over the Internet Protocol so as to maintain the consistency of the replication process upon panic, reboot, and recovery. The transport process need not depend on any “in memory” replication information other than the information in the save volume, so as to permit the replication process to be started or terminated easily on any data mover for load shifting or load balancing. [0098]
  • When a save volume is used, it can be shared between a primary data mover and a secondary data mover in the case of local file system replication, or a primary copy of the shared volume can be kept at the primary site, and a secondary copy of the shared volume can be kept at the secondary site, in the case of remote file system replication. [0099]
  • For the case of local file system replication, FIG. 18 shows a primary site including a [0100] primary data mover 260 managing access to a primary file system 261, and a secondary data mover 262 managing access to a secondary file system 263 maintained as a read-only copy of the primary file system 261. A save volume 264 is shared between the primary data mover 260 and the secondary data mover 262. This sharing is practical when the secondary site is relatively close to the primary site. A redo log 265 records a log of modifications to the primary file system 261 during the replication process for additional protection from an interruption that would require a reboot and recovery.
  • Local replication can be used to replicate files within the same network file server. For example, in the [0101] network file server 221 in FIG. 15, the primary data mover could be the data mover 226, the secondary data mover could be the data mover 227, the save volume could be stored in the cached disk array 229, and replication control messages could be transmitted between the data movers over the data link 231.
  • For the case of remote file system replication, FIG. 19 shows a primary site including a [0102] primary data mover 270 managing access to a primary file system 271, and a secondary data mover 272 managing access to a secondary file system 273 maintained as a read-only copy of the primary file system 271. The primary site includes a primary save volume 274, and the remote site includes a secondary save volume 275. A redo log 276 records a log of modifications to the primary file system 271 during the replication process for additional protection from an interruption that would require a reboot and recovery.
  • FIG. 20 shows a method of operating the system of FIG. 18 for local replication. In a [0103] first step 281, the primary data mover migrates a copy of the primary file system to create a secondary file system at the secondary site in such a way to permit concurrent write access to the primary file system. The migration, for example, may use the method shown in FIG. 17 of the above-cited Ofek U.S. Pat. No. 5,901,327, in which a bit map indicates remote write pending blocks. Alternatively, the migration may use a snapshot copy mechanism, for example, as described in Kedem, U.S. Pat. No. 6,076,148, in which a bit map indicates the blocks that have changed since the time of snap-shotting of the primary file system. The snapshot method is preferred, because it is most compatible with the delta set technique for remote copy of subsequent modifications. For example, a snapshot manager creates a snapshot copy of the primary file system, as will be further described below with reference to FIGS. 22 to 25. In any event, it is desired for the secondary file system to become a copy of the state of the primary file system existing at some point of time, with any subsequent modifications of the primary file system being transferred through the shared save volume.
  • In [0104] step 282, the primary data mover writes subsequent modifications of the primary file system to the shared save volume. In step 283, the secondary data mover reads the subsequent modifications from the shared save volume and writes them to the secondary file system. In step 284, the secondary data mover provides user read-only access to consistent views of the secondary file system. This can be done by integrating the subsequent revisions into the secondary file system and providing concurrent read-only access to the secondary file system in the fashion described above with reference to FIGS. 2 to 7. Execution loops from step 284 back to step 282. In this fashion, the secondary file system is updated from the primary site concurrently with read-only access at the secondary site.
  • FIG. 21 shows a method of operating the system of FIG. 19 for remote replication. In a [0105] first step 291, the primary data mover migrates a copy of the primary file system to create a secondary file system at the secondary site, in a fashion similar to step 281 in FIG. 20. In step 292, the primary data mover writes subsequent modifications of the primary file system to the primary save volume, in a fashion similar to step 282 in FIG. 20. In step 293, the modifications are copied from the primary save volume to the secondary save volume, for example, by using a delta volume facility for transmitting delta chunks as described above with reference to FIGS. 9 to 12. In step 294, the secondary data mover reads the modifications from the secondary save volume and writes them to the secondary file system. In step 295, the secondary data mover provides user read-only access to consistent views of the secondary file system, in a fashion similar to step 284 of FIG. 20. Execution loops from step 295 back to step 292. In this fashion, the secondary file system is remotely updated from the primary site concurrently with read-only access at the secondary site.
  • FIG. 22 shows layered [0106] programming 300 for a primary data mover. It is desired to use layered programming in accordance with the International Standard Organization's Open Systems Interconnection (ISO/OSI) model for networking protocols and distributed applications. As is well known in the art, this OSI model defines seven network layers, namely, the physical layer, the data link layer, the network layer, the transport layer, the session layer, the presentation layer, and the application layer.
  • As shown in FIG. 22, the [0107] layered programming 300 includes a conventional TCP/IP transport layer 301. The layers above the TCP/IP transport layer 301 include a replication control protocol (RCP) session layer 302, a volume multicast presentation layer 303, and an IP-FS (file system) copy send-thread 304 and an IP-replication send-thread 305 at the program layer level. Over these program layers is a management and configuration command interpreter (MAC_CMD) 306 for system operator set-up, initiation, and supervisory control of the replication process.
  • In operation, the [0108] RCP layer 302 provides an application program interface (API) for multicasting data over TCP/IP. RCP provides callback, acknowledgement (ACK), and resumption of aborted transfers.
  • RCP provides the capability for a remote site to replicate and rebroadcast remote copy data. The remote site functions as a router when it rebroadcasts the remote copy data. RCP can also be used to replicate data locally within a group of data movers that share a data storage system. [0109]
  • To create a new remote copy in response to a supervisory command, the [0110] command interpreter 306 initiates execution of a replication module 310 if the replication module is not presently in an active mode. Then, the command interpreter 306 invokes a snapshot manager 308 to create a snapshot copy 309 of a primary file system volume 307. When the snapshot copy is created, the snapshot manager 308 obtains a current delta set number from the replication module 310 and inserts the current delta set number into the metadata of the snapshot. The current delta set number for the snapshot is all that the secondary needs to identify modifications that are made subsequent to the creation of the snapshot. In this fashion, any number of new remote copies can be created at various times during operation of the replication module, with the snapshot process operating concurrently and virtually independent of the replication module. For example, whenever synchronization of a remote copy is lost, for example due to a prolonged disruption of network traffic from the primary site to the remote site, a new remote copy can be created to replace the unsynchronized remote copy.
  • Once the [0111] snapshot copy 309 is accessible, the command interpreter 306 initiates execution of an instance of the IP-FS copy send-thread 304. The instance of the IP-FS copy send-thread 304 reads data from the snapshot copy 309 and calls upon the volume multicast layer 303 to multicast the remote copy data to all of the secondary data movers where the remote copies are to be created. This can be a copy by extent, so there is no copying of invalid or unused data blocks. For example, the volume multicast layer 303 is given a copy command (@vol., length) specifying a volume and an extent to be copied, and may also specify a group of destinations (an RCP group). The snapshot copy 309 of the primary file system identifies the next valid block to be copied, and the number of valid contiguous blocks following the next block. These blocks are copied at the logical level, so it does not matter what physical structure is used for storing the secondary file system at the secondary site. The copying is done locally, or by remote copy, for example by transporting the data block over IP. The volume multicast layer 303 invokes the RCP layer 302 to transport each data block.
  • During the remote copy process, whenever a modification is made to a block of the primary [0112] file system volume 307, the replication module 310 logs an indication of the modified block in a log 314 and later assembles the modification into a delta set chunk written to a primary save volume 311. The replication module 310 logs the indications in the log 314 on a priority or foreground basis as data is written to the primary file system volume 307, and also logs boundaries between delta sets. The replication module 310 later reads the log 314 to read the indicated modifications from the primary file system volume 307, assemble the indicated modifications into delta set chunks on a background basis, and store the delta set chunks in a save volume chunk area of the save volume 311. For example, the log is in the form of a queue of two bit-map tables, a new one of the tables being written to coincident with write operations upon the primary file system volume 307, and an old one of the tables being read to determine blocks to copy from the primary file system to create a new delta set in the save volume 311. When the delta set chunks become available for distribution from the save volume 311, the replication module 310 updates the save volume mailbox area 312 by storing each delta set chunk definition (@vol., length).
  • The IP-replication send-[0113] thread instance 305 polls the save volume mailbox area 312 to see if any delta set chunks have been stored in the save volume chunk area 313. If so, then the thread instance calls upon the volume multicast layer 303 to multicast the delta set chunks to the data movers that manage the storage of the respective remote file system copies. For example, for each delta set chunk, the IP-replication send-thread instance 305 issues a volume multicast command to the volume multicast layer 303. When the chunk multicast is completed, the IP-replication send-thread instance 305 updates its context on the save volume 311 in the mailbox area 312. At reboot after an interruption of multicast of a chunk, the IP-replication send-thread instance is able to restart the multicast of the chunk. The IP-replication send-thread instance also is responsible for retrying transmission of the chunk whenever the connection with the secondary is interrupted.
  • FIG. 23 shows the [0114] layered programming 320 for a secondary data mover. The programming includes a TCP/IP layer 321, an RCP layer 322, a volume multicast layer 323, and a management and configuration command interpreter (MAC_CMD) 324. During creation of a new remote copy in a secondary file system volume 325, the volume multicast layer 323 writes remote copy data from the primary data mover to the secondary file system volume 325, and concurrently writes modifications (delta set chunks) from the primary data mover to a save volume chunk area 326 of a secondary save volume 327. For example, the volume multicast layer performs the steps in FIG. 4 described above to write the modifications to the save volume chunk area 326.
  • A header for the changes in a next version of the delta set is sent last, because there is no guarantee of the order of receipt of the IP packets. The header of the delta set includes a generation count, the number of delta blocks for the next version of the delta set, a checksum for the header, and a checksum for the data of all the delta blocks. The receiver checks whether all of the changes indicated in the header have been received. [0115]
  • Once a complete remote snapshot copy has been reconstructed in the secondary [0116] file system volume 325, a playback module 328 is activated to read the modifications from the save volume chunk area 326 and integrates them into the secondary file system volume 325. The playback module 328, for example, performs the steps in FIGS. 6 to 7 as described above. From each delta-set chunk in the save volume area 326, the playback module 328 gets the block address and number of contiguous blocks to be written to the secondary file system volume.
  • An [0117] access module 329 provides read-only access to a consistent view of the secondary file system in the secondary file system volume 325. The access module 329, for example, performs the steps shown in FIG. 5 as described above.
  • FIG. 24 shows a procedure executed by the primary site of FIG. 22 to perform replication of the primary file system. When replication is started in a [0118] first step 341, the primary file system is paused to make it consistent. Migration of the primary file system to the secondaries can then be started using a remote copy facility or snapshot manager. Then, in step 342, concurrent write access to the primary file system is resumed, and all modifications made on the primary file system are logged at the volume level on a priority or foreground basis when each modification is made. In addition, a background process of delta-set creation is initiated.
  • Two configurable triggers specify the rate of delta set creation: a timeout parameter and a high water mark parameter. Whenever delta set creation is initiated, the current time, as indicated by a real-time clock, is added to a configurable timeout interval to produce the timeout parameter. The high water mark specifies an amount of modified data, in megabytes. The first trigger that occurs will trigger the creation of a delta set. The replication module creates the delta set by pausing the primary file system, copying the modified blocks from the primary file system to the delta set volume, and then resuming the primary file system. By logging indications of the modified blocks and later copying the modified blocks, multiple modifications to the same block are represented and transported once during a single delta set. [0119]
  • In [0120] step 343, the background process of delta set creation is temporarily suspended, for example, by placing the process on a task queue that is periodically serviced. In step 344, execution of the delta set creation process is resumed. In step 345, the modification size is compared to the high water mark. If the high water mark is not exceeded, then execution continues to step 346. In step 346, the present value of the real-time clock is compared to the timeout parameter. If the timeout parameter has not been exceeded, then execution loops back to step 343. Otherwise, execution continues to step 347. Execution also branches to step 347 from step 345 if the modification size is greater than the high water mark.
  • In [0121] step 347, the primary file system is paused. In step 348, a new delta set is created by starting the copying of modified blocks from the primary file system volume to the new delta set. In step 349, the logging of new modifications into a new table is started. In step 350, the time-out and high water mark is re-armed. In other words, a new value for the timeout parameter is computed as the current real time plus the configurable timeout interval, and the modification size is reset to indicate the size of the new modifications. In step 351, the primary file system is resumed. Execution loops from step 351 back to step 343 to suspend the background process of delta set creation.
  • To maintain the consistency of the delta set created in the primary save volume, the primary file system could remain paused and not resumed in [0122] step 351 until the copy process begun in step 348 is completed. Preferably, however, the copy process begun in step 348 is a snapshot copy process, so that write access to the primary file system may resume in step 351 before the copy process has been completed. For the example of the modification log being a queue of two bit-map tables, when a write access to a block in the primary file system is requested, the old bit map is accessed on a priority basis. If the corresponding bit in the old bit map indicates a modified block in the primary file system volume not yet copied to the save volume, then it is copied on a priority basis to the save volume before the new write data is written to the primary file system volume. As soon as a modified block has been copied from the primary file system volume to the save volume, the corresponding bit in the old bit map is cleared. In this fashion, at the completion of the copy process, the entire old table will be in a reset state, ready to be used as the next new table.
  • When the copy process started in [0123] step 348 is completed, the replication module sets the save volume mailbox area to show that a new delta set is ready for transmission. Upon polling the mailbox area, the IP-replication send-thread finds that the new delta set is ready for transmission, and invokes the volume multicast layer to transmit the delta set to the secondary sites. After step 351, execution loops back to step 343.
  • FIG. 25 shows a flow chart of the overall procedure of creating a new remote copy, either for the first time at a secondary site or as a replacement for a remote copy that needs to be resynchronized with the primary file system. In a [0124] first step 352, the snapshot manager creates a snapshot copy of the primary file system at the end of any pending transaction upon the primary file system (e.g., when the primary file system becomes consistent after it is paused in step 341 of FIG. 24 or in step 347 of FIG. 24.) The replication module independently writes any subsequent modifications into a current delta set for the next transaction.
  • In [0125] step 353, the snapshot manager obtains the current delta set number from the replication module and inserts it into metadata of the snapshot copy. In step 354, the IP-FS copy send-thread is started in order to send volume extents of the snapshot copy to the secondary data mover, by invoking the volume multicast layer for each extent.
  • In [0126] step 355, when the IP-FS copy send-thread is finished, the primary data mover sends a “start playback” signal to the secondary data mover. In step 356, secondary data mover receives the “start playback” signal from the primary data mover, and starts the playback module. In step 357, playback module begins playback from the delta set indicated by the delta set number in the snapshot metadata.
  • The playback module ([0127] 328 in FIG. 23) at the secondary site integrates the delta set modifications into secondary file system. Each time that a new delta set appears in the secondary save volume, the modifications can be integrated into the secondary file system, for example, by pausing the secondary file system, copying the modifications from the secondary save volume into the secondary file system, and resuming the secondary file system. Alternatively, a timeout interval and a high water mark value can be configured for the secondary site, so that the modifications may be integrated into the secondary file system at a rate less frequent than the rate at which the new delta sets appear in the secondary save volume. In this case, the modifications from the secondary save volume would not be integrated into the secondary file system until the timeout time is reached unless the amount of modifications in the save volume reaches the high water mark. The integration of the modifications can be performed concurrently with read-only access to a consistent view of the secondary file system as show in FIGS. 3, 6, and 7, as described above.
  • FIG. 26 shows a flowchart of the IP-replication send-thread ([0128] 305 in FIG. 22). In a first step 361, the thread polls the primary save volume mailbox area. If the mailbox area indicates that there is not a new delta set chunk in the primary save volume area, then the thread is finished for the present task invocation interval. Execution of the thread is suspended in step 363, and resumed in step 364 at the next task invocation interval.
  • If the mailbox area indicates that there is a new delta set chunk in the primary save volume, then execution continues from [0129] step 362 to step 365. In step 365, the IP-replication send-thread issues a volume multicast command to broadcast or forward the delta set chunk to specified destination data movers. In step 366, if the multicast has been successful, then execution branches to step 367. In step 367, the IP-replication send-thread updates the primary save volume mailbox to indicate completion of the multicast, and execution continues to step 363 to suspend execution of the thread until the next task invocation interval.
  • In [0130] step 366, if the multicast is not successful, the execution continues to step 368 to test whether more than a certain number (N) of retries have been attempted. If not, then execution loops back to step 365 to retry the multicast of step 365. If more than N retries have been attempted, then execution continues from step 368 to step 369. In step 369, the IP-replication send-thread logs the error, and then in step 370, passes execution to an error handler.
  • FIG. 27 shows various objects defined by the volume multicast layer. The volume multicast layer provides multicast service to instances of a [0131] VolMCast object 370 representing a volume multicast relationship between a respective primary file system volume specified by a volume name (volumeName) and a respective group of secondary data movers specified by an RCP group name (rcpgpeName). For example, at configuration time, one or more RCP groups are defined in response to configuration commands such as:
  • .RCP_config<server_name>add<IP>[0132]
  • This configuration command adds the IP address (IP) of a specified destination data mover (server_name) to an RCP group. [0133]
  • Also at configuration time, a specified data mover can be defined to be a primary data mover with respect to the RCP group (a relationship called a MultiCastNode) in response to a configuration command such as: [0134]
  • .server_config<server_name>rep<groupname>add<Ir>[0135]
  • where “server_name” is the name for the primary data mover, “groupname” is the name of a configured RCP group, and “IP” is the IP address of the primary data mover. When configuration of the MultiCastNode object is finished, the MultiCastNode object will have its own name, a name for the primary data mover, an RCP group name, and a list of IP addresses to which the primary server should broadcast in order to transmit IP packets to all the secondary data movers in the RCP group. [0136]
  • The VolMCast object can then be built on top of a MultiCastNode object. The additional information required for the VolMCast object is, on the sender side, the primary or source file system volume and on each receiver side, the secondary or destination file system volume. For flexibility, it is permitted to specify a different volume name on each secondary data mover. By specifying the destination volume names during creation of the VolMCast object, it is not necessary to specify the destination volume names at each copy time. For example, the VolMCast object is defined by configuration commands to the primary data mover such as: [0137]
  • .server_config<server_name>“volmcast MultiCastNodeName>[−src|−dest]volume”[0138]
  • where <server_name> is the name of the MultiCast Node. [0139]
  • Once the VolMCast object has been defined, an IP-replication service can be configured for the object upon the primary data mover. Then the primary data mover will respond to commands for starting the replication service and stopping the replication service upon the VolMCast object. When replication is stopped on a secondary, the secondary file system is left in a consistent state. In other words, if a replay was in progress, the stop will complete when the replay is finished. [0140]
  • The primary data mover may respond to additional commands for create a new delta set on demand, updating the replication policy (high water mark and timeout interval parameters) on the primary file system or secondary file systems, and defining persistency of the replication process upon remount or reboot of the primary file system or any one of the secondary file systems. For example, at reboot the replication service is re-started on the primary file system and the secondary file system in the state it was at unmount or shutdown. A recovery of the replication context happens at reboot or on remount. The replica recovery is executed before the primary and secondary file systems are made available for user access. This allows all modifications during the recovery of the primary file system to be logged by the replication service. [0141]
  • As shown in FIG. 27, the volume multicast layer is responsive to a number of [0142] commands 371 from higher layers in the protocol stack. In addition to the configuration commands for defining a new VolMCast object relating a specified primary file system volume to a specified RCP group, an existing VolMCast object can be opened for either a sender mode or a receiver mode. An opened VolMCast object can be closed. Once a VolMCast object has been opened in a sender mode, it can be called upon to broadcast a control block (CB) to the secondary volumes of the VolMCast object, such as a control block specifying a remote copy of a specified extent of the primary volume.
  • Control blocks may specify various operations upon the secondary volumes of the VolMCast object, such as cluster file system commands for performing operations such as invalidations, deletions, renaming, or other changes in the configuration of the objects of the file system upon all copies (local or remote) of the file system. In this case, RCP is used for the broadcast or forwarding of the cluster file system commands to all the data movers that are to operate upon the local or remote copies of the file system, and for returning acknowledgement of completion of the operations upon all of the copies of the file system. [0143]
  • With reference to FIG. 27, the volume multicast layer defines a [0144] VolMCastSender object 372 instantiated when a VolMCast instance is opened in the sending mode, and a VolMCastReceiver object 373 instantiated when a VolMCast instance is opened in a receiving mode. The VolMCastSender object class and the VolMCastReceiver object class inherit properties of the VolMCast object class. When the volume multicast layer is called upon in a primary data mover to maintain remote copies of a specified extent of a VolMCastSender instance, an instance of a VolMCastCopy thread 374 is created and executed. The VolMCastCopy thread instance accesses the delta sets from a primary save volume 375 to produce a write stream 376 of blocks sent down to the RCP layer. At the secondary data mover, an instance of a VolMCastReceiver thread 377 is instantiated and executed to receive a read stream 378 of blocks and write the copied delta sets into a secondary save volume 379. An instance of an acknowledgement thread 380 returns an acknowledgement 381 of completion of copying of a delta-set for an extent to the secondary file system. The acknowledgement is sent down to the RCP layer of the secondary data mover. At the primary, the RCP layer sends the acknowledgement 382 to an instance of an acknowledgement thread 383.
  • RCP is a session-layer protocol, for replication from one primary to multiple secondary sites. Control is initiated by the primary, except when recovering from aborted transfers. RCP uses TCP between the primary and secondary for control and data. Network distribution is by an application-level multicast (ALM) using the RCP as a forwarder. Port sharing with HTTP is used for crossing firewalls. [0145]
  • RCP may support other replication applications in addition to 1-to-N IP-based replication for wide-area distribution of read-only data. These other applications include 1-to-N volume mirroring, cluster file system commands, remote file system replication, and distribution and replication of other commands that may be recognized by the data movers. [0146]
  • The 1-to-N volume mirroring is a simplification of to 1-to-N IP-based replication for wide-area distribution of read-only data, because the volume mirroring need not synchronize a remote volume with any consistent version of the primary volume until the remote volume needs to be accessed for recovery purposes. [0147]
  • Remote file system replication also uses RCP for broadcast or forwarding an application command to a remote data mover to initiate a replication of a file system managed by the remote data mover. In a similar fashion, RCP may broadcast or forward other commands recognized by data movers, such as iSCSI or remote-control type commands for archival storage. For example, RCP could broadcast or forward remote control commands of the kind described in Dunham, U.S. Pat. No. 6,353,878 issued Mar. 5, 2002 entitled “Remote Control of Backup Media in a Secondary Storage Subsystem Through Access to a Primary Storage Subsystem,” incorporated herein by reference. [0148]
  • The RCP forwarder is composed of two RCP sessions, an outbound session at the primary, and an inbound session at the secondary The inbound RCP session receives a group name and looks up the group in a routing table. If routes for the group exist in the routing table, then an RCP forwarder is created at the secondary, including a data path by pointer passing from an “in” session to an “out” session. [0149]
  • An RCP group may be configured to include application-level multicast (ALM) topology. For example, ALM route configuration commands begin with an identifier number for the network file server (“cel”) that contains the forwarder data mover, and an identifier number (“ser”) for the forwarder data mover in the network server. The configuration commands end with a “nexthop” specification of an immediate destination data mover: [0150]
  • cel1-ser2: rcproute add group=g1 nexthop=cell2-ser2 [0151]
  • cel2-ser2: rcproute add group=g1 nexthop=cell2-ser3 [0152]
  • cel2-ser2: rcproute add group=g1 nexthop=cell2-ser4 [0153]
  • In effect, the forwarder data mover adds the “nexthop” specification to an entry for the RCP group in the routing table in the forwarder data mover. This entire entry can be displayed by the following configuration command: [0154]
  • cel2-ser2: rcproute display [0155]
  • The entry is displayed, for example, as a list of the “nexthop” destination data movers. The entry can be deleted by the following configuration command: [0156]
  • cel2-ser2: rcproute delete [0157]
  • Each immediate destination data mover may itself be configured as a forwarder in the RCP group. In this case, RCP commands and data will be forwarded more than once, through a chain of forwarders. The set of possible RCP routes from a primary or forwarder in effect becomes a tree or hierarchy of destinations. [0158]
  • The ALM commands may also include commands for creating sessions and sending control blocks or data. For example, the following ALM command creates a session and sends application data to all destinations in group “g1” from cell I-ser2 from a file (named “filename”) using a test application (named “rcpfiletest”). [0159]
  • cell-ser2: rcpfiletest data=filename group=g1 [0160]
  • FIG. 28 shows the [0161] RCP collector service 390 at a primary site. The programming for the RCP collector service includes an RCP session manager 391, collector and worker threads 392, and a single-thread RCP daemon 393. The RCP session manager 391 responds to requests from higher levels in the protocol stack, such as a request from an application 394 to open an RCP pipe 395 between the application 394 and the RCP collector service 390. The application 394 may then send to the session manager 391 requests to setup sessions with RCP groups. A session queue 396 stores the state of each session, and a control block queue 397 keeps track of control blocks sent via TCP/IP to the secondary data movers in the RCP groups. An RCP routing table 398 identifies the immediate destinations of each RCP group to which the TCP/IP messages from the RCP collection service are to be sent, as well as any other destinations to which the messages will be forwarded. For communication of the TCP/IP messages between the RCP service and the network, TCP port:80 is opened in both directions (i.e., for input and output). The single thread RCP daemon 393 is used for interfacing with this TCP port:80.
  • FIG. 29 shows the [0162] RCP collector service 400 at a secondary site. The RCP collector service at the secondary site is similar to the RCP collector service at the primary site, in that it includes an RCP session manager 401, collector and worker threads 402, a single thread RCP daemon 403 for access to/from TCP port:80, an RCP session state queue 406, an RCP control block queue 407, and an RCP routing table 408. The primary difference between the RCP collector service at the secondary site from the RCP collector service at the primary site is in the collector and worker threads 402. At the RCP secondary, the RCP commands and data are received from the TCP port:80 instead of from the application 404. The application 404 is the consumer of the RCP data, instead of a source for RCP data. The RCP collector service 400 at the secondary site may also serve as a forwarder for RCP commands, and therefore the RCP collector service and worker threads 402 at the secondary site include a forwarder thread that does not have a similar or complementary thread in the RCP collector service at the primary site.
  • In operation, an [0163] application 404 can initialize the RCP collector service so that the RCP collector service will call back the application upon receipt of certain RCP commands from TCP port:80. For example, if a new connection command is received from TCP port:80, then the RCP daemon 403 forwards the new connection command to the RCP session manager. The RCP session manager 401 recognizes that this connection command is associated with an application 404 at the secondary site, opens an RCP pipe 405 to this application, and calls the application 404 indicating that the RCP pipe 405 has been opened for the RCP session. (The volume multicast receiver thread 377 of FIG. 27 is an example of such an application.) The application 404 returns an acknowledgement. If the new connection is for a new RCP session, then the session manager creates a new RCP session, and places state information for the new session on the RCP session queue 406. RCP control blocks and data may be received for the session from the TCP port:80. The data may be forwarded to the application, or to a file specified by the application. RCP control blocks to be executed by the RCP collector service 400 may be temporarily placed on the control block queue 407. RCP control blocks or data intended for other secondary site may be forwarded to the intended secondary sites.
  • FIG. 30 shows further details of the forwarding of RCP commands and data by a [0164] data mover 430 identified as Cel2-Ser1. The data mover 430 is programmed with a TCP/IP layer 431 for communication with the IP network 220, and an RCP layer 432 over the TCP/IP layer. For forwarding the RCP commands and data, the RCP layer 432 creates an inbound session 433 and an outbound session 434. The inbound session 433 receives RCP commands from the TCP/IP layer 431. The TCP/IP data stream is retained in a data buffer 435. When an RCP command calls for the forwarding of RCP commands or data to another data mover in a specified RCP group, the inbound session 433 performs a lookup for the group in a routing table 436.
  • In the example of FIG. 30, the routing table [0165] 436 includes a copy of all of the routing information for each group of which the data mover 430 is a member. In this case, for GROUP1, the primary data mover sends RCP commands to at least data movers CEL2-SER1 and CEL9-SER1. CEL2-SER1 (i.e., the data mover 430) forwards the RCP commands and RCP data to data movers CEL3-SER1 and CEL7-SER1. In particular, the inbound session 433 creates an outbound session 434 and creates a TCP/IP data path from the inbound session 433 to the outbound session 434 by passing pointers to the data in the data buffer. The outbound session 434 invokes the RCP/IP layer 431 to multicast the TCP data stream in the data buffer 435 over the IP network 220 to the data movers CEL3-SER1 and CEL7-SER1.
  • The data mover CEL3-SER1 in succession forwards the RCP commands to data movers CEL4-SER1 and CEL5-SER1. Normally, the data mover CEL2-SER1 ([0166] 430) does not need to know that the data mover CEL3-SER1 forwards the RCP commands to data movers CEL4-SER1 and CEL5-SER1, but if the data mover CEL2-SER1 (430) would fail to receive an acknowledgement from CEL3-SER1, then the data mover CEL2-SER1 could minimize the impact of a failure of CEL3-SER1 by forwarding the RCP commands to CEL4-SER1 and CEL5-SER1 until the failure of CEL3-SER1 could be corrected.
  • FIG. 31 shows a flowchart of how the RCP collector service at the secondary site processes an inbound RCP session command. In a [0167] first step 411, the RCP collector service receives a session command. In step 412, if this session command is not a command to be forwarded to other secondary sites, then execution branches to step 413 to execute the action of the command, and the processing of the session command is finished.
  • In [0168] step 412, if the session command is a command to be forwarded to other secondary sites, then execution continues from step 412 to step 414. In step 414, the RCP collector service gets the RCP group name from the session command. Then, in step 415, the RCP collector service looks up the group name in the RCP routing table (408 in FIG. 29). If the group name is not found, then execution branches from step 416 to step 417. In step 417, the RCP collector service returns an error message to the sender of the session command.
  • In [0169] step 416, if the group name is found in the RCP routing table, then execution continues from step 416 to step 418. In step 418, the RCP collector service forwards the action of the session command to each secondary in the group that is an immediate destination of the forwarder (i.e., the data mover that is the secondary presently processing the RCP session command). This is done by instantiating local replication threads or creating outbound sessions for forwarding the action of the session command to each secondary in the group that is an immediate destination of the forwarder. After step 418, processing of the RCP session command is finished.
  • FIG. 32 shows an example of forwarding and local replication. In this example, the [0170] IP network 220 connects a primary data mover 421 to a network file server 422 and a secondary data mover 423. The network file server 422 includes three data movers 424, 425, and 426, and storage 427. The primary data mover manages network access to a primary file system 428. The data mover 424 functions as a forwarder data mover. The data mover 425 functions as a secondary data mover managing access from the network to a secondary file system (copy A) 429. The data mover 426 functions as a secondary data mover managing access from the network to a secondary file system (copy B) 430. The data mover 423 manages network access to a secondary file system (copy C) 431.
  • In operation, when the [0171] primary data mover 421 updates the primary file system 428, it multicasts the modified logical blocks of the file system volume over the IP network 220 to the forwarder data mover 424 and to the secondary data mover 423. The forwarder data mover 424 receives the modified blocks, and performs a local replication of the blocks to cause the secondary data mover 425 to update the secondary file system (copy A) 429 and the to cause the secondary data mover 426 to update the secondary file system (copy B) 430.
  • To perform the local replication, the [0172] forwarder data mover 424 has its volume multicast layer (323 in FIG. 23) save the modified blocks in a save volume 432 in the storage 427, and then the forwarder data mover 424 sends replication commands to the local secondary data movers 425 and 426. Each local secondary data mover 425, 426 has its playback module (328 in FIG. 23) replay the modifications from the save volume 432 into its respective secondary file system copy 429, 430.
  • FIG. 33 shows the sharing of the data mover's network TCP port:[0173] 80 (440) between HTTP and RCP. This configuration is used in all data movers having the RCP collector service; i.e., primary, secondary, or forwarder. The TCP data channel from TCP port:80 (440) provides an in-order byte stream interface. IP packets 444 for HTTP connections and IP packets 445 for RCP connections from the network 220 are directed to the data mover's TCP port:80 (440). The TCP port:80 (440) is opened in both directions (i.e., input and output). In the input direction, the data mover uses a level 5 (L5) filter 441 for demultiplexing the IP packets for the HTTP connections from the IP packets for the RCP connections based on an initial segment of each TCP connection. The L5 filter hands the TCP connection off to either a HTTP collector service 442 or an RCP collector service 443. (The RCP collector service 443 is the collector service 390 in the RCP primary of FIG. 28 or the RCP collector service 400 in an RCP secondary of FIG. 29.) For example, if the initial segment of a TCP connection contains “HTTP/1.X”, then the L5 filter 441 directs the IP packets for the connection to the HTTP collector service 442. If the initial segment of the TCP connection contains “RCP/1.0”, then the IP packets for the TCP connection are directed to the RCP collector service 443. (In an alternative arrangement, the connection could be split as is done in a conventional standalone IP switch.)
  • In view of the above, there has been provided a method and system for wide-area distribution of read-only data over an IP network. Consistent updates are made automatically over the wide-area network, and concurrently with read-only access to the remote copies. A replication control protocol (RCP) is layered over TCP/IP providing the capability for a remote site to replicate and rebroadcast blocks of the remote copy data to specified groups of destinations, as configured in a routing table. A volume multicast layer over RCP provides for multicasting to specified volume extents of the blocks. The blocks are copied at the logical level, so that it does not matter what physical structure is used for storing the remote copies. Save volumes buffer the remote copy data transmitted between the primary or secondary file system volume and the IP network, in order to ensure independence between the replication process, the IP transport method, and the primary file system being replicated. The save volumes store the progress of the replication over the IP network so as to maintain the consistency of the replication process upon panic, reboot, and recovery. [0174]

Claims (62)

What is claimed is:
1. In a data processing system having a plurality of host computers linked by an Internet Protocol (IP) network to a plurality of data storage systems, each of the data storage systems having data storage and at least one data mover computer for moving data between the data storage and the IP network, a method of distributing remote copy data over the IP network from a primary one of the data mover computers to a plurality of secondary ones of the data mover computers, wherein the method comprises:
the primary data mover computer sending the remote copy data over the IP network to at least one forwarder data mover computer, and the forwarder data mover computer routing the remote copy data over the IP network to the plurality of secondary data mover computers.
2. The method as claimed in claim 1, wherein the forwarder data mover computer is programmed with a TCP/IP layer and a replication control protocol layer over the TCP/IP layer, and the replication control protocol layer produces an inbound session and an outbound session for referencing a routing table identifying destinations in the IP network for the secondary data mover computers and for retransmitting the remote copy data over the IP network from the forwarder data mover computer to the secondary data mover computers.
3. The method as claimed in claim 2, which further includes the forwarder data mover computer temporarily storing the remote copy data from the TCP/IP layer in a buffer, and the inbound session passing pointers to the remote copy data in the buffer to the outbound session for the retransmission of the remote copy data in the buffer over the IP network from the forwarder data mover computer to the secondary data mover computers.
4. The method as claimed in claim 1, wherein the primary data mover computer manages a primary file system in the data storage of the storage system including the primary data mover computer, each secondary data mover computer manages a secondary file system in the data storage of the data storage system including said each secondary data mover computer, and each secondary file system is maintained as a remote copy of the primary file system.
5. The method as claimed in claim 4, wherein the secondary data mover computers provide read-only access of at least some of the host computers to consistent views of the secondary file systems when the primary data mover computer provides concurrent write access to the primary file system.
6. The method as claimed in claim 4, wherein at least one save volume is used to buffer the remote copy data transmitted from the primary data mover computer to the forwarder data mover computer.
7. The method as claimed in claim 6, wherein the save volume stores a record of progress of the transmission of the remote copy data over the IP network from the primary data mover computer to the forwarder data mover computer in order to maintain consistency of the replication process upon panic, reboot, and recovery.
8. The method as claimed in claim 6, wherein the process of transporting the remote copy data over the IP network from the primary data mover computer to the forwarder data mover computer does not depend on any replication information in memory other than the information in the save volume, so as to permit the replication process to be started or terminated on any data mover computer for load shifting or load balancing.
9. The method as claimed in claim 6, wherein a timeout parameter and a high water mark parameter are used as triggers for sending sets of the remote copy data from the save volume over the IP network to the forwarder data mover computer.
10. The method as claimed in claim 4, which includes operating a replication service that transmits modifications of the primary file system to the secondary file systems, and then creating a new secondary file system by copying the primary file system to the new secondary file system concurrent with the operation of the replication service, and after the primary file system has been copied to the new secondary file system, updating the new secondary file system with modifications transmitted by the replication service from the primary file system.
11. The method as claimed in claim 10, wherein the replication service transmits modifications from the primary file system to a save volume during the copying of the primary file system to the new secondary file system, and upon completion of the copying of the primary file system to the new secondary file system, the modifications are copied from the save volume to the new secondary file system.
12. The method as claimed in claim 11, wherein the new secondary file system is at a remote site and the save volume is at the remote site, and the method includes transmitting at least a portion of the modifications of the primary file system to the save volume concurrently with the copying of the primary file system to the new secondary file system.
13. The method as claimed in claim 11, wherein the transmission of modifications of the primary file system to the new secondary file system includes transmitting delta sets of the modifications of the primary file system, and
the copying of the primary file system to the new secondary file system includes creating a snapshot copy of the primary file system when a new delta set begins, and creating the secondary file system from the snapshot copy of the primary file system, and
upon completion of the copying of the primary file system to the new secondary file system, the modifications are copied from the save volume to the new secondary file system beginning with the new delta set.
14. The method as claimed in claim 4, wherein the remote copy is a copy by extent, so that there is not a remote copy of data blocks that are not used in the primary file system.
15. The method as claimed in claim 14, wherein the data blocks are remote copied at a logical level, so that it does not matter what physical structure is used for storing the data blocks in the secondary file systems.
16. The method as claimed in claim 1, wherein only one TCP port to the IP network is used in the forwarder data mover computer for receiving and transmitting the remote copy data to and from the network, and the one TCP port is shared with HTTP connections.
17. The method as claimed in claim 1, wherein a TCP port to the IP network is used in the forwarder data mover computer for receiving the remote copy data from the network, the TCP port is shared with HTTP connections, a level 5 filter in the forwarder data mover computer passes IP packets of HTTP connections from the TCP port to an HTTP collector service in the forwarder data mover computer, and the level 5 filter passes IP packets of the remote copy data from the TCP port to a replication collector service in the forwarder data mover computer.
18. A data processing system comprising:
a plurality of data storage systems linked by an Internet Protocol (IP) network for access by a plurality of host computers, each of the storage systems having data storage and at least one data mover computer for moving data between the data storage and the IP network, the data mover computers including means for distributing remote copy data over the IP network from a primary one of the data mover computers to a plurality of secondary ones of the data mover computers by the primary data mover computer sending the remote copy data over the IP network to at least one forwarder data mover computer, and the forwarder data mover computer routing the remote copy data over the IP network to the plurality of secondary data mover computers.
19. The data processing system as claimed in claim 18, wherein the forwarder data mover computer is programmed with a TCP/IP layer and a replication control protocol layer over the TCP/IP layer, and the replication control protocol produces an inbound session and an outbound session for referencing a routing table identifying destinations in the IP network for the secondary data mover computers and for retransmitting the remote copy data over the IP network from the forwarder data mover computer to the secondary data mover computers.
20. The data processing system as claimed in claim 19, which further includes a buffer for temporarily storing the remote copy data from the TCP/IP layer, and the inbound session passes pointers to the remote copy data in the buffer to the outbound session for the retransmission of the remote copy data in the buffer over the IP network from the forwarder data mover computer to the secondary data mover computers.
21. The data processing system as claimed in claim 18, wherein the primary data mover computer includes means for managing a primary file system in the data storage of the storage system including the primary data mover computer, and each secondary data mover computer includes means for managing a secondary file system in the data storage of the data storage system including said each secondary data mover computer by maintaining each secondary file system as a remote copy of the primary file system.
22. The data processing system as claimed in claim 21, wherein the secondary data mover computers include means for providing read-only access of at least some of the host computers to consistent views of the secondary file systems when the primary data mover computer provides concurrent write access to the primary file system.
23. The data processing system as claimed in claim 21, which includes at least one save volume for buffering the remote copy data transmitted from the primary data mover computer to the forwarder data mover computer.
24. The data processing system as claimed in claim 23, which includes means for recording, in the save volume, a record of progress of the transmission of the remote copy data over the IP network from the primary data mover computer to the forwarder data mover computer in order to maintain consistency of the replication process upon panic, reboot, and recovery.
25. The data processing system as claimed in claim 23, which includes means for transporting the remote copy data over the IP network from the primary data mover computer to the forwarder data mover computer without dependence on any replication information in memory other than the information in the save volume, so as to permit the replication process to be started or terminated on any data mover computer for load shifting or load balancing.
26. The data processing system as claimed in claim 23, which includes means for using a timeout parameter and a high water mark parameter as triggers for sending sets of the remote copy data from the save volume over the IP network to the forwarder data mover computer.
27. The data processing system as claimed in claim 21, which includes a replication service for transmitting modifications of the primary file system to the secondary file systems, and means for creating a new secondary file system by copying the primary file system to the new secondary file system concurrent with the operation of the replication service, and after the primary file system has been copied to the new secondary file system, updating the new secondary file system with modifications transmitted by the replication service from the primary file system.
28. The data processing system as claimed in claim 27, wherein the replication service transmits modifications from the primary file system to a save volume during the copying of the primary file system to the new secondary file system, and upon completion of the copying of the primary file system to the new secondary file system, the modifications are copied from the save volume to the new secondary file system.
29. The data processing system as claimed in claim 28, wherein the new secondary file system is at a remote site and the save volume is at the remote site, and the replication service transmits at least a portion of the modifications of the primary file system to the save volume concurrently with the copying of the primary file system to the new secondary file system.
30. The data processing system as claimed in claim 28, wherein the transmission of modifications of the primary file system to the new secondary file system includes transmitting delta sets of the modifications of the primary file system, and
the copying of the primary file system to the new secondary file system includes creating a snapshot copy of the primary file system when a new delta set begins, and creating the secondary file system from the snapshot copy of the primary file system, and
upon completion of the copying of the primary file system to the new secondary file system, the modifications are copied from the save volume to the new secondary file system beginning with the new delta set.
31. The data processing system as claimed in claim 21, wherein the remote copy is a copy by extent, so that there is not a remote copy of data blocks that are not used in the primary file system.
32. The data processing system as claimed in claim 31, wherein the data blocks are remote copied at a logical level, so that it does not matter what physical structure is used for storing the data blocks in the secondary file systems.
33. The data processing system as claimed in claim 18, wherein the forwarder data mover computer includes only one TCP port to the IP network for receiving and transmitting the remote copy data to and from the network, the one TCP port also servicing HTTP connections.
34. The data processing system as claimed in claim 18, wherein the forwarder data mover computer further includes an HTTP collector service, a replication collector service, a TCP port to the IP network for receiving IP packets of the remote copy data from the network and for receiving IP packets of HTTP connections, and a level 5 filter for passing the IP packets of the HTTP connections from the TCP port to the HTTP collector service and passing the IP packets of the remote copy data from the TCP port to the replication collector service.
35. A server for an Internet Protocol (IP) network, the server being programmed with a routing table, a TCP/IP layer, and a replication control protocol (RCP) session layer over the TCP/IP layer, the routing table identifying destinations in the network for remote copy data, and replication control protocol session layer being programmed to produce an inbound session in response to the file server receiving remote copy data from a source in the IP network, and an outbound session for transmitting the remote copy data to a plurality of destinations identified in the routing table as destinations for the remote copy data from the source.
36. The server as claimed in claim 35, which includes data storage for storing a file of the remote copy data, the server being programmed as a network file server for responding to file access requests from host computers in the IP network.
37. The server as claimed in claim 35, which further includes a buffer for temporarily storing the remote copy data from the TCP/IP layer in a buffer, and the inbound session passing pointers to the remote copy data in the buffer for the retransmission of the remote copy data in the buffer over the IP network from the forwarder data mover computer to the secondary data mover computers.
38. The server as claimed in claim 35, which further includes only one TCP port to the IP network for receiving and transmitting the remote copy data to and from the network, the one TCP port also servicing HTTP connections.
39. The server as claimed in claim 35, which further includes an HTTP collector service, a TCP port to the IP network for receiving IP packets of the remote copy data from the network and for receiving IP packets of HTTP connections, and a level 5 filter for passing the IP packets of the HTTP connections from the TCP port to the HTTP collector service and passing the IP packets of the remote copy data from the TCP port to the RCP layer.
40. A primary data storage system for distributing remote copy data over an Internet Protocol (IP) network to at least one secondary data storage system in the IP network, the primary data storage system including data storage and a data mover computer for moving data between the IP network and the data storage, wherein the data storage includes a primary volume including a primary copy of the remote copy data, and a save volume used as a buffer between the primary volume and the IP network, and the data mover computer is programmed with a TCP/IP layer, a replication control protocol (RCP) layer over the TCP/IP layer for transmitting blocks of data from the save volume over the IP network, and a replication module for writing modified blocks of the primary volume to the save volume.
41. The primary data storage system as claimed in claim 40, wherein the data mover computer is programmed with a volume multicast layer over the RCP layer for transmitting volume extents of the data blocks from the save volume to the RCP layer.
42. The primary data storage system as claimed in claim 40, wherein the data mover computer is programmed with a copy send-thread for copying blocks from the primary volume to the RCP layer.
43. The primary data storage system as claimed in claim 42, wherein the data mover computer is programmed with a volume multicast layer over the RCP layer for transmitting volume extents of the data blocks from the copy-send thread to the RCP layer.
44. The primary data storage system as claimed in claim 42, wherein the copy send-thread maintains a bit map of copied blocks, and the replication module maintains a log of copied blocks that been modified since they have been copied by the copy send-thread.
45. The primary data storage system as claimed in claim 40, wherein the replication module maintains a log indicating modified blocks of the primary volume, and repetitively accesses the log to copy the modified blocks from the primary volume to the save volume.
46. The primary data storage system as claimed in claim 45, wherein the replication module copies the modified blocks from the primary volume to the save volume when a timeout interval expires and when the number of modified blocks not yet copied from the primary volume to the save volume exceeds a threshold.
47. The primary data storage system as claimed in claim 40, wherein the RCP layer includes:
a session manager for managing a session queue;
collector and worker threads; and
a single-thread daemon for access to and from a TCP port.
48. The primary data storage system as claimed in claim 40, wherein the RCP layer accesses only one TCP port to the IP network for transmitting the remote copy data to the network, the one TCP port also servicing HTTP connections.
49. The primary data storage system as claimed in claim 48, which further includes an HTTP collector service, a TCP port to the IP network for receiving IP packets of replication connections and for receiving IP packets of HTTP connections, and a level 5 filter for passing the IP packets of the HTTP connections from the TCP port to the HTTP collector service and passing the IP packets of the replication connections from the TCP port to the RCP layer.
50. A secondary data storage system for receiving remote copy data distributed over an Internet Protocol (IP) network from a primary data storage system, the remote copy data including modified blocks of a primary volume in the primary data storage system, the secondary data storage system including data storage and a data mover computer for moving data between the IP network and the data storage, wherein the data storage includes a secondary volume including a secondary copy of the primary volume, and a save volume used as a buffer between the IP network and the secondary volume for buffering the modified blocks in the remote copy data, and wherein the data mover computer is programmed with a TCP/IP layer, a replication control protocol (RCP) layer over the TCP/IP layer for transmitting the modified blocks of remote copy data from the IP network to the save volume, and a playback module for writing the modified blocks of the remote copy data to the save volume.
51. The secondary data storage system as claimed in claim 50, wherein the data mover computer is programmed with a volume multicast layer over the RCP layer for transmitting volume extents of the modified blocks in the remote copy data from the RCP layer to the save volume.
52. The secondary data storage system as claimed in claim 50, wherein the data mover computer is programmed with a volume multicast layer over the RCP layer for transmitting volume extents of the data blocks from the RCP layer to the secondary volume during a remote copy from the primary volume to the secondary volume before transmission of the modified blocks of remote copy data from the primary storage system to the secondary storage system.
53. The secondary data storage system as claimed in claim 50, wherein the RCP layer includes:
a session manager for managing a session queue;
collector and worker threads; and
a single-thread daemon for access to and from a TCP port.
54. The secondary data storage system as claimed in claim 50, wherein the RCP layer accesses only one TCP port to the IP network for receiving the remote copy data from the network, the one TCP port also servicing HTTP connections.
55. The secondary data storage system as claimed in claim 50, which further includes an HTTP collector service, a TCP port to the IP network for receiving IP packets of replication connections and for receiving IP packets of HTTP connections, and a level 5 filter for passing the IP packets of the HTTP connections from the TCP port to the HTTP collector service and passing the IP packets of the replication connections from the TCP port to the RCP layer.
56. A network file server for use in an Internet Protocol (IP) network, the network file server having data storage including a file system volume for storing a file system, and a TCP port for connection to the IP network to permit access from the IP network to the file system, the network file server being programmed with a series of protocol layers including:
a TCP/IP layer for access to the IP network through the TCP port in accordance with the standard Transmission Control Protocol;
a replication control protocol (RCP) session layer over the TCP/IP layer for transmission, forwarding, and reception of blocks of remote copy data in accordance with a replication control protocol in which the blocks of remote copy data are transmitted and forwarded to specified groups of destinations in the IP network, the network file server having a routing table configured with the groups of destinations, and the RCP layer accessing the routing table to determine the destinations in the specified groups for transmission or forwarding; and
a volume multicast layer over the RCP layer for transmission or reception of specified volume extents of blocks between the file system volume and the IP network.
57. The network file server as claimed in claim 56, wherein the network file server is further programmed to perform a remote volume copy operation by copying volume extents of the file system between the file system volume and the volume multicast layer.
58. The network file server as claimed in claim 56, wherein the network file server further includes a save volume, and the network file server is further programmed to operate in a remote replication mode by transferring modifications of the file system between the file system volume and the volume multicast layer, wherein the modifications are temporarily stored in the save volume to buffer the modifications as the modifications are transferred between the file system volume and the multicast layer.
59. The network file server as claimed in claim 56, wherein the network file server includes more than one data mover computer for moving data from the data storage to the IP network, each data mover computer manages access to a respective remote copy of a file system in the data storage, and the network file server is programmed to perform local replication of file system modifications received from the IP network by updating each remote copy of the file system with the file system modifications.
60. The network file server as claimed in claim 56, wherein the RCP layer includes:
a session manager for managing a session queue;
collector and worker threads; and
a single-thread daemon for access to and from a TCP port.
61. The network file server as claimed in claim 56, wherein the RCP layer accesses only one TCP port to the IP network for receiving the remote copy data from the network, the one TCP port also servicing HTTP connections.
62. The network file server as claimed in claim 56, which further includes an HTTP collector service, a TCP port to the IP network for receiving IP packets of replication connections and for receiving IP packets of HTTP connections, and a level 5 filter for passing the IP packets of the HTTP connections from the TCP port to the HTTP collector service and passing the IP packets of the replication connections from the TCP port to the RCP layer.
US10/147,751 2002-05-16 2002-05-16 Replication of remote copy data for internet protocol (IP) transmission Active 2026-05-04 US7546364B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US10/147,751 US7546364B2 (en) 2002-05-16 2002-05-16 Replication of remote copy data for internet protocol (IP) transmission

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US10/147,751 US7546364B2 (en) 2002-05-16 2002-05-16 Replication of remote copy data for internet protocol (IP) transmission

Publications (2)

Publication Number Publication Date
US20030217119A1 true US20030217119A1 (en) 2003-11-20
US7546364B2 US7546364B2 (en) 2009-06-09

Family

ID=29419098

Family Applications (1)

Application Number Title Priority Date Filing Date
US10/147,751 Active 2026-05-04 US7546364B2 (en) 2002-05-16 2002-05-16 Replication of remote copy data for internet protocol (IP) transmission

Country Status (1)

Country Link
US (1) US7546364B2 (en)

Cited By (124)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030041220A1 (en) * 2001-08-23 2003-02-27 Pavel Peleska System and method for establishing consistent memory contents in redundant systems
US20030204597A1 (en) * 2002-04-26 2003-10-30 Hitachi, Inc. Storage system having virtualized resource
US20030225760A1 (en) * 2002-05-30 2003-12-04 Jarmo Ruuth Method and system for processing replicated transactions parallel in secondary server
US20040006587A1 (en) * 2002-07-02 2004-01-08 Dell Products L.P. Information handling system and method for clustering with internal cross coupled storage
US20040098637A1 (en) * 2002-11-15 2004-05-20 Duncan Kurt A. Apparatus and method for enhancing data availability by leveraging primary/backup data storage volumes
US20040153717A1 (en) * 2002-11-07 2004-08-05 Duncan Kurt A. Apparatus and method for enhancing data availability by implementing inter-storage-unit communication
US6779093B1 (en) * 2002-02-15 2004-08-17 Veritas Operating Corporation Control facility for processing in-band control messages during data replication
US20040181640A1 (en) * 2003-03-11 2004-09-16 International Business Machines Corporation Method, system, and program for improved throughput in remote mirroring systems
US20040236915A1 (en) * 2001-11-20 2004-11-25 Hitachi, Ltd. Multiple data management method, computer and storage device therefor
US20040243651A1 (en) * 2003-05-27 2004-12-02 Legato Systems, Inc. System and method for transfering data from a source machine to a target machine
US20040267836A1 (en) * 2003-06-25 2004-12-30 Philippe Armangau Replication of snapshot using a file system copy differential
US20050021690A1 (en) * 2003-02-27 2005-01-27 Prasad Peddada System and method for communications between servers in a cluster
US20050076091A1 (en) * 2003-09-11 2005-04-07 Duncan Missimer Data mirroring
US20050076070A1 (en) * 2003-10-02 2005-04-07 Shougo Mikami Method, apparatus, and computer readable medium for managing replication of back-up object
US20050114740A1 (en) * 2003-11-20 2005-05-26 International Business Machines (Ibm) Corporation Concurrent PPRC/FCP and host access to secondary PPRC/FCP device through independent error management
US20050157756A1 (en) * 2004-01-21 2005-07-21 John Ormond Database block network attached storage packet joining
US20050166018A1 (en) * 2004-01-28 2005-07-28 Kenichi Miki Shared/exclusive control scheme among sites including storage device system shared by plural high-rank apparatuses, and computer system equipped with the same control scheme
US20050193245A1 (en) * 2004-02-04 2005-09-01 Hayden John M. Internet protocol based disaster recovery of a server
US20050204106A1 (en) * 2004-02-27 2005-09-15 Richard Testardi Distributed asynchronous ordered replication
US20050216502A1 (en) * 2004-03-26 2005-09-29 Oracle International Corporation Method of providing shared objects and node-specific objects in a cluster file system
US20050240813A1 (en) * 2004-04-08 2005-10-27 Wataru Okada Restore method for backup
US20050246576A1 (en) * 2004-03-30 2005-11-03 Masaaki Takayama Redundant system utilizing remote disk mirroring technique, and initialization method for remote disk mirroring for in the system
US20050267928A1 (en) * 2004-05-11 2005-12-01 Anderson Todd J Systems, apparatus and methods for managing networking devices
US20050289197A1 (en) * 2004-06-25 2005-12-29 Nec Corporation Replication system having the capability to accept commands at a standby-system site before completion of updating thereof
US20060015499A1 (en) * 2004-07-13 2006-01-19 International Business Machines Corporation Method, data processing system, and computer program product for sectional access privileges of plain text files
US20060064536A1 (en) * 2004-07-21 2006-03-23 Tinker Jeffrey L Distributed storage architecture based on block map caching and VFS stackable file system modules
US20060080362A1 (en) * 2004-10-12 2006-04-13 Lefthand Networks, Inc. Data Synchronization Over a Computer Network
US7035881B2 (en) 2003-09-23 2006-04-25 Emc Corporation Organization of read-write snapshot copies in a data storage system
US7039661B1 (en) * 2003-12-29 2006-05-02 Veritas Operating Corporation Coordinated dirty block tracking
US20060143412A1 (en) * 2004-12-28 2006-06-29 Philippe Armangau Snapshot copy facility maintaining read performance and write performance
US7076690B1 (en) 2002-04-15 2006-07-11 Emc Corporation Method and apparatus for managing access to volumes of storage
US7111138B2 (en) 2003-09-16 2006-09-19 Hitachi, Ltd. Storage system and storage control device
US20060230148A1 (en) * 2005-04-06 2006-10-12 John Forecast TCP forwarding of client requests of high-level file and storage access protocols in a network file server system
US20060271815A1 (en) * 2005-05-31 2006-11-30 Kazuhiko Mizuno System and method for disaster recovery of data
US20070024919A1 (en) * 2005-06-29 2007-02-01 Wong Chi M Parallel filesystem traversal for transparent mirroring of directories and files
US20070027896A1 (en) * 2005-07-28 2007-02-01 International Business Machines Corporation Session replication
US20070038697A1 (en) * 2005-08-03 2007-02-15 Eyal Zimran Multi-protocol namespace server
US20070055703A1 (en) * 2005-09-07 2007-03-08 Eyal Zimran Namespace server using referral protocols
US20070067664A1 (en) * 2005-09-20 2007-03-22 International Business Machines Corporation Failure transparency for update applications under single-master configuration
US20070088930A1 (en) * 2005-10-18 2007-04-19 Jun Matsuda Storage control system and storage control method
US20070088702A1 (en) * 2005-10-03 2007-04-19 Fridella Stephen A Intelligent network client for multi-protocol namespace redirection
US20070118698A1 (en) * 2005-11-18 2007-05-24 Lafrese Lee C Priority scheme for transmitting blocks of data
US20070136389A1 (en) * 2005-11-29 2007-06-14 Milena Bergant Replication of a consistency group of data storage objects from servers in a data network
US7249227B1 (en) * 2003-12-29 2007-07-24 Network Appliance, Inc. System and method for zero copy block protocol write operations
US20070186037A1 (en) * 2003-12-30 2007-08-09 Wibu-Systems Ag Method for controlling a data processing device
US20070195750A1 (en) * 2006-02-21 2007-08-23 Lehman Brothers Inc. System and method for network traffic splitting
US7263590B1 (en) 2003-04-23 2007-08-28 Emc Corporation Method and apparatus for migrating data in a computer system
US7275177B2 (en) 2003-06-25 2007-09-25 Emc Corporation Data recovery with internet protocol replication with or without full resync
US7370025B1 (en) * 2002-12-17 2008-05-06 Symantec Operating Corporation System and method for providing access to replicated data
US20080155316A1 (en) * 2006-10-04 2008-06-26 Sitaram Pawar Automatic Media Error Correction In A File Server
US7415591B1 (en) 2003-04-23 2008-08-19 Emc Corporation Method and apparatus for migrating data and automatically provisioning a target for the migration
US20080208923A1 (en) * 2007-01-10 2008-08-28 Satoru Watanabe Method for verifying data consistency of backup system, program and storage medium
US7546482B2 (en) 2002-10-28 2009-06-09 Emc Corporation Method and apparatus for monitoring the storage of data in a computer system
US20090217104A1 (en) * 2008-02-26 2009-08-27 International Business Machines Corpration Method and apparatus for diagnostic recording using transactional memory
US7653699B1 (en) * 2003-06-12 2010-01-26 Symantec Operating Corporation System and method for partitioning a file system for enhanced availability and scalability
US7707151B1 (en) 2002-08-02 2010-04-27 Emc Corporation Method and apparatus for migrating data
US7769722B1 (en) 2006-12-08 2010-08-03 Emc Corporation Replication and restoration of multiple data storage object types in a data network
US7805583B1 (en) * 2003-04-23 2010-09-28 Emc Corporation Method and apparatus for migrating data in a clustered computer system environment
US7882061B1 (en) * 2006-12-21 2011-02-01 Emc Corporation Multi-thread replication across a network
US7953819B2 (en) 2003-08-22 2011-05-31 Emc Corporation Multi-protocol sharable virtual storage objects
US7987157B1 (en) * 2003-07-18 2011-07-26 Symantec Operating Corporation Low-impact refresh mechanism for production databases
US20110214130A1 (en) * 2010-02-26 2011-09-01 Yoshihiko Nishihata Data processing system, data processing method, and data processing program
US8099572B1 (en) 2008-09-30 2012-01-17 Emc Corporation Efficient backup and restore of storage objects in a version set
US8433864B1 (en) * 2008-06-30 2013-04-30 Symantec Corporation Method and apparatus for providing point-in-time backup images
US8438247B1 (en) * 2010-12-21 2013-05-07 Amazon Technologies, Inc. Techniques for capturing data sets
US20130117744A1 (en) * 2011-11-03 2013-05-09 Ocz Technology Group, Inc. Methods and apparatus for providing hypervisor-level acceleration and virtualization services
US20130159645A1 (en) * 2011-12-15 2013-06-20 International Business Machines Corporation Data selection for movement from a source to a target
US8706833B1 (en) 2006-12-08 2014-04-22 Emc Corporation Data storage server having common replication architecture for multiple storage object types
US20140149350A1 (en) * 2012-11-27 2014-05-29 International Business Machines Corporation Remote Replication in a Storage System
US20140317059A1 (en) * 2005-06-24 2014-10-23 Catalogic Software, Inc. Instant data center recovery
US20140331018A1 (en) * 2013-05-02 2014-11-06 Bull Sas Method and device for saving data in an it infrastructure offering activity resumption functions
US20140344267A1 (en) * 2013-05-17 2014-11-20 Go Daddy Operating Company, LLC Storing, Accessing and Restoring Website Content via a Website Repository
US20140365824A1 (en) * 2012-01-20 2014-12-11 Tencent Technology (Shenzhen) Company Limited Method for recovering hard disk data, server and distributed storage system
US8931107B1 (en) 2011-08-30 2015-01-06 Amazon Technologies, Inc. Techniques for generating block level data captures
US20150269183A1 (en) * 2014-03-19 2015-09-24 Red Hat, Inc. File replication using file content location identifiers
US20160018995A1 (en) * 2014-07-17 2016-01-21 Lsi Corporation Raid system for processing i/o requests utilizing xor commands
US20160092536A1 (en) * 2014-09-30 2016-03-31 International Business Machines Corporation Hybrid data replication
WO2016107013A1 (en) * 2014-12-30 2016-07-07 中兴通讯股份有限公司 Transmission processing and remote processing method and apparatus and computer storage medium
US20160266830A1 (en) * 2013-12-13 2016-09-15 Netapp Inc. Techniques for importation of information to a storage system
US9678981B1 (en) * 2010-05-03 2017-06-13 Panzura, Inc. Customizing data management for a distributed filesystem
US9779105B1 (en) * 2014-03-31 2017-10-03 EMC IP Holding Company LLC Transaction logging using file-system-specific log files
US9805054B2 (en) 2011-11-14 2017-10-31 Panzura, Inc. Managing a global namespace for a distributed filesystem
US9804928B2 (en) 2011-11-14 2017-10-31 Panzura, Inc. Restoring an archived file in a distributed filesystem
US9811532B2 (en) 2010-05-03 2017-11-07 Panzura, Inc. Executing a cloud command for a distributed filesystem
US9817703B1 (en) * 2013-12-04 2017-11-14 Amazon Technologies, Inc. Distributed lock management using conditional updates to a distributed key value data store
US9852150B2 (en) 2010-05-03 2017-12-26 Panzura, Inc. Avoiding client timeouts in a distributed filesystem
US20180024762A1 (en) * 2016-07-22 2018-01-25 International Business Machines Corporation Data access management in distributed computer storage environments
US9965505B2 (en) 2014-03-19 2018-05-08 Red Hat, Inc. Identifying files in change logs using file content location identifiers
US10025808B2 (en) 2014-03-19 2018-07-17 Red Hat, Inc. Compacting change logs using file content location identifiers
US10133767B1 (en) 2015-09-28 2018-11-20 Amazon Technologies, Inc. Materialization strategies in journal-based databases
US10198346B1 (en) * 2015-09-28 2019-02-05 Amazon Technologies, Inc. Test framework for applications using journal-based databases
US20190079836A1 (en) * 2015-12-21 2019-03-14 Intel Corporation Predictive memory maintenance
US10235091B1 (en) * 2016-09-23 2019-03-19 EMC IP Holding Company LLC Full sweep disk synchronization in a storage system
US10250579B2 (en) * 2013-08-13 2019-04-02 Alcatel Lucent Secure file transfers within network-based storage
US10255291B1 (en) * 2009-06-29 2019-04-09 EMC IP Holding Company LLC Replication of volumes using partial volume split
US10296422B1 (en) * 2015-01-31 2019-05-21 Veritas Technologies Llc Low cost, heterogeneous method of transforming replicated data for consumption in the cloud
US10331657B1 (en) 2015-09-28 2019-06-25 Amazon Technologies, Inc. Contention analysis for journal-based databases
US10346260B1 (en) * 2015-09-30 2019-07-09 EMC IP Holding Company LLC Replication based security
US10509707B1 (en) * 2016-12-15 2019-12-17 EMC IP Holding Company LLC Selective data mirroring
US20200145359A1 (en) * 2007-05-22 2020-05-07 International Business Machines Corporation Handling large messages via pointer and log
US10706072B2 (en) * 2013-12-12 2020-07-07 Huawei Technologies Co., Ltd. Data replication method and storage system
US10747632B2 (en) * 2017-08-11 2020-08-18 T-Mobile Usa, Inc. Data redundancy and allocation system
US11016941B2 (en) 2014-02-28 2021-05-25 Red Hat, Inc. Delayed asynchronous file replication in a distributed file system
US11146626B2 (en) * 2018-11-01 2021-10-12 EMC IP Holding Company LLC Cloud computing environment with replication system configured to reduce latency of data read access
US20220103490A1 (en) * 2020-09-28 2022-03-31 Vmware, Inc. Accessing multiple external storages to present an emulated local storage through a nic
US11349917B2 (en) 2020-07-23 2022-05-31 Pure Storage, Inc. Replication handling among distinct networks
US11442652B1 (en) 2020-07-23 2022-09-13 Pure Storage, Inc. Replication handling during storage system transportation
US20230050536A1 (en) * 2021-02-25 2023-02-16 Pure Storage, Inc. Synchronous Workload Optimization
US11593278B2 (en) 2020-09-28 2023-02-28 Vmware, Inc. Using machine executing on a NIC to access a third party storage not supported by a NIC or host
US11606310B2 (en) 2020-09-28 2023-03-14 Vmware, Inc. Flow processing offload using virtual port identifiers
US11614923B2 (en) 2020-04-30 2023-03-28 Splunk Inc. Dual textual/graphical programming interfaces for streaming data processing pipelines
US11615084B1 (en) 2018-10-31 2023-03-28 Splunk Inc. Unified data processing across streaming and indexed data sets
US11636053B2 (en) 2020-09-28 2023-04-25 Vmware, Inc. Emulating a local storage by accessing an external storage through a shared port of a NIC
US11636116B2 (en) 2021-01-29 2023-04-25 Splunk Inc. User interface for customizing data streams
US11645286B2 (en) 2018-01-31 2023-05-09 Splunk Inc. Dynamic data processor for streaming and batch queries
US11663219B1 (en) 2021-04-23 2023-05-30 Splunk Inc. Determining a set of parameter values for a processing pipeline
US11687487B1 (en) * 2021-03-11 2023-06-27 Splunk Inc. Text files updates to an active processing pipeline
US11727039B2 (en) 2017-09-25 2023-08-15 Splunk Inc. Low-latency streaming analytics
US20230259285A1 (en) * 2022-02-16 2023-08-17 T-Mobile Usa, Inc. Preventing data loss in a filesystem by creating duplicates of data in parallel, such as charging data in a wireless telecommunications network
US11829793B2 (en) 2020-09-28 2023-11-28 Vmware, Inc. Unified management of virtual machines and bare metal computers
US11863376B2 (en) 2021-12-22 2024-01-02 Vmware, Inc. Smart NIC leader election
US11886440B1 (en) 2019-07-16 2024-01-30 Splunk Inc. Guided creation interface for streaming data processing pipelines
US11899594B2 (en) 2022-06-21 2024-02-13 VMware LLC Maintenance of data message classification cache on smart NIC
US11922026B2 (en) * 2022-02-16 2024-03-05 T-Mobile Usa, Inc. Preventing data loss in a filesystem by creating duplicates of data in parallel, such as charging data in a wireless telecommunications network

Families Citing this family (53)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8266406B2 (en) 2004-04-30 2012-09-11 Commvault Systems, Inc. System and method for allocation of organizational resources
CA2564967C (en) 2004-04-30 2014-09-30 Commvault Systems, Inc. Hierarchical systems and methods for providing a unified view of storage information
US8055724B2 (en) * 2005-03-21 2011-11-08 Emc Corporation Selection of migration methods including partial read restore in distributed storage management
US7958322B2 (en) * 2005-10-25 2011-06-07 Waratek Pty Ltd Multiple machine architecture with overhead reduction
US8271548B2 (en) 2005-11-28 2012-09-18 Commvault Systems, Inc. Systems and methods for using metadata to enhance storage operations
US20110010518A1 (en) 2005-12-19 2011-01-13 Srinivas Kavuri Systems and Methods for Migrating Components in a Hierarchical Storage Network
US8930496B2 (en) 2005-12-19 2015-01-06 Commvault Systems, Inc. Systems and methods of unified reconstruction in storage systems
US20200257596A1 (en) 2005-12-19 2020-08-13 Commvault Systems, Inc. Systems and methods of unified reconstruction in storage systems
US8572330B2 (en) 2005-12-19 2013-10-29 Commvault Systems, Inc. Systems and methods for granular resource management in a storage network
US7651593B2 (en) 2005-12-19 2010-01-26 Commvault Systems, Inc. Systems and methods for performing data replication
CA2632935C (en) 2005-12-19 2014-02-04 Commvault Systems, Inc. Systems and methods for performing data replication
US7606844B2 (en) 2005-12-19 2009-10-20 Commvault Systems, Inc. System and method for performing replication copy storage operations
US8661216B2 (en) 2005-12-19 2014-02-25 Commvault Systems, Inc. Systems and methods for migrating components in a hierarchical storage network
US8726242B2 (en) 2006-07-27 2014-05-13 Commvault Systems, Inc. Systems and methods for continuous data replication
US9465823B2 (en) * 2006-10-19 2016-10-11 Oracle International Corporation System and method for data de-duplication
US7920700B2 (en) * 2006-10-19 2011-04-05 Oracle International Corporation System and method for data encryption
US8635194B2 (en) * 2006-10-19 2014-01-21 Oracle International Corporation System and method for data compression
US20080147878A1 (en) * 2006-12-15 2008-06-19 Rajiv Kottomtharayil System and methods for granular resource management in a storage network
US7844787B2 (en) * 2006-12-18 2010-11-30 Novell, Inc. Techniques for data replication with snapshot capabilities
US8677091B2 (en) 2006-12-18 2014-03-18 Commvault Systems, Inc. Writing data and storage system specific metadata to network attached storage device
US7805403B2 (en) 2007-01-07 2010-09-28 Apple Inc. Synchronization methods and systems
JP4525726B2 (en) * 2007-10-23 2010-08-18 富士ゼロックス株式会社 Decoding device, decoding program, and image processing device
US20090150533A1 (en) * 2007-12-07 2009-06-11 Brocade Communications Systems, Inc. Detecting need to access metadata during directory operations
US9178842B2 (en) * 2008-11-05 2015-11-03 Commvault Systems, Inc. Systems and methods for monitoring messaging applications for compliance with a policy
US9495382B2 (en) 2008-12-10 2016-11-15 Commvault Systems, Inc. Systems and methods for performing discrete data replication
US8204859B2 (en) 2008-12-10 2012-06-19 Commvault Systems, Inc. Systems and methods for managing replicated database data
US8510409B1 (en) 2009-12-23 2013-08-13 Emc Corporation Application-specific outbound source routing from a host in a data network
US9141919B2 (en) * 2010-02-26 2015-09-22 International Business Machines Corporation System and method for object migration using waves
US8504517B2 (en) 2010-03-29 2013-08-06 Commvault Systems, Inc. Systems and methods for selective data replication
US8725698B2 (en) 2010-03-30 2014-05-13 Commvault Systems, Inc. Stub file prioritization in a data replication system
US8504515B2 (en) 2010-03-30 2013-08-06 Commvault Systems, Inc. Stubbing systems and methods in a data replication environment
US8037345B1 (en) * 2010-03-31 2011-10-11 Emc Corporation Deterministic recovery of a file system built on a thinly provisioned logical volume having redundant metadata
US8589347B2 (en) 2010-05-28 2013-11-19 Commvault Systems, Inc. Systems and methods for performing data replication
US8566640B2 (en) 2010-07-19 2013-10-22 Veeam Software Ag Systems, methods, and computer program products for instant recovery of image level backups
ZA201106261B (en) * 2010-09-15 2012-10-31 Tata Consultancy Services Ltd System and method for replicating block of transactions from primary site to secondary site
US8805951B1 (en) 2011-02-08 2014-08-12 Emc Corporation Virtual machines and cloud storage caching for cloud computing applications
US8352435B1 (en) 2011-03-17 2013-01-08 Emc Corporation Continuous data reduction for highly available synchronous mirrors
US8775381B1 (en) * 2011-05-14 2014-07-08 Pivotal Software, Inc. Parallel database mirroring
US20130138615A1 (en) 2011-11-29 2013-05-30 International Business Machines Corporation Synchronizing updates across cluster filesystems
US8892523B2 (en) 2012-06-08 2014-11-18 Commvault Systems, Inc. Auto summarization of content
US10379988B2 (en) 2012-12-21 2019-08-13 Commvault Systems, Inc. Systems and methods for performance monitoring
US9286007B1 (en) * 2013-03-14 2016-03-15 Emc Corporation Unified datapath architecture
US9280555B1 (en) 2013-03-29 2016-03-08 Emc Corporation Unified data protection for block and file objects
US9218407B1 (en) 2014-06-25 2015-12-22 Pure Storage, Inc. Replication and intermediate read-write state for mediums
US10275320B2 (en) 2015-06-26 2019-04-30 Commvault Systems, Inc. Incrementally accumulating in-process performance data and hierarchical reporting thereof for a data stream in a secondary copy operation
US10248494B2 (en) 2015-10-29 2019-04-02 Commvault Systems, Inc. Monitoring, diagnosing, and repairing a management database in a data storage management system
US10540516B2 (en) 2016-10-13 2020-01-21 Commvault Systems, Inc. Data protection within an unsecured storage environment
CN108984105B (en) 2017-06-02 2021-09-10 伊姆西Ip控股有限责任公司 Method and device for distributing replication tasks in network storage device
US10831591B2 (en) 2018-01-11 2020-11-10 Commvault Systems, Inc. Remedial action based on maintaining process awareness in data storage management
US10642886B2 (en) 2018-02-14 2020-05-05 Commvault Systems, Inc. Targeted search of backup data using facial recognition
US20200192572A1 (en) 2018-12-14 2020-06-18 Commvault Systems, Inc. Disk usage growth prediction system
US11042318B2 (en) 2019-07-29 2021-06-22 Commvault Systems, Inc. Block-level data replication
US11809285B2 (en) 2022-02-09 2023-11-07 Commvault Systems, Inc. Protecting a management database of a data storage management system to meet a recovery point objective (RPO)

Citations (67)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4686620A (en) * 1984-07-26 1987-08-11 American Telephone And Telegraph Company, At&T Bell Laboratories Database backup method
US4755928A (en) * 1984-03-05 1988-07-05 Storage Technology Corporation Outboard back-up and recovery system with transfer of randomly accessible data sets between cache and host and cache and tape simultaneously
US5155845A (en) * 1990-06-15 1992-10-13 Storage Technology Corporation Data storage system for providing redundant copies of data on different disk drives
US5175852A (en) * 1987-02-13 1992-12-29 International Business Machines Corporation Distributed file access structure lock
US5175837A (en) * 1989-02-03 1992-12-29 Digital Equipment Corporation Synchronizing and processing of memory access operations in multiprocessor systems using a directory of lock bits
US5218695A (en) * 1990-02-05 1993-06-08 Epoch Systems, Inc. File server system having high-speed write execution
US5255270A (en) * 1990-11-07 1993-10-19 Emc Corporation Method of assuring data write integrity on a data storage device
US5276867A (en) * 1989-12-19 1994-01-04 Epoch Systems, Inc. Digital data storage system with improved data migration
US5276860A (en) * 1989-12-19 1994-01-04 Epoch Systems, Inc. Digital data processor with improved backup storage
US5276871A (en) * 1991-03-18 1994-01-04 Bull Hn Information Systems Inc. Method of file shadowing among peer systems
US5301286A (en) * 1991-01-02 1994-04-05 At&T Bell Laboratories Memory archiving indexing arrangement
US5341493A (en) * 1990-09-21 1994-08-23 Emc Corporation Disk storage system with write preservation during power failure
US5367698A (en) * 1991-10-31 1994-11-22 Epoch Systems, Inc. Network file migration system
US5379418A (en) * 1990-02-28 1995-01-03 Hitachi, Ltd. Highly reliable online system
US5434994A (en) * 1994-05-23 1995-07-18 International Business Machines Corporation System and method for maintaining replicated data coherency in a data processing system
US5487160A (en) * 1992-12-04 1996-01-23 At&T Global Information Solutions Company Concurrent image backup for disk storage system
US5535381A (en) * 1993-07-22 1996-07-09 Data General Corporation Apparatus and method for copying and restoring disk files
US5590320A (en) * 1994-09-14 1996-12-31 Smart Storage, Inc. Computer file directory system
US5594863A (en) * 1995-06-26 1997-01-14 Novell, Inc. Method and apparatus for network file recovery
US5611069A (en) * 1993-11-05 1997-03-11 Fujitsu Limited Disk array apparatus which predicts errors using mirror disks that can be accessed in parallel
US5615329A (en) * 1994-02-22 1997-03-25 International Business Machines Corporation Remote data duplexing
US5673382A (en) * 1996-05-30 1997-09-30 International Business Machines Corporation Automated management of off-site storage volumes for disaster recovery
US5680640A (en) * 1995-09-01 1997-10-21 Emc Corporation System for migrating data by selecting a first or second transfer means based on the status of a data element map initialized to a predetermined state
US5701516A (en) * 1992-03-09 1997-12-23 Auspex Systems, Inc. High-performance non-volatile RAM protected write cache accelerator system employing DMA and data transferring scheme
US5742792A (en) * 1993-04-23 1998-04-21 Emc Corporation Remote data mirroring
US5758149A (en) * 1995-03-17 1998-05-26 Unisys Corporation System for optimally processing a transaction and a query to the same database concurrently
US5819292A (en) * 1993-06-03 1998-10-06 Network Appliance, Inc. Method for maintaining consistent states of a file system and for creating user-accessible read-only copies of a file system
US5829047A (en) * 1996-08-29 1998-10-27 Lucent Technologies Inc. Backup memory for reliable operation
US5835954A (en) * 1996-09-12 1998-11-10 International Business Machines Corporation Target DASD controlled data migration move
US5835953A (en) * 1994-10-13 1998-11-10 Vinca Corporation Backup system that takes a snapshot of the locations in a mass storage device that has been identified for updating prior to updating
US5852715A (en) * 1996-03-19 1998-12-22 Emc Corporation System for currently updating database by one host and reading the database by different host for the purpose of implementing decision support functions
US5857208A (en) * 1996-05-31 1999-01-05 Emc Corporation Method and apparatus for performing point in time backup operation in a computer system
US5870746A (en) * 1995-10-12 1999-02-09 Ncr Corporation System and method for segmenting a database based upon data attributes
US5873116A (en) * 1996-11-22 1999-02-16 International Business Machines Corp. Method and apparatus for controlling access to data structures without the use of locks
US5875478A (en) * 1996-12-03 1999-02-23 Emc Corporation Computer backup using a file system, network, disk, tape and remote archiving repository media system
US5893140A (en) * 1996-08-14 1999-04-06 Emc Corporation File server having a file system cache and protocol for truly safe asynchronous writes
US5901327A (en) * 1996-05-28 1999-05-04 Emc Corporation Bundling of write data from channel commands in a command chain for transmission over a data link between data storage systems for remote data mirroring
US5909483A (en) * 1994-07-22 1999-06-01 Comverse Network Systems, Inc. Remote subscriber migration
US5923878A (en) * 1996-11-13 1999-07-13 Sun Microsystems, Inc. System, method and apparatus of directly executing an architecture-independent binary program
US5974563A (en) * 1995-10-16 1999-10-26 Network Specialists, Inc. Real time backup system
US5978951A (en) * 1997-09-11 1999-11-02 3Com Corporation High speed cache management unit for use in a bridge/router
US6016501A (en) * 1998-03-18 2000-01-18 Bmc Software Enterprise data movement system and method which performs data load and changed data propagation operations
US6029175A (en) * 1995-10-26 2000-02-22 Teknowledge Corporation Automatic retrieval of changed files by a network software agent
US6052797A (en) * 1996-05-28 2000-04-18 Emc Corporation Remotely mirrored data storage system with a count indicative of data consistency
US6076148A (en) * 1997-12-26 2000-06-13 Emc Corporation Mass storage subsystem and backup arrangement for digital data processing system which permits information to be backed up while host computer(s) continue(s) operating in connection with information stored on mass storage subsystem
US6078929A (en) * 1996-06-07 2000-06-20 At&T Internet file system
US6081875A (en) * 1997-05-19 2000-06-27 Emc Corporation Apparatus and method for backup of a disk storage system
US6088796A (en) * 1998-08-06 2000-07-11 Cianfrocca; Francis Secure middleware and server control system for querying through a network firewall
US6092066A (en) * 1996-05-31 2000-07-18 Emc Corporation Method and apparatus for independent operation of a remote data facility
US6101497A (en) * 1996-05-31 2000-08-08 Emc Corporation Method and apparatus for independent and simultaneous access to a common data set
US6192408B1 (en) * 1997-09-26 2001-02-20 Emc Corporation Network file server sharing local caches of file access information in data processors assigned to respective file systems
US6324654B1 (en) * 1998-03-30 2001-11-27 Legato Systems, Inc. Computer network remote data mirroring system
US6353878B1 (en) * 1998-08-13 2002-03-05 Emc Corporation Remote control of backup media in a secondary storage subsystem through access to a primary storage subsystem
US6401239B1 (en) * 1999-03-22 2002-06-04 B.I.S. Advanced Software Systems Ltd. System and method for quick downloading of electronic files
US6434681B1 (en) * 1999-12-02 2002-08-13 Emc Corporation Snapshot copy facility for a data storage system permitting continued host read/write access
US20020163910A1 (en) * 2001-05-01 2002-11-07 Wisner Steven P. System and method for providing access to resources using a fabric switch
US6496908B1 (en) * 2001-05-18 2002-12-17 Emc Corporation Remote mirroring
US6564229B1 (en) * 2000-06-08 2003-05-13 International Business Machines Corporation System and method for pausing and resuming move/copy operations
US6578120B1 (en) * 1997-06-24 2003-06-10 International Business Machines Corporation Synchronization and resynchronization of loosely-coupled copy operations between a primary and a remote secondary DASD volume under concurrent updating
US20040030951A1 (en) * 2002-08-06 2004-02-12 Philippe Armangau Instantaneous restoration of a production copy from a snapshot copy in a data storage system
US6694447B1 (en) * 2000-09-29 2004-02-17 Sun Microsystems, Inc. Apparatus and method for increasing application availability during a disaster fail-back
US6732124B1 (en) * 1999-03-30 2004-05-04 Fujitsu Limited Data processing system with mechanism for restoring file systems based on transaction logs
US6823336B1 (en) * 2000-09-26 2004-11-23 Emc Corporation Data storage system and method for uninterrupted read-only access to a consistent dataset by one host processor concurrent with read-write access by another host processor
US6938039B1 (en) * 2000-06-30 2005-08-30 Emc Corporation Concurrent file across at a target file server during migration of file systems between file servers using a network file system access protocol
US7010553B2 (en) * 2002-03-19 2006-03-07 Network Appliance, Inc. System and method for redirecting access to a remote mirrored snapshot
US20060242182A1 (en) * 2005-04-25 2006-10-26 Murali Palaniappan System and method for stranded file opens during disk compression utility requests
US20070033236A1 (en) * 2005-08-04 2007-02-08 Fujitsu Limited Database restructuring apparatus, and computer-readable recording medium recording database restructuring program

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR0128271B1 (en) 1994-02-22 1998-04-15 윌리암 티. 엘리스 Remote data duplexing
JP2894676B2 (en) 1994-03-21 1999-05-24 インターナショナル・ビジネス・マシーンズ・コーポレイション Asynchronous remote copy system and asynchronous remote copy method

Patent Citations (71)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4755928A (en) * 1984-03-05 1988-07-05 Storage Technology Corporation Outboard back-up and recovery system with transfer of randomly accessible data sets between cache and host and cache and tape simultaneously
US4686620A (en) * 1984-07-26 1987-08-11 American Telephone And Telegraph Company, At&T Bell Laboratories Database backup method
US5175852A (en) * 1987-02-13 1992-12-29 International Business Machines Corporation Distributed file access structure lock
US5175837A (en) * 1989-02-03 1992-12-29 Digital Equipment Corporation Synchronizing and processing of memory access operations in multiprocessor systems using a directory of lock bits
US5276860A (en) * 1989-12-19 1994-01-04 Epoch Systems, Inc. Digital data processor with improved backup storage
US5276867A (en) * 1989-12-19 1994-01-04 Epoch Systems, Inc. Digital data storage system with improved data migration
US5218695A (en) * 1990-02-05 1993-06-08 Epoch Systems, Inc. File server system having high-speed write execution
US5379418A (en) * 1990-02-28 1995-01-03 Hitachi, Ltd. Highly reliable online system
US5596706A (en) * 1990-02-28 1997-01-21 Hitachi, Ltd. Highly reliable online system
US5155845A (en) * 1990-06-15 1992-10-13 Storage Technology Corporation Data storage system for providing redundant copies of data on different disk drives
US5341493A (en) * 1990-09-21 1994-08-23 Emc Corporation Disk storage system with write preservation during power failure
US5255270A (en) * 1990-11-07 1993-10-19 Emc Corporation Method of assuring data write integrity on a data storage device
US5301286A (en) * 1991-01-02 1994-04-05 At&T Bell Laboratories Memory archiving indexing arrangement
US5276871A (en) * 1991-03-18 1994-01-04 Bull Hn Information Systems Inc. Method of file shadowing among peer systems
US5367698A (en) * 1991-10-31 1994-11-22 Epoch Systems, Inc. Network file migration system
US5701516A (en) * 1992-03-09 1997-12-23 Auspex Systems, Inc. High-performance non-volatile RAM protected write cache accelerator system employing DMA and data transferring scheme
US5487160A (en) * 1992-12-04 1996-01-23 At&T Global Information Solutions Company Concurrent image backup for disk storage system
US6502205B1 (en) * 1993-04-23 2002-12-31 Emc Corporation Asynchronous remote data mirroring system
US5742792A (en) * 1993-04-23 1998-04-21 Emc Corporation Remote data mirroring
US5819292A (en) * 1993-06-03 1998-10-06 Network Appliance, Inc. Method for maintaining consistent states of a file system and for creating user-accessible read-only copies of a file system
US5535381A (en) * 1993-07-22 1996-07-09 Data General Corporation Apparatus and method for copying and restoring disk files
US5611069A (en) * 1993-11-05 1997-03-11 Fujitsu Limited Disk array apparatus which predicts errors using mirror disks that can be accessed in parallel
US5615329A (en) * 1994-02-22 1997-03-25 International Business Machines Corporation Remote data duplexing
US5434994A (en) * 1994-05-23 1995-07-18 International Business Machines Corporation System and method for maintaining replicated data coherency in a data processing system
US5909483A (en) * 1994-07-22 1999-06-01 Comverse Network Systems, Inc. Remote subscriber migration
US5590320A (en) * 1994-09-14 1996-12-31 Smart Storage, Inc. Computer file directory system
US5835953A (en) * 1994-10-13 1998-11-10 Vinca Corporation Backup system that takes a snapshot of the locations in a mass storage device that has been identified for updating prior to updating
US5758149A (en) * 1995-03-17 1998-05-26 Unisys Corporation System for optimally processing a transaction and a query to the same database concurrently
US5594863A (en) * 1995-06-26 1997-01-14 Novell, Inc. Method and apparatus for network file recovery
US5680640A (en) * 1995-09-01 1997-10-21 Emc Corporation System for migrating data by selecting a first or second transfer means based on the status of a data element map initialized to a predetermined state
US6108748A (en) * 1995-09-01 2000-08-22 Emc Corporation System and method for on-line, real time, data migration
US5870746A (en) * 1995-10-12 1999-02-09 Ncr Corporation System and method for segmenting a database based upon data attributes
US5974563A (en) * 1995-10-16 1999-10-26 Network Specialists, Inc. Real time backup system
US6029175A (en) * 1995-10-26 2000-02-22 Teknowledge Corporation Automatic retrieval of changed files by a network software agent
US5852715A (en) * 1996-03-19 1998-12-22 Emc Corporation System for currently updating database by one host and reading the database by different host for the purpose of implementing decision support functions
US6035412A (en) * 1996-03-19 2000-03-07 Emc Corporation RDF-based and MMF-based backups
US6052797A (en) * 1996-05-28 2000-04-18 Emc Corporation Remotely mirrored data storage system with a count indicative of data consistency
US5901327A (en) * 1996-05-28 1999-05-04 Emc Corporation Bundling of write data from channel commands in a command chain for transmission over a data link between data storage systems for remote data mirroring
US5673382A (en) * 1996-05-30 1997-09-30 International Business Machines Corporation Automated management of off-site storage volumes for disaster recovery
US6092066A (en) * 1996-05-31 2000-07-18 Emc Corporation Method and apparatus for independent operation of a remote data facility
US5857208A (en) * 1996-05-31 1999-01-05 Emc Corporation Method and apparatus for performing point in time backup operation in a computer system
US6101497A (en) * 1996-05-31 2000-08-08 Emc Corporation Method and apparatus for independent and simultaneous access to a common data set
US6078929A (en) * 1996-06-07 2000-06-20 At&T Internet file system
US5893140A (en) * 1996-08-14 1999-04-06 Emc Corporation File server having a file system cache and protocol for truly safe asynchronous writes
US5829047A (en) * 1996-08-29 1998-10-27 Lucent Technologies Inc. Backup memory for reliable operation
US5835954A (en) * 1996-09-12 1998-11-10 International Business Machines Corporation Target DASD controlled data migration move
US5923878A (en) * 1996-11-13 1999-07-13 Sun Microsystems, Inc. System, method and apparatus of directly executing an architecture-independent binary program
US5873116A (en) * 1996-11-22 1999-02-16 International Business Machines Corp. Method and apparatus for controlling access to data structures without the use of locks
US5875478A (en) * 1996-12-03 1999-02-23 Emc Corporation Computer backup using a file system, network, disk, tape and remote archiving repository media system
US6081875A (en) * 1997-05-19 2000-06-27 Emc Corporation Apparatus and method for backup of a disk storage system
US6578120B1 (en) * 1997-06-24 2003-06-10 International Business Machines Corporation Synchronization and resynchronization of loosely-coupled copy operations between a primary and a remote secondary DASD volume under concurrent updating
US5978951A (en) * 1997-09-11 1999-11-02 3Com Corporation High speed cache management unit for use in a bridge/router
US6192408B1 (en) * 1997-09-26 2001-02-20 Emc Corporation Network file server sharing local caches of file access information in data processors assigned to respective file systems
US6076148A (en) * 1997-12-26 2000-06-13 Emc Corporation Mass storage subsystem and backup arrangement for digital data processing system which permits information to be backed up while host computer(s) continue(s) operating in connection with information stored on mass storage subsystem
US6016501A (en) * 1998-03-18 2000-01-18 Bmc Software Enterprise data movement system and method which performs data load and changed data propagation operations
US6324654B1 (en) * 1998-03-30 2001-11-27 Legato Systems, Inc. Computer network remote data mirroring system
US6088796A (en) * 1998-08-06 2000-07-11 Cianfrocca; Francis Secure middleware and server control system for querying through a network firewall
US6353878B1 (en) * 1998-08-13 2002-03-05 Emc Corporation Remote control of backup media in a secondary storage subsystem through access to a primary storage subsystem
US6401239B1 (en) * 1999-03-22 2002-06-04 B.I.S. Advanced Software Systems Ltd. System and method for quick downloading of electronic files
US6732124B1 (en) * 1999-03-30 2004-05-04 Fujitsu Limited Data processing system with mechanism for restoring file systems based on transaction logs
US6434681B1 (en) * 1999-12-02 2002-08-13 Emc Corporation Snapshot copy facility for a data storage system permitting continued host read/write access
US6564229B1 (en) * 2000-06-08 2003-05-13 International Business Machines Corporation System and method for pausing and resuming move/copy operations
US6938039B1 (en) * 2000-06-30 2005-08-30 Emc Corporation Concurrent file across at a target file server during migration of file systems between file servers using a network file system access protocol
US6823336B1 (en) * 2000-09-26 2004-11-23 Emc Corporation Data storage system and method for uninterrupted read-only access to a consistent dataset by one host processor concurrent with read-write access by another host processor
US6694447B1 (en) * 2000-09-29 2004-02-17 Sun Microsystems, Inc. Apparatus and method for increasing application availability during a disaster fail-back
US20020163910A1 (en) * 2001-05-01 2002-11-07 Wisner Steven P. System and method for providing access to resources using a fabric switch
US6496908B1 (en) * 2001-05-18 2002-12-17 Emc Corporation Remote mirroring
US7010553B2 (en) * 2002-03-19 2006-03-07 Network Appliance, Inc. System and method for redirecting access to a remote mirrored snapshot
US20040030951A1 (en) * 2002-08-06 2004-02-12 Philippe Armangau Instantaneous restoration of a production copy from a snapshot copy in a data storage system
US20060242182A1 (en) * 2005-04-25 2006-10-26 Murali Palaniappan System and method for stranded file opens during disk compression utility requests
US20070033236A1 (en) * 2005-08-04 2007-02-08 Fujitsu Limited Database restructuring apparatus, and computer-readable recording medium recording database restructuring program

Cited By (207)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030041220A1 (en) * 2001-08-23 2003-02-27 Pavel Peleska System and method for establishing consistent memory contents in redundant systems
US6959400B2 (en) * 2001-08-23 2005-10-25 Siemens Aktiengesellschaft System and method for establishing consistent memory contents in redundant systems
US20040236915A1 (en) * 2001-11-20 2004-11-25 Hitachi, Ltd. Multiple data management method, computer and storage device therefor
US7010650B2 (en) 2001-11-20 2006-03-07 Hitachi, Ltd. Multiple data management method, computer and storage device therefor
US6779093B1 (en) * 2002-02-15 2004-08-17 Veritas Operating Corporation Control facility for processing in-band control messages during data replication
US7373468B1 (en) * 2002-02-15 2008-05-13 Symantec Operating Corporation Control facility for processing in-band control messages during data replication
US7502960B1 (en) 2002-04-15 2009-03-10 Emc Corporation Method and apparatus for managing access to volumes of storage
US7076690B1 (en) 2002-04-15 2006-07-11 Emc Corporation Method and apparatus for managing access to volumes of storage
US7469289B2 (en) 2002-04-26 2008-12-23 Hitachi, Ltd. Storage system having virtualized resource
US20030204597A1 (en) * 2002-04-26 2003-10-30 Hitachi, Inc. Storage system having virtualized resource
US20060253549A1 (en) * 2002-04-26 2006-11-09 Hitachi, Ltd. Storage system having virtualized resource
US7222172B2 (en) 2002-04-26 2007-05-22 Hitachi, Ltd. Storage system having virtualized resource
US20030225760A1 (en) * 2002-05-30 2003-12-04 Jarmo Ruuth Method and system for processing replicated transactions parallel in secondary server
US6978396B2 (en) * 2002-05-30 2005-12-20 Solid Information Technology Oy Method and system for processing replicated transactions parallel in secondary server
US20040006587A1 (en) * 2002-07-02 2004-01-08 Dell Products L.P. Information handling system and method for clustering with internal cross coupled storage
US7707151B1 (en) 2002-08-02 2010-04-27 Emc Corporation Method and apparatus for migrating data
US7546482B2 (en) 2002-10-28 2009-06-09 Emc Corporation Method and apparatus for monitoring the storage of data in a computer system
US20040153717A1 (en) * 2002-11-07 2004-08-05 Duncan Kurt A. Apparatus and method for enhancing data availability by implementing inter-storage-unit communication
US7039829B2 (en) * 2002-11-07 2006-05-02 Lsi Logic Corporation Apparatus and method for enhancing data availability by implementing inter-storage-unit communication
US7107483B2 (en) * 2002-11-15 2006-09-12 Lsi Logic Corporation Apparatus and method for enhancing data availability by leveraging primary/backup data storage volumes
US20040098637A1 (en) * 2002-11-15 2004-05-20 Duncan Kurt A. Apparatus and method for enhancing data availability by leveraging primary/backup data storage volumes
US7370025B1 (en) * 2002-12-17 2008-05-06 Symantec Operating Corporation System and method for providing access to replicated data
US8103625B1 (en) * 2002-12-17 2012-01-24 Symantec Operating Corporation System and method for providing access to replicated data
US20050021690A1 (en) * 2003-02-27 2005-01-27 Prasad Peddada System and method for communications between servers in a cluster
US20080126546A1 (en) * 2003-02-27 2008-05-29 Bea Systems, Inc. System and method for communication between servers in a cluster
US7571255B2 (en) 2003-02-27 2009-08-04 Bea Systems, Inc. System and method for communication between servers in a cluster
US7376754B2 (en) * 2003-02-27 2008-05-20 Bea Systems, Inc. System and method for communications between servers in a cluster
US7581063B2 (en) * 2003-03-11 2009-08-25 International Business Machines Corporation Method, system, and program for improved throughput in remote mirroring systems
US20050228954A1 (en) * 2003-03-11 2005-10-13 International Business Machines Corporation Method, system, and program for improved throughput in remote mirroring systems
US6996688B2 (en) * 2003-03-11 2006-02-07 International Business Machines Corporation Method, system, and program for improved throughput in remote mirroring systems
US20040181640A1 (en) * 2003-03-11 2004-09-16 International Business Machines Corporation Method, system, and program for improved throughput in remote mirroring systems
US7263590B1 (en) 2003-04-23 2007-08-28 Emc Corporation Method and apparatus for migrating data in a computer system
US7415591B1 (en) 2003-04-23 2008-08-19 Emc Corporation Method and apparatus for migrating data and automatically provisioning a target for the migration
US7805583B1 (en) * 2003-04-23 2010-09-28 Emc Corporation Method and apparatus for migrating data in a clustered computer system environment
US7318071B2 (en) 2003-05-27 2008-01-08 Emc Corporation System and method for transfering data from a source machine to a target machine
US20040243651A1 (en) * 2003-05-27 2004-12-02 Legato Systems, Inc. System and method for transfering data from a source machine to a target machine
WO2004107637A3 (en) * 2003-05-27 2005-10-20 Emc Corp A system and method for transfering data from a source machine to a target machine
US7653699B1 (en) * 2003-06-12 2010-01-26 Symantec Operating Corporation System and method for partitioning a file system for enhanced availability and scalability
US7275177B2 (en) 2003-06-25 2007-09-25 Emc Corporation Data recovery with internet protocol replication with or without full resync
US7567991B2 (en) 2003-06-25 2009-07-28 Emc Corporation Replication of snapshot using a file system copy differential
US20040267836A1 (en) * 2003-06-25 2004-12-30 Philippe Armangau Replication of snapshot using a file system copy differential
US7987157B1 (en) * 2003-07-18 2011-07-26 Symantec Operating Corporation Low-impact refresh mechanism for production databases
US7953819B2 (en) 2003-08-22 2011-05-31 Emc Corporation Multi-protocol sharable virtual storage objects
US20050076091A1 (en) * 2003-09-11 2005-04-07 Duncan Missimer Data mirroring
US7111138B2 (en) 2003-09-16 2006-09-19 Hitachi, Ltd. Storage system and storage control device
US7035881B2 (en) 2003-09-23 2006-04-25 Emc Corporation Organization of read-write snapshot copies in a data storage system
US20050076070A1 (en) * 2003-10-02 2005-04-07 Shougo Mikami Method, apparatus, and computer readable medium for managing replication of back-up object
US20070055716A1 (en) * 2003-10-02 2007-03-08 Shougo Mikami Method, apparatus, and computer readable medium for managing replication of back-up object
US7152080B2 (en) * 2003-10-02 2006-12-19 Hitachi, Ltd. Method, apparatus, and computer readable medium for managing replication of back-up object
US7197663B2 (en) * 2003-11-20 2007-03-27 International Business Machines Corporation Concurrent PPRC/FCP and host access to secondary PPRC/FCP device through independent error management
US20050114740A1 (en) * 2003-11-20 2005-05-26 International Business Machines (Ibm) Corporation Concurrent PPRC/FCP and host access to secondary PPRC/FCP device through independent error management
US7606841B1 (en) 2003-12-29 2009-10-20 Symantec Operating Corporation Coordinated dirty block tracking
US7039661B1 (en) * 2003-12-29 2006-05-02 Veritas Operating Corporation Coordinated dirty block tracking
US7849274B2 (en) 2003-12-29 2010-12-07 Netapp, Inc. System and method for zero copy block protocol write operations
US7249227B1 (en) * 2003-12-29 2007-07-24 Network Appliance, Inc. System and method for zero copy block protocol write operations
US20070208821A1 (en) * 2003-12-29 2007-09-06 Pittman Joseph C System and method for zero copy block protocol write operations
US7779033B2 (en) * 2003-12-30 2010-08-17 Wibu-Systems Ag Method for controlling a data processing device
US20070186037A1 (en) * 2003-12-30 2007-08-09 Wibu-Systems Ag Method for controlling a data processing device
US7639713B2 (en) 2004-01-21 2009-12-29 Emc Corporation Database block network attached storage packet joining
US20050157756A1 (en) * 2004-01-21 2005-07-21 John Ormond Database block network attached storage packet joining
US7363437B2 (en) 2004-01-28 2008-04-22 Hitachi, Ltd. Shared/exclusive control scheme among sites including storage device system shared by plural high-rank apparatuses, and computer system equipped with the same control scheme
US20070186059A1 (en) * 2004-01-28 2007-08-09 Kenichi Miki Shared/exclusive control scheme among sites including storage device system shared by plural high-rank apparatuses, and computer system equipped with the same control scheme
US7783844B2 (en) 2004-01-28 2010-08-24 Hitachi, Ltd. Shared/exclusive control scheme among sites including storage device system shared by plural high-rank apparatuses, and computer system equipped with the same control scheme
US20050166018A1 (en) * 2004-01-28 2005-07-28 Kenichi Miki Shared/exclusive control scheme among sites including storage device system shared by plural high-rank apparatuses, and computer system equipped with the same control scheme
US7383463B2 (en) 2004-02-04 2008-06-03 Emc Corporation Internet protocol based disaster recovery of a server
US20050193245A1 (en) * 2004-02-04 2005-09-01 Hayden John M. Internet protocol based disaster recovery of a server
US20050204106A1 (en) * 2004-02-27 2005-09-15 Richard Testardi Distributed asynchronous ordered replication
US7624109B2 (en) * 2004-02-27 2009-11-24 Texas Memory Systems, Inc. Distributed asynchronous ordered replication
US20050216502A1 (en) * 2004-03-26 2005-09-29 Oracle International Corporation Method of providing shared objects and node-specific objects in a cluster file system
US7657529B2 (en) * 2004-03-26 2010-02-02 Oracle International Corporation Method of providing shared objects and node-specific objects in a cluster file system
US20050246576A1 (en) * 2004-03-30 2005-11-03 Masaaki Takayama Redundant system utilizing remote disk mirroring technique, and initialization method for remote disk mirroring for in the system
US7107486B2 (en) * 2004-04-08 2006-09-12 Hitachi, Ltd. Restore method for backup
US20050240813A1 (en) * 2004-04-08 2005-10-27 Wataru Okada Restore method for backup
US7966391B2 (en) * 2004-05-11 2011-06-21 Todd J. Anderson Systems, apparatus and methods for managing networking devices
US20050267928A1 (en) * 2004-05-11 2005-12-01 Anderson Todd J Systems, apparatus and methods for managing networking devices
US20100235324A1 (en) * 2004-06-25 2010-09-16 Nec Corporation Replication System Having the Capability to Accept Commands at a Standby-system Site before Completion of Updating thereof
US20050289197A1 (en) * 2004-06-25 2005-12-29 Nec Corporation Replication system having the capability to accept commands at a standby-system site before completion of updating thereof
US8161138B2 (en) * 2004-06-25 2012-04-17 Nec Corporation Replication system having the capability to accept commands at a standby-system site before completion of updating thereof
US20060015499A1 (en) * 2004-07-13 2006-01-19 International Business Machines Corporation Method, data processing system, and computer program product for sectional access privileges of plain text files
US20060064536A1 (en) * 2004-07-21 2006-03-23 Tinker Jeffrey L Distributed storage architecture based on block map caching and VFS stackable file system modules
US7640274B2 (en) * 2004-07-21 2009-12-29 Tinker Jeffrey L Distributed storage architecture based on block map caching and VFS stackable file system modules
US20060080362A1 (en) * 2004-10-12 2006-04-13 Lefthand Networks, Inc. Data Synchronization Over a Computer Network
US20060143412A1 (en) * 2004-12-28 2006-06-29 Philippe Armangau Snapshot copy facility maintaining read performance and write performance
US20060230148A1 (en) * 2005-04-06 2006-10-12 John Forecast TCP forwarding of client requests of high-level file and storage access protocols in a network file server system
US20060271815A1 (en) * 2005-05-31 2006-11-30 Kazuhiko Mizuno System and method for disaster recovery of data
US7587627B2 (en) * 2005-05-31 2009-09-08 Hitachi, Ltd. System and method for disaster recovery of data
US9378099B2 (en) * 2005-06-24 2016-06-28 Catalogic Software, Inc. Instant data center recovery
US20140317059A1 (en) * 2005-06-24 2014-10-23 Catalogic Software, Inc. Instant data center recovery
US20070024919A1 (en) * 2005-06-29 2007-02-01 Wong Chi M Parallel filesystem traversal for transparent mirroring of directories and files
US8832697B2 (en) * 2005-06-29 2014-09-09 Cisco Technology, Inc. Parallel filesystem traversal for transparent mirroring of directories and files
US20070027896A1 (en) * 2005-07-28 2007-02-01 International Business Machines Corporation Session replication
US7668904B2 (en) * 2005-07-28 2010-02-23 International Business Machines Corporation Session replication
US20070038697A1 (en) * 2005-08-03 2007-02-15 Eyal Zimran Multi-protocol namespace server
US20070055703A1 (en) * 2005-09-07 2007-03-08 Eyal Zimran Namespace server using referral protocols
US20070067664A1 (en) * 2005-09-20 2007-03-22 International Business Machines Corporation Failure transparency for update applications under single-master configuration
US7600149B2 (en) * 2005-09-20 2009-10-06 International Business Machines Corporation Failure transparency for update applications under single-master configuration
US20070088702A1 (en) * 2005-10-03 2007-04-19 Fridella Stephen A Intelligent network client for multi-protocol namespace redirection
US7475213B2 (en) * 2005-10-18 2009-01-06 Hitachi, Ltd. Storage control system and storage control method
US20070088930A1 (en) * 2005-10-18 2007-04-19 Jun Matsuda Storage control system and storage control method
US7844794B2 (en) 2005-10-18 2010-11-30 Hitachi, Ltd. Storage system with cache threshold control
US20090182940A1 (en) * 2005-10-18 2009-07-16 Jun Matsuda Storage control system and control method
US7444478B2 (en) 2005-11-18 2008-10-28 International Business Machines Corporation Priority scheme for transmitting blocks of data
US20090006789A1 (en) * 2005-11-18 2009-01-01 International Business Machines Corporation Computer program product and a system for a priority scheme for transmitting blocks of data
US7769960B2 (en) 2005-11-18 2010-08-03 International Business Machines Corporation Computer program product and a system for a priority scheme for transmitting blocks of data
US20070118698A1 (en) * 2005-11-18 2007-05-24 Lafrese Lee C Priority scheme for transmitting blocks of data
US7765187B2 (en) 2005-11-29 2010-07-27 Emc Corporation Replication of a consistency group of data storage objects from servers in a data network
US20070136389A1 (en) * 2005-11-29 2007-06-14 Milena Bergant Replication of a consistency group of data storage objects from servers in a data network
US8531953B2 (en) * 2006-02-21 2013-09-10 Barclays Capital Inc. System and method for network traffic splitting
US20070195750A1 (en) * 2006-02-21 2007-08-23 Lehman Brothers Inc. System and method for network traffic splitting
US7890796B2 (en) 2006-10-04 2011-02-15 Emc Corporation Automatic media error correction in a file server
US20080155316A1 (en) * 2006-10-04 2008-06-26 Sitaram Pawar Automatic Media Error Correction In A File Server
US7769722B1 (en) 2006-12-08 2010-08-03 Emc Corporation Replication and restoration of multiple data storage object types in a data network
US8706833B1 (en) 2006-12-08 2014-04-22 Emc Corporation Data storage server having common replication architecture for multiple storage object types
US7882061B1 (en) * 2006-12-21 2011-02-01 Emc Corporation Multi-thread replication across a network
US20080208923A1 (en) * 2007-01-10 2008-08-28 Satoru Watanabe Method for verifying data consistency of backup system, program and storage medium
US7890467B2 (en) * 2007-01-10 2011-02-15 Hitachi, Ltd. Method for verifying data consistency of backup system, program and storage medium
US20200145359A1 (en) * 2007-05-22 2020-05-07 International Business Machines Corporation Handling large messages via pointer and log
US11556400B2 (en) * 2007-05-22 2023-01-17 International Business Machines Corporation Handling large messages via pointer and log
US20090217104A1 (en) * 2008-02-26 2009-08-27 International Business Machines Corpration Method and apparatus for diagnostic recording using transactional memory
US8972794B2 (en) * 2008-02-26 2015-03-03 International Business Machines Corporation Method and apparatus for diagnostic recording using transactional memory
US8433864B1 (en) * 2008-06-30 2013-04-30 Symantec Corporation Method and apparatus for providing point-in-time backup images
US8099572B1 (en) 2008-09-30 2012-01-17 Emc Corporation Efficient backup and restore of storage objects in a version set
US10255291B1 (en) * 2009-06-29 2019-04-09 EMC IP Holding Company LLC Replication of volumes using partial volume split
US8615769B2 (en) * 2010-02-26 2013-12-24 Nec Corporation Data processing system, data processing method, and data processing program
US20110214130A1 (en) * 2010-02-26 2011-09-01 Yoshihiko Nishihata Data processing system, data processing method, and data processing program
US9852150B2 (en) 2010-05-03 2017-12-26 Panzura, Inc. Avoiding client timeouts in a distributed filesystem
US9811532B2 (en) 2010-05-03 2017-11-07 Panzura, Inc. Executing a cloud command for a distributed filesystem
US9678981B1 (en) * 2010-05-03 2017-06-13 Panzura, Inc. Customizing data management for a distributed filesystem
US8438247B1 (en) * 2010-12-21 2013-05-07 Amazon Technologies, Inc. Techniques for capturing data sets
US8943127B2 (en) 2010-12-21 2015-01-27 Amazon Technologies, Inc. Techniques for capturing data sets
US8931107B1 (en) 2011-08-30 2015-01-06 Amazon Technologies, Inc. Techniques for generating block level data captures
US9189343B2 (en) 2011-08-30 2015-11-17 Amazon Technologies, Inc. Frequent data set captures for volume forensics
US20130117744A1 (en) * 2011-11-03 2013-05-09 Ocz Technology Group, Inc. Methods and apparatus for providing hypervisor-level acceleration and virtualization services
US9804928B2 (en) 2011-11-14 2017-10-31 Panzura, Inc. Restoring an archived file in a distributed filesystem
US10296494B2 (en) 2011-11-14 2019-05-21 Panzura, Inc. Managing a global namespace for a distributed filesystem
US9805054B2 (en) 2011-11-14 2017-10-31 Panzura, Inc. Managing a global namespace for a distributed filesystem
US20130159645A1 (en) * 2011-12-15 2013-06-20 International Business Machines Corporation Data selection for movement from a source to a target
US9087011B2 (en) 2011-12-15 2015-07-21 International Business Machines Corporation Data selection for movement from a source to a target
US9087010B2 (en) * 2011-12-15 2015-07-21 International Business Machines Corporation Data selection for movement from a source to a target
US20140365824A1 (en) * 2012-01-20 2014-12-11 Tencent Technology (Shenzhen) Company Limited Method for recovering hard disk data, server and distributed storage system
US20140149350A1 (en) * 2012-11-27 2014-05-29 International Business Machines Corporation Remote Replication in a Storage System
US20140331018A1 (en) * 2013-05-02 2014-11-06 Bull Sas Method and device for saving data in an it infrastructure offering activity resumption functions
US10606505B2 (en) * 2013-05-02 2020-03-31 Bull Sas Method and device for saving data in an IT infrastructure offering activity resumption functions
US20140344267A1 (en) * 2013-05-17 2014-11-20 Go Daddy Operating Company, LLC Storing, Accessing and Restoring Website Content via a Website Repository
US10250579B2 (en) * 2013-08-13 2019-04-02 Alcatel Lucent Secure file transfers within network-based storage
US9817703B1 (en) * 2013-12-04 2017-11-14 Amazon Technologies, Inc. Distributed lock management using conditional updates to a distributed key value data store
US11734306B2 (en) 2013-12-12 2023-08-22 Huawei Technologies Co., Ltd. Data replication method and storage system
US10706072B2 (en) * 2013-12-12 2020-07-07 Huawei Technologies Co., Ltd. Data replication method and storage system
US10175895B2 (en) * 2013-12-13 2019-01-08 Netapp Inc. Techniques for importation of information to a storage system
US20160266830A1 (en) * 2013-12-13 2016-09-15 Netapp Inc. Techniques for importation of information to a storage system
US11016941B2 (en) 2014-02-28 2021-05-25 Red Hat, Inc. Delayed asynchronous file replication in a distributed file system
US10025808B2 (en) 2014-03-19 2018-07-17 Red Hat, Inc. Compacting change logs using file content location identifiers
US20150269183A1 (en) * 2014-03-19 2015-09-24 Red Hat, Inc. File replication using file content location identifiers
US11064025B2 (en) 2014-03-19 2021-07-13 Red Hat, Inc. File replication using file content location identifiers
US9986029B2 (en) * 2014-03-19 2018-05-29 Red Hat, Inc. File replication using file content location identifiers
US9965505B2 (en) 2014-03-19 2018-05-08 Red Hat, Inc. Identifying files in change logs using file content location identifiers
US9779105B1 (en) * 2014-03-31 2017-10-03 EMC IP Holding Company LLC Transaction logging using file-system-specific log files
US20160018995A1 (en) * 2014-07-17 2016-01-21 Lsi Corporation Raid system for processing i/o requests utilizing xor commands
US10599675B2 (en) 2014-09-30 2020-03-24 International Business Machines Corporation Hybrid data replication
US9940379B2 (en) * 2014-09-30 2018-04-10 International Business Machines Corporation Hybrid data replication
US20160092536A1 (en) * 2014-09-30 2016-03-31 International Business Machines Corporation Hybrid data replication
WO2016107013A1 (en) * 2014-12-30 2016-07-07 中兴通讯股份有限公司 Transmission processing and remote processing method and apparatus and computer storage medium
US10942817B1 (en) 2015-01-31 2021-03-09 Veritas Technologies Llc Low cost, heterogeneous method of transforming replicated data for consumption in the cloud
US11366724B2 (en) 2015-01-31 2022-06-21 Veritas Technologies Llc Low cost, heterogeneous method of transforming replicated data for consumption in the cloud
US10296422B1 (en) * 2015-01-31 2019-05-21 Veritas Technologies Llc Low cost, heterogeneous method of transforming replicated data for consumption in the cloud
US10331657B1 (en) 2015-09-28 2019-06-25 Amazon Technologies, Inc. Contention analysis for journal-based databases
US10133767B1 (en) 2015-09-28 2018-11-20 Amazon Technologies, Inc. Materialization strategies in journal-based databases
US10198346B1 (en) * 2015-09-28 2019-02-05 Amazon Technologies, Inc. Test framework for applications using journal-based databases
US10346260B1 (en) * 2015-09-30 2019-07-09 EMC IP Holding Company LLC Replication based security
US20190079836A1 (en) * 2015-12-21 2019-03-14 Intel Corporation Predictive memory maintenance
US20180024762A1 (en) * 2016-07-22 2018-01-25 International Business Machines Corporation Data access management in distributed computer storage environments
US10235091B1 (en) * 2016-09-23 2019-03-19 EMC IP Holding Company LLC Full sweep disk synchronization in a storage system
US10509707B1 (en) * 2016-12-15 2019-12-17 EMC IP Holding Company LLC Selective data mirroring
US10747632B2 (en) * 2017-08-11 2020-08-18 T-Mobile Usa, Inc. Data redundancy and allocation system
US11727039B2 (en) 2017-09-25 2023-08-15 Splunk Inc. Low-latency streaming analytics
US11645286B2 (en) 2018-01-31 2023-05-09 Splunk Inc. Dynamic data processor for streaming and batch queries
US11615084B1 (en) 2018-10-31 2023-03-28 Splunk Inc. Unified data processing across streaming and indexed data sets
US11146626B2 (en) * 2018-11-01 2021-10-12 EMC IP Holding Company LLC Cloud computing environment with replication system configured to reduce latency of data read access
US11886440B1 (en) 2019-07-16 2024-01-30 Splunk Inc. Guided creation interface for streaming data processing pipelines
US11614923B2 (en) 2020-04-30 2023-03-28 Splunk Inc. Dual textual/graphical programming interfaces for streaming data processing pipelines
US11349917B2 (en) 2020-07-23 2022-05-31 Pure Storage, Inc. Replication handling among distinct networks
US11789638B2 (en) 2020-07-23 2023-10-17 Pure Storage, Inc. Continuing replication during storage system transportation
US11882179B2 (en) 2020-07-23 2024-01-23 Pure Storage, Inc. Supporting multiple replication schemes across distinct network layers
US11442652B1 (en) 2020-07-23 2022-09-13 Pure Storage, Inc. Replication handling during storage system transportation
US11606310B2 (en) 2020-09-28 2023-03-14 Vmware, Inc. Flow processing offload using virtual port identifiers
US11875172B2 (en) 2020-09-28 2024-01-16 VMware LLC Bare metal computer for booting copies of VM images on multiple computing devices using a smart NIC
US20220103490A1 (en) * 2020-09-28 2022-03-31 Vmware, Inc. Accessing multiple external storages to present an emulated local storage through a nic
US11829793B2 (en) 2020-09-28 2023-11-28 Vmware, Inc. Unified management of virtual machines and bare metal computers
US11716383B2 (en) * 2020-09-28 2023-08-01 Vmware, Inc. Accessing multiple external storages to present an emulated local storage through a NIC
US11824931B2 (en) 2020-09-28 2023-11-21 Vmware, Inc. Using physical and virtual functions associated with a NIC to access an external storage through network fabric driver
US11792134B2 (en) 2020-09-28 2023-10-17 Vmware, Inc. Configuring PNIC to perform flow processing offload using virtual port identifiers
US11636053B2 (en) 2020-09-28 2023-04-25 Vmware, Inc. Emulating a local storage by accessing an external storage through a shared port of a NIC
US11736565B2 (en) 2020-09-28 2023-08-22 Vmware, Inc. Accessing an external storage through a NIC
US11736566B2 (en) 2020-09-28 2023-08-22 Vmware, Inc. Using a NIC as a network accelerator to allow VM access to an external storage via a PF module, bus, and VF module
US11593278B2 (en) 2020-09-28 2023-02-28 Vmware, Inc. Using machine executing on a NIC to access a third party storage not supported by a NIC or host
US11636116B2 (en) 2021-01-29 2023-04-25 Splunk Inc. User interface for customizing data streams
US11650995B2 (en) 2021-01-29 2023-05-16 Splunk Inc. User defined data stream for routing data to a data destination based on a data route
US11782631B2 (en) * 2021-02-25 2023-10-10 Pure Storage, Inc. Synchronous workload optimization
US20230050536A1 (en) * 2021-02-25 2023-02-16 Pure Storage, Inc. Synchronous Workload Optimization
US11687487B1 (en) * 2021-03-11 2023-06-27 Splunk Inc. Text files updates to an active processing pipeline
US11663219B1 (en) 2021-04-23 2023-05-30 Splunk Inc. Determining a set of parameter values for a processing pipeline
US11863376B2 (en) 2021-12-22 2024-01-02 Vmware, Inc. Smart NIC leader election
US20230259285A1 (en) * 2022-02-16 2023-08-17 T-Mobile Usa, Inc. Preventing data loss in a filesystem by creating duplicates of data in parallel, such as charging data in a wireless telecommunications network
US11922026B2 (en) * 2022-02-16 2024-03-05 T-Mobile Usa, Inc. Preventing data loss in a filesystem by creating duplicates of data in parallel, such as charging data in a wireless telecommunications network
US11899594B2 (en) 2022-06-21 2024-02-13 VMware LLC Maintenance of data message classification cache on smart NIC
US11928367B2 (en) 2022-06-21 2024-03-12 VMware LLC Logical memory addressing for network devices
US11928062B2 (en) 2022-06-21 2024-03-12 VMware LLC Accelerating data message classification with smart NICs

Also Published As

Publication number Publication date
US7546364B2 (en) 2009-06-09

Similar Documents

Publication Publication Date Title
US7546364B2 (en) Replication of remote copy data for internet protocol (IP) transmission
US7275177B2 (en) Data recovery with internet protocol replication with or without full resync
US7567991B2 (en) Replication of snapshot using a file system copy differential
US7383463B2 (en) Internet protocol based disaster recovery of a server
US11349949B2 (en) Method of using path signatures to facilitate the recovery from network link failures
US8706833B1 (en) Data storage server having common replication architecture for multiple storage object types
US7769722B1 (en) Replication and restoration of multiple data storage object types in a data network
US6823336B1 (en) Data storage system and method for uninterrupted read-only access to a consistent dataset by one host processor concurrent with read-write access by another host processor
US8935211B2 (en) Metadata management for fixed content distributed data storage
US7657581B2 (en) Metadata management for fixed content distributed data storage
US7421435B2 (en) Data processing system and storage subsystem provided in data processing system
US9213719B2 (en) Peer-to-peer redundant file server system and methods
US7254740B2 (en) System and method for state preservation in a stretch cluster
US7475077B2 (en) System and method for emulating a virtual boundary of a file system for data management at a fileset granularity
US20030154305A1 (en) High availability lightweight directory access protocol service
JP2005250921A (en) Backup system and method
AU2011265370B2 (en) Metadata management for fixed content distributed data storage

Legal Events

Date Code Title Description
AS Assignment

Owner name: EMC CORPORATION, MASSACHUSETTS

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:RAMAN, SUCHITRA;ARMANGAU, PHILIPPE;BERGANT, MILENA;AND OTHERS;REEL/FRAME:012921/0307;SIGNING DATES FROM 20020509 TO 20020514

STCF Information on status: patent grant

Free format text: PATENTED CASE

FEPP Fee payment procedure

Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

FPAY Fee payment

Year of fee payment: 4

AS Assignment

Owner name: CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH, AS COLLATERAL AGENT, NORTH CAROLINA

Free format text: SECURITY AGREEMENT;ASSIGNORS:ASAP SOFTWARE EXPRESS, INC.;AVENTAIL LLC;CREDANT TECHNOLOGIES, INC.;AND OTHERS;REEL/FRAME:040134/0001

Effective date: 20160907

Owner name: THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT, TEXAS

Free format text: SECURITY AGREEMENT;ASSIGNORS:ASAP SOFTWARE EXPRESS, INC.;AVENTAIL LLC;CREDANT TECHNOLOGIES, INC.;AND OTHERS;REEL/FRAME:040136/0001

Effective date: 20160907

Owner name: CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH, AS COLLAT

Free format text: SECURITY AGREEMENT;ASSIGNORS:ASAP SOFTWARE EXPRESS, INC.;AVENTAIL LLC;CREDANT TECHNOLOGIES, INC.;AND OTHERS;REEL/FRAME:040134/0001

Effective date: 20160907

Owner name: THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., A

Free format text: SECURITY AGREEMENT;ASSIGNORS:ASAP SOFTWARE EXPRESS, INC.;AVENTAIL LLC;CREDANT TECHNOLOGIES, INC.;AND OTHERS;REEL/FRAME:040136/0001

Effective date: 20160907

AS Assignment

Owner name: EMC IP HOLDING COMPANY LLC, MASSACHUSETTS

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:EMC CORPORATION;REEL/FRAME:040203/0001

Effective date: 20160906

FPAY Fee payment

Year of fee payment: 8

AS Assignment

Owner name: THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., T

Free format text: SECURITY AGREEMENT;ASSIGNORS:CREDANT TECHNOLOGIES, INC.;DELL INTERNATIONAL L.L.C.;DELL MARKETING L.P.;AND OTHERS;REEL/FRAME:049452/0223

Effective date: 20190320

Owner name: THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., TEXAS

Free format text: SECURITY AGREEMENT;ASSIGNORS:CREDANT TECHNOLOGIES, INC.;DELL INTERNATIONAL L.L.C.;DELL MARKETING L.P.;AND OTHERS;REEL/FRAME:049452/0223

Effective date: 20190320

AS Assignment

Owner name: THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., TEXAS

Free format text: SECURITY AGREEMENT;ASSIGNORS:CREDANT TECHNOLOGIES INC.;DELL INTERNATIONAL L.L.C.;DELL MARKETING L.P.;AND OTHERS;REEL/FRAME:053546/0001

Effective date: 20200409

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 12TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1553); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 12

AS Assignment

Owner name: WYSE TECHNOLOGY L.L.C., CALIFORNIA

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058216/0001

Effective date: 20211101

Owner name: SCALEIO LLC, MASSACHUSETTS

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058216/0001

Effective date: 20211101

Owner name: MOZY, INC., WASHINGTON

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058216/0001

Effective date: 20211101

Owner name: MAGINATICS LLC, CALIFORNIA

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058216/0001

Effective date: 20211101

Owner name: FORCE10 NETWORKS, INC., CALIFORNIA

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058216/0001

Effective date: 20211101

Owner name: EMC IP HOLDING COMPANY LLC, TEXAS

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058216/0001

Effective date: 20211101

Owner name: EMC CORPORATION, MASSACHUSETTS

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058216/0001

Effective date: 20211101

Owner name: DELL SYSTEMS CORPORATION, TEXAS

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058216/0001

Effective date: 20211101

Owner name: DELL SOFTWARE INC., CALIFORNIA

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058216/0001

Effective date: 20211101

Owner name: DELL PRODUCTS L.P., TEXAS

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058216/0001

Effective date: 20211101

Owner name: DELL MARKETING L.P., TEXAS

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058216/0001

Effective date: 20211101

Owner name: DELL INTERNATIONAL, L.L.C., TEXAS

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058216/0001

Effective date: 20211101

Owner name: DELL USA L.P., TEXAS

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058216/0001

Effective date: 20211101

Owner name: CREDANT TECHNOLOGIES, INC., TEXAS

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058216/0001

Effective date: 20211101

Owner name: AVENTAIL LLC, CALIFORNIA

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058216/0001

Effective date: 20211101

Owner name: ASAP SOFTWARE EXPRESS, INC., ILLINOIS

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058216/0001

Effective date: 20211101

AS Assignment

Owner name: SCALEIO LLC, MASSACHUSETTS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (040136/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061324/0001

Effective date: 20220329

Owner name: EMC IP HOLDING COMPANY LLC (ON BEHALF OF ITSELF AND AS SUCCESSOR-IN-INTEREST TO MOZY, INC.), TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (040136/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061324/0001

Effective date: 20220329

Owner name: EMC CORPORATION (ON BEHALF OF ITSELF AND AS SUCCESSOR-IN-INTEREST TO MAGINATICS LLC), MASSACHUSETTS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (040136/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061324/0001

Effective date: 20220329

Owner name: DELL MARKETING CORPORATION (SUCCESSOR-IN-INTEREST TO FORCE10 NETWORKS, INC. AND WYSE TECHNOLOGY L.L.C.), TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (040136/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061324/0001

Effective date: 20220329

Owner name: DELL PRODUCTS L.P., TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (040136/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061324/0001

Effective date: 20220329

Owner name: DELL INTERNATIONAL L.L.C., TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (040136/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061324/0001

Effective date: 20220329

Owner name: DELL USA L.P., TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (040136/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061324/0001

Effective date: 20220329

Owner name: DELL MARKETING L.P. (ON BEHALF OF ITSELF AND AS SUCCESSOR-IN-INTEREST TO CREDANT TECHNOLOGIES, INC.), TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (040136/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061324/0001

Effective date: 20220329

Owner name: DELL MARKETING CORPORATION (SUCCESSOR-IN-INTEREST TO ASAP SOFTWARE EXPRESS, INC.), TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (040136/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061324/0001

Effective date: 20220329

AS Assignment

Owner name: SCALEIO LLC, MASSACHUSETTS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (045455/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061753/0001

Effective date: 20220329

Owner name: EMC IP HOLDING COMPANY LLC (ON BEHALF OF ITSELF AND AS SUCCESSOR-IN-INTEREST TO MOZY, INC.), TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (045455/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061753/0001

Effective date: 20220329

Owner name: EMC CORPORATION (ON BEHALF OF ITSELF AND AS SUCCESSOR-IN-INTEREST TO MAGINATICS LLC), MASSACHUSETTS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (045455/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061753/0001

Effective date: 20220329

Owner name: DELL MARKETING CORPORATION (SUCCESSOR-IN-INTEREST TO FORCE10 NETWORKS, INC. AND WYSE TECHNOLOGY L.L.C.), TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (045455/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061753/0001

Effective date: 20220329

Owner name: DELL PRODUCTS L.P., TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (045455/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061753/0001

Effective date: 20220329

Owner name: DELL INTERNATIONAL L.L.C., TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (045455/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061753/0001

Effective date: 20220329

Owner name: DELL USA L.P., TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (045455/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061753/0001

Effective date: 20220329

Owner name: DELL MARKETING L.P. (ON BEHALF OF ITSELF AND AS SUCCESSOR-IN-INTEREST TO CREDANT TECHNOLOGIES, INC.), TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (045455/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061753/0001

Effective date: 20220329

Owner name: DELL MARKETING CORPORATION (SUCCESSOR-IN-INTEREST TO ASAP SOFTWARE EXPRESS, INC.), TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (045455/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061753/0001

Effective date: 20220329