WO2001084314A3 - Method and system for providing cluster replicated checkpoint services - Google Patents

Method and system for providing cluster replicated checkpoint services Download PDF

Info

Publication number
WO2001084314A3
WO2001084314A3 PCT/US2001/014250 US0114250W WO0184314A3 WO 2001084314 A3 WO2001084314 A3 WO 2001084314A3 US 0114250 W US0114250 W US 0114250W WO 0184314 A3 WO0184314 A3 WO 0184314A3
Authority
WO
WIPO (PCT)
Prior art keywords
checkpoint
node
replica
checkpoint information
services
Prior art date
Application number
PCT/US2001/014250
Other languages
French (fr)
Other versions
WO2001084314A2 (en
Inventor
Mark A Kampe
Frederic E Herrmann
Stephane Brossier
Original Assignee
Sun Microsystems Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sun Microsystems Inc filed Critical Sun Microsystems Inc
Priority to AU2001259403A priority Critical patent/AU2001259403A1/en
Publication of WO2001084314A2 publication Critical patent/WO2001084314A2/en
Publication of WO2001084314A3 publication Critical patent/WO2001084314A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1464Management of the backup or restore process for networked environments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/202Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant
    • G06F11/2023Failover techniques
    • G06F11/203Failover techniques using migration

Abstract

The present invention describes a method and system for providing cluster replicated checkpoint services. In particular, the method provides cluster replicated checkpoint services for replicas of a checkpoint in a cluster. The cluster includes a first node and a second node, which are connected to one another via a network. The replicas include a primary replica and a secondary replica. The method includes managing the checkpoint that contains checkpoint information, and creating the primary replica in a memory of the first node. The primary replica contains first checkpoint information. The method also includes updating the primary replica so that the first checkpoint information corresponds to the checkpoint information, creating the secondary replica that contains second checkpoint information in a memory of the second node, and updating the secondary replica so that the second checkpoint information corresponds to the checkpoint information.
PCT/US2001/014250 2000-05-02 2001-05-02 Method and system for providing cluster replicated checkpoint services WO2001084314A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2001259403A AU2001259403A1 (en) 2000-05-02 2001-05-02 Method and system for providing cluster replicated checkpoint services

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US20109900P 2000-05-02 2000-05-02
US20109200P 2000-05-02 2000-05-02
US60/201,099 2000-05-02
US60/201,092 2000-05-02

Publications (2)

Publication Number Publication Date
WO2001084314A2 WO2001084314A2 (en) 2001-11-08
WO2001084314A3 true WO2001084314A3 (en) 2002-04-25

Family

ID=26896379

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2001/014250 WO2001084314A2 (en) 2000-05-02 2001-05-02 Method and system for providing cluster replicated checkpoint services

Country Status (3)

Country Link
US (1) US6823474B2 (en)
AU (1) AU2001259403A1 (en)
WO (1) WO2001084314A2 (en)

Families Citing this family (95)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7158926B2 (en) * 2000-05-05 2007-01-02 Sun Microsystems, Inc. Cluster availability model
US6820216B2 (en) * 2001-03-30 2004-11-16 Transmeta Corporation Method and apparatus for accelerating fault handling
US6829719B2 (en) 2001-03-30 2004-12-07 Transmeta Corporation Method and apparatus for handling nested faults
US8423674B2 (en) * 2001-06-02 2013-04-16 Ericsson Ab Method and apparatus for process sync restart
US7054910B1 (en) * 2001-12-20 2006-05-30 Emc Corporation Data replication facility for distributed computing environments
US7231554B2 (en) * 2002-03-25 2007-06-12 Availigent, Inc. Transparent consistent active replication of multithreaded application programs
US6892320B1 (en) * 2002-06-03 2005-05-10 Sun Microsystems, Inc. Method and apparatus for providing multiple-version support for highly available objects
WO2004012061A2 (en) * 2002-07-29 2004-02-05 Eternal Systems, Inc. Consistent message ordering for semi-active and passive replication
US7206964B2 (en) * 2002-08-30 2007-04-17 Availigent, Inc. Consistent asynchronous checkpointing of multithreaded application programs based on semi-active or passive replication
US7305582B1 (en) * 2002-08-30 2007-12-04 Availigent, Inc. Consistent asynchronous checkpointing of multithreaded application programs based on active replication
US6990541B2 (en) * 2002-11-22 2006-01-24 Sun Microsystems, Inc. Arbitration unit for prioritizing requests based on multiple request groups
US7739240B2 (en) * 2002-12-09 2010-06-15 Hewlett-Packard Development Company, L.P. Replication and replica management in a wide area file system
JPWO2004075135A1 (en) * 2003-02-19 2006-06-01 松下電器産業株式会社 Monitoring electronic device system, monitoring method, program, and recording medium
US7987157B1 (en) * 2003-07-18 2011-07-26 Symantec Operating Corporation Low-impact refresh mechanism for production databases
US7657781B1 (en) * 2003-07-25 2010-02-02 Cisco Technology, Inc. System and method for providing redundant data load sharing in a distributed network
CN1292346C (en) * 2003-09-12 2006-12-27 国际商业机器公司 System and method for performing task in distributing calculating system structure
US7743381B1 (en) 2003-09-16 2010-06-22 Symantec Operating Corporation Checkpoint service
US7165186B1 (en) * 2003-10-07 2007-01-16 Sun Microsystems, Inc. Selective checkpointing mechanism for application components
US9213609B2 (en) * 2003-12-16 2015-12-15 Hewlett-Packard Development Company, L.P. Persistent memory device for backup process checkpoint states
US20050216552A1 (en) * 2004-03-24 2005-09-29 Samuel Fineberg Communication-link-attached persistent memory system
US8181162B2 (en) * 2004-06-14 2012-05-15 Alcatel Lucent Manager component for checkpoint procedures
US20060026367A1 (en) * 2004-07-27 2006-02-02 Sanjoy Das Storage task coordination apparatus method and system
US7299376B2 (en) * 2004-08-25 2007-11-20 International Business Machines Corporation Apparatus, system, and method for verifying backup data
US8122280B2 (en) 2004-08-26 2012-02-21 Open Invention Network, Llc Method and system for providing high availability to computer applications
FR2882448B1 (en) * 2005-01-21 2007-05-04 Meiosys Soc Par Actions Simpli METHOD OF MANAGING, JOURNALIZING OR REJECTING THE PROGRESS OF AN APPLICATION PROCESS
US7478278B2 (en) * 2005-04-14 2009-01-13 International Business Machines Corporation Template based parallel checkpointing in a massively parallel computer system
US7389300B1 (en) 2005-05-27 2008-06-17 Symantec Operating Corporation System and method for multi-staged in-memory checkpoint replication with relaxed consistency
US8099627B1 (en) * 2005-06-28 2012-01-17 Symantec Operating Corporation Persistent images of distributed shared memory segments and in-memory checkpoints
US7779295B1 (en) * 2005-06-28 2010-08-17 Symantec Operating Corporation Method and apparatus for creating and using persistent images of distributed shared memory segments and in-memory checkpoints
US7669073B2 (en) * 2005-08-19 2010-02-23 Stratus Technologies Bermuda Ltd. Systems and methods for split mode operation of fault-tolerant computer systems
US8078910B1 (en) 2008-12-15 2011-12-13 Open Invention Network, Llc Method and system for providing coordinated checkpointing to a group of independent computer applications
US7681075B2 (en) 2006-05-02 2010-03-16 Open Invention Network Llc Method and system for providing high availability to distributed computer applications
US8082468B1 (en) 2008-12-15 2011-12-20 Open Invention Networks, Llc Method and system for providing coordinated checkpointing to a group of independent computer applications
US20070174484A1 (en) * 2006-01-23 2007-07-26 Stratus Technologies Bermuda Ltd. Apparatus and method for high performance checkpointing and rollback of network operations
US7769727B2 (en) * 2006-05-31 2010-08-03 Microsoft Corporation Resolving update-delete conflicts
JP5029685B2 (en) * 2007-02-28 2012-09-19 富士通株式会社 Backup device
US7987266B2 (en) * 2008-07-29 2011-07-26 International Business Machines Corporation Failover in proxy server networks
US8880473B1 (en) 2008-12-15 2014-11-04 Open Invention Network, Llc Method and system for providing storage checkpointing to a group of independent computer applications
US8341631B2 (en) 2009-04-10 2012-12-25 Open Invention Network Llc System and method for application isolation
US10019327B1 (en) 2008-12-15 2018-07-10 Open Invention Network Llc System and method for hybrid kernel- and user-space incremental and full checkpointing
US8281317B1 (en) 2008-12-15 2012-10-02 Open Invention Network Llc Method and computer readable medium for providing checkpointing to windows application groups
US8745442B1 (en) * 2011-04-28 2014-06-03 Open Invention Network, Llc System and method for hybrid kernel- and user-space checkpointing
US9256496B1 (en) * 2008-12-15 2016-02-09 Open Invention Network, Llc System and method for hybrid kernel—and user-space incremental and full checkpointing
US9354977B1 (en) * 2008-12-15 2016-05-31 Open Invention Network Llc System and method for hybrid kernel- and user-space incremental and full checkpointing
US8041994B2 (en) * 2009-01-09 2011-10-18 Alcatel Lucent Asynchronous checkpointing with audits in high availability networks
US20100185682A1 (en) * 2009-01-09 2010-07-22 Lucent Technologies Inc. Object identifier and common registry to support asynchronous checkpointing with audits
US8713060B2 (en) 2009-03-31 2014-04-29 Amazon Technologies, Inc. Control service for relational data management
US9705888B2 (en) 2009-03-31 2017-07-11 Amazon Technologies, Inc. Managing security groups for data instances
US9207984B2 (en) 2009-03-31 2015-12-08 Amazon Technologies, Inc. Monitoring and automatic scaling of data volumes
US11538078B1 (en) 2009-04-10 2022-12-27 International Business Machines Corporation System and method for usage billing of hosted applications
US9058599B1 (en) 2009-04-10 2015-06-16 Open Invention Network, Llc System and method for usage billing of hosted applications
US9135283B2 (en) 2009-10-07 2015-09-15 Amazon Technologies, Inc. Self-service configuration for data environment
US8074107B2 (en) 2009-10-26 2011-12-06 Amazon Technologies, Inc. Failover and recovery for replicated data instances
US20110246823A1 (en) * 2010-04-05 2011-10-06 Et International, Inc. Task-oriented node-centric checkpointing (toncc)
US8738961B2 (en) * 2010-08-17 2014-05-27 International Business Machines Corporation High-availability computer cluster with failover support based on a resource map
US11615115B2 (en) 2010-12-23 2023-03-28 Mongodb, Inc. Systems and methods for managing distributed database deployments
US10366100B2 (en) 2012-07-26 2019-07-30 Mongodb, Inc. Aggregation framework system architecture and method
US10262050B2 (en) 2015-09-25 2019-04-16 Mongodb, Inc. Distributed database systems and methods with pluggable storage engines
US10713280B2 (en) 2010-12-23 2020-07-14 Mongodb, Inc. Systems and methods for managing distributed database deployments
US9740762B2 (en) 2011-04-01 2017-08-22 Mongodb, Inc. System and method for optimizing data migration in a partitioned database
US10997211B2 (en) 2010-12-23 2021-05-04 Mongodb, Inc. Systems and methods for database zone sharding and API integration
US10740353B2 (en) 2010-12-23 2020-08-11 Mongodb, Inc. Systems and methods for managing distributed database deployments
US10977277B2 (en) 2010-12-23 2021-04-13 Mongodb, Inc. Systems and methods for database zone sharding and API integration
US9881034B2 (en) 2015-12-15 2018-01-30 Mongodb, Inc. Systems and methods for automating management of distributed databases
US8996463B2 (en) 2012-07-26 2015-03-31 Mongodb, Inc. Aggregation framework system architecture and method
US9805108B2 (en) * 2010-12-23 2017-10-31 Mongodb, Inc. Large distributed database clustering systems and methods
US8572031B2 (en) * 2010-12-23 2013-10-29 Mongodb, Inc. Method and apparatus for maintaining replica sets
US10346430B2 (en) 2010-12-23 2019-07-09 Mongodb, Inc. System and method for determining consensus within a distributed database
US10614098B2 (en) 2010-12-23 2020-04-07 Mongodb, Inc. System and method for determining consensus within a distributed database
US11544288B2 (en) 2010-12-23 2023-01-03 Mongodb, Inc. Systems and methods for managing distributed database deployments
US9495477B1 (en) 2011-04-20 2016-11-15 Google Inc. Data storage in a graph processing system
US11307941B1 (en) * 2011-04-28 2022-04-19 Open Invention Network Llc System and method for hybrid kernel- and user-space incremental and full checkpointing
US11625307B1 (en) 2011-04-28 2023-04-11 International Business Machines Corporation System and method for hybrid kernel- and user-space incremental and full checkpointing
US10872095B2 (en) 2012-07-26 2020-12-22 Mongodb, Inc. Aggregation framework system architecture and method
US11544284B2 (en) 2012-07-26 2023-01-03 Mongodb, Inc. Aggregation framework system architecture and method
US11403317B2 (en) 2012-07-26 2022-08-02 Mongodb, Inc. Aggregation framework system architecture and method
US9251002B2 (en) 2013-01-15 2016-02-02 Stratus Technologies Bermuda Ltd. System and method for writing checkpointing data
US9031910B2 (en) 2013-06-24 2015-05-12 Sap Se System and method for maintaining a cluster setup
JP2015095015A (en) * 2013-11-11 2015-05-18 富士通株式会社 Data arrangement method, data arrangement program, and information processing system
US9569517B1 (en) * 2013-11-27 2017-02-14 Google Inc. Fault tolerant distributed key-value storage
EP3090344B1 (en) 2013-12-30 2018-07-18 Stratus Technologies Bermuda Ltd. Dynamic checkpointing systems and methods
ES2652262T3 (en) 2013-12-30 2018-02-01 Stratus Technologies Bermuda Ltd. Method of delaying checkpoints by inspecting network packets
US9588844B2 (en) 2013-12-30 2017-03-07 Stratus Technologies Bermuda Ltd. Checkpointing systems and methods using data forwarding
US10606324B2 (en) * 2015-01-16 2020-03-31 Hewlett Packard Enterprise Development Lp Plenum to deliver cool air and route multiple cables
US10496669B2 (en) 2015-07-02 2019-12-03 Mongodb, Inc. System and method for augmenting consensus election in a distributed database
WO2017004547A1 (en) * 2015-07-02 2017-01-05 Google Inc. Distributed storage system with replica location selection
US10394822B2 (en) 2015-09-25 2019-08-27 Mongodb, Inc. Systems and methods for data conversion and comparison
US10846411B2 (en) 2015-09-25 2020-11-24 Mongodb, Inc. Distributed database systems and methods with encrypted storage engines
US10673623B2 (en) 2015-09-25 2020-06-02 Mongodb, Inc. Systems and methods for hierarchical key management in encrypted distributed databases
US10423626B2 (en) 2015-09-25 2019-09-24 Mongodb, Inc. Systems and methods for data conversion and comparison
US10671496B2 (en) 2016-05-31 2020-06-02 Mongodb, Inc. Method and apparatus for reading and writing committed data
US10621050B2 (en) 2016-06-27 2020-04-14 Mongodb, Inc. Method and apparatus for restoring data from snapshots
US10866868B2 (en) 2017-06-20 2020-12-15 Mongodb, Inc. Systems and methods for optimization of database operations
US10379966B2 (en) * 2017-11-15 2019-08-13 Zscaler, Inc. Systems and methods for service replication, validation, and recovery in cloud-based systems
CN111813786A (en) * 2019-04-12 2020-10-23 阿里巴巴集团控股有限公司 Defect detecting/processing method and device

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5590277A (en) * 1994-06-22 1996-12-31 Lucent Technologies Inc. Progressive retry method and apparatus for software failure recovery in multi-process message-passing applications

Family Cites Families (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5440726A (en) * 1994-06-22 1995-08-08 At&T Corp. Progressive retry method and apparatus having reusable software modules for software failure recovery in multi-process message-passing applications
JPH08286989A (en) 1995-04-19 1996-11-01 Fuji Xerox Co Ltd Network management system
US5621885A (en) * 1995-06-07 1997-04-15 Tandem Computers, Incorporated System and method for providing a fault tolerant computer program runtime support environment
US5737514A (en) * 1995-11-29 1998-04-07 Texas Micro, Inc. Remote checkpoint memory system and protocol for fault-tolerant computer system
US5740348A (en) 1996-07-01 1998-04-14 Sun Microsystems, Inc. System and method for selecting the correct group of replicas in a replicated computer database system
US5832529A (en) * 1996-10-11 1998-11-03 Sun Microsystems, Inc. Methods, apparatus, and product for distributed garbage collection
US5845292A (en) * 1996-12-16 1998-12-01 Lucent Technologies Inc. System and method for restoring a distributed checkpointed database
US6292905B1 (en) 1997-05-13 2001-09-18 Micron Technology, Inc. Method for providing a fault tolerant network using distributed server processes to remap clustered network resources to other servers during server failure
US6360331B2 (en) * 1998-04-17 2002-03-19 Microsoft Corporation Method and system for transparently failing over application configuration information in a server cluster
US6145094A (en) 1998-05-12 2000-11-07 Sun Microsystems, Inc. Transaction locks for high availability
US6163856A (en) 1998-05-29 2000-12-19 Sun Microsystems, Inc. Method and apparatus for file system disaster recovery
US6308282B1 (en) 1998-11-10 2001-10-23 Honeywell International Inc. Apparatus and methods for providing fault tolerance of networks and network interface cards
US20020138704A1 (en) * 1998-12-15 2002-09-26 Stephen W. Hiser Method and apparatus fault tolerant shared memory
US6594779B1 (en) * 1999-03-30 2003-07-15 International Business Machines Corporation Method, system and program products for managing the checkpointing/restarting of resources of a computing environment
US6380331B1 (en) 2000-06-30 2002-04-30 Exxonmobil Chemical Patents Inc. Metallocene compositions
US6691245B1 (en) * 2000-10-10 2004-02-10 Lsi Logic Corporation Data storage with host-initiated synchronization and fail-over of remote mirror

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5590277A (en) * 1994-06-22 1996-12-31 Lucent Technologies Inc. Progressive retry method and apparatus for software failure recovery in multi-process message-passing applications

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
KERMARREC A-M ET AL: "DESIGN IMPLEMENTATION AND EVALUATION OF ICARE: AN EFFICIENT RECOVERABLE DSM", SOFTWARE PRACTICE & EXPERIENCE, JOHN WILEY & SONS LTD. CHICHESTER, GB, vol. 28, no. 9, 25 July 1998 (1998-07-25), pages 981 - 1010, XP000765512, ISSN: 0038-0644 *
MOSER L E ET AL: "Eternal: fault tolerance and live upgrades for distributed object systems", DARPA INFORMATION SURVIVABILITY CONFERENCE AND EXPOSITION, 2000. DISCEX '00. PROCEEDINGS HILTON HEAD, SC, USA 25-27 JAN. 2000, LAS ALAMITOS, CA, USA,IEEE COMPUT. SOC, US, 25 January 2000 (2000-01-25), pages 184 - 196, XP010371114, ISBN: 0-7695-0490-6 *

Also Published As

Publication number Publication date
US20020032883A1 (en) 2002-03-14
AU2001259403A1 (en) 2001-11-12
US6823474B2 (en) 2004-11-23
WO2001084314A2 (en) 2001-11-08

Similar Documents

Publication Publication Date Title
WO2001084314A3 (en) Method and system for providing cluster replicated checkpoint services
WO2005025152A3 (en) Synchronizing multiple copies of a database using digest
WO2004012061A3 (en) Consistent message ordering for semi-active and passive replication
HK1065388A1 (en) Selective data replication system and method
WO2006020668A3 (en) Periodic update of data in a relationship system
DE69800808D1 (en) Redundant, distributed network system
AU2001239797A1 (en) Information access, collaboration and integration system and method
WO2001084338A3 (en) Cluster configuration repository
AU2001251643A1 (en) System and method for providing distributed database services
AU2001270294A1 (en) Methods and systems for adaptation, diagnosis, optimization, and prescription technology for network based applications
AU2001224603A1 (en) System and method for providing an information network on the internet
AU2001285231A1 (en) A method and system for automatically connecting real-world entities directly tocorresponding network-based data sources or services
WO1999045451A3 (en) Remote data access and synchronization
AU2002226214A1 (en) Method and system for automatically updating contact information within a database
NL1018853A1 (en) Database modeling system and method.
WO2004006058A3 (en) Web service architecture and methods
WO2007018698A3 (en) System and method for automatic user availability setting
WO2004092869A3 (en) Data matrix method and system for distribution of data
EP0992901A3 (en) Enhanced two phase commit protocol
DK0614485T3 (en) Method for selectively propagating CD34 positive cells
AU2001267658A1 (en) Method and system for selecting on request one or several data sources availablefrom a communication network
WO2006023282A3 (en) Method and apparatus for operating an ad-hoc communication system
AU2001235016A1 (en) High availability database system using live/load database copies
WO2002096131A3 (en) System and method for maintaining object data in a distributed object system
WO1998054660A3 (en) Method to be used with a distributed data base, and a system adapted to work according to the method

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: JP