US20130185716A1 - System and method for providing a virtualized replication and high availability environment - Google Patents

System and method for providing a virtualized replication and high availability environment Download PDF

Info

Publication number
US20130185716A1
US20130185716A1 US13/349,709 US201213349709A US2013185716A1 US 20130185716 A1 US20130185716 A1 US 20130185716A1 US 201213349709 A US201213349709 A US 201213349709A US 2013185716 A1 US2013185716 A1 US 2013185716A1
Authority
US
United States
Prior art keywords
virtual machines
server
virtualized
replication
production server
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
US13/349,709
Other versions
US8893147B2 (en
Inventor
Jinxing YIN
Pengcheng Dun
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CA Inc
Original Assignee
Computer Associates Think Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Computer Associates Think Inc filed Critical Computer Associates Think Inc
Priority to US13/349,709 priority Critical patent/US8893147B2/en
Assigned to COMPUTER ASSOCIATES THINK, INC. reassignment COMPUTER ASSOCIATES THINK, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: DUN, PENGCHENG, YIN, JINXING
Publication of US20130185716A1 publication Critical patent/US20130185716A1/en
Assigned to CA, INC. reassignment CA, INC. MERGER (SEE DOCUMENT FOR DETAILS). Assignors: COMPUTER ASSOCIATES THINK, INC.
Priority to US14/538,485 priority patent/US9519656B2/en
Application granted granted Critical
Publication of US8893147B2 publication Critical patent/US8893147B2/en
Priority to US15/375,617 priority patent/US10114834B2/en
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/178Techniques for file synchronisation in file systems
    • G06F16/1794Details of file format conversion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/2053Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant
    • G06F11/2056Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant by mirroring
    • G06F11/2069Management of state, configuration or failover
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/188Virtual file systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1095Replication or mirroring of data, e.g. scheduling or transport for data synchronisation between network nodes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/131Protocols for games, networked simulations or virtual reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45562Creating, deleting, cloning virtual machine instances
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/4557Distribution of virtual machine instances; Migration and load balancing

Definitions

  • the invention generally relates to a system and method for providing a virtualized replication and high availability environment, and in particular, to installing a replication and high availability engine in a parent partition on a virtualized production server (rather than virtual machines that run in child partitions on the virtualized production server), automatically discovering the virtual machines running in the child partitions on the virtualized production server, and automatically synchronizing all files associated with the virtual machines and continuously replicating subsequent changes to the files associated with the virtual machines to a virtualized replica server that can create on-demand virtual machines from the synchronized and replicated files to handle switchover, failover, switchback, and failback events associated with the virtualized production server or the virtual machines running therein.
  • virtualization may create lag times in responding to business needs, provisioning resources, and implementing effective security measures.
  • capacity planning and automation must be implemented to mitigate information technology inefficiencies, slow response times, and missed business opportunities.
  • virtualization tends to cross multiple silos, which requires coordinated management and integration and time-consuming manual processes that can hinder performance and elevate costs.
  • virtualized systems usually require installing appropriate engines on virtual machines managed therein in order to protect applications that may be running in the managed virtual machines.
  • manually installing the engines on the virtual machines tends to be difficult, time consuming, and resource intensive. For example, usage information associated with a particular virtual machine may require the engine installed thereon to create several different high availability scenarios, including some that may be unnecessary or not relevant to user needs.
  • manually installing engines on individual virtual machines requires users to know how to configure different applications running therein (e.g., SQL, Exchange, SharePoint, etc.), which tends to introduce substantial human resource and information technology resource costs.
  • each virtualization platform contains specific management tools, organizations tend to quickly feel the pain associated with multiple management solutions, uncoordinated manual processes, weak security measures, and inadequate tracking and reporting practices. As such, without a coordinated management approach, organizations may be unable to attain the promise associated with virtualization technology, which may instead become a burden that threatens to consume information technology resources, budgets, and reputations because information technology has become saddled with trying to effectively manage and scale resources while business users become frustrated because applications and services needed to dynamically respond to market opportunities may be unavailable or disrupted.
  • the system and method described herein may provide a virtualized replication and high availability environment.
  • the system and method described herein may provide Windows, Linux, and Unix systems with high availability and continuous or periodic data protection associated with related applications and data to maximize uptime and availability associated with physical and virtualized environments.
  • the system and method described herein may provide simple mechanisms to migrate or replicate data between different servers and locations, whether physical or virtual, and to consolidate data between remote offices and a backup or archive facility and protect onsite and offsite data.
  • the system and method described herein may include non-disruptive recovery testing and data rewind capabilities to restore systems, applications, and data to prior states, which may be useful to speeding recovery times and minimizing data loss.
  • system and method described herein may further include real-time server and application monitoring, automated and push-button failover or switchover, and automated and push-button switchback (or failback) to restore replica systems or replica applications in response to a production (or master) server having been repaired or replaced.
  • the system and method described herein may perform monitoring at server, application, and hypervisor and virtual machine levels, which may enable the system and method to respond to issues at physical, application, and virtualization levels, and may further replicate operating systems, system states, and application data to an offline replica server, which may enable the system and method to improve protection speeds, reduce costs, and safely test and migrate from a physical to a virtual server or from one virtual server to another virtual server.
  • system and method described herein may include a unified management console across all operating systems, virtualization platforms, and applications to easily visualize and manage the virtualized replication and high availability environment.
  • the system and method described herein may provide the virtualized replication and high availability environment using an architecture having a hypervisor that runs a guest operating system directly on underlying hardware and supports isolated partitions.
  • the architecture may be based on Microsoft Hyper-V server virtualization technology, which may be used to create and run separate virtual machines on one physical machine and thereby consolidate multiple server and application roles and better leverage server hardware investments.
  • the architecture may natively support x64 computing, which may be leveraged to efficiently run multiple different operating systems in parallel on one physical server and assign multiple processors or cores to one virtual machine to utilize the increased processing capacity associated with multi-core processors or multi-processor architectures.
  • the hypervisor may have a parent partition that runs a virtualization stack having direct access to the underlying hardware, wherein the parent partition may create child partitions that can host any suitable guest operating system.
  • the virtualization stack may include various components that run in a kernel mode or privileged processor ring, including a VMBus that provides a logical channel to redirect requests and responses between virtual devices in the child partitions and the parent partition to manage inter-partition communication between the parent and child partitions.
  • the virtualization stack may further include various device drivers associated with virtual machines running in the child partitions, a kernel to support the guest operating system instance running in the parent partition, and a virtualization service provider that connects to the VMBus and handles device access requests from the child partitions.
  • the child partitions may similarly include a kernel to support the guest operating system running therein, which may be the same or different from the guest operating system running in the parent partition, a VMBus to communicate with the parent partition, and a virtualization service consumer (or virtualization service client) that transparently communicates with the virtualization service provider in the parent partition to redirect and fulfill device access requests that originate in the child partitions.
  • a kernel to support the guest operating system running therein, which may be the same or different from the guest operating system running in the parent partition
  • VMBus to communicate with the parent partition
  • a virtualization service consumer or virtualization service client
  • the virtualization stack may include various components that run in a user mode or less privileged ring, including a virtual machine interface provider that guest operating systems or applications running in the child partitions can use to communicate with the hypervisor, a virtual machine management service that can manage states associated with the virtual machines in the child partitions and control state-related tasks associated therewith, and virtual machine worker processes that the virtual machine management service creates to start corresponding virtual machine instances in the child partitions and handle interactions between the parent partition and the virtual machines in the child partitions.
  • a virtual machine interface provider that guest operating systems or applications running in the child partitions can use to communicate with the hypervisor
  • a virtual machine management service that can manage states associated with the virtual machines in the child partitions and control state-related tasks associated therewith
  • virtual machine worker processes that the virtual machine management service creates to start corresponding virtual machine instances in the child partitions and handle interactions between the parent partition and the virtual machines in the child partitions.
  • the system and method described herein may provide a physical to virtualized or a virtualized to virtualized replication and high availability environment to ensure that various applications or virtual machines running on a production (or active) server will have absolute operational continuity via a virtualized replica server.
  • the system and method described herein may validate consistency between the applications running on the production server (or virtual machines running the applications on the production server) and various virtual machine files hosted on the replica server that correspond to the applications or the virtual machines on the production server, which may enable recovering the applications (or the virtual machines) on the production server from the replica server.
  • the applications (or virtual machines) on the production server may become unavailable due to downtime, failure, or other loss or disruption associated with the production server, in which case the applications (or virtual machines) may be activated on the replica server to ensure continuity and thereby handle the downtime, failure, or disruption associated with the production server.
  • the procedure that relates to loading the applications (or virtual machines) on the replica server may be considered switchover if the downtime was planned or failover if the downtime was unplanned, while the procedure to subsequently recover the applications (or virtual machines) on the production server via the replica server may be considered switchback or failback.
  • the system and method described herein may use a replication and high availability engine on the production server and a similar replication and high availability engine on the replica server, which may both use asynchronous real-time replication and proactive validation to test whether the virtual machine files hosted on the replica server can reliably recover the applications (or virtual machines running the applications) on the production server to provide cost-effective disaster recovery.
  • data associated with various applications and files, databases, or other suitable data sources relating thereto may be synchronized and replicated between the production server and the replica server over local, wide, or other suitable networks having the replication and high availability engine installed therein and the appropriate network connections needed to communicate with one another.
  • the virtualized replication and high availability environment may provide data synchronization, asynchronous real-time data replication, and automated switchover, failover, and switchback to provide data continuity in various deployment scenarios, which may include full system protection (physical or virtual) using a hypervisor host and replication and high availability in physical to virtual guest, virtual guest to virtual guest, and hypervisor host to hypervisor host environments.
  • the system and method described herein may use the hypervisor host to provide the physical and virtual full system protection deployment scenarios, wherein either full system protection deployment scenario may provide application-independent synchronization to transfer a complete state associated with the production server to the virtualized replica server and subsequently replicate changes to the state associated with production server to the virtualized replica server.
  • a physical production server may read data directly from volumes associated with various master applications running thereon to obtain any suitable files and data relating to the operating system, system state, and disk layout associated with the master applications.
  • the data may then be serialized and sent to the replica server, which may inject the serialized data into virtual hard disk files that represent the volumes associated with the master applications.
  • the replica server may include the hypervisor host within a virtualization stack having a substantially similar architecture to that described above, whereby the hypervisor host may run various different operating systems in one or more child partitions to support the operating systems that run the master applications on the physical production server.
  • the replica server may use the hypervisor host to inject the serialized data into the virtual hard disk files and thereby perform volume-level synchronization associated with the master applications.
  • the virtualized full system protection scenario may operate in a substantially similar manner to the physical full system protection deployment scenario, except that the virtualized full system protection scenario may include a virtualized production server having a hypervisor host that can read the volume data associated with the master applications directly from virtual hard disk files associated with virtual machines that run the applications on the virtualized production server.
  • any subsequent changes to the applications or the virtual machines that run the applications may then be replicated within the virtual hard disk files that correspond to the applications or virtual machines, wherein to handle switchover or failover in response to disruption associated with a master application or virtual machine, the disrupted master application or virtual machine may be disabled and the virtualized replica server may create an on-demand virtual machine from the virtual machine files corresponding thereto and make the on-demand virtual machine available to ensure continuity (e.g., the hypervisor host may configure the on-demand virtual machine with various values specified in a virtual machine configuration file, connect the on-demand virtual machine to a disk image mounted from the virtual hard disk file, and boot the on-demand virtual machine to make the on-demand virtual machine available to end users without disruption).
  • the full system protection deployment scenarios may support large sets of applications and environments and may be simple to deploy because automatically transferring the entire state associated with the production server to the virtualized replica server may obviate or substantially reduce a need to manually provision or otherwise synchronize the virtualized replica server prior to initiating replication operations.
  • the system and method described herein may have various master applications and a replication and high availability engine running on a physical production server.
  • the virtualized replica server may have a similar replication and high availability engine in addition to a virtualization stack to manage various virtual machine files that correspond to the master applications running on the physical production server.
  • the replication and high availability engine running on the physical production server may generally replicate data associated with the master applications or any other suitable data residing on the physical production server to the virtualized replica server, which may reside at the same location as the physical production server or at a remote data center to provide a data protection and disaster recovery site.
  • the physical to virtual guest replication and high availability deployment scenario may generally include synchronizing the physical production server and the virtualized replica server (e.g., via the full system protection techniques described above or any other suitable technique) and then continuously capturing and replicating byte-level changes to the data residing on the physical production server to the virtualized replica server.
  • the virtualization stack on the virtualized replica server may include an active hypervisor that has access to underlying hardware and runs a guest operating system to replicate the changes within various virtual machine files that correspond to the master applications and other data volumes residing on the physical production server and thereby deliver continuous onsite or offsite data protection.
  • the changes captured and replicated from the physical production server to the virtualized replica server may be recorded in a rewind log to preserve a context associated with the replicated data (e.g., to track the changes, undo the changes at the production server, locate a switch point in the virtual machine files on the replica server that can be used to suitably resume business operations in response to disaster or other failure associated with the production server, etc.).
  • the system and method described herein may therefore synchronize and replicate the physical production server to the virtualized replica server to support automated or manual switchover and failover to redirect workloads from the physical production server to the virtualized replica server.
  • the virtualized replica server may invoke one or more components in the virtualization stack to automatically start one or more on-demand virtual machines in response to disruption associated with the physical production server or one or more applications running thereon, wherein the virtualization stack may start the on-demand virtual machines from the virtual machine files that correspond to the applications experiencing disruption on the physical production server.
  • end users and workloads associated with the disrupted applications may then be automatically redirected to the on-demand virtual machines on the virtualized replica server to handle the switchover or failover and thereby minimize business downtime.
  • the procedure to start the on-demand virtual machines on the virtualized replica server and redirect the end users and workloads to the virtualized replica server may be initiated manually, which may enable information technology personnel to investigate the issues that caused the disruption prior to performing the switchover or failover (if necessary).
  • the system and method described herein may generally synchronize and replicate the production server to the virtualized replica server in a substantially similar manner to the physical to virtual guest scenario, and may further handle switchover and failover in a substantially similar manner to the physical to virtual guest scenario.
  • the production server may be virtualized, whereby the virtualized production server may run the applications within one or more virtual machines, while the virtualized replica server may run one or more corresponding virtual machines and maintain one or more virtual machine files that correspond to the virtual machines executing the applications on the virtualized production server (e.g., mirroring the virtual hard disk files, configuration files, and snapshot files associated with the virtual machines running on the virtualized production server).
  • the virtual guest to virtual guest scenario may have different replication and high availability engine instances installed and configured on the individual virtual machines running thereon, and different replication and high availability engine instances may be similarly installed and configured on the individual virtual machines running on the virtualized replica server. Accordingly, the replication and high availability engine instances on the virtualized production server and the virtualized replica server may communicate with one another to synchronize, replicate, and manage switchover and failover associated with the individual virtual machines running the applications on the virtualized production server.
  • the system and method described herein may provide the hypervisor host to hypervisor host replication and high availability deployment scenario to obviate or substantially reduce a need to install and configure different instances associated with the replication and high availability engine on individual virtual machines, which may advantageously provide hypervisor-level replication, switchover and failover, and rewind and recovery capabilities associated with all (or certain selected) virtual machines running on the virtualized production server (e.g., if a third party provides the replication and high availability engine, the hypervisor-level replication and high availability scenario may limit the need to purchase the license to only one per virtual host).
  • the hypervisor-level replication and high availability scenario may substantially reduce deployment time and costs because the requisite software need only be installed on the parent partition within each virtual host and may further reduce processor and memory usage because each virtual machine would not require a local replication and high availability engine instance.
  • the virtualized replica server may create the on-demand virtual machines in response to switchover or failover conditions, the hypervisor-level scenario may satisfy cold site definitions and thereby reduce costs associated with licensing operating systems and licenses associated with the applications running on the virtual hosts.
  • the system and method described herein may have the virtualized production server automatically discover all virtual machines running thereon and create various replication scenarios according to the virtual machines selected to be replicated to the virtualized replica server.
  • the replication and high availability engine installed on the parent partition in the virtualized production server may replicate all the files associated with the discovered (or selected) virtual machines to the virtualized replica server, which may store the replicated files within one or more virtual machine files that correspond to the discovered or selected virtual machines, and any subsequent changes to the files associated with the discovered or selected virtual machines may be continuously replicated to the corresponding virtual machine files on the virtualized replica server in a similar manner.
  • the replication and high availability engine may bring the virtualized replica server online, use the virtualization stack in the parent partition to create on-demand virtual machines from the virtual machine files corresponding to the virtual machines on the virtualized production server, and redirect end users and workloads to the replica server to maintain consistency and minimize downtime.
  • switchover or failover conditions associated with individual virtual machines may be handled similarly, wherein the virtualized replica server may start an appropriate on-demand virtual machine and redirect end users and workloads to the on-demand virtual machine to minimize downtime associated with the individual virtual machines experiencing disruption.
  • the hypervisor-level deployment scenario may include the system and method described herein initially installing the replication and high availability engine in the parent partition on the virtualized production server (rather than the individual virtual machines) and the parent partition on the virtualized replica server.
  • one or more components associated with the virtualization stack may be installed on the guest operating system associated with every virtual machine on the virtualized production server to determine host names associated with the virtual machines, whereby all the virtual machines on the virtualized production server may then be automatically discovered and a volume shadow copy service (VSS) writer associated with the virtualization stack may collect all the files relating to the discovered virtual machines (e.g., virtual hard disk files, configuration files, and snapshot files associated with each virtual machine).
  • VSS volume shadow copy service
  • the replication and high availability engine may then automatically create various replication scenarios associated with each virtual machine to define various replication properties associated with each virtual machine, wherein the replication and high availability engine may then run all scenarios associated with all virtual machines to replicate and protect all the virtual machines, or alternatively select certain virtual machines (or certain scenarios associated with a particular virtual machine) to customize the replication scenarios used to protect the virtualized production server.
  • the replication and high availability engine may then run all scenarios associated with all virtual machines to replicate and protect all the virtual machines, or alternatively select certain virtual machines (or certain scenarios associated with a particular virtual machine) to customize the replication scenarios used to protect the virtualized production server.
  • any subsequent changes to the virtual machines may be replicated to the virtualized replica server.
  • the system and method described herein may handle switchover or failover conditions associated with one or more virtual machines on the virtualized production server, which may include the virtualized replica server creating and registering one or more on-demand virtual machines corresponding the virtual machines associated with the switchover or failover condition (e.g., from the corresponding virtual machine files).
  • the switchover and failover procedure may generally exchange active and standby roles between the virtualized production server and the virtualized replica server, whereby the virtualized production server may change to a standby role in response to the switchover or failover assigning the active role to the virtualized replica server.
  • the relevant scenarios may further specify how to handle reverse replication operations (e.g., replicating changes to the on-demand virtual machines to protect or otherwise backup changes to the files associated therewith), whereby changes to the on-demand virtual machines may continue to be replicated in accordance with the reverse replication operations specified in the relevant scenarios.
  • the switchover or failover may be triggered manually or automatically.
  • the system and method described herein may perform switchback or failback to return the active role to the virtualized production server and the standby role to the virtualized replica server subsequent to the switchover or failover exchanging the active and standby roles between the virtualized production server and the virtualized replica server.
  • the system and method described herein may determine whether to overwrite the data that existed on the virtualized production server prior to the switchover or failover with the data existing on the virtualized replica server at the time that the switchback or failback will be performed.
  • the lost or corrupted data can be restored from the virtualized replica server via reverse synchronization to the virtualized production server, or the lost or corrupted data may be recovered from a certain event in the past or a prior point in time via the data rewind capabilities (e.g., via a suitable event-stamped or time-stamped checkpoint and/or bookmark that can be used to roll the virtualized production server back to the event or point in time prior to when the data was lost or corrupted).
  • FIG. 1 illustrates an exemplary architecture that may be used to provide a virtualized replication and high availability environment, according to one aspect of the invention.
  • FIG. 2A illustrates an exemplary system that may provide a physical to virtualized replication and high availability environment
  • FIG. 2B illustrates an exemplary system that may provide a virtualized to virtualized replication and high availability environment, according to one aspect of the invention.
  • FIG. 3 illustrates an exemplary method that may be used to balance loads and manage switchover or failover conditions in a virtualized replication and high availability environment, according to one aspect of the invention.
  • the system and method described herein may provide a virtualized replication and high availability environment.
  • the system and method described herein may provide Windows, Linux, and Unix systems with high availability and continuous or periodic data protection associated with related applications and data to maximize uptime and availability associated with physical and virtualized environments.
  • the system and method described herein may provide simple mechanisms to migrate or replicate data between different servers and locations, whether physical or virtual, and to consolidate data between remote offices and a backup or archive facility and protect onsite and offsite data.
  • the system and method described herein may include non-disruptive recovery testing and data rewind capabilities to restore systems, applications, and data to prior states, which may be useful to speeding recovery times and minimizing data loss.
  • system and method described herein may further include real-time server and application monitoring, automated and push-button failover or switchover, and automated and push-button switchback (or failback) to restore replica systems or replica applications in response to a production (or master) server having been repaired or replaced.
  • the system and method described herein may perform monitoring at server, application, and hypervisor and virtual machine levels, which may enable the system and method to respond to issues at physical, application, and virtualization levels, and may further replicate operating systems, system states, and application data to an offline replica server, which may enable the system and method to improve protection speeds, reduce costs, and safely test and migrate from a physical to a virtual server or from one virtual server to another virtual server.
  • system and method described herein may include a unified management console across all operating systems, virtualization platforms, and applications to easily visualize and manage the virtualized replication and high availability environment.
  • FIG. 1 illustrates an exemplary architecture 100 that may be used to provide the virtualized replication and high availability environment that will be described in further detail herein.
  • the architecture 100 illustrated in FIG. 1 may include a hypervisor 120 that runs directly on underlying hardware 110 , wherein the hypervisor 120 may run a guest operating system and support isolated partitions.
  • the architecture 100 may include Microsoft Hyper-V server virtualization technology integrated with Windows Server 2008, which may be used to create and run separate virtual machines on one physical machine and thereby consolidate multiple server and application roles and better leverage server hardware investments.
  • Hyper-V architecture 100 may natively support x64 computing, which may be leveraged to efficiently run multiple different operating systems (e.g., Windows®, Linux, etc.) in parallel on one physical server, and may allow assigning multiple processors or processor cores to one virtual machine, which may provide a future-proof virtualization technology that can utilize the increased processing capacity associated with multi-core processors.
  • operating systems e.g., Windows®, Linux, etc.
  • the hypervisor 120 used in the architecture 100 may have a parent partition that runs an appropriate guest operating system (e.g., Windows Server 2008 if the architecture 100 implements Microsoft Hyper-V), wherein a virtualization stack may run in the parent partition and have direct access to the underlying hardware devices 110 .
  • the parent partition may then create one or more child partitions that can host any suitable guest operating system (e.g., Windows Server 2008, Windows NT 4.0, Linux distributions, etc.).
  • the virtualization stack running in the parent partition may include various components that run in a kernel mode or privileged processor ring (i.e., “Ring 0”), including a VMBus 130 a that provides a logical channel to redirect requests and responses between virtual devices in the child partitions and the parent partition that has access to the underlying hardware 110 and thereby manage inter-partition communication between the parent and child partitions.
  • the virtualization stack in the parent partition may further include various device drivers 135 associated with virtual machines running in the child partitions, a kernel 140 a to support the guest operating system instance (e.g., Windows Server 2008) running in the parent partition, and a virtualization service provider 150 that handles device access requests from the child partitions via the VMBus 130 a .
  • the child partitions may similarly include a kernel 140 b to support the guest operating system running therein, which may be the same or different from the guest operating system running in the parent partition, a VMBus 130 b to communicate with the parent partition, and a virtualization service consumer (or virtualization service client) 155 that transparently communicates with the virtualization service provider 150 in the parent partition (e.g., via the VMBus 130 a and VMBus 130 b ) to redirect and fulfill device access requests that originate in the child partitions.
  • a kernel 140 b to support the guest operating system running therein, which may be the same or different from the guest operating system running in the parent partition
  • a VMBus 130 b to communicate with the parent partition
  • a virtualization service consumer (or virtualization service client) 155 that transparently communicates with the virtualization service provider 150 in the parent partition (e.g., via the VMBus 130 a and VMBus 130 b ) to redirect and fulfill device access requests that originate in the child partitions.
  • the virtualization stack in the parent partition may further include various components that run in a user mode or less privileged ring (i.e., “Ring 3”), including a virtual machine interface provider 160 that guest operating systems or applications 190 running in the child partitions can use to communicate with the hypervisor 120 (via the VMBus 130 a - b ).
  • the components running in Ring 3 may include a virtual machine management service 170 that can manage states associated with the virtual machines running applications 190 in the child partitions and control tasks that relate to states associated therewith (e.g., capturing snapshots associated with the virtual machines).
  • the virtual machine management service 170 may create one or more virtual machine worker processes 180 to start corresponding virtual machine instances that run the applications 190 in the child partitions, wherein the virtual machine worker processes 180 may handle management level interactions between the parent partition and the virtual machines in the child partitions.
  • the virtual machine worker processes 180 may create, configure, run, pause, resume, save, restore, and snapshot the associated virtual machine instance in the child partitions, and may further handle interrupt requests, memory, and input/output port mapping associated with the virtual machine instances.
  • FIG. 2A illustrates an exemplary system 200 A that may provide a physical to virtualized replication and high availability environment
  • FIG. 2B illustrates an exemplary system 200 B that may provide a virtualized to virtualized replication and high availability environment
  • the system 200 A may generally include a physical production (or active) server 220 and a virtualized replica (or standby) server 260 to ensure that various applications 250 running on the production server 220 will have absolute operational continuity to a certain degree over a given measurement period.
  • 2B may include a virtualized production server 220 having various virtual machines 255 that run the applications 250 on the production server 220 , wherein the virtualized replica server 260 may similarly ensure that the applications 250 running in the virtual machines 255 on the virtualized production server 220 will have absolute operational continuity over the measurement period.
  • the systems 200 A-B may respectively validate consistency between the applications 250 running on the physical production server 220 and the virtual machines 255 running the applications 250 on the virtualized production server and various virtual machine files 270 hosted on the replica server 260 that respectively correspond to the applications 250 and the virtual machines 255 running on the production server 220 , which may enable recovering the applications 250 (or the virtual machines 255 running the applications 250 ) on the production server 220 from the virtual machine files 270 on the replica server 260 .
  • the applications 250 running on the production server 220 may become unavailable due to downtime, failure, or other loss or disruption associated with the production server 220 , in which case the system 200 may load the applications 250 (or the virtual machines 255 running the applications 250 on the production server 220 ) on the replica server 260 to ensure continuity and thereby handle the downtime, failure, or other loss or disruption associated with the production server 220 .
  • the procedure in which the system 200 loads the applications 250 (or the virtual machines 255 running the applications 250 ) on the replica server 260 to ensure continuity may be considered switchover if the downtime was planned (e.g., to upgrade or maintain the production server 220 ) or failover if the downtime was unplanned (e.g., because the production server 220 failed due to a threat, overload condition, or other emergency that was not anticipated in advance).
  • the procedure to subsequently recover the applications 250 (or the virtual machines 255 running the applications 250 ) on the production server 220 from the replica server 260 may be referred to as switchback or failback.
  • the systems 200 A-B shown in FIGS. 2A-B may both include a replication and high availability engine 240 a on the production server 220 and a similar replication and high availability engine 240 b on the replica server 260 , wherein the replication and high availability engines 240 a - b may use asynchronous real-time replication and proactive validation to test whether the virtual machine files 270 hosted on the replica server 260 can reliably recover the applications 250 (or the virtual machines 255 running the applications 250 ) on the production server 220 to provide cost-effective disaster recovery.
  • data associated with various applications 250 and files, databases, or other suitable data sources relating thereto may be synchronized between the production server 220 and the replica server 260 and subsequent changes to the data may be asynchronously replicated between the production server 220 and the replica server 260 , wherein the data may be synchronized and replicated over local area networks, wide area networks, or other suitable networks that have the replication and high availability engine 240 installed therein and the appropriate TCP or other network connections needed to communicate with one another.
  • the virtualized replication and high availability environment provided in the systems 200 A and 200 B may provide data synchronization, asynchronous real-time data replication, and automated switchover, failover, and switchback to provide data continuity in various deployment scenarios.
  • the various deployment scenarios may include full system protection (physical or virtual) using a hypervisor host, and may further include replication and high availability in physical to virtual guest, virtual guest to virtual guest, and hypervisor host to hypervisor host environments.
  • the system 200 A shown in FIG. 2A may use the hypervisor host to provide the physical full system protection deployment scenario
  • the system 200 B shown in FIG. 2B may use the hypervisor host to provide the virtual full system protection deployment scenario, wherein either full system protection deployment scenario may provide application-independent synchronization to transfer a complete state associated with the production server 220 to the virtualized replica server 260 and subsequently replicate changes to the state associated with production server 220 to the virtualized replica server 260 .
  • the replication and high availability engine 240 a on the physical production server 220 may read data directly from volumes associated with various master applications 250 a - n running on the physical production server 220 to obtain any suitable files and data relating to the operating system, system state, and disk layout associated with the master applications 250 a - n .
  • the replication and high availability engine 240 a may then serialize and send the data read from the volumes associated with the master applications 250 a - n to the virtualized replica server 260 , which may inject the serialized data into virtual hard disk (*.vhd) files 270 a - n that represent the volumes associated with the master applications 250 a - n .
  • virtual hard disk (*.vhd) files 270 a - n that represent the volumes associated with the master applications 250 a - n .
  • the virtualized replica server 260 may include the hypervisor host within a virtualization stack 230 b having a substantially similar architecture to that shown in FIG. 1 and described above, whereby the hypervisor host may run various different operating systems in one or more child partitions to support the operating systems that run the master applications 250 on the physical production server 220 .
  • a replication and high availability engine 240 b on the virtualized replica server 260 may use the hypervisor host in the virtualization stack 230 b to inject the serialized data into the *.vhd files 270 and thereby perform volume-level synchronization associated with the master applications 250 a - n running on the physical production server 220 .
  • system 200 B may operate in a substantially similar manner to synchronize and transfer the complete state associated with the virtualized production server 220 to the virtualized replica server 260 , except that the virtualized production server 220 may include a virtualization stack 230 a having a hypervisor host that can read the operating system, system state, disk layout, and other volume data associated with the master applications 250 directly from virtual hard disk files associated with virtual machines 255 that run the applications 250 on the virtualized production server 220 .
  • the replication and high availability engine 240 a on the production server 220 may replicate any changes to the applications 250 or the virtual machines 255 that run the applications 250 to the virtualized replica server 260 , which may replicate the changes within the *.vhd files 270 that correspond to the changed applications 250 or virtual machines 255 .
  • the replication may be performed at the file-level, including all files on the volumes associated with the master applications 250 in addition to any files in system folders that relate to the production server 220 .
  • systems 200 A-B may both use reverse path lookups to maintain consistent mappings between the *.vhd files 270 hosted on the virtualized replica server 260 and the volumes or file systems associated with the master applications 250 or virtual machines 255 on the production server 220 (e.g., using techniques described in U.S. patent application Ser. No. 13/234,532, entitled “System and Method for Network File System Server Replication Using Reverse Path Lookup,” filed Sep. 16, 2011, the contents of which are hereby incorporated by reference in their entirety).
  • the virtualized replica server 260 may initially have an offline state to prevent network address, network name, or other network conflicts (i.e., because the virtualized replica server 260 represents an effective clone associated with the physical or virtualized production server 220 ). However, to handle switchover or failover in response to downtime, failure, or other loss or disruption associated with a particular master application 250 or virtual machine 255 running thereon, the disrupted master application 250 or virtual machine 255 may be disabled and the virtualized replica server 260 may create an on-demand virtual machine 280 from the virtual machine file 270 corresponding thereto.
  • the virtual machine files 270 on the virtualized replica server 260 may include a *.xml file that contains information to configure the operating system, disk size, network, and other aspects associated with the on-demand virtual machine 280 and a *.avhd file that contains a most recent snapshot associated with the master application 250 or virtual machine 255 , which may be created, validated, and otherwise managed using techniques described in U.S. patent application Ser. No. 13/043,201, entitled “System and Method for Providing Assured Recovery and Replication,” filed Mar. 8, 2011, and U.S. patent application Ser. No.
  • the virtualization stack 230 b may use the virtual machine files 270 to create the on-demand virtual machine 280 and make the on-demand virtual machine 280 available to ensure continuity associated with the application 250 or virtual machine 255 that were disrupted on the production server 220 .
  • the virtualization stack 230 b may use the hypervisor host to configure the on-demand virtual machine 280 with various values specified in the *.xml configuration file 270 , mount a disk image from the *.vhd file 270 and connect the on-demand virtual machine 280 to the mounted disk image, configure network connections associated with the on-demand virtual machine 280 with information specified in the *.xml configuration file 270 , and then boot the on-demand virtual machine 280 and install integration services to make the on-demand virtual machine 280 available to end users.
  • the physical or virtual full system protection deployment scenarios described above may protect the entire state associated with the production server 220
  • the physical or virtual full system protection deployment scenarios may support large sets of applications 250 and environments.
  • the full system protection deployment scenarios may be simple to deploy because transferring the entire state associated with the production server 220 to the virtualized replica server 260 in an automated manner may obviate or substantially reduce any need to manually provision the virtualized replica server 260 prior to initiating replication operations.
  • the system 200 A shown in FIG. 2A may be used to provide the physical to virtual guest replication and high availability deployment scenario.
  • the system 200 A may include a physical production server 220 having various master applications 250 a - n and a replication and high availability engine 240 a running thereon in addition to a virtualized replica server 260 having a similar replication and high availability engine 240 b and a virtualization stack 230 b to manage various virtual machine files 270 a - n that correspond to the master applications 250 a - n running on the physical production server 220 .
  • the replication and high availability engine 240 a may generally replicate data associated with the master applications 250 a - n residing on the physical production server 220 or any other suitable data residing on the physical production server 220 to the virtualized replica server 260 , wherein the physical production server 220 and the virtualized replica server 260 may reside at the same location, or the virtualized replica server 260 may be located at a remote data center or remote office that provides a data protection and disaster recovery site associated with the physical production server 220 .
  • the physical to virtual guest replication and high availability deployment scenario may continuously capture and replicate byte-level changes to the master applications 250 a - n and any databases or files on the physical production server 220 to the virtualized replica server 260 .
  • the byte-level changes may be captured and replicated using techniques described in U.S. patent application Ser. No. 10/188,512, entitled “Method and System for Updating an Archive of a Computer File,” filed Jul. 3, 2002, which issued as U.S. Pat. No. 7,730,031 on Jun.
  • the virtualization stack 230 b on the virtualized replica server 260 may include an active hypervisor that has access to underlying hardware and runs a guest operating system to replicate the changes within various virtual machine files 270 a - n that correspond to the master applications 250 a - n and other data volumes residing on the physical production server 220 and thereby deliver continuous onsite or offsite data protection.
  • the changes captured and replicated from the physical production server 220 to the virtualized replica server 260 may be recorded in a rewind log to preserve a context that can be used, for example, to track the changes, undo the changes at the production server 220 , or locate a switch point in the virtual machine files 270 on the replica server 260 that can be used to suitably resume business operations in response to a disaster or other failure associated with the production server 220 (e.g., using techniques described in U.S. patent application Ser. No. 10/981,837, entitled “Replicated Data Validation,” filed Nov. 5, 2004, which issued as U.S. Pat. No. 7,840,535 on Nov. 23, 2010, the contents of which are hereby incorporated by reference in their entirety).
  • the physical to virtual guest replication and high availability deployment scenario may therefore synchronize and replicate the physical production server 220 to the virtualized replica server 260 to support automated or manual switchover and failover to redirect workloads from the physical production server 220 to the virtualized replica server 260 .
  • the virtualized replica server 260 may invoke one or more components in the virtualization stack 230 b to automatically start one or more on-demand virtual machines 280 in response to downtime, failure, outage, or other disruption associated with the physical production server 220 or one or more applications 250 running thereon.
  • the virtualization stack 230 b may start the one or more on-demand virtual machines 280 from the virtual machine files 270 that correspond to the applications 250 experiencing disruption on the physical production server 220 , wherein end users and workloads associated with the disrupted applications 250 may be automatically redirected to the on-demand virtual machines 280 started on the virtualized replica server 260 to handle the switchover or failover and thereby minimize business downtime.
  • the procedure to start the on-demand virtual machines 280 on the virtualized replica server 260 and redirect the end users and workloads associated with the disrupted applications 250 to the virtualized replica server 260 may be initiated manually, whereby information technology personnel may investigate the issues that caused the disruption prior to performing the switchover or failover (if necessary).
  • the system 200 B shown in FIG. 2B may be used to provide the virtual guest to virtual guest and hypervisor host to hypervisor host replication and high availability deployment scenarios.
  • the virtual guest to virtual guest replication and high availability deployment scenario may generally synchronize and replicate the production server 220 to the virtualized replica server 260 in a substantially similar manner to the physical to virtual guest replication and high availability deployment scenario, and may further handle switchover and failover in a substantially similar manner to the physical to virtual guest scenario.
  • the production server 220 may be virtualized, whereby the virtualized production server 220 may include one or more virtual machines 255 to execute the applications 255 on the virtualized production server 220 , while the virtualized replica server 260 runs one or more virtual machines 280 and maintains one or more virtual machine files 270 that correspond to the virtual machines 255 executing the applications 255 on the virtualized production server 220 (e.g., mirroring the *.vhd virtual hard disk files, the *.xml configuration files, and the *.avhd snapshot files associated with the virtual machines running on the virtualized production server).
  • the virtualized production server 220 may include one or more virtual machines 255 to execute the applications 255 on the virtualized production server 220
  • the virtualized replica server 260 runs one or more virtual machines 280 and maintains one or more virtual machine files 270 that correspond to the virtual machines 255 executing the applications 255 on the virtualized production server 220 (e.g., mirroring the *.vhd virtual hard disk files, the *.xml configuration files,
  • the virtual guest to virtual guest deployment scenario may have different instances of the replication and high availability engine 240 a installed and configured on the individual virtual machines 255 running on the virtualized production server 220 , and may similarly have different instances of the replication and high availability engine 240 b installed and configured on the individual virtual machines 280 running on the virtualized replica server 260 . Accordingly, the replication and high availability engine instances 240 a on the virtualized production server 220 and the replication and high availability engine instances 240 b on the virtualized replica server 260 may communicate with one another to synchronize, replicate, and manage switchover and failover associated with the individual virtual machines 255 running the applications 250 on the virtualized production server 220 .
  • the hypervisor host to hypervisor host replication and high availability deployment scenario may obviate or substantially reduce a need to install and configure different instances associated with the replication and high availability engine 240 on individual virtual machines, which may advantageously provide hypervisor-level replication, switchover and failover, and rewind and recovery capabilities associated with all (or certain selected) virtual machines 255 running on the virtualized production server 220 .
  • the hypervisor-level replication, switchover and failover, and rewind and recovery capabilities may require only one license to purchase the replication and high availability engine 240 from the third party per virtual host (e.g., one license for the replication and high availability engine 240 a on the virtualized production server 220 and one license for the replication and high availability engine 240 b on the virtualized replica server 260 ).
  • the hypervisor-level capabilities may substantially reduce deployment time and costs because the requisite software need only be installed on the parent partition within the virtualized production server 220 and the virtualized replica server 260 , and may further reduce processor and memory usage because each virtual machine 255 would not require a locally installed replication and high availability engine instance 240 a .
  • the virtualized replica server 260 only creates the on-demand virtual machines 280 in response to a switchover or failover condition, the hypervisor-level deployment scenario may satisfy cold site definitions and thereby reduce costs associated with licensing operating systems and licenses associated with the applications 250 running in the virtual machines 255 on the virtualized production server 220 .
  • the virtualized production server 220 may automatically discover all the virtual machines 255 running thereon and create various replication scenarios according to the virtual machines 255 that are selected to be replicated to the virtualized replica server 260 .
  • the replication and high availability engine 240 a installed on the parent partition in the virtualized production server 220 may then replicate all the files associated with the discovered virtual machines 255 (or the virtual machines 255 selected to be replicated) to the virtualized replica server 260 , which may use the replication and high availability engine 240 b to store the replicated files within one or more virtual machine files 270 that correspond to the discovered (or selected) virtual machines 255 .
  • the changes may be continuously replicated within the corresponding virtual machine files 270 stored on the virtualized replica server 260 (via the replication and high availability engine 240 b ).
  • the replication and high availability engine 240 b may bring the virtualized replica server 260 online, use the virtualization stack 230 b in the parent partition to create on-demand virtual machines 280 within one or more child partitions from the virtual machine files 270 corresponding to the discovered or selected virtual machines on the virtualized production server 220 , and redirect end users and workloads to the replica server 260 to maintain consistency and minimize downtime associated with the disruption to the virtualized production server 220 .
  • switchover or failover conditions associated with individual discovered or selected virtual machines 255 may be handled in a similar manner, whereby the virtualized replica server 260 may start an appropriate on-demand virtual machine 280 and redirect end users and workloads to the on-demand virtual machine 280 to minimize downtime associated with the individual discovered or selected virtual machines 255 experiencing disruption.
  • the hypervisor-level replication and high availability scenario may include initially installing the replication and high availability engine 240 a in the parent partition on the virtualized production server 220 (rather than the individual virtual machines 255 ) and similarly installing the replication and high availability engine 240 b in the parent partition on the virtualized replica server 260 .
  • one or more components associated with the virtualization stack may be installed on the guest operating system associated with every virtual machine 255 on the virtualized production server 220 to enable the replication and high availability engine 240 a to determine host names associated with the virtual machines 255 .
  • the replication and high availability engine 240 a may then automatically discover all the virtual machines 255 on the virtualized production server 220 and use a volume shadow copy service (VSS) writer associated with the virtualization stack 230 a to collect all the files 270 relating to the discovered virtual machines 255 , wherein the collected files 270 may include the *.vhd files that represent virtual hard disks associated with each virtual machine 255 , the *.xml configuration files that contain unique identifiers and various settings associated with each virtual machine 255 , and the *.avhd files that contain all snapshots associated with the individual virtual machines 255 .
  • VSS volume shadow copy service
  • the replication and high availability engine 240 a may automatically create various replication scenarios associated with each virtual machine 255 .
  • the replication scenarios may generally define various replication properties associated with each virtual machine 255 , wherein the properties may enable or disable scheduled bookmarks on the production server 220 , set spool sizes and directory paths, replicate in online or scheduled modes, specify whether to synchronize at a file-level or block-level, specify whether to ignore certain files having the same size and type, specify whether to run a script, send an email, or log results to handle event notifications and reporting, and enable or disable delays or data rewind capabilities, among others.
  • each replication scenario associated with an individual virtual machine 255 may include all the files 270 relating thereto, including the *.vhd, virtual hard disk file, the *.xml configuration file, and the *.avhd snapshot file associated with the individual virtual machine 255 .
  • the replication and high availability engine 240 a may then run all the scenarios associated with all the virtual machines 255 in order to replicate and protect all the virtual machines 255 .
  • one or more virtual machines 255 (or certain scenarios associated with a particular virtual machine 255 ) may be selected to customize the replication scenarios used to protect the virtualized production server 220 .
  • the replication and high availability engine 240 a may then replicate any subsequent changes to the files 270 associated with such virtual machines 255 to the virtualized replica server 260 (via the replication and high availability engine 240 b installed thereon).
  • the virtualized replica server 260 may then create and register one or more on-demand virtual machines 280 corresponding to the one or more virtual machines 255 associated with the switchover or failover condition on the virtualized production server 220 , wherein the on-demand virtual machines 280 may be created from the virtual machine files 270 corresponding to the virtual machines 255 associated with the switchover or failover condition.
  • the switchover and failover procedure may generally exchange active and standby roles between the virtualized production server 220 and the virtualized replica server 260 , whereby the virtualized production server 220 may change to a standby role in response to the switchover or failover assigning the active role to the virtualized replica server 260 .
  • the relevant scenarios may further specify how to handle reverse replication operations (e.g., replicating changes to the on-demand virtual machines 280 to protect or otherwise backup changes to the files 270 associated therewith), whereby the replication and high availability engine 240 b on the virtualized replica engine 260 may continue to replicate changes to the on-demand virtual machines 280 in accordance with the reverse replication operations specified in the relevant scenarios once the virtualized production server 220 becomes available (e.g., changes may be resynchronized from the virtualized replica server 260 to the virtualized production server 220 , which may include comparing data on the virtualized production server 220 to data on the virtualized replica server 260 to determine the changes to replicate back to the virtualized production server 220 ).
  • reverse replication operations e.g., replicating changes to the on-demand virtual machines 280 to protect or otherwise backup changes to the files 270 associated therewith
  • the replication and high availability engine 240 b on the virtualized replica engine 260 may continue to replicate changes to the on-demand virtual machines 280 in accordance with the reverse replication operations specified
  • the switchover or failover may be triggered manually (e.g., due to planned downtime, to balance loads among the virtualized production server 220 and the virtualized replica server 260 , in response to a notification that the virtualized production server 220 has become unavailable, etc.).
  • the switchover or failover may be triggered automatically (at a scheduled time or in response to detecting that the virtualized production server 220 has become unavailable), wherein the replication and high availability engine 240 b on the virtualized replica server 260 may periodically check the status associated with the virtualized production server 220 to determine whether to trigger the switchover or failover procedure.
  • the replication and high availability engine 240 b may periodically send ping requests to the virtual machines 255 running on the virtualized production server 220 and automatically bring up the corresponding on-demand virtual machine 280 on the virtualized replica server 260 if the virtualized production server 220 does not respond.
  • the virtualized replica server 260 may check the status associated with the virtualized production server 220 via custom requests to monitor specific applications 250 or virtual machines 255 or requests to databases or services running in the parent partition associated with the virtualized production server 220 to verify the status associated therewith.
  • the switchover may be manually triggered to test certain applications 250 or virtual machines 255 on the virtualized replica server 260 without disrupting or otherwise interfering with operations on the virtualized production server 220 .
  • switchback or failback may be performed to return the active role to the virtualized production server 220 and the standby role to the virtualized replica server 260 .
  • performing the switchback or failback may include determining whether to overwrite the data that existed on the virtualized production server 220 prior to the switchover or failover with the data existing on the virtualized replica server 260 at the time that the switchback or failback has been initiated.
  • the lost data can be restored from the virtualized replica server 260 via reverse synchronization to the virtualized production server 220 , or the lost data may be recovered from a certain event or point in time via the data rewind capabilities, which may involve locating a suitable event-stamped or time-stamped checkpoint and/or bookmark to roll lost or corrupted data on the virtualized production server 220 back to the event or point in time prior to when the data was lost or corrupted.
  • FIG. 3 illustrates an exemplary method 300 that may be used to balance loads and manage switchover or failover conditions in a virtualized replication and high availability environment.
  • the method 300 may include an initial operation 310 to install a replication and high availability engine in a parent partition on a virtualized production server (rather than individual virtual machines running in child partitions on the virtualized production server) and a similar replication and high availability engine in a parent partition on a virtualized replica server.
  • operation 310 may include installing various virtualization stack components in guest operating systems associated with the virtual machines running on the virtualized production server to enable the replication and high availability engine to determine host names associated therewith.
  • the replication and high availability engine may then automatically discover all the virtual machines on the virtualized production server in an operation 320 , which may further include a volume shadow copy service (VSS) writer associated with the virtualization stack collecting all the files relating to the discovered virtual machines (e.g., *.vhd files that represent virtual hard disks associated with each virtual machine, *.xml configuration files that contain unique identifiers and various settings associated with each virtual machine, and *.avhd files that contain all snapshots associated with the individual virtual machines).
  • VSS volume shadow copy service
  • the replication and high availability engine may automatically create various replication scenarios associated with each virtual machine in an operation 330 .
  • the replication scenarios may generally define various replication properties associated with each virtual machine (e.g., whether to enable or disable scheduled bookmarks, establishing spool sizes and directory paths, whether to replicate in online or scheduled modes, etc.).
  • each replication scenario associated with an individual virtual machine may include all the files relating thereto, including the *.vhd, virtual hard disk file, the *.xml configuration file, and the *.avhd snapshot file associated with the individual virtual machine, which may be written to the virtualized replica server.
  • the replication and high availability engine may then run all the scenarios associated with all the virtual machines in an operation 340 to replicate and protect all the virtual machines.
  • operation 340 may include selecting certain virtual machines (or certain scenarios associated with a particular virtual machine) to customize the replication scenarios used to protect the virtualized production server in operation 340 .
  • the replication and high availability engine in response to initially synchronizing the files associated with the virtual machines to the virtualized replica server in operation 330 , the replication and high availability engine may then replicate any subsequent changes to the files associated with such virtual machines to the virtualized replica server in operation 340 (i.e., via a replication and high availability engine installed thereon).
  • a load associated with the virtualized production server may then be analyzed in an operation 350 to determine whether or not to initiate a procedure to balance the load associated with the virtualized production server.
  • an operation 360 may determine whether the virtualized production server currently has an overloaded status or could otherwise benefit from offloading one or more workloads to a standby or other alternate server.
  • an operation 380 may register one or more on-demand virtual machines to offload and redirect certain workloads from the virtualized production server, as will be described in further detail below.
  • an operation 370 may determine whether or not a switchover or failover condition associated with the virtualized production server or the virtual machines running thereon has occurred.
  • operation 370 may trigger the switchover or failover manually due to planned downtime, in response to a notification that the virtualized production server has become unavailable, or in other appropriate circumstances, or operation 370 may alternatively triggered the switchover or failover automatically (e.g., at a scheduled time, in response to detecting unavailability associated with the virtualized production server or certain virtual machines running thereon, etc.).
  • a replication and high availability engine on the virtualized replica server may periodically check the status associated with the virtualized production server in operation 370 to determine whether to trigger the switchover or failover procedure (e.g., sending ping requests to the virtual machines running on the virtualized production server to determine whether the virtualized production server responds to indicate availability, sending custom requests to specific applications, virtual machines, databases, or services running in the parent partition on the virtualized production server to verify the status associated therewith, etc.).
  • the switchover or failover procedure e.g., sending ping requests to the virtual machines running on the virtualized production server to determine whether the virtualized production server responds to indicate availability, sending custom requests to specific applications, virtual machines, databases, or services running in the parent partition on the virtualized production server to verify the status associated therewith, etc.
  • operation 380 may include the virtualized replica server creating and registering one or more on-demand virtual machines corresponding to any virtual machines on the virtualized production server that are associated with the load balance, switchover, failover condition.
  • operation 380 may create the on-demand virtual machines from the virtual machine files corresponding to the virtual machines associated with the load balance, switchover, or failover condition and exchange active and standby roles between the virtualized production server and the virtualized replica server.
  • registering the on-demand virtual machines to perform the load balance, switchover, or failover condition may change the virtualized production server to a standby role and assign an active role to the virtualized replica server.
  • the relevant replication scenarios may further specify how to handle reverse replication operations, which may be performed using a method having substantially similar characteristics to the method 300 shown in FIG. 3 and described in further detail herein (i.e., replicating changes to the on-demand virtual machines to protect or otherwise backup changes to the files associated therewith after the virtualized replica server became active).
  • the replication and high availability engine on the virtualized replica engine may use the above-described techniques to continue replicating changes to the on-demand virtual machines in accordance with the reverse replication operations specified in the relevant scenarios once the virtualized production server becomes available (e.g., to resynchronize changes from the virtualized replica server to the virtualized production server, data on the virtualized production server may be compared to data on the virtualized replica server to determine the changes that need to be replicated back to the virtualized production server).
  • switchback or failback may be performed in a similar manner to return the active role to the virtualized production server and the standby role to the virtualized replica server.
  • performing the switchback or failback may include determining whether to overwrite the data that existed on the virtualized production server prior to the load balance, switchover, or failover with the data existing on the virtualized replica server at the time that the switchback or failback has been scheduled to occur.
  • the lost data can be restored from the virtualized replica server via reverse synchronization to the virtualized production server, or the lost data may be recovered from a certain event or point in time via data rewind capabilities, which may involve locating a suitable event-stamped or time-stamped checkpoint and/or bookmark to roll the virtualized production server back to the event or point in time prior to when the data loss or corruption occurred on the virtualized production server.
  • Implementations of the invention may be made in hardware, firmware, software, or any suitable combination thereof.
  • the invention may also be implemented as instructions stored on a machine-readable medium that can be read and executed on one or more processing devices.
  • the machine-readable medium may include various mechanisms that can store and transmit information that can be read on the processing devices or other machines (e.g., read only memory, random access memory, magnetic disk storage media, optical storage media, flash memory devices, or any other storage or non-transitory media that can suitably store and transmit machine-readable information).
  • firmware, software, routines, or instructions may be described in the above disclosure with respect to certain exemplary aspects and implementations performing certain actions or operations, it will be apparent that such descriptions are merely for the sake of convenience and that such actions or operations in fact result from processing devices, computing devices, processors, controllers, or other hardware executing the firmware, software, routines, or instructions. Moreover, to the extent that the above disclosure describes executing or performing certain operations or actions in a particular order or sequence, such descriptions are exemplary only and such operations or actions may be performed or executed in any suitable order or sequence.

Abstract

The system and method described herein may provide a virtualized replication and high availability environment. In particular, a virtualized production server may run one or more virtual machines in one or more child partitions and have a replication and high availability engine installed in a parent partition. The replication and high availability engine may automatically discover the virtual machines running in the child partitions and automatically synchronize all files associated with the virtual machines to a virtualized replica server. Furthermore, the replication and high availability engine may continuously replicate subsequent changes to the files associated with the virtual machines running in the child partitions to the virtualized replica server, which may then create on-demand virtual machines from the synchronized and replicated files to handle switchover, failover, switchback, and failback events associated with the virtualized production server or the virtual machines running in the child partitions associated therewith.

Description

    FIELD OF THE INVENTION
  • The invention generally relates to a system and method for providing a virtualized replication and high availability environment, and in particular, to installing a replication and high availability engine in a parent partition on a virtualized production server (rather than virtual machines that run in child partitions on the virtualized production server), automatically discovering the virtual machines running in the child partitions on the virtualized production server, and automatically synchronizing all files associated with the virtual machines and continuously replicating subsequent changes to the files associated with the virtual machines to a virtualized replica server that can create on-demand virtual machines from the synchronized and replicated files to handle switchover, failover, switchback, and failback events associated with the virtualized production server or the virtual machines running therein.
  • BACKGROUND OF THE INVENTION
  • Today, many (if not all) organizations tend to conduct substantial amounts of business electronically, and consequently, depend on having reliable, continuous access to information technology systems, applications, and resources in order to effectively manage business endeavors. At the same time, information technology threats ranging from viruses, malware, and data corruption to application failures and natural disasters are growing in number, type, and severity, while current trends in technology have presented information technology departments with a plethora of recurring challenges. For example, the need to do business at an increasingly faster pace with larger critical data volumes have amplified the pressure on information technology, which has led to efforts to consolidate, migrate, or virtualize servers and resources hosted thereon without disrupting operations or damaging resources. As such, even isolated failures have the potential to render information technology resources unavailable, which may cause organizations to lose substantial amounts of revenue or information that could impede or even cripple business. Although certain organizations have attempted to utilize backup solutions to protect the information that applications create, many backup solutions lack the restoration granularity required to quickly restore important data, while others demand full restoration to temporary disk space (even to recover one single file).
  • Consequently, many organizations have turned to complementary replication and high availability solutions to minimize downtime and protect critical applications and data, and moreover, efforts to implement server virtualization with Microsoft Hyper-V and other virtualization platforms have increased due to the potential that virtualization has to increase information technology flexibility, drive down costs, and accelerate time to market. Although mainstream virtualization adoption has the potential to enable simple, economical, and reliable disaster recovery strategies, many adopters tend to quickly discover that virtualization adds new complexity that can interfere with achieving data protection, system availability, and disaster recovery goals (e.g., because protecting virtual servers raises additional and/or different issues from protecting physical servers). In other words, even with all the potential benefits that virtualization can potentially offer, increasing diversity in virtualized computing environments introduces abstraction that requires a coordinated and cohesive management approach to realize the visibility, control, and automation essential to planning and deploying an organized, secure, and scalable virtualized infrastructure. For example, to the extent that some virtualization vendors provide data protection capabilities, these solutions typically only work on the particular platforms that the virtualization vendors deliver. As such, protecting applications and related data associated with virtual machines hosted on virtualized servers, whether implementing VMware, Microsoft Hyper-V, Citrix XenServer, or other virtualization technology, requires more than backup and restore solutions alone can provide because point solutions are not cost-effective and add complexity to managing a heterogeneous environment.
  • Accordingly, because disruptions to system and application availability and data loss typically translates to lost revenue, lower customer service and employee productivity, and even damage to reputation, organizations need more than point or platform-specific backup and restore solutions to achieve faster recovery times with continuous data protection, high availability to support demanding service level agreements and disaster recovery strategies, and business protection in modern fast-paced environments. Although virtualization has the potential to streamline information technology infrastructure and resource efficiency, reduce capital and operating costs, and improve business continuity, the risk that virtual deployments will proceed unmanaged and unsecured tends to increase with increased virtualization and abstraction. In particular, rather than achieving a consolidated and secure infrastructure, an uncontrolled virtual machine proliferation termed “virtual sprawl” may result instead. For example, without automated monitoring, alerting, and control, virtualization may create lag times in responding to business needs, provisioning resources, and implementing effective security measures. Furthermore, capacity planning and automation must be implemented to mitigate information technology inefficiencies, slow response times, and missed business opportunities. In a related sense, virtualization tends to cross multiple silos, which requires coordinated management and integration and time-consuming manual processes that can hinder performance and elevate costs.
  • In the replication, high availability, and data protection context, virtualized systems usually require installing appropriate engines on virtual machines managed therein in order to protect applications that may be running in the managed virtual machines. However, manually installing the engines on the virtual machines tends to be difficult, time consuming, and resource intensive. For example, usage information associated with a particular virtual machine may require the engine installed thereon to create several different high availability scenarios, including some that may be unnecessary or not relevant to user needs. Moreover, manually installing engines on individual virtual machines requires users to know how to configure different applications running therein (e.g., SQL, Exchange, SharePoint, etc.), which tends to introduce substantial human resource and information technology resource costs. Because each virtualization platform contains specific management tools, organizations tend to quickly feel the pain associated with multiple management solutions, uncoordinated manual processes, weak security measures, and inadequate tracking and reporting practices. As such, without a coordinated management approach, organizations may be unable to attain the promise associated with virtualization technology, which may instead become a burden that threatens to consume information technology resources, budgets, and reputations because information technology has become saddled with trying to effectively manage and scale resources while business users become frustrated because applications and services needed to dynamically respond to market opportunities may be unavailable or disrupted.
  • SUMMARY OF THE INVENTION
  • According to one aspect of the invention, the system and method described herein may provide a virtualized replication and high availability environment. In particular, the system and method described herein may provide Windows, Linux, and Unix systems with high availability and continuous or periodic data protection associated with related applications and data to maximize uptime and availability associated with physical and virtualized environments. For example, the system and method described herein may provide simple mechanisms to migrate or replicate data between different servers and locations, whether physical or virtual, and to consolidate data between remote offices and a backup or archive facility and protect onsite and offsite data. Further, the system and method described herein may include non-disruptive recovery testing and data rewind capabilities to restore systems, applications, and data to prior states, which may be useful to speeding recovery times and minimizing data loss. Additionally, the system and method described herein may further include real-time server and application monitoring, automated and push-button failover or switchover, and automated and push-button switchback (or failback) to restore replica systems or replica applications in response to a production (or master) server having been repaired or replaced. In one implementation, the system and method described herein may perform monitoring at server, application, and hypervisor and virtual machine levels, which may enable the system and method to respond to issues at physical, application, and virtualization levels, and may further replicate operating systems, system states, and application data to an offline replica server, which may enable the system and method to improve protection speeds, reduce costs, and safely test and migrate from a physical to a virtual server or from one virtual server to another virtual server. Moreover, the system and method described herein may include a unified management console across all operating systems, virtualization platforms, and applications to easily visualize and manage the virtualized replication and high availability environment.
  • According to one aspect of the invention, the system and method described herein may provide the virtualized replication and high availability environment using an architecture having a hypervisor that runs a guest operating system directly on underlying hardware and supports isolated partitions. For example, in one implementation, the architecture may be based on Microsoft Hyper-V server virtualization technology, which may be used to create and run separate virtual machines on one physical machine and thereby consolidate multiple server and application roles and better leverage server hardware investments. Furthermore, the architecture may natively support x64 computing, which may be leveraged to efficiently run multiple different operating systems in parallel on one physical server and assign multiple processors or cores to one virtual machine to utilize the increased processing capacity associated with multi-core processors or multi-processor architectures.
  • According to one aspect of the invention, the hypervisor may have a parent partition that runs a virtualization stack having direct access to the underlying hardware, wherein the parent partition may create child partitions that can host any suitable guest operating system. The virtualization stack may include various components that run in a kernel mode or privileged processor ring, including a VMBus that provides a logical channel to redirect requests and responses between virtual devices in the child partitions and the parent partition to manage inter-partition communication between the parent and child partitions. The virtualization stack may further include various device drivers associated with virtual machines running in the child partitions, a kernel to support the guest operating system instance running in the parent partition, and a virtualization service provider that connects to the VMBus and handles device access requests from the child partitions. The child partitions may similarly include a kernel to support the guest operating system running therein, which may be the same or different from the guest operating system running in the parent partition, a VMBus to communicate with the parent partition, and a virtualization service consumer (or virtualization service client) that transparently communicates with the virtualization service provider in the parent partition to redirect and fulfill device access requests that originate in the child partitions. Further, the virtualization stack may include various components that run in a user mode or less privileged ring, including a virtual machine interface provider that guest operating systems or applications running in the child partitions can use to communicate with the hypervisor, a virtual machine management service that can manage states associated with the virtual machines in the child partitions and control state-related tasks associated therewith, and virtual machine worker processes that the virtual machine management service creates to start corresponding virtual machine instances in the child partitions and handle interactions between the parent partition and the virtual machines in the child partitions.
  • According to one aspect of the invention, the system and method described herein may provide a physical to virtualized or a virtualized to virtualized replication and high availability environment to ensure that various applications or virtual machines running on a production (or active) server will have absolute operational continuity via a virtualized replica server. In particular, the system and method described herein may validate consistency between the applications running on the production server (or virtual machines running the applications on the production server) and various virtual machine files hosted on the replica server that correspond to the applications or the virtual machines on the production server, which may enable recovering the applications (or the virtual machines) on the production server from the replica server. For example, the applications (or virtual machines) on the production server may become unavailable due to downtime, failure, or other loss or disruption associated with the production server, in which case the applications (or virtual machines) may be activated on the replica server to ensure continuity and thereby handle the downtime, failure, or disruption associated with the production server. In one implementation, the procedure that relates to loading the applications (or virtual machines) on the replica server may be considered switchover if the downtime was planned or failover if the downtime was unplanned, while the procedure to subsequently recover the applications (or virtual machines) on the production server via the replica server may be considered switchback or failback.
  • According to one aspect of the invention, the system and method described herein may use a replication and high availability engine on the production server and a similar replication and high availability engine on the replica server, which may both use asynchronous real-time replication and proactive validation to test whether the virtual machine files hosted on the replica server can reliably recover the applications (or virtual machines running the applications) on the production server to provide cost-effective disaster recovery. In particular, data associated with various applications and files, databases, or other suitable data sources relating thereto may be synchronized and replicated between the production server and the replica server over local, wide, or other suitable networks having the replication and high availability engine installed therein and the appropriate network connections needed to communicate with one another. As such, the virtualized replication and high availability environment may provide data synchronization, asynchronous real-time data replication, and automated switchover, failover, and switchback to provide data continuity in various deployment scenarios, which may include full system protection (physical or virtual) using a hypervisor host and replication and high availability in physical to virtual guest, virtual guest to virtual guest, and hypervisor host to hypervisor host environments.
  • According to one aspect of the invention, the system and method described herein may use the hypervisor host to provide the physical and virtual full system protection deployment scenarios, wherein either full system protection deployment scenario may provide application-independent synchronization to transfer a complete state associated with the production server to the virtualized replica server and subsequently replicate changes to the state associated with production server to the virtualized replica server. For example, in the physical full protection deployment scenario, a physical production server may read data directly from volumes associated with various master applications running thereon to obtain any suitable files and data relating to the operating system, system state, and disk layout associated with the master applications. In one implementation, the data may then be serialized and sent to the replica server, which may inject the serialized data into virtual hard disk files that represent the volumes associated with the master applications. For example, the replica server may include the hypervisor host within a virtualization stack having a substantially similar architecture to that described above, whereby the hypervisor host may run various different operating systems in one or more child partitions to support the operating systems that run the master applications on the physical production server. As such, to synchronize the complete state associated with the physical production server, the replica server may use the hypervisor host to inject the serialized data into the virtual hard disk files and thereby perform volume-level synchronization associated with the master applications. The virtualized full system protection scenario may operate in a substantially similar manner to the physical full system protection deployment scenario, except that the virtualized full system protection scenario may include a virtualized production server having a hypervisor host that can read the volume data associated with the master applications directly from virtual hard disk files associated with virtual machines that run the applications on the virtualized production server.
  • According to one aspect of the invention, in either the physical full system protection deployment scenario or the virtual full system protection deployment scenario, any subsequent changes to the applications or the virtual machines that run the applications may then be replicated within the virtual hard disk files that correspond to the applications or virtual machines, wherein to handle switchover or failover in response to disruption associated with a master application or virtual machine, the disrupted master application or virtual machine may be disabled and the virtualized replica server may create an on-demand virtual machine from the virtual machine files corresponding thereto and make the on-demand virtual machine available to ensure continuity (e.g., the hypervisor host may configure the on-demand virtual machine with various values specified in a virtual machine configuration file, connect the on-demand virtual machine to a disk image mounted from the virtual hard disk file, and boot the on-demand virtual machine to make the on-demand virtual machine available to end users without disruption). Accordingly, because the physical and virtual full system protection deployment scenarios can protect the entire state associated with the production server, the full system protection deployment scenarios may support large sets of applications and environments and may be simple to deploy because automatically transferring the entire state associated with the production server to the virtualized replica server may obviate or substantially reduce a need to manually provision or otherwise synchronize the virtualized replica server prior to initiating replication operations.
  • According to one aspect of the invention, to provide the physical to virtual guest replication and high availability deployment scenario, the system and method described herein may have various master applications and a replication and high availability engine running on a physical production server. In one implementation, the virtualized replica server may have a similar replication and high availability engine in addition to a virtualization stack to manage various virtual machine files that correspond to the master applications running on the physical production server. As such, the replication and high availability engine running on the physical production server may generally replicate data associated with the master applications or any other suitable data residing on the physical production server to the virtualized replica server, which may reside at the same location as the physical production server or at a remote data center to provide a data protection and disaster recovery site. In one implementation, the physical to virtual guest replication and high availability deployment scenario may generally include synchronizing the physical production server and the virtualized replica server (e.g., via the full system protection techniques described above or any other suitable technique) and then continuously capturing and replicating byte-level changes to the data residing on the physical production server to the virtualized replica server. As such, the virtualization stack on the virtualized replica server may include an active hypervisor that has access to underlying hardware and runs a guest operating system to replicate the changes within various virtual machine files that correspond to the master applications and other data volumes residing on the physical production server and thereby deliver continuous onsite or offsite data protection. Moreover, in one implementation, the changes captured and replicated from the physical production server to the virtualized replica server may be recorded in a rewind log to preserve a context associated with the replicated data (e.g., to track the changes, undo the changes at the production server, locate a switch point in the virtual machine files on the replica server that can be used to suitably resume business operations in response to disaster or other failure associated with the production server, etc.).
  • According to one aspect of the invention, in the physical to virtual guest replication and high availability deployment scenario, the system and method described herein may therefore synchronize and replicate the physical production server to the virtualized replica server to support automated or manual switchover and failover to redirect workloads from the physical production server to the virtualized replica server. For example, the virtualized replica server may invoke one or more components in the virtualization stack to automatically start one or more on-demand virtual machines in response to disruption associated with the physical production server or one or more applications running thereon, wherein the virtualization stack may start the on-demand virtual machines from the virtual machine files that correspond to the applications experiencing disruption on the physical production server. As such, end users and workloads associated with the disrupted applications may then be automatically redirected to the on-demand virtual machines on the virtualized replica server to handle the switchover or failover and thereby minimize business downtime. Alternatively, the procedure to start the on-demand virtual machines on the virtualized replica server and redirect the end users and workloads to the virtualized replica server may be initiated manually, which may enable information technology personnel to investigate the issues that caused the disruption prior to performing the switchover or failover (if necessary).
  • According to one aspect of the invention, to provide the virtual guest to virtual guest replication and high availability deployment scenario, the system and method described herein may generally synchronize and replicate the production server to the virtualized replica server in a substantially similar manner to the physical to virtual guest scenario, and may further handle switchover and failover in a substantially similar manner to the physical to virtual guest scenario. However, in the virtual guest to virtual guest scenario, the production server may be virtualized, whereby the virtualized production server may run the applications within one or more virtual machines, while the virtualized replica server may run one or more corresponding virtual machines and maintain one or more virtual machine files that correspond to the virtual machines executing the applications on the virtualized production server (e.g., mirroring the virtual hard disk files, configuration files, and snapshot files associated with the virtual machines running on the virtualized production server). Further, the virtual guest to virtual guest scenario may have different replication and high availability engine instances installed and configured on the individual virtual machines running thereon, and different replication and high availability engine instances may be similarly installed and configured on the individual virtual machines running on the virtualized replica server. Accordingly, the replication and high availability engine instances on the virtualized production server and the virtualized replica server may communicate with one another to synchronize, replicate, and manage switchover and failover associated with the individual virtual machines running the applications on the virtualized production server.
  • According to one aspect of the invention, the system and method described herein may provide the hypervisor host to hypervisor host replication and high availability deployment scenario to obviate or substantially reduce a need to install and configure different instances associated with the replication and high availability engine on individual virtual machines, which may advantageously provide hypervisor-level replication, switchover and failover, and rewind and recovery capabilities associated with all (or certain selected) virtual machines running on the virtualized production server (e.g., if a third party provides the replication and high availability engine, the hypervisor-level replication and high availability scenario may limit the need to purchase the license to only one per virtual host). Moreover, the hypervisor-level replication and high availability scenario may substantially reduce deployment time and costs because the requisite software need only be installed on the parent partition within each virtual host and may further reduce processor and memory usage because each virtual machine would not require a local replication and high availability engine instance. In addition, because the virtualized replica server may create the on-demand virtual machines in response to switchover or failover conditions, the hypervisor-level scenario may satisfy cold site definitions and thereby reduce costs associated with licensing operating systems and licenses associated with the applications running on the virtual hosts.
  • According to one aspect of the invention, to provide the hypervisor-level deployment scenario, the system and method described herein may have the virtualized production server automatically discover all virtual machines running thereon and create various replication scenarios according to the virtual machines selected to be replicated to the virtualized replica server. As such, the replication and high availability engine installed on the parent partition in the virtualized production server may replicate all the files associated with the discovered (or selected) virtual machines to the virtualized replica server, which may store the replicated files within one or more virtual machine files that correspond to the discovered or selected virtual machines, and any subsequent changes to the files associated with the discovered or selected virtual machines may be continuously replicated to the corresponding virtual machine files on the virtualized replica server in a similar manner. In response to a switchover or failover condition associated with the virtualized production server, the replication and high availability engine may bring the virtualized replica server online, use the virtualization stack in the parent partition to create on-demand virtual machines from the virtual machine files corresponding to the virtual machines on the virtualized production server, and redirect end users and workloads to the replica server to maintain consistency and minimize downtime. In one implementation, switchover or failover conditions associated with individual virtual machines may be handled similarly, wherein the virtualized replica server may start an appropriate on-demand virtual machine and redirect end users and workloads to the on-demand virtual machine to minimize downtime associated with the individual virtual machines experiencing disruption.
  • According to one aspect of the invention, the hypervisor-level deployment scenario may include the system and method described herein initially installing the replication and high availability engine in the parent partition on the virtualized production server (rather than the individual virtual machines) and the parent partition on the virtualized replica server. In addition, one or more components associated with the virtualization stack may be installed on the guest operating system associated with every virtual machine on the virtualized production server to determine host names associated with the virtual machines, whereby all the virtual machines on the virtualized production server may then be automatically discovered and a volume shadow copy service (VSS) writer associated with the virtualization stack may collect all the files relating to the discovered virtual machines (e.g., virtual hard disk files, configuration files, and snapshot files associated with each virtual machine). The replication and high availability engine may then automatically create various replication scenarios associated with each virtual machine to define various replication properties associated with each virtual machine, wherein the replication and high availability engine may then run all scenarios associated with all virtual machines to replicate and protect all the virtual machines, or alternatively select certain virtual machines (or certain scenarios associated with a particular virtual machine) to customize the replication scenarios used to protect the virtualized production server. In response to suitably synchronizing the files associated with the virtual machines on the virtualized production server to the virtualized replica server, any subsequent changes to the virtual machines may be replicated to the virtualized replica server.
  • According to one aspect of the invention, the system and method described herein may handle switchover or failover conditions associated with one or more virtual machines on the virtualized production server, which may include the virtualized replica server creating and registering one or more on-demand virtual machines corresponding the virtual machines associated with the switchover or failover condition (e.g., from the corresponding virtual machine files). In particular, the switchover and failover procedure may generally exchange active and standby roles between the virtualized production server and the virtualized replica server, whereby the virtualized production server may change to a standby role in response to the switchover or failover assigning the active role to the virtualized replica server. Furthermore, in response to the switchover or failover, the relevant scenarios may further specify how to handle reverse replication operations (e.g., replicating changes to the on-demand virtual machines to protect or otherwise backup changes to the files associated therewith), whereby changes to the on-demand virtual machines may continue to be replicated in accordance with the reverse replication operations specified in the relevant scenarios. In one implementation, the switchover or failover may be triggered manually or automatically.
  • According to one aspect of the invention, the system and method described herein may perform switchback or failback to return the active role to the virtualized production server and the standby role to the virtualized replica server subsequent to the switchover or failover exchanging the active and standby roles between the virtualized production server and the virtualized replica server. For example, to perform the switchback or failback, the system and method described herein may determine whether to overwrite the data that existed on the virtualized production server prior to the switchover or failover with the data existing on the virtualized replica server at the time that the switchback or failback will be performed. Furthermore, in response to data loss or data corruption on the virtualized production server, the lost or corrupted data can be restored from the virtualized replica server via reverse synchronization to the virtualized production server, or the lost or corrupted data may be recovered from a certain event in the past or a prior point in time via the data rewind capabilities (e.g., via a suitable event-stamped or time-stamped checkpoint and/or bookmark that can be used to roll the virtualized production server back to the event or point in time prior to when the data was lost or corrupted).
  • Other objects and advantages of the invention will be apparent to those skilled in the art based on the following drawings and detailed description.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 illustrates an exemplary architecture that may be used to provide a virtualized replication and high availability environment, according to one aspect of the invention.
  • FIG. 2A illustrates an exemplary system that may provide a physical to virtualized replication and high availability environment, while FIG. 2B illustrates an exemplary system that may provide a virtualized to virtualized replication and high availability environment, according to one aspect of the invention.
  • FIG. 3 illustrates an exemplary method that may be used to balance loads and manage switchover or failover conditions in a virtualized replication and high availability environment, according to one aspect of the invention.
  • DETAILED DESCRIPTION
  • According to one aspect of the invention, the system and method described herein may provide a virtualized replication and high availability environment. In particular, the system and method described herein may provide Windows, Linux, and Unix systems with high availability and continuous or periodic data protection associated with related applications and data to maximize uptime and availability associated with physical and virtualized environments. For example, the system and method described herein may provide simple mechanisms to migrate or replicate data between different servers and locations, whether physical or virtual, and to consolidate data between remote offices and a backup or archive facility and protect onsite and offsite data. Further, the system and method described herein may include non-disruptive recovery testing and data rewind capabilities to restore systems, applications, and data to prior states, which may be useful to speeding recovery times and minimizing data loss. Additionally, the system and method described herein may further include real-time server and application monitoring, automated and push-button failover or switchover, and automated and push-button switchback (or failback) to restore replica systems or replica applications in response to a production (or master) server having been repaired or replaced. In one implementation, the system and method described herein may perform monitoring at server, application, and hypervisor and virtual machine levels, which may enable the system and method to respond to issues at physical, application, and virtualization levels, and may further replicate operating systems, system states, and application data to an offline replica server, which may enable the system and method to improve protection speeds, reduce costs, and safely test and migrate from a physical to a virtual server or from one virtual server to another virtual server. Moreover, the system and method described herein may include a unified management console across all operating systems, virtualization platforms, and applications to easily visualize and manage the virtualized replication and high availability environment.
  • According to one aspect of the invention, FIG. 1 illustrates an exemplary architecture 100 that may be used to provide the virtualized replication and high availability environment that will be described in further detail herein. In particular, the architecture 100 illustrated in FIG. 1 may include a hypervisor 120 that runs directly on underlying hardware 110, wherein the hypervisor 120 may run a guest operating system and support isolated partitions. For example, in one implementation, the architecture 100 may include Microsoft Hyper-V server virtualization technology integrated with Windows Server 2008, which may be used to create and run separate virtual machines on one physical machine and thereby consolidate multiple server and application roles and better leverage server hardware investments. Furthermore, the Hyper-V architecture 100 may natively support x64 computing, which may be leveraged to efficiently run multiple different operating systems (e.g., Windows®, Linux, etc.) in parallel on one physical server, and may allow assigning multiple processors or processor cores to one virtual machine, which may provide a future-proof virtualization technology that can utilize the increased processing capacity associated with multi-core processors.
  • In one implementation, the hypervisor 120 used in the architecture 100 may have a parent partition that runs an appropriate guest operating system (e.g., Windows Server 2008 if the architecture 100 implements Microsoft Hyper-V), wherein a virtualization stack may run in the parent partition and have direct access to the underlying hardware devices 110. The parent partition may then create one or more child partitions that can host any suitable guest operating system (e.g., Windows Server 2008, Windows NT 4.0, Linux distributions, etc.). In one implementation, the virtualization stack running in the parent partition may include various components that run in a kernel mode or privileged processor ring (i.e., “Ring 0”), including a VMBus 130 a that provides a logical channel to redirect requests and responses between virtual devices in the child partitions and the parent partition that has access to the underlying hardware 110 and thereby manage inter-partition communication between the parent and child partitions. In one implementation, the virtualization stack in the parent partition may further include various device drivers 135 associated with virtual machines running in the child partitions, a kernel 140 a to support the guest operating system instance (e.g., Windows Server 2008) running in the parent partition, and a virtualization service provider 150 that handles device access requests from the child partitions via the VMBus 130 a. For example, the child partitions may similarly include a kernel 140 b to support the guest operating system running therein, which may be the same or different from the guest operating system running in the parent partition, a VMBus 130 b to communicate with the parent partition, and a virtualization service consumer (or virtualization service client) 155 that transparently communicates with the virtualization service provider 150 in the parent partition (e.g., via the VMBus 130 a and VMBus 130 b) to redirect and fulfill device access requests that originate in the child partitions.
  • In one implementation, the virtualization stack in the parent partition may further include various components that run in a user mode or less privileged ring (i.e., “Ring 3”), including a virtual machine interface provider 160 that guest operating systems or applications 190 running in the child partitions can use to communicate with the hypervisor 120 (via the VMBus 130 a-b). In addition, the components running in Ring 3 may include a virtual machine management service 170 that can manage states associated with the virtual machines running applications 190 in the child partitions and control tasks that relate to states associated therewith (e.g., capturing snapshots associated with the virtual machines). To that end, the virtual machine management service 170 may create one or more virtual machine worker processes 180 to start corresponding virtual machine instances that run the applications 190 in the child partitions, wherein the virtual machine worker processes 180 may handle management level interactions between the parent partition and the virtual machines in the child partitions. For example, in one implementation, the virtual machine worker processes 180 may create, configure, run, pause, resume, save, restore, and snapshot the associated virtual machine instance in the child partitions, and may further handle interrupt requests, memory, and input/output port mapping associated with the virtual machine instances. In one implementation, further detail relating to the Microsoft Hyper-V virtualization technology that may be used in the architecture 100 may be found in “Virtualization for Windows: A Technology Overview” and “Getting to Know Hyper-V: A Walkthrough from Initial Setup to Common Scenarios,” the contents of which are hereby incorporated by reference in their entirety.
  • According to one aspect of the invention, FIG. 2A illustrates an exemplary system 200A that may provide a physical to virtualized replication and high availability environment, while FIG. 2B illustrates an exemplary system 200B that may provide a virtualized to virtualized replication and high availability environment. In particular, the system 200A may generally include a physical production (or active) server 220 and a virtualized replica (or standby) server 260 to ensure that various applications 250 running on the production server 220 will have absolute operational continuity to a certain degree over a given measurement period. In a similar respect, the system 200B shown in FIG. 2B may include a virtualized production server 220 having various virtual machines 255 that run the applications 250 on the production server 220, wherein the virtualized replica server 260 may similarly ensure that the applications 250 running in the virtual machines 255 on the virtualized production server 220 will have absolute operational continuity over the measurement period.
  • As such, to provide the virtualized replication and high availability environment, the systems 200A-B may respectively validate consistency between the applications 250 running on the physical production server 220 and the virtual machines 255 running the applications 250 on the virtualized production server and various virtual machine files 270 hosted on the replica server 260 that respectively correspond to the applications 250 and the virtual machines 255 running on the production server 220, which may enable recovering the applications 250 (or the virtual machines 255 running the applications 250) on the production server 220 from the virtual machine files 270 on the replica server 260. For example, in one implementation, the applications 250 running on the production server 220 (or the virtual machines 255 running the applications 250 on the production server 220) may become unavailable due to downtime, failure, or other loss or disruption associated with the production server 220, in which case the system 200 may load the applications 250 (or the virtual machines 255 running the applications 250 on the production server 220) on the replica server 260 to ensure continuity and thereby handle the downtime, failure, or other loss or disruption associated with the production server 220. In one implementation, the procedure in which the system 200 loads the applications 250 (or the virtual machines 255 running the applications 250) on the replica server 260 to ensure continuity may be considered switchover if the downtime was planned (e.g., to upgrade or maintain the production server 220) or failover if the downtime was unplanned (e.g., because the production server 220 failed due to a threat, overload condition, or other emergency that was not anticipated in advance). Moreover, the procedure to subsequently recover the applications 250 (or the virtual machines 255 running the applications 250) on the production server 220 from the replica server 260 may be referred to as switchback or failback.
  • In one implementation, the systems 200A-B shown in FIGS. 2A-B may both include a replication and high availability engine 240 a on the production server 220 and a similar replication and high availability engine 240 b on the replica server 260, wherein the replication and high availability engines 240 a-b may use asynchronous real-time replication and proactive validation to test whether the virtual machine files 270 hosted on the replica server 260 can reliably recover the applications 250 (or the virtual machines 255 running the applications 250) on the production server 220 to provide cost-effective disaster recovery. In particular, data associated with various applications 250 and files, databases, or other suitable data sources relating thereto may be synchronized between the production server 220 and the replica server 260 and subsequent changes to the data may be asynchronously replicated between the production server 220 and the replica server 260, wherein the data may be synchronized and replicated over local area networks, wide area networks, or other suitable networks that have the replication and high availability engine 240 installed therein and the appropriate TCP or other network connections needed to communicate with one another. Thus, in one implementation, the virtualized replication and high availability environment provided in the systems 200A and 200B may provide data synchronization, asynchronous real-time data replication, and automated switchover, failover, and switchback to provide data continuity in various deployment scenarios. For example, as will be described in further detail herein, the various deployment scenarios may include full system protection (physical or virtual) using a hypervisor host, and may further include replication and high availability in physical to virtual guest, virtual guest to virtual guest, and hypervisor host to hypervisor host environments.
  • In one implementation, the system 200A shown in FIG. 2A may use the hypervisor host to provide the physical full system protection deployment scenario, while the system 200B shown in FIG. 2B may use the hypervisor host to provide the virtual full system protection deployment scenario, wherein either full system protection deployment scenario may provide application-independent synchronization to transfer a complete state associated with the production server 220 to the virtualized replica server 260 and subsequently replicate changes to the state associated with production server 220 to the virtualized replica server 260. For example, in the system 200A, the replication and high availability engine 240 a on the physical production server 220 may read data directly from volumes associated with various master applications 250 a-n running on the physical production server 220 to obtain any suitable files and data relating to the operating system, system state, and disk layout associated with the master applications 250 a-n. In one implementation, the replication and high availability engine 240 a may then serialize and send the data read from the volumes associated with the master applications 250 a-n to the virtualized replica server 260, which may inject the serialized data into virtual hard disk (*.vhd) files 270 a-n that represent the volumes associated with the master applications 250 a-n. For example, in one implementation, the virtualized replica server 260 may include the hypervisor host within a virtualization stack 230 b having a substantially similar architecture to that shown in FIG. 1 and described above, whereby the hypervisor host may run various different operating systems in one or more child partitions to support the operating systems that run the master applications 250 on the physical production server 220.
  • As such, to synchronize and transfer the complete state associated with the physical production server 220 to the virtualized replica server 260, a replication and high availability engine 240 b on the virtualized replica server 260 may use the hypervisor host in the virtualization stack 230 b to inject the serialized data into the *.vhd files 270 and thereby perform volume-level synchronization associated with the master applications 250 a-n running on the physical production server 220. Furthermore, the system 200B may operate in a substantially similar manner to synchronize and transfer the complete state associated with the virtualized production server 220 to the virtualized replica server 260, except that the virtualized production server 220 may include a virtualization stack 230 a having a hypervisor host that can read the operating system, system state, disk layout, and other volume data associated with the master applications 250 directly from virtual hard disk files associated with virtual machines 255 that run the applications 250 on the virtualized production server 220. In either scenario, subsequent to suitably synchronizing the complete state associated with the applications 250 running on the production server 220 (or the virtual machines 255 running the applications 250), the replication and high availability engine 240 a on the production server 220 may replicate any changes to the applications 250 or the virtual machines 255 that run the applications 250 to the virtualized replica server 260, which may replicate the changes within the *.vhd files 270 that correspond to the changed applications 250 or virtual machines 255. In one implementation, the replication may be performed at the file-level, including all files on the volumes associated with the master applications 250 in addition to any files in system folders that relate to the production server 220. Moreover, the systems 200A-B may both use reverse path lookups to maintain consistent mappings between the *.vhd files 270 hosted on the virtualized replica server 260 and the volumes or file systems associated with the master applications 250 or virtual machines 255 on the production server 220 (e.g., using techniques described in U.S. patent application Ser. No. 13/234,532, entitled “System and Method for Network File System Server Replication Using Reverse Path Lookup,” filed Sep. 16, 2011, the contents of which are hereby incorporated by reference in their entirety).
  • In one implementation, the virtualized replica server 260 may initially have an offline state to prevent network address, network name, or other network conflicts (i.e., because the virtualized replica server 260 represents an effective clone associated with the physical or virtualized production server 220). However, to handle switchover or failover in response to downtime, failure, or other loss or disruption associated with a particular master application 250 or virtual machine 255 running thereon, the disrupted master application 250 or virtual machine 255 may be disabled and the virtualized replica server 260 may create an on-demand virtual machine 280 from the virtual machine file 270 corresponding thereto. For example, in addition to the *.vhd files 270 that represent the volumes or file systems associated with the master applications 250 a-n and virtual machines 255, the virtual machine files 270 on the virtualized replica server 260 may include a *.xml file that contains information to configure the operating system, disk size, network, and other aspects associated with the on-demand virtual machine 280 and a *.avhd file that contains a most recent snapshot associated with the master application 250 or virtual machine 255, which may be created, validated, and otherwise managed using techniques described in U.S. patent application Ser. No. 13/043,201, entitled “System and Method for Providing Assured Recovery and Replication,” filed Mar. 8, 2011, and U.S. patent application Ser. No. 13/234,532, the contents of which are hereby incorporated by reference in their entirety. As such, to handle the switchover or failover condition, the virtualization stack 230 b may use the virtual machine files 270 to create the on-demand virtual machine 280 and make the on-demand virtual machine 280 available to ensure continuity associated with the application 250 or virtual machine 255 that were disrupted on the production server 220. For example, the virtualization stack 230 b may use the hypervisor host to configure the on-demand virtual machine 280 with various values specified in the *.xml configuration file 270, mount a disk image from the *.vhd file 270 and connect the on-demand virtual machine 280 to the mounted disk image, configure network connections associated with the on-demand virtual machine 280 with information specified in the *.xml configuration file 270, and then boot the on-demand virtual machine 280 and install integration services to make the on-demand virtual machine 280 available to end users.
  • Accordingly, because the physical or virtual full system protection deployment scenarios described above may protect the entire state associated with the production server 220, the physical or virtual full system protection deployment scenarios may support large sets of applications 250 and environments. Moreover, the full system protection deployment scenarios may be simple to deploy because transferring the entire state associated with the production server 220 to the virtualized replica server 260 in an automated manner may obviate or substantially reduce any need to manually provision the virtualized replica server 260 prior to initiating replication operations.
  • In one implementation, the system 200A shown in FIG. 2A may be used to provide the physical to virtual guest replication and high availability deployment scenario. In particular, the system 200A may include a physical production server 220 having various master applications 250 a-n and a replication and high availability engine 240 a running thereon in addition to a virtualized replica server 260 having a similar replication and high availability engine 240 b and a virtualization stack 230 b to manage various virtual machine files 270 a-n that correspond to the master applications 250 a-n running on the physical production server 220. As such, the replication and high availability engine 240 a may generally replicate data associated with the master applications 250 a-n residing on the physical production server 220 or any other suitable data residing on the physical production server 220 to the virtualized replica server 260, wherein the physical production server 220 and the virtualized replica server 260 may reside at the same location, or the virtualized replica server 260 may be located at a remote data center or remote office that provides a data protection and disaster recovery site associated with the physical production server 220.
  • In one implementation, in response to suitably synchronizing the physical production server 220 and the virtualized replica server 260 (e.g., via the full system protection techniques described above or another suitable mechanism), the physical to virtual guest replication and high availability deployment scenario may continuously capture and replicate byte-level changes to the master applications 250 a-n and any databases or files on the physical production server 220 to the virtualized replica server 260. For example, in one implementation, the byte-level changes may be captured and replicated using techniques described in U.S. patent application Ser. No. 10/188,512, entitled “Method and System for Updating an Archive of a Computer File,” filed Jul. 3, 2002, which issued as U.S. Pat. No. 7,730,031 on Jun. 1, 2010, the contents of which are hereby incorporated by reference in their entirety. As such, the virtualization stack 230 b on the virtualized replica server 260 may include an active hypervisor that has access to underlying hardware and runs a guest operating system to replicate the changes within various virtual machine files 270 a-n that correspond to the master applications 250 a-n and other data volumes residing on the physical production server 220 and thereby deliver continuous onsite or offsite data protection. Moreover, in one implementation, the changes captured and replicated from the physical production server 220 to the virtualized replica server 260 may be recorded in a rewind log to preserve a context that can be used, for example, to track the changes, undo the changes at the production server 220, or locate a switch point in the virtual machine files 270 on the replica server 260 that can be used to suitably resume business operations in response to a disaster or other failure associated with the production server 220 (e.g., using techniques described in U.S. patent application Ser. No. 10/981,837, entitled “Replicated Data Validation,” filed Nov. 5, 2004, which issued as U.S. Pat. No. 7,840,535 on Nov. 23, 2010, the contents of which are hereby incorporated by reference in their entirety).
  • Accordingly, the physical to virtual guest replication and high availability deployment scenario may therefore synchronize and replicate the physical production server 220 to the virtualized replica server 260 to support automated or manual switchover and failover to redirect workloads from the physical production server 220 to the virtualized replica server 260. For example, in one implementation, the virtualized replica server 260 may invoke one or more components in the virtualization stack 230 b to automatically start one or more on-demand virtual machines 280 in response to downtime, failure, outage, or other disruption associated with the physical production server 220 or one or more applications 250 running thereon. In particular, the virtualization stack 230 b may start the one or more on-demand virtual machines 280 from the virtual machine files 270 that correspond to the applications 250 experiencing disruption on the physical production server 220, wherein end users and workloads associated with the disrupted applications 250 may be automatically redirected to the on-demand virtual machines 280 started on the virtualized replica server 260 to handle the switchover or failover and thereby minimize business downtime. Alternatively, the procedure to start the on-demand virtual machines 280 on the virtualized replica server 260 and redirect the end users and workloads associated with the disrupted applications 250 to the virtualized replica server 260 may be initiated manually, whereby information technology personnel may investigate the issues that caused the disruption prior to performing the switchover or failover (if necessary).
  • In one implementation, the system 200B shown in FIG. 2B may be used to provide the virtual guest to virtual guest and hypervisor host to hypervisor host replication and high availability deployment scenarios. In particular, the virtual guest to virtual guest replication and high availability deployment scenario may generally synchronize and replicate the production server 220 to the virtualized replica server 260 in a substantially similar manner to the physical to virtual guest replication and high availability deployment scenario, and may further handle switchover and failover in a substantially similar manner to the physical to virtual guest scenario. However, in the virtual guest to virtual guest scenario, the production server 220 may be virtualized, whereby the virtualized production server 220 may include one or more virtual machines 255 to execute the applications 255 on the virtualized production server 220, while the virtualized replica server 260 runs one or more virtual machines 280 and maintains one or more virtual machine files 270 that correspond to the virtual machines 255 executing the applications 255 on the virtualized production server 220 (e.g., mirroring the *.vhd virtual hard disk files, the *.xml configuration files, and the *.avhd snapshot files associated with the virtual machines running on the virtualized production server). Furthermore, the virtual guest to virtual guest deployment scenario may have different instances of the replication and high availability engine 240 a installed and configured on the individual virtual machines 255 running on the virtualized production server 220, and may similarly have different instances of the replication and high availability engine 240 b installed and configured on the individual virtual machines 280 running on the virtualized replica server 260. Accordingly, the replication and high availability engine instances 240 a on the virtualized production server 220 and the replication and high availability engine instances 240 b on the virtualized replica server 260 may communicate with one another to synchronize, replicate, and manage switchover and failover associated with the individual virtual machines 255 running the applications 250 on the virtualized production server 220.
  • In one implementation, the hypervisor host to hypervisor host replication and high availability deployment scenario may obviate or substantially reduce a need to install and configure different instances associated with the replication and high availability engine 240 on individual virtual machines, which may advantageously provide hypervisor-level replication, switchover and failover, and rewind and recovery capabilities associated with all (or certain selected) virtual machines 255 running on the virtualized production server 220. In particular, if a third party provides the replication and high availability engine 240, the hypervisor-level replication, switchover and failover, and rewind and recovery capabilities may require only one license to purchase the replication and high availability engine 240 from the third party per virtual host (e.g., one license for the replication and high availability engine 240 a on the virtualized production server 220 and one license for the replication and high availability engine 240 b on the virtualized replica server 260). Moreover, the hypervisor-level capabilities may substantially reduce deployment time and costs because the requisite software need only be installed on the parent partition within the virtualized production server 220 and the virtualized replica server 260, and may further reduce processor and memory usage because each virtual machine 255 would not require a locally installed replication and high availability engine instance 240 a. In addition, because the virtualized replica server 260 only creates the on-demand virtual machines 280 in response to a switchover or failover condition, the hypervisor-level deployment scenario may satisfy cold site definitions and thereby reduce costs associated with licensing operating systems and licenses associated with the applications 250 running in the virtual machines 255 on the virtualized production server 220.
  • In one implementation, to provide the hypervisor-level deployment scenario shown in FIG. 2B, the virtualized production server 220 may automatically discover all the virtual machines 255 running thereon and create various replication scenarios according to the virtual machines 255 that are selected to be replicated to the virtualized replica server 260. As such, the replication and high availability engine 240 a installed on the parent partition in the virtualized production server 220 may then replicate all the files associated with the discovered virtual machines 255 (or the virtual machines 255 selected to be replicated) to the virtualized replica server 260, which may use the replication and high availability engine 240 b to store the replicated files within one or more virtual machine files 270 that correspond to the discovered (or selected) virtual machines 255. In a similar respect, in response to any subsequent changes to the files associated with the discovered (or selected) virtual machines 255, the changes may be continuously replicated within the corresponding virtual machine files 270 stored on the virtualized replica server 260 (via the replication and high availability engine 240 b). In response to any switchover (planned downtime) or failover (unplanned downtime) conditions associated with the virtualized production server 220, the replication and high availability engine 240 b may bring the virtualized replica server 260 online, use the virtualization stack 230 b in the parent partition to create on-demand virtual machines 280 within one or more child partitions from the virtual machine files 270 corresponding to the discovered or selected virtual machines on the virtualized production server 220, and redirect end users and workloads to the replica server 260 to maintain consistency and minimize downtime associated with the disruption to the virtualized production server 220. In one implementation, switchover or failover conditions associated with individual discovered or selected virtual machines 255 may be handled in a similar manner, whereby the virtualized replica server 260 may start an appropriate on-demand virtual machine 280 and redirect end users and workloads to the on-demand virtual machine 280 to minimize downtime associated with the individual discovered or selected virtual machines 255 experiencing disruption.
  • More particularly, the hypervisor-level replication and high availability scenario may include initially installing the replication and high availability engine 240 a in the parent partition on the virtualized production server 220 (rather than the individual virtual machines 255) and similarly installing the replication and high availability engine 240 b in the parent partition on the virtualized replica server 260. In addition, one or more components associated with the virtualization stack may be installed on the guest operating system associated with every virtual machine 255 on the virtualized production server 220 to enable the replication and high availability engine 240 a to determine host names associated with the virtual machines 255. The replication and high availability engine 240 a may then automatically discover all the virtual machines 255 on the virtualized production server 220 and use a volume shadow copy service (VSS) writer associated with the virtualization stack 230 a to collect all the files 270 relating to the discovered virtual machines 255, wherein the collected files 270 may include the *.vhd files that represent virtual hard disks associated with each virtual machine 255, the *.xml configuration files that contain unique identifiers and various settings associated with each virtual machine 255, and the *.avhd files that contain all snapshots associated with the individual virtual machines 255.
  • In one implementation, in response to suitably collecting all the files 270 relating to the discovered virtual machines 255, the replication and high availability engine 240 a may automatically create various replication scenarios associated with each virtual machine 255. In particular, the replication scenarios may generally define various replication properties associated with each virtual machine 255, wherein the properties may enable or disable scheduled bookmarks on the production server 220, set spool sizes and directory paths, replicate in online or scheduled modes, specify whether to synchronize at a file-level or block-level, specify whether to ignore certain files having the same size and type, specify whether to run a script, send an email, or log results to handle event notifications and reporting, and enable or disable delays or data rewind capabilities, among others. In one implementation, each replication scenario associated with an individual virtual machine 255 may include all the files 270 relating thereto, including the *.vhd, virtual hard disk file, the *.xml configuration file, and the *.avhd snapshot file associated with the individual virtual machine 255. In one implementation, the replication and high availability engine 240 a may then run all the scenarios associated with all the virtual machines 255 in order to replicate and protect all the virtual machines 255. Alternatively, one or more virtual machines 255 (or certain scenarios associated with a particular virtual machine 255) may be selected to customize the replication scenarios used to protect the virtualized production server 220. In one implementation, in response to suitably synchronizing the files 270 associated with the virtual machines 255 on the virtualized production server 220 that are to be replicated to the virtualized replica server 260, the replication and high availability engine 240 a may then replicate any subsequent changes to the files 270 associated with such virtual machines 255 to the virtualized replica server 260 (via the replication and high availability engine 240 b installed thereon).
  • In one implementation, in response to a switchover or failover condition associated with one or more virtual machines 255 on the virtualized production server 220, the virtualized replica server 260 may then create and register one or more on-demand virtual machines 280 corresponding to the one or more virtual machines 255 associated with the switchover or failover condition on the virtualized production server 220, wherein the on-demand virtual machines 280 may be created from the virtual machine files 270 corresponding to the virtual machines 255 associated with the switchover or failover condition. In particular, the switchover and failover procedure may generally exchange active and standby roles between the virtualized production server 220 and the virtualized replica server 260, whereby the virtualized production server 220 may change to a standby role in response to the switchover or failover assigning the active role to the virtualized replica server 260. Furthermore, in response to performing the switchover or failover, the relevant scenarios may further specify how to handle reverse replication operations (e.g., replicating changes to the on-demand virtual machines 280 to protect or otherwise backup changes to the files 270 associated therewith), whereby the replication and high availability engine 240 b on the virtualized replica engine 260 may continue to replicate changes to the on-demand virtual machines 280 in accordance with the reverse replication operations specified in the relevant scenarios once the virtualized production server 220 becomes available (e.g., changes may be resynchronized from the virtualized replica server 260 to the virtualized production server 220, which may include comparing data on the virtualized production server 220 to data on the virtualized replica server 260 to determine the changes to replicate back to the virtualized production server 220).
  • In one implementation, the switchover or failover may be triggered manually (e.g., due to planned downtime, to balance loads among the virtualized production server 220 and the virtualized replica server 260, in response to a notification that the virtualized production server 220 has become unavailable, etc.). Alternatively, the switchover or failover may be triggered automatically (at a scheduled time or in response to detecting that the virtualized production server 220 has become unavailable), wherein the replication and high availability engine 240 b on the virtualized replica server 260 may periodically check the status associated with the virtualized production server 220 to determine whether to trigger the switchover or failover procedure. For example, in one implementation, the replication and high availability engine 240 b may periodically send ping requests to the virtual machines 255 running on the virtualized production server 220 and automatically bring up the corresponding on-demand virtual machine 280 on the virtualized replica server 260 if the virtualized production server 220 does not respond. Alternatively, the virtualized replica server 260 may check the status associated with the virtualized production server 220 via custom requests to monitor specific applications 250 or virtual machines 255 or requests to databases or services running in the parent partition associated with the virtualized production server 220 to verify the status associated therewith. In another alternative, the switchover may be manually triggered to test certain applications 250 or virtual machines 255 on the virtualized replica server 260 without disrupting or otherwise interfering with operations on the virtualized production server 220.
  • In one implementation, subsequent to the switchover or failover exchanging the active and standby roles between the virtualized production server 220 and the virtualized replica server 260, switchback or failback may be performed to return the active role to the virtualized production server 220 and the standby role to the virtualized replica server 260. In one implementation, performing the switchback or failback may include determining whether to overwrite the data that existed on the virtualized production server 220 prior to the switchover or failover with the data existing on the virtualized replica server 260 at the time that the switchback or failback has been initiated. Furthermore, in response to an event that causes data loss on the virtualized production server 220, the lost data can be restored from the virtualized replica server 260 via reverse synchronization to the virtualized production server 220, or the lost data may be recovered from a certain event or point in time via the data rewind capabilities, which may involve locating a suitable event-stamped or time-stamped checkpoint and/or bookmark to roll lost or corrupted data on the virtualized production server 220 back to the event or point in time prior to when the data was lost or corrupted. In one implementation, further detail relating to techniques that may be used to handle replication, switchover or failover, switchback or failback, and the data rewind capabilities in the system 200B may be described in “CA ARCserve Replication and High Availability for Virtualized Server Environments Operating Guide for Windows r16—Protecting Hyper-V Environments,” the contents of which are hereby incorporated by reference in their entirety.
  • According to one aspect of the invention, FIG. 3 illustrates an exemplary method 300 that may be used to balance loads and manage switchover or failover conditions in a virtualized replication and high availability environment. In particular, the method 300 may include an initial operation 310 to install a replication and high availability engine in a parent partition on a virtualized production server (rather than individual virtual machines running in child partitions on the virtualized production server) and a similar replication and high availability engine in a parent partition on a virtualized replica server. In addition, operation 310 may include installing various virtualization stack components in guest operating systems associated with the virtual machines running on the virtualized production server to enable the replication and high availability engine to determine host names associated therewith. In one implementation, the replication and high availability engine may then automatically discover all the virtual machines on the virtualized production server in an operation 320, which may further include a volume shadow copy service (VSS) writer associated with the virtualization stack collecting all the files relating to the discovered virtual machines (e.g., *.vhd files that represent virtual hard disks associated with each virtual machine, *.xml configuration files that contain unique identifiers and various settings associated with each virtual machine, and *.avhd files that contain all snapshots associated with the individual virtual machines).
  • In one implementation, in response to suitably collecting all the files relating to the discovered virtual machines, the replication and high availability engine may automatically create various replication scenarios associated with each virtual machine in an operation 330. In particular, the replication scenarios may generally define various replication properties associated with each virtual machine (e.g., whether to enable or disable scheduled bookmarks, establishing spool sizes and directory paths, whether to replicate in online or scheduled modes, etc.). In one implementation, each replication scenario associated with an individual virtual machine may include all the files relating thereto, including the *.vhd, virtual hard disk file, the *.xml configuration file, and the *.avhd snapshot file associated with the individual virtual machine, which may be written to the virtualized replica server. In one implementation, the replication and high availability engine may then run all the scenarios associated with all the virtual machines in an operation 340 to replicate and protect all the virtual machines. Alternatively, operation 340 may include selecting certain virtual machines (or certain scenarios associated with a particular virtual machine) to customize the replication scenarios used to protect the virtualized production server in operation 340. In one implementation, in response to initially synchronizing the files associated with the virtual machines to the virtualized replica server in operation 330, the replication and high availability engine may then replicate any subsequent changes to the files associated with such virtual machines to the virtualized replica server in operation 340 (i.e., via a replication and high availability engine installed thereon).
  • In one implementation, a load associated with the virtualized production server may then be analyzed in an operation 350 to determine whether or not to initiate a procedure to balance the load associated with the virtualized production server. For example, an operation 360 may determine whether the virtualized production server currently has an overloaded status or could otherwise benefit from offloading one or more workloads to a standby or other alternate server. As such, in response to operation 360 triggering a load balance condition associated with the virtualized production server, an operation 380 may register one or more on-demand virtual machines to offload and redirect certain workloads from the virtualized production server, as will be described in further detail below. Otherwise, in response to operation 360 determining that the load associated with the virtualized production server does not reflect a need to balance the load, an operation 370 may determine whether or not a switchover or failover condition associated with the virtualized production server or the virtual machines running thereon has occurred. In one implementation, operation 370 may trigger the switchover or failover manually due to planned downtime, in response to a notification that the virtualized production server has become unavailable, or in other appropriate circumstances, or operation 370 may alternatively triggered the switchover or failover automatically (e.g., at a scheduled time, in response to detecting unavailability associated with the virtualized production server or certain virtual machines running thereon, etc.). For example, a replication and high availability engine on the virtualized replica server may periodically check the status associated with the virtualized production server in operation 370 to determine whether to trigger the switchover or failover procedure (e.g., sending ping requests to the virtual machines running on the virtualized production server to determine whether the virtualized production server responds to indicate availability, sending custom requests to specific applications, virtual machines, databases, or services running in the parent partition on the virtualized production server to verify the status associated therewith, etc.).
  • In one implementation, in response to operation 360 triggering a load balance associated with the virtualized production server or operation 370 detecting a switchover or failover condition, operation 380 may include the virtualized replica server creating and registering one or more on-demand virtual machines corresponding to any virtual machines on the virtualized production server that are associated with the load balance, switchover, failover condition. In particular, operation 380 may create the on-demand virtual machines from the virtual machine files corresponding to the virtual machines associated with the load balance, switchover, or failover condition and exchange active and standby roles between the virtualized production server and the virtualized replica server. As such, registering the on-demand virtual machines to perform the load balance, switchover, or failover condition may change the virtualized production server to a standby role and assign an active role to the virtualized replica server. Furthermore, in response to performing the load balance, switchover, or failover, the relevant replication scenarios may further specify how to handle reverse replication operations, which may be performed using a method having substantially similar characteristics to the method 300 shown in FIG. 3 and described in further detail herein (i.e., replicating changes to the on-demand virtual machines to protect or otherwise backup changes to the files associated therewith after the virtualized replica server became active). Accordingly, the replication and high availability engine on the virtualized replica engine may use the above-described techniques to continue replicating changes to the on-demand virtual machines in accordance with the reverse replication operations specified in the relevant scenarios once the virtualized production server becomes available (e.g., to resynchronize changes from the virtualized replica server to the virtualized production server, data on the virtualized production server may be compared to data on the virtualized replica server to determine the changes that need to be replicated back to the virtualized production server).
  • In one implementation, subsequent to the load balance, switchover, or failover performed in operation 380 to exchange the active and standby roles between the virtualized production server and the virtualized replica server, switchback or failback may be performed in a similar manner to return the active role to the virtualized production server and the standby role to the virtualized replica server. For example, in one implementation, performing the switchback or failback may include determining whether to overwrite the data that existed on the virtualized production server prior to the load balance, switchover, or failover with the data existing on the virtualized replica server at the time that the switchback or failback has been scheduled to occur. Furthermore, in response to an event that causes data loss on the virtualized production server, the lost data can be restored from the virtualized replica server via reverse synchronization to the virtualized production server, or the lost data may be recovered from a certain event or point in time via data rewind capabilities, which may involve locating a suitable event-stamped or time-stamped checkpoint and/or bookmark to roll the virtualized production server back to the event or point in time prior to when the data loss or corruption occurred on the virtualized production server.
  • Implementations of the invention may be made in hardware, firmware, software, or any suitable combination thereof. The invention may also be implemented as instructions stored on a machine-readable medium that can be read and executed on one or more processing devices. For example, the machine-readable medium may include various mechanisms that can store and transmit information that can be read on the processing devices or other machines (e.g., read only memory, random access memory, magnetic disk storage media, optical storage media, flash memory devices, or any other storage or non-transitory media that can suitably store and transmit machine-readable information). Furthermore, although firmware, software, routines, or instructions may be described in the above disclosure with respect to certain exemplary aspects and implementations performing certain actions or operations, it will be apparent that such descriptions are merely for the sake of convenience and that such actions or operations in fact result from processing devices, computing devices, processors, controllers, or other hardware executing the firmware, software, routines, or instructions. Moreover, to the extent that the above disclosure describes executing or performing certain operations or actions in a particular order or sequence, such descriptions are exemplary only and such operations or actions may be performed or executed in any suitable order or sequence.
  • Furthermore, aspects and implementations may be described in the above disclosure as including particular features, structures, or characteristics, but it will be apparent that every aspect or implementation may or may not necessarily include the particular features, structures, or characteristics. Further, where particular features, structures, or characteristics have been described in connection with a specific aspect or implementation, it will be understood that such features, structures, or characteristics may be included with other aspects or implementations, whether or not explicitly described. Thus, various changes and modifications may be made to the preceding disclosure without departing from the scope or spirit of the invention, and the specification and drawings should therefore be regarded as exemplary only, with the scope of the invention determined solely by the appended claims.

Claims (24)

What is claimed is:
1. A system for providing a virtualized replication and high availability environment, wherein the system comprises:
a production server having hardware to host a virtualization architecture, wherein the virtualization architecture includes a parent partition that contains a virtualization stack having access to the hardware associated with the production server and one or more child partitions configured to execute one or more virtual machines;
a replica server having hardware to host the virtualization architecture; and
a replication and high availability engine installed in the parent partition associated with the production server, wherein the replication and high availability engine is configured to:
synchronize virtual machine files associated with the one or more virtual machines executed in the one or more child partitions to the replica server; and
run one or more replication scenarios to replicate changes to the virtual machine files associated with the one or more virtual machines to the replica server.
2. The system recited in claim 1, wherein the replication and high availability engine is further configured to invoke a volume shadow copy service writer associated with the virtualization stack to capture the virtual machine files synchronized to the replica server.
3. The system recited in claim 2, wherein the virtual machine files associated with the one or more virtual machines include virtual hard disk files having a *.vhd file format, configuration files having a *.xml file format, and snapshot files having a *.avhd file format.
4. The system recited in claim 1, wherein the virtualization architecture further includes a hypervisor configured to isolate the parent partition from the one or more child partitions.
5. The system recited in claim 1, wherein the virtualization stack includes one or more components configured to manage inter-partition communication between the parent partition and the one or more child partitions.
6. The system recited in claim 1, wherein the virtualization architecture hosted on the replica server includes a parent partition configured to:
create one or more on-demand virtual machines from the synchronized virtual machine files associated with the one or more replication scenarios;
start the one or more on-demand virtual machines in one or more child partitions; and
redirect end users and workloads from the one or more virtual machines executed in the one or more child partitions on the production server to the one or more on-demand virtual machines started in the one or more child partitions on the replica server.
7. The system recited in claim 6, wherein the parent partition is configured to automatically redirect the end users and the workloads to the one or more on-demand virtual machines on the replica server in response to a switchover event that relates to planned downtime associated with the production server or the virtual machines executed thereon.
8. The system recited in claim 6, wherein the parent partition is configured to automatically redirect the end users and the workloads to the one or more on-demand virtual machines on the replica server in response to a failover event that relates to unplanned downtime associated with the production server or the virtual machines executed thereon.
9. The system recited in claim 6, wherein the parent partition is configured to automatically redirect the end users and the workloads to the one or more on-demand virtual machines on the replica server to a balance a load associated with the production server.
10. The system recited in claim 6, wherein the parent partition is configured to redirect the end users and the workloads to the one or more on-demand virtual machines on the replica server to test the one or more on-demand virtual machines on the replica server without disrupting the one or more virtual machines executing on the production server.
11. The system recited in claim 6, wherein the replica server is further configured to run the one or more replication scenarios in reverse to synchronize virtual machine files associated with the one or more on-demand virtual machines or changes to the virtual machine files associated with the one or more on-demand virtual machines to the production server.
12. The system recited in claim 11, wherein the replication and high availability engine is further is configured to redirect the end users and the workloads from the one or more on-demand virtual machines on the replica server to the one or more virtual machines in the one or more child partitions on the production server in response to the replica server having run the one or more replication scenarios in reverse to perform switchback or failback from the replica server to the production server.
13. A method for providing a virtualized replication and high availability environment, comprising:
hosting a virtualization architecture on a production server, wherein the virtualization architecture includes a parent partition that contains a virtualization stack having access to hardware associated with the production server and one or more child partitions configured to execute one or more virtual machines;
synchronizing, via a replication and high availability engine installed in the parent partition associated with the production server, virtual machine files associated with the one or more virtual machines executed in the one or more child partitions to a replica server; and
running, via the replication and high availability engine, one or more replication scenarios to replicate changes to the virtual machine files associated with the one or more virtual machines to the replica server.
14. The method recited in claim 13, further comprising invoking, via the replication and high availability engine, a volume shadow copy service writer associated with the virtualization stack to capture the virtual machine files synchronized to the replica server.
15. The method recited in claim 14, wherein the virtual machine files associated with the one or more virtual machines include virtual hard disk files having a *.vhd file format, configuration files having a *.xml file format, and snapshot files having a *.avhd file format.
16. The method recited in claim 13, wherein the virtualization architecture further includes a hypervisor configured to isolate the parent partition from the one or more child partitions.
17. The method recited in claim 13, wherein the virtualization stack includes one or more components configured to manage inter-partition communication between the parent partition and the one or more child partitions.
18. The method recited in claim 13, further comprising:
creating, via the virtualization architecture hosted in a parent partition on the replica server, one or more on-demand virtual machines from the synchronized virtual machine files associated with the one or more replication scenarios;
starting, via the virtualization architecture hosted in a parent partition on the replica server, the one or more on-demand virtual machines in one or more child partitions; and
redirecting end users and workloads from the one or more virtual machines executed in the one or more child partitions on the production server to the one or more on-demand virtual machines started in the one or more child partitions on the replica server.
19. The method recited in claim 18, wherein the replica server automatically redirects the end users and the workloads to the one or more on-demand virtual machines on the replica server in response to a switchover event that relates to planned downtime associated with the production server or the virtual machines executed thereon.
20. The method recited in claim 18, wherein the replica server automatically redirects the end users and the workloads to the one or more on-demand virtual machines on the replica server in response to a failover event that relates to unplanned downtime associated with the production server or the virtual machines executed thereon.
21. The method recited in claim 18, wherein the replica server automatically redirects the end users and the workloads to the one or more on-demand virtual machines on the replica server to a balance a load associated with the production server.
22. The method recited in claim 18, wherein the replica server automatically redirects the end users and the workloads to the one or more on-demand virtual machines on the replica server to test the one or more on-demand virtual machines on the replica server without disrupting the one or more virtual machines executing on the production server.
23. The method recited in claim 18, further comprising running the one or more replication scenarios on the replica server in reverse to synchronize virtual machine files associated with the one or more on-demand virtual machines or changes to the virtual machine files associated with the one or more on-demand virtual machines to the production server.
24. The method recited in claim 23, further comprising redirecting the end users and the workloads from the one or more on-demand virtual machines on the replica server to the one or more virtual machines in the one or more child partitions on the production server in response to the replica server having run the one or more replication scenarios in reverse to perform switchback or failback from the replica server to the production server.
US13/349,709 2012-01-13 2012-01-13 Providing a virtualized replication and high availability environment including a replication and high availability engine Active 2032-07-02 US8893147B2 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
US13/349,709 US8893147B2 (en) 2012-01-13 2012-01-13 Providing a virtualized replication and high availability environment including a replication and high availability engine
US14/538,485 US9519656B2 (en) 2012-01-13 2014-11-11 System and method for providing a virtualized replication and high availability environment
US15/375,617 US10114834B2 (en) 2012-01-13 2016-12-12 Exogenous virtual machine synchronization and replication

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US13/349,709 US8893147B2 (en) 2012-01-13 2012-01-13 Providing a virtualized replication and high availability environment including a replication and high availability engine

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US14/538,485 Continuation US9519656B2 (en) 2012-01-13 2014-11-11 System and method for providing a virtualized replication and high availability environment

Publications (2)

Publication Number Publication Date
US20130185716A1 true US20130185716A1 (en) 2013-07-18
US8893147B2 US8893147B2 (en) 2014-11-18

Family

ID=48780907

Family Applications (3)

Application Number Title Priority Date Filing Date
US13/349,709 Active 2032-07-02 US8893147B2 (en) 2012-01-13 2012-01-13 Providing a virtualized replication and high availability environment including a replication and high availability engine
US14/538,485 Active US9519656B2 (en) 2012-01-13 2014-11-11 System and method for providing a virtualized replication and high availability environment
US15/375,617 Active US10114834B2 (en) 2012-01-13 2016-12-12 Exogenous virtual machine synchronization and replication

Family Applications After (2)

Application Number Title Priority Date Filing Date
US14/538,485 Active US9519656B2 (en) 2012-01-13 2014-11-11 System and method for providing a virtualized replication and high availability environment
US15/375,617 Active US10114834B2 (en) 2012-01-13 2016-12-12 Exogenous virtual machine synchronization and replication

Country Status (1)

Country Link
US (3) US8893147B2 (en)

Cited By (89)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110035358A1 (en) * 2009-08-07 2011-02-10 Dilip Naik Optimized copy of virtual machine storage files
US20130185408A1 (en) * 2012-01-18 2013-07-18 Dh2I Company Systems and Methods for Server Cluster Application Virtualization
US20130185586A1 (en) * 2012-01-18 2013-07-18 LineRate Systems, Inc. Self-healing of network service modules
US20140033179A1 (en) * 2012-07-30 2014-01-30 Hewlett-Packard Development Company Lp Application testing
US20140082624A1 (en) * 2012-09-19 2014-03-20 Fujitsu Semiconductor Limited Execution control method and multi-processor system
US20140280457A1 (en) * 2013-03-15 2014-09-18 State Farm Mutual Automobile Insurance Company Implementation of a web-scale data fabric
US8856790B1 (en) 2008-09-25 2014-10-07 Dell Software Inc. Systems and methods for data management in a virtual computing environment
US20140344809A1 (en) * 2013-05-16 2014-11-20 Vmware, Inc. Policy-based data placement in a virtualized computing environment
US8898114B1 (en) 2010-08-27 2014-11-25 Dell Software Inc. Multitier deduplication systems and methods
US20140366021A1 (en) * 2013-06-07 2014-12-11 American Megatrends, Inc. Methods, Devices and Computer Readable Storage Devices for Emulating an Accelerometer in a Guest Operating System from a Host Operating System
US20140366022A1 (en) * 2013-06-07 2014-12-11 American Megatrends, Inc. Methods, Devices and Computer Readable Storage Devices for Emulating a Magnetometer in a Guest Operating System from a Host Operating System
US20140379656A1 (en) * 2013-06-24 2014-12-25 Sap Ag System and Method for Maintaining a Cluster Setup
US20150046511A1 (en) * 2013-08-06 2015-02-12 Wal-Mart Stores, Inc. System and method for storing and processing web service requests
US8966318B1 (en) * 2012-04-27 2015-02-24 Symantec Corporation Method to validate availability of applications within a backup image
US8996468B1 (en) 2009-04-17 2015-03-31 Dell Software Inc. Block status mapping system for reducing virtual machine backup storage
CN104615476A (en) * 2013-11-01 2015-05-13 国际商业机器公司 Selected virtual machine replication and virtual machine restart techniques
US9069782B2 (en) 2012-10-01 2015-06-30 The Research Foundation For The State University Of New York System and method for security and privacy aware virtual machine checkpointing
US20150199210A1 (en) * 2014-01-15 2015-07-16 American Megatrends, Inc. Methods, Devices and Computer Readable Storage Devices for Confluence of Multiple Operating Systems
US20150205280A1 (en) * 2014-01-20 2015-07-23 Yokogawa Electric Corporation Process controller and updating method thereof
US9218176B1 (en) 2014-06-13 2015-12-22 International Business Machines Corporation Software deployment in a distributed virtual machine environment
US20160034295A1 (en) * 2014-07-30 2016-02-04 Microsoft Technology Licensing, Llc Hypervisor-hosted virtual machine forensics
US20160034548A1 (en) * 2014-07-31 2016-02-04 Dell Products, Lp System and Method for Obtaining Automated Scaling of a Virtual Desktop Environment
US9286102B1 (en) * 2014-11-05 2016-03-15 Vmware, Inc. Desktop image management for hosted hypervisor environments
US9311375B1 (en) * 2012-02-07 2016-04-12 Dell Software Inc. Systems and methods for compacting a virtual machine file
US9311318B1 (en) 2008-07-14 2016-04-12 Dell Software Inc. Backup systems and methods for a virtual computing environment
US9350668B2 (en) 2014-06-03 2016-05-24 The Viki Group, Inc. Systems and methods for IP sharing across wide area networks
US9378038B2 (en) 2013-06-07 2016-06-28 American Megatrends, Inc. Methods, devices and computer readable storage devices for emulating a gyroscope in a guest operating system from a host operating system
US20160232025A1 (en) * 2013-09-23 2016-08-11 Gopc Pty Ltd Virtual computing systems and methods
US9430182B2 (en) 2014-03-06 2016-08-30 American Megatrends, Inc. Methods, systems and computer readable storage devices for presenting screen content
US9436489B2 (en) 2013-12-20 2016-09-06 Red Hat Israel, Ltd. Virtual machine data replication with shared resources
US9465855B2 (en) 2013-10-22 2016-10-11 International Business Machines Corporation Maintaining two-site configuration for workload availability between sites at unlimited distances for products and services
US9569446B1 (en) 2010-06-08 2017-02-14 Dell Software Inc. Cataloging system for image-based backup
US20170060707A1 (en) * 2015-08-25 2017-03-02 International Business Machines Corporation High availability dynamic restart priority calculator
US20170091009A1 (en) * 2015-09-30 2017-03-30 International Business Machines Corporation Optimized diagnostic data collection driven by a ticketing system
CN106708603A (en) * 2016-12-28 2017-05-24 平安科技(深圳)有限公司 Virtual machine quick recovery method and device
US9734022B1 (en) * 2015-03-31 2017-08-15 EMC IP Holding Company LLC Identifying virtual machines and errors for snapshots
US20170235654A1 (en) 2016-02-12 2017-08-17 Nutanix, Inc. Virtualized file server resilience
US20170242866A1 (en) * 2016-02-23 2017-08-24 Canon Kabushiki Kaisha Management system and control method
US9760448B1 (en) * 2013-08-23 2017-09-12 Acronis International Gmbh Hot recovery of virtual machines
US9767284B2 (en) 2012-09-14 2017-09-19 The Research Foundation For The State University Of New York Continuous run-time validation of program execution: a practical approach
US9767271B2 (en) 2010-07-15 2017-09-19 The Research Foundation For The State University Of New York System and method for validating program execution at run-time
US20170289248A1 (en) * 2016-03-29 2017-10-05 Lsis Co., Ltd. Energy management server, energy management system and the method for operating the same
US20170329623A1 (en) * 2014-11-24 2017-11-16 Intel Corporation Support for application transparent, high available gpu computing with vm checkpointing
US9858097B2 (en) 2013-06-07 2018-01-02 American Megatrends, Inc. Methods, devices and computer readable storage devices for emulating rotation events in a guest operating system from a host operating system
US9882980B2 (en) 2013-10-22 2018-01-30 International Business Machines Corporation Managing continuous priority workload availability and general workload availability between sites at unlimited distances for products and services
US9973570B2 (en) 2015-05-01 2018-05-15 Hartford Fire Insurance Company System for providing an isolated testing model for disaster recovery capabilites
US20180157560A1 (en) * 2016-12-02 2018-06-07 Vmware, Inc. Methods and apparatus for transparent database switching using master-replica high availability setup in relational databases
US20180157677A1 (en) * 2016-12-06 2018-06-07 Nutanix, Inc. Cloning virtualized file servers
US10025873B2 (en) 2014-04-18 2018-07-17 Walmart Apollo, Llc System and method for storing and processing database requests
US10055307B2 (en) * 2015-06-30 2018-08-21 Vmware, Inc. Workflows for series of snapshots
US10169174B2 (en) * 2016-02-29 2019-01-01 International Business Machines Corporation Disaster recovery as a service using virtualization technique
US10228958B1 (en) * 2014-12-05 2019-03-12 Quest Software Inc. Systems and methods for archiving time-series data during high-demand intervals
US10419572B2 (en) 2013-08-06 2019-09-17 Walmart Apollo, Llc Caching system and method
US10510121B2 (en) 2013-08-16 2019-12-17 United Stated Automobile Association (USAA) System and method for performing dwelling maintenance analytics on insured property
US10552911B1 (en) 2014-01-10 2020-02-04 United Services Automobile Association (Usaa) Determining status of building modifications using informatics sensor data
US10552267B2 (en) * 2016-09-15 2020-02-04 International Business Machines Corporation Microcheckpointing with service processor
CN110851239A (en) * 2019-11-15 2020-02-28 湖南智领通信科技有限公司 TYPE-I TYPE hard real-time high-reliability full virtualization method
US20200104050A1 (en) * 2018-10-01 2020-04-02 EMC IP Holding Company LLC Dynamic multiple proxy deployment
US10614525B1 (en) 2014-03-05 2020-04-07 United Services Automobile Association (Usaa) Utilizing credit and informatic data for insurance underwriting purposes
US10613886B2 (en) 2015-06-30 2020-04-07 Vmware, Inc. Protecting virtual computing instances
US10713726B1 (en) 2013-01-13 2020-07-14 United Services Automobile Association (Usaa) Determining insurance policy modifications using informatic sensor data
US20200225972A1 (en) * 2019-01-14 2020-07-16 Vmware, Inc. Autonomously reproducing and destructing virtual machines
US10728090B2 (en) 2016-12-02 2020-07-28 Nutanix, Inc. Configuring network segmentation for a virtualization environment
US10803015B2 (en) 2013-08-06 2020-10-13 Walmart Apollo, Llc Caching system and method
US10824455B2 (en) 2016-12-02 2020-11-03 Nutanix, Inc. Virtualized server systems and methods including load balancing for virtualized file servers
US10873501B2 (en) 2016-12-09 2020-12-22 Vmware, Inc. Methods, systems and apparatus to propagate node configuration changes to services in a distributed environment
CN112463132A (en) * 2020-11-13 2021-03-09 四川新网银行股份有限公司 Database switching tool and switching method
US10970179B1 (en) * 2014-09-30 2021-04-06 Acronis International Gmbh Automated disaster recovery and data redundancy management systems and methods
US11016861B2 (en) 2019-04-11 2021-05-25 International Business Machines Corporation Crash recoverability for graphics processing units (GPU) in a computing environment
CN113037569A (en) * 2021-04-19 2021-06-25 杭州和利时自动化有限公司 Redundant service method, device, equipment and medium based on double servers
US11086826B2 (en) 2018-04-30 2021-08-10 Nutanix, Inc. Virtualized server systems and methods including domain joining techniques
US11087404B1 (en) 2014-01-10 2021-08-10 United Services Automobile Association (Usaa) Electronic sensor management
US20210365336A1 (en) * 2020-05-19 2021-11-25 EMC IP Holding Company LLC Cost-optimized true zero recovery time objective for multiple applications based on interdependent applications
US11194680B2 (en) 2018-07-20 2021-12-07 Nutanix, Inc. Two node clusters recovery on a failure
US11218418B2 (en) 2016-05-20 2022-01-04 Nutanix, Inc. Scalable leadership election in a multi-processing computing environment
US11281484B2 (en) 2016-12-06 2022-03-22 Nutanix, Inc. Virtualized server systems and methods including scaling of file system virtual machines
US11294777B2 (en) 2016-12-05 2022-04-05 Nutanix, Inc. Disaster recovery for distributed file servers, including metadata fixers
US11310286B2 (en) 2014-05-09 2022-04-19 Nutanix, Inc. Mechanism for providing external access to a secured networked virtualization environment
US11416941B1 (en) 2014-01-10 2022-08-16 United Services Automobile Association (Usaa) Electronic sensor management
US20220292197A1 (en) * 2021-03-14 2022-09-15 Microsoft Technology Licensing, Llc Automatic update of vm sets
US11467923B2 (en) * 2019-05-15 2022-10-11 Kyndryl, Inc. Application recovery using pooled resources
US11562034B2 (en) 2016-12-02 2023-01-24 Nutanix, Inc. Transparent referrals for distributed file servers
US11568073B2 (en) 2016-12-02 2023-01-31 Nutanix, Inc. Handling permissions for virtualized file servers
US11770447B2 (en) 2018-10-31 2023-09-26 Nutanix, Inc. Managing high-availability file servers
US11768809B2 (en) 2020-05-08 2023-09-26 Nutanix, Inc. Managing incremental snapshots for fast leader node bring-up
US11836512B2 (en) 2020-05-19 2023-12-05 EMC IP Holding Company LLC Virtual machine replication strategy based on predicted application failures
US11847666B1 (en) 2014-02-24 2023-12-19 United Services Automobile Association (Usaa) Determining status of building modifications using informatics sensor data
US11899957B2 (en) 2020-05-19 2024-02-13 EMC IP Holding Company LLC Cost-optimized true zero recovery time objective for multiple applications
US11934283B2 (en) 2020-05-19 2024-03-19 EMC IP Holding Company LLC Cost-optimized true zero recovery time objective for multiple applications using failure domains

Families Citing this family (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10248453B2 (en) 2012-10-23 2019-04-02 Red Hat Israel, Ltd. Client live migration for a virtual machine
US9854035B2 (en) 2013-05-28 2017-12-26 International Business Machines Corporation Maintaining state synchronization of an application between computing devices as well as maintaining state synchronization of common information between different applications without requiring periodic synchronization
US10044799B2 (en) * 2013-05-28 2018-08-07 International Business Machines Corporation Implementing synchronization of state information betweeen instances of an application as well as between different applications in an efficient, scalable manner
WO2015118377A1 (en) 2014-02-04 2015-08-13 Telefonaktiebolaget L M Ericsson (Publ) Managing service availability in a mega virtual machine
WO2015163084A1 (en) * 2014-04-22 2015-10-29 オリンパス株式会社 Data processing system and data processing method
US9836476B2 (en) * 2014-09-25 2017-12-05 Netapp, Inc. Synchronizing configuration of partner objects across distributed storage systems using transformations
US20180150331A1 (en) * 2016-11-30 2018-05-31 International Business Machines Corporation Computing resource estimation in response to restarting a set of logical partitions
US11232000B1 (en) * 2017-02-24 2022-01-25 Amazon Technologies, Inc. Moving database partitions from replica nodes
CN108733460B (en) 2017-04-17 2022-11-29 伊姆西Ip控股有限责任公司 Method and apparatus for maintaining sessions for network storage devices
US10521273B2 (en) * 2017-06-08 2019-12-31 Cisco Technology, Inc. Physical partitioning of computing resources for server virtualization
US10721125B2 (en) * 2017-07-20 2020-07-21 Vmware, Inc. Systems and methods for update propagation between nodes in a distributed system
CN107678831A (en) * 2017-09-25 2018-02-09 郑州云海信息技术有限公司 It is a kind of to realize the V2V methods migrated between virtual platform
US11182187B2 (en) 2018-04-17 2021-11-23 Red Hat Israel, Ltd. Dynamic network connectivity verification in distributed virtual environments
US10754785B2 (en) 2018-06-28 2020-08-25 Intel Corporation Checkpointing for DRAM-less SSD
CN109189550A (en) * 2018-08-03 2019-01-11 广州竞德信息技术有限公司 A kind of control method of virtualized server
US10986089B2 (en) * 2019-04-11 2021-04-20 Kas Kasravi Virtual mobile device system and method thereof
US11119750B2 (en) * 2019-05-23 2021-09-14 International Business Machines Corporation Decentralized offline program updating
US11080083B1 (en) * 2019-08-28 2021-08-03 Juniper Networks, Inc. Providing physical host hardware state information to virtual machines deployed on the physical host
US11372702B2 (en) 2019-10-22 2022-06-28 International Business Machines Corporation Optimized high availability management using cluster-wide view
US20210247993A1 (en) * 2020-02-06 2021-08-12 EMC IP Holding Company LLC True zero rto eliminating os and app load times
US11734136B1 (en) 2022-02-11 2023-08-22 International Business Machines Corporation Quick disaster recovery in distributed computing environment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080189700A1 (en) * 2007-02-02 2008-08-07 Vmware, Inc. Admission Control for Virtual Machine Cluster
US7840963B2 (en) * 2004-10-15 2010-11-23 Microsoft Corporation Marking and utilizing portions of memory state information during a switch between virtual machines to minimize software service interruption
US8495316B2 (en) * 2008-08-25 2013-07-23 Symantec Operating Corporation Efficient management of archival images of virtual machines having incremental snapshots

Family Cites Families (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
ATE312378T1 (en) 2000-03-01 2005-12-15 Computer Ass Think Inc METHOD AND SYSTEM FOR UPDATE AN ARCHIVE OF A FILE
GB0112781D0 (en) 2001-05-25 2001-07-18 Global Continuity Plc Method for rapid recovery from a network file server failure
US7093086B1 (en) 2002-03-28 2006-08-15 Veritas Operating Corporation Disaster recovery and backup using virtual machines
US7478211B2 (en) 2004-01-09 2009-01-13 International Business Machines Corporation Maintaining consistency for remote copy using virtualization
US7840535B2 (en) 2004-11-05 2010-11-23 Computer Associates Think, Inc. Replicated data validation
US7899788B2 (en) * 2005-04-01 2011-03-01 Microsoft Corporation Using a data protection server to backup and restore data on virtual servers
US7165158B1 (en) 2005-08-17 2007-01-16 Hitachi, Ltd. System and method for migrating a replication system
US7814364B2 (en) 2006-08-31 2010-10-12 Dell Products, Lp On-demand provisioning of computer resources in physical/virtual cluster environments
US9098347B2 (en) 2006-12-21 2015-08-04 Vmware Implementation of virtual machine operations using storage system functionality
US8307177B2 (en) 2008-09-05 2012-11-06 Commvault Systems, Inc. Systems and methods for management of virtualization data
US7877639B2 (en) 2008-11-06 2011-01-25 Dell Products L.P. Systems and methods to provide failover support for booting embedded hypervisor from an internal non-volatile memory card
US8055937B2 (en) 2008-12-22 2011-11-08 QuorumLabs, Inc. High availability and disaster recovery using virtualization
US8732339B2 (en) 2009-03-24 2014-05-20 Hewlett-Packard Development Company, L.P. NPIV at storage devices
US9037718B2 (en) 2009-03-25 2015-05-19 Ntt Docomo, Inc. Method and apparatus for live replication
US20100262797A1 (en) 2009-04-10 2010-10-14 PHD Virtual Technologies Virtual machine data backup
US8335943B2 (en) 2009-06-22 2012-12-18 Citrix Systems, Inc. Systems and methods for stateful session failover between multi-core appliances
US8327181B2 (en) 2009-06-22 2012-12-04 Citrix Systems, Inc. Systems and methods for failover between multi-core appliances
US8751844B2 (en) 2009-09-24 2014-06-10 Citrix Systems, Inc. Systems and methods for attributing an amount of power consumption to a workload
US8495019B2 (en) 2011-03-08 2013-07-23 Ca, Inc. System and method for providing assured recovery and replication
US8301597B1 (en) 2011-09-16 2012-10-30 Ca, Inc. System and method for network file system server replication using reverse path lookup

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7840963B2 (en) * 2004-10-15 2010-11-23 Microsoft Corporation Marking and utilizing portions of memory state information during a switch between virtual machines to minimize software service interruption
US20080189700A1 (en) * 2007-02-02 2008-08-07 Vmware, Inc. Admission Control for Virtual Machine Cluster
US8495316B2 (en) * 2008-08-25 2013-07-23 Symantec Operating Corporation Efficient management of archival images of virtual machines having incremental snapshots

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Cully et al; Remus: High Availability via Asynchronous Virtual Machine Replication; USDI '08; 2008 *
Zhu et al; Optimizing the Performance of Virtual Machine Synchronization for Fault Tolerance; IEEE 4 Nov. 2010 *

Cited By (168)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9311318B1 (en) 2008-07-14 2016-04-12 Dell Software Inc. Backup systems and methods for a virtual computing environment
US8856790B1 (en) 2008-09-25 2014-10-07 Dell Software Inc. Systems and methods for data management in a virtual computing environment
US8996468B1 (en) 2009-04-17 2015-03-31 Dell Software Inc. Block status mapping system for reducing virtual machine backup storage
US9778946B2 (en) 2009-08-07 2017-10-03 Dell Software Inc. Optimized copy of virtual machine storage files
US20110035358A1 (en) * 2009-08-07 2011-02-10 Dilip Naik Optimized copy of virtual machine storage files
US9569446B1 (en) 2010-06-08 2017-02-14 Dell Software Inc. Cataloging system for image-based backup
US9767271B2 (en) 2010-07-15 2017-09-19 The Research Foundation For The State University Of New York System and method for validating program execution at run-time
US8898114B1 (en) 2010-08-27 2014-11-25 Dell Software Inc. Multitier deduplication systems and methods
US9515869B2 (en) * 2012-01-18 2016-12-06 Dh2I Company Systems and methods for server cluster application virtualization
US20130185586A1 (en) * 2012-01-18 2013-07-18 LineRate Systems, Inc. Self-healing of network service modules
US20130185408A1 (en) * 2012-01-18 2013-07-18 Dh2I Company Systems and Methods for Server Cluster Application Virtualization
US9311375B1 (en) * 2012-02-07 2016-04-12 Dell Software Inc. Systems and methods for compacting a virtual machine file
US8966318B1 (en) * 2012-04-27 2015-02-24 Symantec Corporation Method to validate availability of applications within a backup image
US20140033179A1 (en) * 2012-07-30 2014-01-30 Hewlett-Packard Development Company Lp Application testing
US9767284B2 (en) 2012-09-14 2017-09-19 The Research Foundation For The State University Of New York Continuous run-time validation of program execution: a practical approach
US20140082624A1 (en) * 2012-09-19 2014-03-20 Fujitsu Semiconductor Limited Execution control method and multi-processor system
US9552495B2 (en) 2012-10-01 2017-01-24 The Research Foundation For The State University Of New York System and method for security and privacy aware virtual machine checkpointing
US10324795B2 (en) 2012-10-01 2019-06-18 The Research Foundation for the State University o System and method for security and privacy aware virtual machine checkpointing
US9069782B2 (en) 2012-10-01 2015-06-30 The Research Foundation For The State University Of New York System and method for security and privacy aware virtual machine checkpointing
US10713726B1 (en) 2013-01-13 2020-07-14 United Services Automobile Association (Usaa) Determining insurance policy modifications using informatic sensor data
US8930581B2 (en) * 2013-03-15 2015-01-06 State Farm Mutual Automobile Insurance Company Implementation of a web-scale data fabric
US9948715B1 (en) 2013-03-15 2018-04-17 State Farm Mutual Automobile Insurance Company Implementation of a web-scale data fabric
US9208240B1 (en) 2013-03-15 2015-12-08 State Farm Mutual Automobile Insurance Company Implementation of a web scale data fabric
US20140280457A1 (en) * 2013-03-15 2014-09-18 State Farm Mutual Automobile Insurance Company Implementation of a web-scale data fabric
US9363322B1 (en) 2013-03-15 2016-06-07 State Farm Mutual Automobile Insurance Company Implementation of a web scale data fabric
US10715598B1 (en) 2013-03-15 2020-07-14 State Farm Mutual Automobile Insurance Company Implementation of a web-scale data fabric
US9015238B1 (en) 2013-03-15 2015-04-21 State Farm Mutual Automobile Insurance Company Implementation of a web scale data fabric
US9672053B2 (en) 2013-05-16 2017-06-06 Vmware, Inc. Service request processing
US9582297B2 (en) * 2013-05-16 2017-02-28 Vmware, Inc. Policy-based data placement in a virtualized computing environment
US20140344809A1 (en) * 2013-05-16 2014-11-20 Vmware, Inc. Policy-based data placement in a virtualized computing environment
US9407714B2 (en) 2013-05-16 2016-08-02 Vmware, Inc. Data refreshing of applications
US20140366022A1 (en) * 2013-06-07 2014-12-11 American Megatrends, Inc. Methods, Devices and Computer Readable Storage Devices for Emulating a Magnetometer in a Guest Operating System from a Host Operating System
US9858097B2 (en) 2013-06-07 2018-01-02 American Megatrends, Inc. Methods, devices and computer readable storage devices for emulating rotation events in a guest operating system from a host operating system
US9378038B2 (en) 2013-06-07 2016-06-28 American Megatrends, Inc. Methods, devices and computer readable storage devices for emulating a gyroscope in a guest operating system from a host operating system
US20140366021A1 (en) * 2013-06-07 2014-12-11 American Megatrends, Inc. Methods, Devices and Computer Readable Storage Devices for Emulating an Accelerometer in a Guest Operating System from a Host Operating System
US9031910B2 (en) * 2013-06-24 2015-05-12 Sap Se System and method for maintaining a cluster setup
US20140379656A1 (en) * 2013-06-24 2014-12-25 Sap Ag System and Method for Maintaining a Cluster Setup
US10803015B2 (en) 2013-08-06 2020-10-13 Walmart Apollo, Llc Caching system and method
US10419572B2 (en) 2013-08-06 2019-09-17 Walmart Apollo, Llc Caching system and method
US10116762B2 (en) * 2013-08-06 2018-10-30 Walmart Apollo, Llc System and method for storing and processing web service requests
US20150046511A1 (en) * 2013-08-06 2015-02-12 Wal-Mart Stores, Inc. System and method for storing and processing web service requests
US10510121B2 (en) 2013-08-16 2019-12-17 United Stated Automobile Association (USAA) System and method for performing dwelling maintenance analytics on insured property
US9760448B1 (en) * 2013-08-23 2017-09-12 Acronis International Gmbh Hot recovery of virtual machines
US11663025B2 (en) * 2013-09-23 2023-05-30 Bankvault Pty Ltd Maintenance of and caching of suspended virtual computers in a pool of suspended virtual computers
US20160232025A1 (en) * 2013-09-23 2016-08-11 Gopc Pty Ltd Virtual computing systems and methods
US10084858B2 (en) * 2013-10-22 2018-09-25 International Business Machines Corporation Managing continuous priority workload availability and general workload availability between sites at unlimited distances for products and services
US9720741B2 (en) 2013-10-22 2017-08-01 International Business Machines Corporation Maintaining two-site configuration for workload availability between sites at unlimited distances for products and services
US9882980B2 (en) 2013-10-22 2018-01-30 International Business Machines Corporation Managing continuous priority workload availability and general workload availability between sites at unlimited distances for products and services
US11249815B2 (en) 2013-10-22 2022-02-15 International Business Machines Corporation Maintaining two-site configuration for workload availability between sites at unlimited distances for products and services
US9465855B2 (en) 2013-10-22 2016-10-11 International Business Machines Corporation Maintaining two-site configuration for workload availability between sites at unlimited distances for products and services
US9389970B2 (en) * 2013-11-01 2016-07-12 International Business Machines Corporation Selected virtual machine replication and virtual machine restart techniques
CN104615476A (en) * 2013-11-01 2015-05-13 国际商业机器公司 Selected virtual machine replication and virtual machine restart techniques
US9436489B2 (en) 2013-12-20 2016-09-06 Red Hat Israel, Ltd. Virtual machine data replication with shared resources
US11164257B1 (en) 2014-01-10 2021-11-02 United Services Automobile Association (Usaa) Streamlined property insurance application and renewal process
US11532004B1 (en) 2014-01-10 2022-12-20 United Services Automobile Association (Usaa) Utilizing credit and informatic data for insurance underwriting purposes
US11151657B1 (en) 2014-01-10 2021-10-19 United Services Automobile Association (Usaa) Insurance policy modification based on secondary informatics
US10552911B1 (en) 2014-01-10 2020-02-04 United Services Automobile Association (Usaa) Determining status of building modifications using informatics sensor data
US11526948B1 (en) 2014-01-10 2022-12-13 United Services Automobile Association (Usaa) Identifying and recommending insurance policy products/services using informatic sensor data
US11461850B1 (en) 2014-01-10 2022-10-04 United Services Automobile Association (Usaa) Determining insurance policy modifications using informatic sensor data
US11423429B1 (en) 2014-01-10 2022-08-23 United Services Automobile Association (Usaa) Determining status of building modifications using informatics sensor data
US11416941B1 (en) 2014-01-10 2022-08-16 United Services Automobile Association (Usaa) Electronic sensor management
US10783588B1 (en) 2014-01-10 2020-09-22 United Services Automobile Association (Usaa) Identifying and recommending insurance policy products/services using informatic sensor data
US10679296B1 (en) 2014-01-10 2020-06-09 United Services Automobile Association (Usaa) Systems and methods for determining insurance coverage based on informatics
US10977736B1 (en) 2014-01-10 2021-04-13 United Services Automobile Association (Usaa) Determining risks related to activities on insured properties using informatic sensor data
US11227339B1 (en) 2014-01-10 2022-01-18 United Services Automobile Association (Usaa) Systems and methods for utilizing imaging informatics
US10699348B1 (en) 2014-01-10 2020-06-30 United Services Automobile Association (Usaa) Utilizing credit and informatic data for insurance underwriting purposes
US11532006B1 (en) 2014-01-10 2022-12-20 United Services Automobile Association (Usaa) Determining and initiating insurance claim events
US11526949B1 (en) 2014-01-10 2022-12-13 United Services Automobile Association (Usaa) Determining risks related to activities on insured properties using informatic sensor data
US11138672B1 (en) 2014-01-10 2021-10-05 United Services Automobile Association (Usaa) Determining and initiating insurance claim events
US11120506B1 (en) 2014-01-10 2021-09-14 United Services Automobile Association (Usaa) Streamlined property insurance application and renewal process
US11113765B1 (en) 2014-01-10 2021-09-07 United Services Automobile Association (Usaa) Determining appliance insurance coverage/products using informatic sensor data
US11087404B1 (en) 2014-01-10 2021-08-10 United Services Automobile Association (Usaa) Electronic sensor management
US10740847B1 (en) 2014-01-10 2020-08-11 United Services Automobile Association (Usaa) Method and system for making rapid insurance policy decisions
US11068992B1 (en) 2014-01-10 2021-07-20 United Services Automobile Association (Usaa) Insurance policy modifications using informatic sensor data
US20150199210A1 (en) * 2014-01-15 2015-07-16 American Megatrends, Inc. Methods, Devices and Computer Readable Storage Devices for Confluence of Multiple Operating Systems
US20150205280A1 (en) * 2014-01-20 2015-07-23 Yokogawa Electric Corporation Process controller and updating method thereof
US9869984B2 (en) * 2014-01-20 2018-01-16 Yokogawa Electric Corporation Process controller and updating method thereof
US11847666B1 (en) 2014-02-24 2023-12-19 United Services Automobile Association (Usaa) Determining status of building modifications using informatics sensor data
US10614525B1 (en) 2014-03-05 2020-04-07 United Services Automobile Association (Usaa) Utilizing credit and informatic data for insurance underwriting purposes
US9430182B2 (en) 2014-03-06 2016-08-30 American Megatrends, Inc. Methods, systems and computer readable storage devices for presenting screen content
US10025873B2 (en) 2014-04-18 2018-07-17 Walmart Apollo, Llc System and method for storing and processing database requests
US10671695B2 (en) 2014-04-18 2020-06-02 Walmart Apollo, Llc System and method for storing and processing database requests
US11310286B2 (en) 2014-05-09 2022-04-19 Nutanix, Inc. Mechanism for providing external access to a secured networked virtualization environment
US9350668B2 (en) 2014-06-03 2016-05-24 The Viki Group, Inc. Systems and methods for IP sharing across wide area networks
US9304752B2 (en) 2014-06-13 2016-04-05 International Business Machines Corporation Software deployment in a distributed virtual machine environment
US9218176B1 (en) 2014-06-13 2015-12-22 International Business Machines Corporation Software deployment in a distributed virtual machine environment
US9851998B2 (en) * 2014-07-30 2017-12-26 Microsoft Technology Licensing, Llc Hypervisor-hosted virtual machine forensics
US10169071B2 (en) 2014-07-30 2019-01-01 Microsoft Technology Licensing, Llc Hypervisor-hosted virtual machine forensics
US20160034295A1 (en) * 2014-07-30 2016-02-04 Microsoft Technology Licensing, Llc Hypervisor-hosted virtual machine forensics
US20160034548A1 (en) * 2014-07-31 2016-02-04 Dell Products, Lp System and Method for Obtaining Automated Scaling of a Virtual Desktop Environment
US10970179B1 (en) * 2014-09-30 2021-04-06 Acronis International Gmbh Automated disaster recovery and data redundancy management systems and methods
US9286102B1 (en) * 2014-11-05 2016-03-15 Vmware, Inc. Desktop image management for hosted hypervisor environments
US10996968B2 (en) * 2014-11-24 2021-05-04 Intel Corporation Support for application transparent, high available GPU computing with VM checkpointing
US20170329623A1 (en) * 2014-11-24 2017-11-16 Intel Corporation Support for application transparent, high available gpu computing with vm checkpointing
US10228958B1 (en) * 2014-12-05 2019-03-12 Quest Software Inc. Systems and methods for archiving time-series data during high-demand intervals
US9734022B1 (en) * 2015-03-31 2017-08-15 EMC IP Holding Company LLC Identifying virtual machines and errors for snapshots
US9973570B2 (en) 2015-05-01 2018-05-15 Hartford Fire Insurance Company System for providing an isolated testing model for disaster recovery capabilites
US10055307B2 (en) * 2015-06-30 2018-08-21 Vmware, Inc. Workflows for series of snapshots
US10613886B2 (en) 2015-06-30 2020-04-07 Vmware, Inc. Protecting virtual computing instances
US20170060707A1 (en) * 2015-08-25 2017-03-02 International Business Machines Corporation High availability dynamic restart priority calculator
US9852035B2 (en) * 2015-08-25 2017-12-26 International Business Machines Corporation High availability dynamic restart priority calculator
US20170091009A1 (en) * 2015-09-30 2017-03-30 International Business Machines Corporation Optimized diagnostic data collection driven by a ticketing system
US10255127B2 (en) * 2015-09-30 2019-04-09 International Business Machines Corporation Optimized diagnostic data collection driven by a ticketing system
US11550558B2 (en) 2016-02-12 2023-01-10 Nutanix, Inc. Virtualized file server deployment
US10540164B2 (en) 2016-02-12 2020-01-21 Nutanix, Inc. Virtualized file server upgrade
US10838708B2 (en) 2016-02-12 2020-11-17 Nutanix, Inc. Virtualized file server backup to cloud
US11544049B2 (en) 2016-02-12 2023-01-03 Nutanix, Inc. Virtualized file server disaster recovery
US10540165B2 (en) 2016-02-12 2020-01-21 Nutanix, Inc. Virtualized file server rolling upgrade
US11537384B2 (en) 2016-02-12 2022-12-27 Nutanix, Inc. Virtualized file server distribution across clusters
US10949192B2 (en) 2016-02-12 2021-03-16 Nutanix, Inc. Virtualized file server data sharing
US11550557B2 (en) 2016-02-12 2023-01-10 Nutanix, Inc. Virtualized file server
US10540166B2 (en) 2016-02-12 2020-01-21 Nutanix, Inc. Virtualized file server high availability
US10809998B2 (en) 2016-02-12 2020-10-20 Nutanix, Inc. Virtualized file server splitting and merging
US20170235591A1 (en) 2016-02-12 2017-08-17 Nutanix, Inc. Virtualized file server block awareness
US20170235654A1 (en) 2016-02-12 2017-08-17 Nutanix, Inc. Virtualized file server resilience
US11669320B2 (en) 2016-02-12 2023-06-06 Nutanix, Inc. Self-healing virtualized file server
US11579861B2 (en) 2016-02-12 2023-02-14 Nutanix, Inc. Virtualized file server smart data ingestion
US11645065B2 (en) 2016-02-12 2023-05-09 Nutanix, Inc. Virtualized file server user views
US10831465B2 (en) 2016-02-12 2020-11-10 Nutanix, Inc. Virtualized file server distribution across clusters
US11550559B2 (en) 2016-02-12 2023-01-10 Nutanix, Inc. Virtualized file server rolling upgrade
US11106447B2 (en) 2016-02-12 2021-08-31 Nutanix, Inc. Virtualized file server user views
US10719307B2 (en) 2016-02-12 2020-07-21 Nutanix, Inc. Virtualized file server block awareness
US10719305B2 (en) 2016-02-12 2020-07-21 Nutanix, Inc. Virtualized file server tiers
US10719306B2 (en) 2016-02-12 2020-07-21 Nutanix, Inc. Virtualized file server resilience
US11922157B2 (en) 2016-02-12 2024-03-05 Nutanix, Inc. Virtualized file server
US11526468B2 (en) * 2016-02-23 2022-12-13 Canon Kabushiki Kaisha Management system and control method
US20170242866A1 (en) * 2016-02-23 2017-08-24 Canon Kabushiki Kaisha Management system and control method
US10169174B2 (en) * 2016-02-29 2019-01-01 International Business Machines Corporation Disaster recovery as a service using virtualization technique
US20170289248A1 (en) * 2016-03-29 2017-10-05 Lsis Co., Ltd. Energy management server, energy management system and the method for operating the same
US10567501B2 (en) * 2016-03-29 2020-02-18 Lsis Co., Ltd. Energy management server, energy management system and the method for operating the same
US11218418B2 (en) 2016-05-20 2022-01-04 Nutanix, Inc. Scalable leadership election in a multi-processing computing environment
US11888599B2 (en) 2016-05-20 2024-01-30 Nutanix, Inc. Scalable leadership election in a multi-processing computing environment
US10552267B2 (en) * 2016-09-15 2020-02-04 International Business Machines Corporation Microcheckpointing with service processor
US11016857B2 (en) 2016-09-15 2021-05-25 International Business Machines Corporation Microcheckpointing with service processor
US11562034B2 (en) 2016-12-02 2023-01-24 Nutanix, Inc. Transparent referrals for distributed file servers
US10824455B2 (en) 2016-12-02 2020-11-03 Nutanix, Inc. Virtualized server systems and methods including load balancing for virtualized file servers
US11568073B2 (en) 2016-12-02 2023-01-31 Nutanix, Inc. Handling permissions for virtualized file servers
US20180157560A1 (en) * 2016-12-02 2018-06-07 Vmware, Inc. Methods and apparatus for transparent database switching using master-replica high availability setup in relational databases
US10728090B2 (en) 2016-12-02 2020-07-28 Nutanix, Inc. Configuring network segmentation for a virtualization environment
US10776385B2 (en) * 2016-12-02 2020-09-15 Vmware, Inc. Methods and apparatus for transparent database switching using master-replica high availability setup in relational databases
US11294777B2 (en) 2016-12-05 2022-04-05 Nutanix, Inc. Disaster recovery for distributed file servers, including metadata fixers
US11775397B2 (en) 2016-12-05 2023-10-03 Nutanix, Inc. Disaster recovery for distributed file servers, including metadata fixers
US11288239B2 (en) * 2016-12-06 2022-03-29 Nutanix, Inc. Cloning virtualized file servers
US11281484B2 (en) 2016-12-06 2022-03-22 Nutanix, Inc. Virtualized server systems and methods including scaling of file system virtual machines
US20180157677A1 (en) * 2016-12-06 2018-06-07 Nutanix, Inc. Cloning virtualized file servers
US11922203B2 (en) 2016-12-06 2024-03-05 Nutanix, Inc. Virtualized server systems and methods including scaling of file system virtual machines
US10873501B2 (en) 2016-12-09 2020-12-22 Vmware, Inc. Methods, systems and apparatus to propagate node configuration changes to services in a distributed environment
CN106708603A (en) * 2016-12-28 2017-05-24 平安科技(深圳)有限公司 Virtual machine quick recovery method and device
US11675746B2 (en) 2018-04-30 2023-06-13 Nutanix, Inc. Virtualized server systems and methods including domain joining techniques
US11086826B2 (en) 2018-04-30 2021-08-10 Nutanix, Inc. Virtualized server systems and methods including domain joining techniques
US11194680B2 (en) 2018-07-20 2021-12-07 Nutanix, Inc. Two node clusters recovery on a failure
US20200104050A1 (en) * 2018-10-01 2020-04-02 EMC IP Holding Company LLC Dynamic multiple proxy deployment
US10929048B2 (en) * 2018-10-01 2021-02-23 EMC IP Holding Company LLC Dynamic multiple proxy deployment
US11770447B2 (en) 2018-10-31 2023-09-26 Nutanix, Inc. Managing high-availability file servers
US11080079B2 (en) * 2019-01-14 2021-08-03 Vmware, Inc. Autonomously reproducing and destructing virtual machines
US20200225972A1 (en) * 2019-01-14 2020-07-16 Vmware, Inc. Autonomously reproducing and destructing virtual machines
US11016861B2 (en) 2019-04-11 2021-05-25 International Business Machines Corporation Crash recoverability for graphics processing units (GPU) in a computing environment
US11467923B2 (en) * 2019-05-15 2022-10-11 Kyndryl, Inc. Application recovery using pooled resources
CN110851239A (en) * 2019-11-15 2020-02-28 湖南智领通信科技有限公司 TYPE-I TYPE hard real-time high-reliability full virtualization method
US11768809B2 (en) 2020-05-08 2023-09-26 Nutanix, Inc. Managing incremental snapshots for fast leader node bring-up
US20210365336A1 (en) * 2020-05-19 2021-11-25 EMC IP Holding Company LLC Cost-optimized true zero recovery time objective for multiple applications based on interdependent applications
US11797400B2 (en) * 2020-05-19 2023-10-24 EMC IP Holding Company LLC Cost-optimized true zero recovery time objective for multiple applications based on interdependent applications
US11836512B2 (en) 2020-05-19 2023-12-05 EMC IP Holding Company LLC Virtual machine replication strategy based on predicted application failures
US11934283B2 (en) 2020-05-19 2024-03-19 EMC IP Holding Company LLC Cost-optimized true zero recovery time objective for multiple applications using failure domains
US11899957B2 (en) 2020-05-19 2024-02-13 EMC IP Holding Company LLC Cost-optimized true zero recovery time objective for multiple applications
CN112463132A (en) * 2020-11-13 2021-03-09 四川新网银行股份有限公司 Database switching tool and switching method
US20220292197A1 (en) * 2021-03-14 2022-09-15 Microsoft Technology Licensing, Llc Automatic update of vm sets
CN113037569A (en) * 2021-04-19 2021-06-25 杭州和利时自动化有限公司 Redundant service method, device, equipment and medium based on double servers

Also Published As

Publication number Publication date
US9519656B2 (en) 2016-12-13
US8893147B2 (en) 2014-11-18
US10114834B2 (en) 2018-10-30
US20170091221A1 (en) 2017-03-30
US20150066844A1 (en) 2015-03-05

Similar Documents

Publication Publication Date Title
US10114834B2 (en) Exogenous virtual machine synchronization and replication
US11797395B2 (en) Application migration between environments
US11579991B2 (en) Dynamic allocation of compute resources at a recovery site
US11074143B2 (en) Data backup and disaster recovery between environments
US9747179B2 (en) Data management agent for selective storage re-caching
US9727429B1 (en) Method and system for immediate recovery of replicated virtual machines
US11663085B2 (en) Application backup and management
US9201736B1 (en) Methods and apparatus for recovery of complex assets in distributed information processing systems
CN107111533B (en) Virtual machine cluster backup
US9552405B1 (en) Methods and apparatus for recovery of complex assets in distributed information processing systems
US10713183B2 (en) Virtual machine backup using snapshots and current configuration
US9977704B1 (en) Automated backup and replication of virtual machine data centers
US9037822B1 (en) Hierarchical volume tree
Yang et al. On improvement of cloud virtual machine availability with virtualization fault tolerance mechanism
US9423956B2 (en) Emulating a stretched storage device using a shared storage device
US20130091376A1 (en) Self-repairing database system
EP3750066B1 (en) Protection of infrastructure-as-a-service workloads in public cloud
US10055309B1 (en) Parallel restoration of a virtual machine's virtual machine disks
US9442811B2 (en) Emulating a stretched storage device using a shared replicated storage device
US9602341B1 (en) Secure multi-tenant virtual control server operation in a cloud environment using API provider
US11880282B2 (en) Container-based application data protection method and system
Dell
US10756953B1 (en) Method and system of seamlessly reconfiguring a data center after a failure
US20230267052A1 (en) Containerized data mover for data protection workloads
US20240028488A1 (en) Application data protection method and system

Legal Events

Date Code Title Description
AS Assignment

Owner name: COMPUTER ASSOCIATES THINK, INC., NEW YORK

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:YIN, JINXING;DUN, PENGCHENG;REEL/FRAME:027527/0863

Effective date: 20110929

AS Assignment

Owner name: CA, INC., NEW YORK

Free format text: MERGER;ASSIGNOR:COMPUTER ASSOCIATES THINK, INC.;REEL/FRAME:033974/0639

Effective date: 20120327

STCF Information on status: patent grant

Free format text: PATENTED CASE

FEPP Fee payment procedure

Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551)

Year of fee payment: 4

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 8