Great research starts with great data.

Learn More
More >
Patent Analysis of

Increment resynchronization in hash-based replication

Updated Time 12 June 2019

Patent Registration Data

Publication Number

US10152527

Application Number

US14/979890

Application Date

28 December 2015

Publication Date

11 December 2018

Current Assignee

EMC IP HOLDING COMPANY LLC

Original Assignee (Applicant)

EMC CORPORATION

International Classification

G06F17/30

Cooperative Classification

G06F17/30575,G06F17/3033,G06F16/27,G06F16/2255

Inventor

MEIRI, DAVID,LEMPEL, IRIT

Patent Images

This patent contains figures and images illustrating the invention and its embodiment.

US10152527 Increment resynchronization hash-based replication 1 US10152527 Increment resynchronization hash-based replication 2 US10152527 Increment resynchronization hash-based replication 3
See all images <>

Abstract

In one aspect, a method includes selecting a C-module; sending a write from a host to the selected C-module; selecting a D-module to commit a page related to the write; selecting a R-module to transmit data from the write to the target; writing the data from the write to target location; and writing the data to an address-to-hash table after acknowledgement that the data has been written to the target location and after the D-module acknowledges that the page is committed.

Read more

Claims

1. A method for storing data in a computer based data storage system, the computer based data storage system comprising a first storage subsystem and a second storage subsystem, the first storage subsystem being configured to store data in a consistency group of the first storage subsystem, and replicate the consistency group to the second storage subsystem, the method comprising:

receiving, at the first storage subsystem, a request to write a data payload; storing the data payload in the consistency group of the first storage subsystem, the storing being performed by a module in the first storage subsystem; initiating, by the module, a replication of the data payload to the second storage subsystem; stopping all replication of the consistency group to the second storage subsystem in response to detecting that the replication of the data payload is unsuccessful; synchronizing the first storage subsystem with the second storage subsystem; and updating, by the module, at least one address-to-hash (A2H) table in the first storage subsystem to identify an address associated with the data payload after the first storage subsystem and the second storage subsystem are synchronized, wherein the module is configured to update the A2H table after: (i) the data payload has been stored in the consistency group of the first storage subsystem, and (ii) the module has received an acknowledgment that the data payload has been successfully replicated to the second storage subsystem.

2. The method of claim 1, wherein stopping all replication of the consistency group includes:

instructing the module to stop acknowledging write requests; instructing the module to take a snapshot of the consistency group; and notifying the module to stop the replication of the consistency group to the second storage subsystem.

3. The method of claim 2, wherein synchronizing the first storage subsystem with the second storage subsystem includes:

comparing data that is currently stored in the consistency group with the snapshot; and sending any data that is currently stored in the consistency group and not identified in the snapshot to the second storage subsystem.

4. An apparatus for synchronous replication in a computer based data storage system, comprising:

one or more storage devices configured to implement a consistency group; and electronic hardware circuitry that is operatively coupled to the one or more storage devices, the electronic hardware circuitry being configured to: receive a request to write a data payload; store the data payload in the consistency group; initiate a replication of the data payload to a replication subsystem; stop all replication of the consistency group to the replication subsystem in response to detecting that the replication of the data payload is unsuccessful; synchronize the consistency group with a replica of the consistency group that is stored at the replication subsystem; and update at least one address-to-hash (A2H) table in the first storage subsystem to identify an address associated with the data payload after the consistency group is synchronized with the replica of the consistency group, wherein the A2H table is updated after: (i) the data payload has been stored in the consistency group, and (ii) the data payload has been successfully replicated to the replication subsystem.

5. The apparatus of claim 4, wherein the electronic hardware circuitry comprises at least one of a processor, a memory, a programmable logic device or a logic gate.

6. The apparatus of claim 4, further comprising taking a snapshot of the consistency group when the replication of the consistency group is stopped.

7. The apparatus of claim 6, wherein synchronizing the consistency group with the replica of the consistency group includes:

comparing data that is currently stored in the consistency group with the snapshot; and sending any data that is currently stored in the consistency group and not identified in the snapshot to the replication subsystem.

8. A non-transitory computer-readable medium storing one or more processor-executable instructions, which executed by one or more processors, cause the one or more processors to perform a method for storing data in a computer based data storage system, the computer based data storage system comprising a first storage subsystem and a second storage subsystem, the first storage subsystem being configured to store data in a consistency group of the first storage subsystem, and replicate the consistency group to the second storage subsystem, the method comprising:

receiving, at the first storage subsystem, a request to write a data payload; storing the data payload in the consistency group of the first storage subsystem, the storing being performed by a module in the first storage subsystem; initiating, by the module, a replication of the data payload to the second storage subsystem; stopping all replication of the consistency group to the second storage subsystem in response to detecting that the replication of the data payload is unsuccessful; synchronizing the first storage subsystem with the second storage subsystem; and updating, by the module, at least one address-to-hash (A2H) table in the first storage subsystem to identify an address associated with the data payload after the first storage subsystem and the second storage subsystem are synchronized, wherein the module is configured to update the A2H table after: (i) the data payload has been stored in the consistency group of the first storage subsystem, and (ii) the module has received an acknowledgment that the data payload has been successfully replicated to the second storage subsystem.

9. The non-transitory computer-readable medium of claim 8, wherein stopping all replication of the consistency group includes:

instructing the module to stop acknowledging write requests; instructing the module to take a snapshot of the consistency group; and notifying the module to stop the replication of the consistency group to the second storage subsystem.

10. The non-transitory computer-readable medium of claim 9, wherein synchronizing the first storage subsystem with the second storage subsystem includes:

comparing data that is currently stored in the consistency group with the snapshot; and sending any data that is currently stored in the consistency group and not identified in the snapshot to the second storage subsystem.

Read more

Claim Tree

  • 1
    1. A method for storing data in a computer based data storage system, the computer based data storage system comprising
    • a first storage subsystem and a second storage subsystem, the first storage subsystem being configured to store data in a consistency group of the first storage subsystem, and replicate the consistency group to the second storage subsystem, the method comprising: receiving, at the first storage subsystem, a request to write a data payload
    • storing the data payload in the consistency group of the first storage subsystem, the storing being performed by a module in the first storage subsystem
    • initiating, by the module, a replication of the data payload to the second storage subsystem
    • stopping all replication of the consistency group to the second storage subsystem in response to detecting that the replication of the data payload is unsuccessful
    • synchronizing the first storage subsystem with the second storage subsystem
    • and updating, by the module, at least one address-to-hash (A2H) table in the first storage subsystem to identify an address associated with the data payload after the first storage subsystem and the second storage subsystem are synchronized, wherein the module is configured to update the A2H table after: (i) the data payload has been stored in the consistency group of the first storage subsystem, and (ii) the module has received an acknowledgment that the data payload has been successfully replicated to the second storage subsystem.
    • 2. The method of claim 1, wherein
      • stopping all replication of the consistency group includes: instructing the module to stop acknowledging write requests; instructing the module to take a snapshot of the consistency group; and notifying the module to stop the replication of the consistency group to the second storage subsystem.
  • 4
    4. An apparatus for synchronous replication in a computer based data storage system, comprising:
    • one or more storage devices configured to implement a consistency group
    • and electronic hardware circuitry that is operatively coupled to the one or more storage devices, the electronic hardware circuitry being configured to: receive a request to write a data payload
    • store the data payload in the consistency group
    • initiate a replication of the data payload to a replication subsystem
    • stop all replication of the consistency group to the replication subsystem in response to detecting that the replication of the data payload is unsuccessful
    • synchronize the consistency group with a replica of the consistency group that is stored at the replication subsystem
    • and update at least one address-to-hash (A2H) table in the first storage subsystem to identify an address associated with the data payload after the consistency group is synchronized with the replica of the consistency group, wherein the A2H table is updated after: (i) the data payload has been stored in the consistency group, and (ii) the data payload has been successfully replicated to the replication subsystem.
    • 5. The apparatus of claim 4, wherein
      • the electronic hardware circuitry comprises
    • 6. The apparatus of claim 4, further comprising
      • taking a snapshot of the consistency group when the replication of the consistency group is stopped.
  • 8
    8. A non-transitory computer-readable medium storing one or more processor-executable instructions, which executed by one or more processors, cause the one or more processors to perform a method for storing data in a computer based data storage system, the computer based data storage system comprising
    • a first storage subsystem and a second storage subsystem, the first storage subsystem being configured to store data in a consistency group of the first storage subsystem, and replicate the consistency group to the second storage subsystem, the method comprising: receiving, at the first storage subsystem, a request to write a data payload
    • storing the data payload in the consistency group of the first storage subsystem, the storing being performed by a module in the first storage subsystem
    • initiating, by the module, a replication of the data payload to the second storage subsystem
    • stopping all replication of the consistency group to the second storage subsystem in response to detecting that the replication of the data payload is unsuccessful
    • synchronizing the first storage subsystem with the second storage subsystem
    • and updating, by the module, at least one address-to-hash (A2H) table in the first storage subsystem to identify an address associated with the data payload after the first storage subsystem and the second storage subsystem are synchronized, wherein the module is configured to update the A2H table after: (i) the data payload has been stored in the consistency group of the first storage subsystem, and (ii) the module has received an acknowledgment that the data payload has been successfully replicated to the second storage subsystem.
    • 9. The non-transitory computer-readable medium of claim 8, wherein
      • stopping all replication of the consistency group includes: instructing the module to stop acknowledging write requests; instructing the module to take a snapshot of the consistency group; and notifying the module to stop the replication of the consistency group to the second storage subsystem.
See all independent claims <>

Description

BACKGROUND

Storage systems in general, and block based storage systems specifically, are a key element in modern data centers and computing infrastructure. These systems are designed to store and retrieve large amounts of data, by providing data block address and data block content—for storing a block of data—and by providing a data block address for retrieval of the data block content that is stored at the specified address.

Storage solutions are typically partitioned into categories based on a use case and application within a computing infrastructure, and a key distinction exists between primary storage solutions and archiving storage solutions. Primary storage is typically used as the main storage pool for computing applications during application run-time. As such, the performance of primary storage systems is very often a key challenge and a major potential bottleneck in overall application performance, since storage and retrieval of data consumes time and delays the completion of application processing. Storage systems designed for archiving applications are much less sensitive to performance constraints, as they are not part of the run-time application processing.

In general computer systems grow over their lifetime and the data under management tends to grow over the system lifetime. Growth can be exponential, and in both primary and archiving storage systems, exponential capacity growth typical in modern computing environment presents a major challenge as it results in increased cost, space, and power consumption of the storage systems required to support ever increasing amounts of information.

Existing storage solutions, and especially primary storage solutions, rely on address-based mapping of data, as well as address-based functionality of the storage system's internal algorithms. This is only natural since the computing applications always rely on address-based mapping and identification of data they store and retrieve. However, a completely different scheme in which data, internally within the storage system, is mapped and managed based on its content instead of its address has many substantial advantages. For example, it improves storage capacity efficiency since any duplicate block data will only occupy actual capacity of a single instance of that block. As another example, it improves performance since duplicate block writes do not need to be executed internally in the storage system. Existing storage systems, either primary storage systems or archiving storage systems are incapable of supporting the combination of content based storage—with its numerous advantages—and ultra-high performance.

A number of issues arise with respect to such devices, and it is necessary to consider such issues as performance, lifetime and resilience to failure of individual devices, overall speed of response and the like.

Such devices may be used in highly demanding circumstances where failure to process data correctly can be extremely serious, or where large scales are involved, and where the system has to be able to cope with sudden surges in demand.

SUMMARY

In one aspect, a method includes selecting a C-module; sending a write from a host to the selected C-module; selecting a D-module to commit a page related to the write; selecting a R-module to transmit data from the write to the target; writing the data from the write to target location; and writing the data to an address-to-hash table after acknowledgement that the data has been written to the target location and after the D-module acknowledges that the page is committed.

In another aspect, an apparatus includes electronic hardware circuitry configured to selecting a C-module; sending a write from a host to the selected C-module; selecting a D-module to commit a page related to the write; selecting a R-module to transmit data from the write to the target; writing the data from the write to target location; and writing the data to an address-to-hash table after acknowledgement that the data has been written to the target location and after the D-module acknowledges that the page is committed.

In a further aspect, an article includes a non-transitory computer-readable medium that stores computer-executable instructions. The instructions causing a machine to select a C-module; send a write from a host to the selected C-module; select a D-module to commit a page related to the write; select a R-module to transmit data from the write to the target; write the data from the write to target location; and write the data to an address-to-hash table after acknowledgement that the data has been written to the target location and after the D-module acknowledges that the page is committed.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a simplified block diagram of an example of a system using content addressable storage (CAS).

FIG. 2 is a simplified block diagram of an example of a configuration of modules of the system of FIG. 1.

FIG. 3 is a simplified block diagram of an example of a data protection system using CAS, according to an embodiment of the disclosure.

FIG. 4 is a flowchart of an example of a process to perform a write, according to an embodiment of the disclosure.

FIG. 5 is a flowchart of an example of a process to trip a consistency group, according to an embodiment of the disclosure.

FIG. 6 is a flowchart of an example of a process to perform increment synchronization, according to an embodiment of the disclosure.

FIG. 7 is a computer on which all or part of the processes of FIGS. 4 to 6 may be implemented, according to an embodiment of the disclosure.

DETAILED DESCRIPTION

In example embodiments, contents-addressable storage (CAS) arrays, volume data is stored as a combination of an address-to-hash (A2H) metadata table and a backend hash-indexed disk storage. In synchronous replication in example embodiments, a consistency group (cgroup) trip requires suspending replication, and resuming it later with an increment resynchronization, copying only data written since the cgroup trip. In one example, a cgroup trip is a management event that, in response to link issues or complete link loss, suspends replication consistently for volumes in the entire group, leaving a consistent replica on the target.

Described herein are techniques that enable taking an instantaneous snapshot of the volume state (using the A2H table) and, enable resuming replication later with copying of the hash keys written since the cgroup trip (i.e., the ones not yet transmitted). In certain embodiments, these techniques account for writes that occurred after the cgroup trip, including for inflight I/Os (e.g., I/Os in process after replication has stopped), aborted I/Os, and any other link events that may get in the way of a correct increment copy. In other embodiments, the techniques also ensure that there will not be any instances where a piece of data (i.e., a hash signature) that was written to the source volume is not in the target volume also, when the increment resynchronization process is complete. Moreover, in further embodiments, these techniques minimize the excess of data that is transmitted during the increment resynchronization, i.e., data that is already on target and does not need to be resent.

In an example embodiment CAS array, data is stored in blocks, for example of 4 KB, where each block has a unique large hash signature, for example of 20 bytes, saved on Flash memory. The examples described herein include a networked memory system. In certain embodiments, the networked memory system includes multiple memory storage units arranged for content addressable storage of data. In some embodiments, the data may be transferred to and from the storage units using separate data and control planes. In other embodiments, hashing may be used for the content addressing, and the hashing may produce evenly distributed results over the allowed input range. In certain embodiments, the hashing defines the physical addresses so that data storage, for example, may make even use of the system resources.

An example embodiment CAS array can be used to ensure that data appearing twice is stored at a single location with two pointer pointing at the single location. Hence unnecessary duplicate write operations can be identified and avoided in example embodiments. Such a feature may be included in certain embodiments of the present system as data deduplication. As well as making the system more efficient overall, it also increases the lifetime of those storage units that are limited by the number of write/erase operations. In certain embodiments, deduplication of data, meaning ensuring that the same data is not stored twice in different places, is an inherent effect of using Content-Based mapping of data to D-Modules and within D-Modules.

According to example embodiments, the separation of Control and Data may enable a substantially unlimited level of scalability, since control operations can be split over any number of processing elements, and data operations can be split over any number of data storage elements. In certain embodiments, this allows scalability in both capacity and performance, and may thus permit an operation to be effectively balanced between the different modules and nodes.

Nothing in the architecture limits the number of the different R-, C-, D-, and H-modules which are described further herein. Hence, in example embodiments of the present invention, any number of such modules can be assembled. The more modules added, the higher the performance of the system becomes and the larger the capacity it can handle. Hence scalability of performance and capacity is achieved.

Referring to FIG. 1, a system 10 is an example of a system to perform scalable block data storage and retrieval using content addressing. System 10 is architected around four main functional Modules designated R (for Router), C (for Control), D (for Data), and H (for Hash). Being modular and scalable, any specific system configuration includes at least one of R-, C-, D-, and H-modules, but may include a multiplicity of any or all of these Modules.

In particular, the system 10 includes data storage devices 12 on which the data blocks are stored. The storage devices 12 are networked to computing modules, there being several kinds of modules, including control modules 14 and data modules 16. The modules carry out content addressing for storage and retrieval, and the network defines separate paths or planes: control paths or a control plane which goes via the control modules 14 and data paths or a data plane which goes via the data modules 16.

The control modules 14 may control execution of read and write commands. The data modules 16 are connected to the storage devices and, under control of a respective control module, pass data to and/or from the storage devices. Both the C- and D-modules may retain extracts of the data stored in the storage device, and the extracts may be used for the content addressing. Typically the extracts may be computed by cryptographic hashing of the data, as will be discussed in greater detail below, and hash modules (FIG. 2) may specifically be provided for this purpose. That is to say the hash modules calculate hash values for data which is the subject of storage commands, and the hash values calculated may later be used for retrieval.

Routing modules 18 may terminate storage and retrieval operations and distribute command parts of any operations to control modules that are explicitly selected for the operation in such a way as to retain balanced usage within the system 10.

The routing modules 18 may use hash values, calculated from data associated with the operations, to select the control module 14 for the distribution. More particularly, selection of the control module 14 may use hash values, but typically relies on the user address and not on the content (hash). The hash value is, however, typically used for selecting the Data (D) module 16, and for setting the physical location for data storage within a D-module 16.

The storage devices 12 may be solid state random access storage devices, as opposed to spinning disk devices; however disk devices may be used instead or in addition.

The routing modules 18 and/or data modules 16 may compare the extracts or hash values of write data with hash values of already stored data, and where a match is found, simply point to the matched data and avoid rewriting.

The modules 14, 16, 18 are combined into nodes 20 on the network, and the nodes 20 are connected over the network by a switch 22.

In example embodiments, the use of content addressing with multiple data modules selected on the basis of the content hashing, and a finely-grained mapping of user addresses to Control Modules allow for a scalable distributed architecture.

In some examples, the system 10 may employ more than a single type of memory technology, including a mix of more than one Flash technology (e.g., single level cell—SLC flash and multilevel cell—MLC flash), and a mix of Flash and DRAM technologies. In certain embodiments, the data mapping optimizes performance and life span by taking advantage of the different access speeds and different write/erase cycle limitations of the various memory technologies.

In some examples, blocks of data are mapped internally within the system based on Content Addressing, which may be, for example, implemented through a distributed Content Addressable Storage (CAS) algorithm. For example, this scheme may map blocks of data internally according to their content, resulting in mapping of identical blocks to the same unique internal location. In some examples. the distributed CAS algorithm may allow for scaling of the CAS domain as overall system capacity grows, effectively utilizing and balancing the available computational and storage elements in order to improve overall system performance at any scale and with any number of computational and storage elements.

The examples described herein implement block storage in a distributed and scalable architecture, efficiently aggregating performance from a large number of ultra-fast storage media elements (SSDs or other), while providing in-line, highly granular block-level deduplication with no or little performance degradation.

In one example, the system 10 may include one or more of the features of a system for scalable data storage and retrieval using content addressing described in U.S. Pat. No. 9,104,326, issued Aug. 11, 2015, entitled “SCALABLE BLOCK DATA STORAGE USING CONTENT ADDRESSING,” which is assigned to the same assignee as this patent application and is incorporated herein in its entirety. In other examples, the system 10 includes features used in EMC® XTREMIO®.

Referring to FIG. 2, an example of a functional block diagram of the system 10 is the diagram 200. In FIG. 2, an H module 200 is connected to an R-module 202. The R-module is connected to both Control 204 and Data 206 modules. The data module is connected to any number of memory devices SSD 208.

A function of the R-module 202 is to terminate SAN Read/Write commands and route them to appropriate C- and D-modules, 204, 206 for execution by these Modules. By doing so, the R Module 202 can distribute workload over multiple C- and D-modules 204, 206, and at the same time create complete separation of the Control and Data planes, that is to say provide separate control and data paths.

A function of the C-module 204 is to control the execution of a Read/Write command, as well as other storage functions implemented by the system. It may maintain and manage key metadata elements.

A function of the D-module 206 is to perform the actual Read/Write operation by accessing the storage devices 208 (designated SSDs) attached to it. The D module 206 may maintain metadata related with the physical location of data blocks.

A function of the H-module 200 is to calculate the Hash function value for a given block of data.

Referring to FIG. 3, the system 10 can be a system 10′ used for production and system 10 can also be a system 10″ used for replication, according to an embodiment of the disclosure. An example of a replication system is a replication system 300. The replication system 300 includes a host 302 and the system 10′ at a production site and a system 10″ connected to the system 10′ at replication site by a network 304. In this configuration example, data is replicated from the system 10′ to the system 10″. The system 10′ includes a system management module (SYM) 352, a C-module 354, a D-module 356, an R-module 358a and an R-module 358b.

The C-module 354 includes a volume 357a with an address-to hash (A2H) table 360a; and a volume 357b with an A2H table 360b that form a consistency group 359. As will be further described herein the C-module includes snapshots 364a, 364b corresponding to volumes 357a and 357b, respectively.

As will be further described herein the system 10″, a replica CG 367 includes a volume 387a which is a replica of volume 357a and a volume 387b which is a replica of volume 357b.

Referring to FIG. 4, a process 400 is an example of a process to perform a write in the replication system 300 in the production site, according to an embodiment of the disclosure. As will be further described herein process 400 contributes to a simplified resynchronization in the event of loss of synchronization between the production site and the replication site.

Process 400 receives a host write (402). For example, the host 302 writes to the system 10′ and the host write is received into a data page in memory by the R-module 358a.

Process 400 selects C-module (406) and sends write to selected C-module (412). For example, the R-module 358a selects the C-module 354 and sends the write command to the C-module 354.

Process 400 selects D-module to commit pages (416). For example, the C-module 354 selects the D-module 356 to commit the data page to disk.

Process 400 selects R-module to transmit (422). For example, the C-module 354 checks whether the write is to a synchronous replicated consistency group, and if so selects an R-module 358b to transmit the write to the target volume (e.g., either replica volumes 387a or replica volume 387b), sends a “transmit data” command to that R-module 358b, and waits for a response.

Process 400 reads data from R-module that received write from host (428). For example, the R-module 358b reads the data from the original R-module 358a. If the process 400 in processing block 422 is the R-module that received the host writes then this processing block is not needed.

Process 400 sends write to target (432). For example, the R-module 358b transmits the write to the target volume (e.g., either volumes 387a or volume 387b) for execution.

Process 400 waits for acknowledgement from the D-module and the R-module that transmitted the write (436). For example, the C-module 354 waits for the D-module 356 to acknowledge that the page was committed to disk and the R-module 358b to acknowledge that the write was written to the target volume (e.g., either volumes 387a or volume 387b).

Process 400 writes to the address-to-hash table if the D-module and the R-module acknowledge (440). For example, the C-module 354 updates the A2H 360 after the D-module 356 acknowledges that the page was committed to disk and the R-module 358b acknowledges that the write was written to the target volume (e.g., either volumes 387a or volume 387b).

Waiting for acknowledgement from the R-module before writing to the A2H table is important because if there is a link failure (e.g., between the production site and the replication site) there is certainty that the data committed to the A2H table has already replicated to the target volume. This makes recovery easier, and the performance penalty of not updating A2H in parallel with the replicating is small. Note that persisting the data to the backend is performed in parallel to transmitting data remotely; thus, the only additional delay is the A2H update, which is very small.

In example embodiments, a cgroup trip may be triggered by either a request from a C-module that is unable to transmit data to the target or a SYM Link monitoring component can decide to trip a cgroup if all the links are either down or are too slow.

In certain embodiments, after a cgroup trip, when the local (e.g., CG 359) and remote copy (CG 367) are out of sync and a replication is being re-established, the re-sync operation should not involve a full copy of the cgroup because it takes too much time to copy entire volumes of data. For example, a full resynchronization of a 4 TB volume with 4 KB page size may involve copying 1 billion page copy operations, and take hours to complete. Instead, in example embodiments, only the data that was written since the local and remote were last in sync, should be copied (including data that was inflight (inflight I/Os) when the cgroup trip occurred).

In some embodiments, once a C-module receives a transmit error from an R-module, it tries to send the request to the next R-module on a list of available transmit R-modules, i.e., a list of R-modules with links to the target storage array. In certain embodiments, if all the links fail or if the write operation fails on the replication side, or if the C-module ran out of time for retries, the C-module receiving the error is responsible for notifying the SYM module. In some embodiments, the SYM module in turn trips the cgroup and posts an alert that the cgroup is no longer equal to its replica. In certain embodiments, the replication pair (source-target) enters an asynchronous replication state, where it collects the data that will be needed for resynchronization once the problem is fixed.

In one example the I/O in the C module is not completed until the cgroup trip occurs. Once the trip occurs and replication has been suspended, a good status is sent to the host for the I/O.

Referring to FIG. 5, a process 500 is an example of a process to trip a consistency group, according to an embodiment of the disclosure. In one example, the process 500 is performed by the SYM module 352. Process 500 instructs C-modules to stop acknowledging writes from the host (502). For example, the SYM module 352 instructs the C-module 354 to stop acknowledging writes from the host 302.

Process 500 instructs the C-modules to take a snapshot of the consistency group (508). For example, the SYM module 352 instructs the C-module 354 to take snapshots 364a, 364b of volumes 357a, 357b respectively in a consistency group 359. The snapshots 364a, 364b contain data that has been verified to exist on the replica volumes 387a, 387b respectively. Any data whose transmission was aborted or incomplete is not in the snapshot.

Process 500 notifies C-modules to stop replication (512). For example, the SYM module 352 instructs the C-module 354 to replicate the consistency group 359.

Process 500 makes indication that replication has stopped for the consistency group (522). For example, the SYM module 352 sends an alert indicating that replication has stopped for the consistency group and/or that the source and target are no longer synchronized.

Referring to FIG. 6, a process 600 is an example of a process to perform increment resynchronization, according to an embodiment of the disclosure. Process 600 sends a command to C-Modules to perform increment synchronization (602). For example, the SYM module 352 sends a command to the C-modules 354 to perform synchronization of the volumes in the consistency group.

Process 600 compares volume with snapshot (606). For example, the C-Modules on the production site compare each of the volumes of the consistency group to the snapshot of the consistency group performed in processing block 508 in process 500.

For example, process 600 sends corresponding data pages from source to target of any volume data different from the snapshot data (612). For example, any data in the volume at the source different from the snapshot is sent to the respective target volume by sending the corresponding data pages from the D-Module to the target volume (e.g., one of replica volume 387a and replica volume 387b).

Referring to FIG. 7, in one example, a computer 700 includes a processor 702, a volatile memory 704, a non-volatile memory 706 (e.g., hard disk, flash disks and so forth) and the user interface (UI) 708 (e.g., a graphical user interface, a mouse, a keyboard, a display, touch screen and so forth), according to an embodiment of the disclosure. The non-volatile memory 706 stores computer instructions 712, an operating system 716 and data 718. In one example, the computer instructions 712 are executed by the processor 702 out of volatile memory 704 to perform all or part of the processes described herein (e.g., processes 400, 500 and 600).

The processes described herein (e.g., processes 400, 500 and 600) are not limited to use with the hardware and software of FIG. 7; they may find applicability in any computing or processing environment and with any type of machine or set of machines that is capable of running a computer program. The processes described herein may be implemented in hardware, software, or a combination of the two. The processes described herein may be implemented in computer programs executed on programmable computers/machines that each includes a processor, a non-transitory machine-readable medium or other article of manufacture that is readable by the processor (including volatile and non-volatile memory and/or storage elements), at least one input device, and one or more output devices. Program code may be applied to data entered using an input device to perform any of the processes described herein and to generate output information.

The system may be implemented, at least in part, via a computer program product, (e.g., in a non-transitory machine-readable storage medium such as, for example, a non-transitory computer-readable medium), for execution by, or to control the operation of, data processing apparatus (e.g., a programmable processor, a computer, or multiple computers). Each such program may be implemented in a high level procedural or object-oriented programming language to communicate with a computer system. However, the programs may be implemented in assembly or machine language. The language may be a compiled or an interpreted language and it may be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment. A computer program may be deployed to be executed on one computer or on multiple computers at one site or distributed across multiple sites and interconnected by a communication network. A computer program may be stored on a non-transitory machine-readable medium that is readable by a general or special purpose programmable computer for configuring and operating the computer when the non-transitory machine-readable medium is read by the computer to perform the processes described herein. For example, the processes described herein may also be implemented as a non-transitory machine-readable storage medium, configured with a computer program, where upon execution, instructions in the computer program cause the computer to operate in accordance with the processes. A non-transitory machine-readable medium may include but is not limited to a hard drive, compact disc, flash memory, non-volatile memory, volatile memory, magnetic diskette and so forth but does not include a transitory signal per se.

The processes described herein are not limited to the specific examples described. For example, the processes 400, 500 and 600 are not limited to the specific processing order of FIGS. 4 to 6, respectively. Rather, any of the processing blocks of FIGS. 4 to 6 may be re-ordered, combined or removed, performed in parallel or in serial, as necessary, to achieve the results set forth above.

The processing blocks (for example, in the processes 400, 500 and 600) associated with implementing the system may be performed by one or more programmable processors executing one or more computer programs to perform the functions of the system. All or part of the system may be implemented as, special purpose logic circuitry (e.g., an FPGA (field-programmable gate array) and/or an ASIC (application-specific integrated circuit)). All or part of the system may be implemented using electronic hardware circuitry that include electronic devices such as, for example, at least one of a processor, a memory, a programmable logic device or a logic gate.

Elements of different embodiments described herein may be combined to form other embodiments not specifically set forth above. Other embodiments not specifically described herein are also within the scope of the following claims.

Read more
PatSnap Solutions

Great research starts with great data.

Use the most comprehensive innovation intelligence platform to maximise ROI on research.

Learn More

Citation

Patents Cited in This Cited by
Title Current Assignee Application Date Publication Date
System and method for increasing the effective bandwidth of a communications network ARTERA GROUP, INC. 04 June 2002 24 April 2003
System and method of providing a cache-efficient, hybrid, compressed digital tree with wide dynamic ranges and simple interface requiring no configuration or tuning HEWLETT PACKARD ENTERPRISE DEVELOPMENT LP 04 June 2001 27 March 2003
Systems and methods for electronic data storage management STORACTIVE, INC. 20 December 2001 09 May 2002
情報処理装置、情報処理方法、およびプログラム 株式会社フィックスターズ 15 April 2013 30 October 2014
Dynamic trap table interposition for efficient collection of trap statistics ORACLE AMERICA, INC. 29 January 2002 31 July 2003
See full citation <>

More Patents & Intellectual Property

PatSnap Solutions

PatSnap solutions are used by R&D teams, legal and IP professionals, those in business intelligence and strategic planning roles and by research staff at academic institutions globally.

PatSnap Solutions
Search & Analyze
The widest range of IP search tools makes getting the right answers and asking the right questions easier than ever. One click analysis extracts meaningful information on competitors and technology trends from IP data.
Business Intelligence
Gain powerful insights into future technology changes, market shifts and competitor strategies.
Workflow
Manage IP-related processes across multiple teams and departments with integrated collaboration and workflow tools.
Contact Sales
Clsoe
US10152527 Increment resynchronization hash-based replication 1 US10152527 Increment resynchronization hash-based replication 2 US10152527 Increment resynchronization hash-based replication 3