Conflicts can occur in replication environments where a document can be updated concurrently by one or more writers. This is especially common in mobile environments wherein unreliable network connections may result in concurrent changes from multiple devices not being synchronized in a timely fashion , thereby resulting in conflicts.
In this post, we discuss the fundamentals of how conflicts are handled in Couchbase Mobile and outline the role of applications in resolving conflicts. In a following post, we will discuss a related topic of managing Revision Trees and database sizes in a Revision Based Document Management system.
The illustration below describes a scenario where conflicts can occur in mobile environments
The Couchbase Mobile stack includes the Couchbase Lite embedded database running locally on devices and Sync Gateway in the cloud which is typically backed by Couchbase server persisting the data in the cloud. The Sync Gateway handles the replication of documents across the devices. It is conceivable that a document can be updated by multiple devices at the same time.
Multi Version Concurrency Control
Couchbase Mobile uses the Multi Version Concurrency Control (MVCC) technique to handle conflicts. In this method, every document is assigned a unique system generated revision ID. This ID is in addition to the document ID which remains the same across document revisions. Every change to a document, be it is a modification or a delete is treated as a new revision to the document and is hence, assigned a new revision ID.
Every time a change is to be made to an existing document, the writer must include the revision ID of the current revision of the document that is being updated. A new revision is created for the change and added as a child node to the current revision that was being updated, resulting in a Revision Tree for the document.
Every document has an associated Revision Tree that grows over the lifetime of the document. We will discuss techniques to manage the size of the tree in a future post – stay tuned!
At a very high level, every document in Couchbase Mobile V1.4 is comprised of a document ID, current revision ID, JSON body and Metadata. The Metadata, among other things, holds the revision history for the document. The Metadata is a “behind the scenes” concept, and user applications should never really worry about it. In fact, in future version of Couchbase, the metadata will be moved out of the document completely and into an XATTR.
In addition, every document revision has an associated TTL value (that defaults to 5 minutes).
In a MVCC based system, every update including a delete operation creates a document revision. Deleted revisions are called “Tombstone” revisions. A deleted revision is essentially a special revision that has the “_deleted” property to true. Deletion revisions are replicated. These revisions are special in that if you do a query on Couchbase Lite, they won’t be returned.
The revision ID Structure
A Revision ID is of the format <generation ID> “-“ <content hash ID>
- generation ID (new revision) = generation ID (parent of the revision) + 1
The very first revision that is created when a document is created has a generation ID of 1
- content hash ID = hash computed from the contents of the revision
This implies that two revisions of a document with identical content will have the same content hash ID.
Note: As an optimization, if two writers make identical changes to a document concurrently, resulting in two incoming revisions with the same revision ID, Couchbase Mobile will only store a single revision.
In a MVCC based system, a conflict occurs if the system encounters a branch in the tree. From the discussion on revision trees, you can infer that this would be the case where there are two or more leaf nodes in the tree.
When a conflict occurs, Couchbase Mobile still needs to select a “winner” or “Current Revision” among the conflicting leaf nodes. Picking a winner does not imply that the conflict is resolved.
Couchbase Mobile picks a winner deterministically. Due to the deterministic nature of the process, there is no need for couchbase mobile nodes to communicate with each other about picking winners – they all pick the same one using the following criteria
Case 1 : All leaf revisions are undeleted
- The winner is the undeleted leaf revision on the longest revision branch
Case 2: All leaf revisions are deleted
The winner is the deleted leaf revision on the longest revision branch
Case 3: Some leaf revisions are deleted and some are undeleted
The winner is the undeleted leaf revision on the longest revision branch
Case 4: There is a tie
The winner is the one whose revision ID sorts higher in a simple ASCII comparison
Who Must Handle Conflicts ?
Although Couchbase Mobile picks a Current Revision among conflicting revisions, it is the ultimately the responsibility of the application to resolve conflicts for the following reasons –
- The choice of the winner among conflicting revisions may be based on criteria other than the ones that the System uses to deterministically pick a winner. The current revision selected by Couchbase may not be the right choice for the application.
- There isn’t a clear winner among the conflicting resolutions. So, in this case, there may be a need to merge changes from conflicting revisions. The specifics of a merge depends on application semantics.
- Even if the application decides to go with the revision that Couchbase picked as the winner, remember, that the conflicting revisions still remain in the database. So, the application must still delete the non-winning revisions from the Revision tree in order to prevent the Revision Tree from growing very large with unused conflicting revisions. This will be discussed in depth in an upcoming post.
Options for Handling Conflicts
There are two options for handling conflicts
Option 1: Pick a Revision among conflicting revisions
In this option, one of the conflicting revisions as the winner and “tombstone” the rest.
The application can retain the “Current Revision” picked by Couchbase as the winner or it can pick a different revision among the conflicting revisions. Regardless, it is important to delete the non-winners so they can be purged during Database Compaction as discussed in the next section
Option 2: N-way-merge
In this option, the changes from conflicting revisions are merged according to application semantics. The merges go into a new revision which becomes the current/winning revision. The non-winning branch is tombstoned.
What if Conflicts are not resolved ?
If an application decides to go with the winner picked by Couchbase Mobile and does not explicitly resolve conflicts, you can end up with a Revision tree with a large number of branches and leaf nodes. This will have undesirable consequences on the size of the document and consequently, on the size of the database. Couchbase Mobile has automatic processes in place to alleviate some of this but the application is ultimately responsible for ensuring that unwanted leaf revisions are tombstoned.
We will discuss the details of database and revision tree management in a following post , so stay tuned.
This post was an introduction to the concurrency control system in Couchbase Mobile. One important consideration in a MVCC based system is managing the size of the revision trees and to preventing it from bloating. This aspect will be discussed in an incoming post.
If you have questions or feedback, please leave a comment below or feel free to reach out to me at Twitter @rajagp or email me firstname.lastname@example.org. The Couchbase Forums are another good place to reach out with questions.