Document conflicts can occur in distributed environments that support data synchronization, wherein a document can be updated concurrently by one or more writers. This is especially common in mobile environments where unreliable network connections may result in concurrent changes from multiple devices not being synchronized in a timely fashion. It can also occur on the client side, for instance, if a document is being updated locally while changes to the document are being pulled down to the client from remote server. Hence, there is a need to resolve these document conflicts.  Couchbase Mobile 2.0, introduces “Automatic Conflict Resolution” or a “Conflict free” mode wherein conflicts are automatically handled and  there are effectively no conflicting document revisions in the Couchbase database.

As an app developer, you probably wouldn’t ever have to do anything specifically on the app side in order to deal with conflicts because the system’s default conflict resolver will automatically handle things for you. However, if you need to, you have the option of being notified on conflicts and can take appropriate action . This post will discuss the fundamentals of how document conflicts are automatically handled in Couchbase Mobile 2.0 using the “automatic conflict resolution” or “conflict free ” mode .

 

Assumptions

This post assumes that you are familiar with the architecture of Couchbase Mobile stack that includes the Couchbase Lite embedded datababase for your mobile clients, the Sync Gateway and Couchbase Server. If you are new to the platform, I would recommend catching with these resources.

Background

If you haven’t done so, it would be worthwhile reading this earlier demystifying conflict resolution post that discusses the fundamentals of how conflict resolution is handled on “non-conflict-free” mode in Couchbase Mobile including an overview Multi Version Concurrency Control (MVCC) system.

Document Revision Tree

To understand conflict resolution process, you would have to a basic understanding of how documents are stored. In Couchbase Mobile, every document is assigned a unique system generated revision ID or revID. This ID is in addition to the document ID which remains the same across document revisions. Every change to a document, be it a modification or a delete is treated as a new revision to the document and is hence, assigned a new revision ID.

In addition, every document revision has a “generationID” associated with it. The first time a document is created, a revision gets created with a “generationID” of 1. Every subsequent revision update will increment this number.

Every time a change (edit or delete) is to be made to an existing document, the writer must include the revision ID of the current revision of the document that is being updated. A new revision is created for the change and added as a child node to the current revision that was being updated, resulting in a Revision Tree for the document.

NOTE : The “revID” must be treated as an opaque object by the app and the app must not try to generate the “revID”.

How Document Conflicts Occur

At a high level, a conflict occurs when changes are made to the same parent revision of a document by multiple writers.

Conflicts At Couchbase Lite

The illustration below describes a scenario where an app is updating a document revision in memory while the same revision is being updated as a result of a pull replication.

Conflicts At Sync Gateway

The illustration below describes a scenario where two clients push up changes to the same revision of the document.

Conflict Resolution in Couchbase Mobile 2.0

In “Conflict Free” mode , the Sync Gateway essentially rejects all conflicting updates. The conflicts are automcatically resolved at the time the document is created, updated or deleted.
So the implication of this is that there are no conflicting documents stored in the database and is effectively, “Conflict Free”. We will take a look at how this works.

Automatic Conflict Resolution in Couchbase Lite 2.0

Every document in Couchbase Lite 2.0 is associated with a “conflict resolver” that is executed if a conflict occurs when an attempt is made to save or delete a document. The conflict resolver function chooses a “winner” among the conflicting revisions and this winner is added as a child to the document revision tree.

There are two concurrency control policies supported by Couchbase Lite.

Concurrency Control Policy

Last Write Always Wins (Default):

This is also the default conflict resolver policy. In this case, the last update to the database always wins.

When you call saveDocument with no Concurrency Control argument this is the policy that is in effect by default.

This is an example of the call in swift.

The examples below should clarify the implications of this policy

  •  In example below, our save / update request succeeds even though the document in the database was updated since the last time we read the document. The document in the database could have been updated by a different thread in the app as a result of some external trigger such as a pull replication or a data fetch from a remote server. The implication is that any changes that were made to the document between the time we read the document and updated it would be overwritten.
  • In example below, our save / update request succeeds even though the document in the database was deleted since the last time we read the document. The document in the database could have been updated by a different thread in the app as a result of some external trigger such as a pull replication or a data fetch from a remote server. The implication is that a deleted document could be resurrected as a result of the save.
Fail On Conflict

While we anticipate that the default concurrency control policy of “last write wins” should work for most cases, you can override the default behavior by specifying that you wish to be notified in case of a conflict while saving the document. This can be done by including the optinal ConcurrentControl argument as part of the save request.

A return value of false with a ConcurrencyControl policy of failOnConflict indicates that the document saved failed as a resSync Gatewaproperty which determined the

y is configurable through the allow_conflicts property of a conflict. We will now examine how you can handle conflict errors.

Handling Fail on Conflicts

How you handle conflicts depends on the application semantics. Here are examples of how you can handle it in swift (which can be easily mapped to other languages)

  • Option 1 . Merge the conflicting versions of the document and save

This is an example of how you would merge the conflicting revisions. Again, how you would do it is entirely upto your application. This is intended to be a reference implementation on how you could handle it.

  • Option 2 : Force your save to win

This option is effectively the same as the default concurreny policy of “writes always win”. Except that in this case, you examine the contents of currently saved document and then make a determination on whether to force a save using a saveDocument.
Note that you could still run the risk of a race condition in that the document could be updated again before you save it.

  • Option 3 : Skip the save

In this case, you could examine the contents of currently saved document and then determine that you would rather keep the currently saved version of the document.

Implications on Creating Documents with same Id

Since the default conflict resolution policy on saveDocument is that the last write always wins, if you attempt to create a document again with a docId that already exists in the database, it will go ahead and update the existing document by additing a new revision to it.

Hence, if you want to ensure that you do not inadvertently update an existing document, you must specify the “ConcurrenyControl argument with failOnConflict. This will then return an error that you can handle appropriately.

This is very much similar to the Option 3 specified earlier except that in this case, you don’t have to examine the current contents of the document. A conflict failure implies that a document with Id already exists.

This is an example in swift

Conflict Free Mode in Sync Gateway 2.0

In this mode, the sync gateway rejects revisions that causes a conflict with a HTTP 409 Error, effectively ensuring that there are no conflicting revisions in the databaase. Conflicts are handled on the client side during a pull replication.

Push Replication

  1. Client pushes up the revision changes to Sync Gateway.
  2. Sync Gateway detects that the incoming revision is in conflict with the current saved revision on the server (i.e. the incoming revision’s ancestor is not the active revision on the server)
  3. Sync Gateway rejects the revision change with a 409 Error. Couchbase Lite does not really do anything other than log the error. The conflict is subsequently resolved during a pull replication.

Pull Replication

During a pull replication, if the client detects a conflict, the conflict is resolved using the following deterministric criteria

  • Deletes always win. An example of this case is as shown below.

  • The most recent change (highest generation ID) wins or the revision with max revID wins if the generations are the same. i.e. – the revID that sorts higher in a simple ASCII comparison. An example of this is shown below.

 

We will examine the cases during Pull Replication –

Server Branch is the Winner

  1. Client pulls the revision changes from the Sync Gateway
  2. Client detects that the incoming revision is in conflict with the current saved revision (i.e. the incoming revision and saved revision share a common parent)
  3. Client calls the conflict resolver function which determines the winner between the current saved revision and the server revision.
    1. Since the incoming revision (Rev2-B) from server is more recent than the local revision (Rev2-A) , the server revision is selected as the winner. Note that this would also be the case if the revision on server was deleted.
  4. The server branch is grafted to the local revision tree (and the local branch is tombstoned)
  5. Subsequently, when changes are pushed to the server, there will be no conflicts and the server syncs up with the client side.
Local Branch is the Winner
  1. Client pulls the revision changes from the Sync Gateway
  2. Client detects that the incoming revision is in conflict with the current saved revision (i.e. the incoming revision and saved revision share a common parent)
  3. Client calls the conflict resolver function which determines the winner between the current saved revision and the server revision.
    1. Since the local revision (Rev2-B) is the more recent than the incoming server revision (Rev2-A) , the local revision is selected as the winner. Note that this would also be the case if the revision on the local branch was deleted.
  4. The server branch is grafted to the local revision tree and a new revision Rev3 is added which corresponds to the contents of the winning revision, Rev2-B
  5. Subsequently, when changes are pushed to the server, there will be no conflicts and the server syncs up with the client side.

NOTE: As you have observed, conflicts are resolved on Couchbase Lite during a pull replication. The implication of this is that if the replicator is configured to only be a push replicator, then Couchbase Lite’s view of the data will diverge from Sync Gateway’s (since some of CBL’s attempted writes would fail w/ 409). Hence, if you forsee conflicts to occur, then it is recommended that Couchbase Lite replicator be configured in push-pull mode when used in Conflict Free mode

 

Configuring Conflict Free Mode on Sync Gateway

“Conflict Free” mode on Sync Gateway is configurable through allow_conflicts property in the Sync Gateway configuration file. It must be set to “false” to enable conflict-free mode. It should also be noted that regardless of this property configuration, there will be no conflicting revisions added to the Sync Gateway when syncing with Couchbase Lite 2.0 clients. We discussed that earlier in the section on Replication . The “allow_conflicts” configuration only has implications for non 2.0 clients  and REST API clients. The table below summarizes the implication

allow_conflicts_configuration

Impact of Conflict Free mode on Database sizes

The impact of conflict resolution on database size was discussed in detail in our earlier post on Managing database sizes. In the system that allowed for conflicts, as the number of conflicting revisions grow, the size of the revision tree can grow impacting the size of the database. So it was important to resolve conflicts in a timely fashion. Sync Gateway has a revs_limit property which determines the size of the revision tree. The revs_limit property defaults to 1000, which means that the metadata corresponding to the last 1000 revisions are stored in Sync Gateway before getting pruned away. While setting the revs_limit to a large value would negatively impact the database size, it was important not to set it to a very low value. It was important to maintain the metadata corresponding to the older revisions as they were needed to handle conflict resolution else you run the risk of having a forest of disconnected revision trees for a document. So the minimum permissible value of revs_limit is 20.

However, with conflict-free mode, there is no real need to save the metadata corresponding to older revisions. This implies that the revs_limit could be just 1. This implies that only the latest/active revision is stored. This brings big savings in database size.

Acknowledgements

Special thanks to Pasin Suriyentrakorn from the Mobile Engineering team for his feedback on the Couchbase Lite sections, to Adam Fraser, also from the Mobile Engineering team, for his feedback on the Sync Gateway sections and to Daniel Petersen from Mobile Engineering team for his feedback .

What Next

This blog post discussed how conflicts are automatically handled in Couchbase Mobile 2.0. You can download Couchbase Mobile 2.0 from our downloads page.

If you have questions or feedback, please leave a comment below or feel free to reach out to me at Twitter at @rajagp or email me at priya.rajagopal@couchbase.com.  The Couchbase Forums are another great place to reach out with questions.

 

Posted by Priya Rajagopal , Principal Product Manager

Priya Rajagopal is a Principal Product Manager for Couchbase focusing on the Couchbase Mobile platform. She has been professionally developing software for over 19 years in several technical leadership positions, with the last 7 years focussed on mobile technologies. She has spent a decade in software R&D and is a co-inventor on 22 technology related patents.

11 Comments

  1. Hi Priya,

    If Mobile 2.0 introduces the new “Conflict Free” mode, what is the alternative mode? Especially since Gateway only support “Conflict Free” as EE. How would conflicts be resolved/not resolved on the Community edition?

    Using ConcurrencyControl it’s possible to handle and resolve conflicts manually on Save() at the local client side. For conflicts that result from replication you say:
    > Client calls the conflict resolver function
    Where is this conflict resolver function set? Beta 2 doesn’t have the old ConflictResolver on the ReplicatorConfiguration. Does this refer to the internal automatic resolver, or is this something that can be resolved manually? If this is automatic only, would it not be strange that internal local save conflicts can be handled in a custom way, but external replicated conflicts can not?
    Also, how would this be handled differently when you don’t have “Conflict Free” (EE) mode?

    Thank you.

  2. Hi!
    In CE , conflicts on the SGW would be handled the way it was in 1.x version. Basically, would not get the 409 Error on push conflict.
    Yes- this is the internal conflict resolver. You cannot override that on 2.0.
    You are right in your observation that conflicts to local saves can be detected by app but you cannot do that for conflicts resulting from replication. But also note, that the default conflict resolution policy is different in the two cases – Unlike the local save case, in the replication case, it is deterministic and we think that in most cases, user would prefer a deterministic state.

    The objective is to be “conflict free”. Hence, in both cases, the conflict resolution happens automatically at the time that the document is updated (local or remote pull). So with that goal in mind, unlike the former case , where we can reject a local save , as you can easily deduce , you cannot quite do the same on a pull replication .

    Now , that said, we can consider providing an option to override the automatic conflict resolver. That will be post 2.0 depending on user demand. Do you see the need for that ?

    1. In the 2.0 documentation it states:
      > Starting in Couchbase Lite 2.0, document conflicts must be resolved using automatic conflict resolution or in the application.
      So if the CE is meant to use the v1.x method instead, this is not true?

      All the items used in v1.x to manage revisions and conflicts, such as conflict resolvers, document Revision class, etc. seem to all have been removed from the 2.x API. So if the 2.x CE is meant to use the v1.x method, how? Everything seems to have been designed for the new automatic method only. The CE Gateway would still have multiple revisions for documents in a conflicted state, until they get resolved somewhere. So then clients need to be able to see which documents are in a conflicted state, and what revisions exist.

      I do see a need to override the automatic conflict resolver, since you allow to override it on the client save() method. Sure, if you employ continuous replication, then conflicts are most likely to occur on save, since changes are replicated instantly (hopefully). But if you are mostly working offline, and only occasionally replicating, then conflicts are mostly going to occur when replicating, not on save(). In that scenario having the advanced custom merging capabilities on save() are much less useful, and you would be ‘stuck’ with just the default to the longest revision tree. It would seem to me that if you wanted to handle the conflict on save(), you would also want to ability to handle the conflict from replication the same way.

      Sure the technical problem is different, since you can’t ‘reject’ the pull. In v1.x this was solved with having the multiple revisions in a conflicted state, and then resolving it. I think you would still want to have this option in v2.x somehow. I don’t think the technical challenge of implementing this should be the reason to cripple the functionality compared to v1.x, and compared to the capabilities of save().

  3. The third panel of the first diagram (in “Conflicts At Couchbase Lite”) features the caption: “App tries to save locally modified version of document. Document is not in conflict!”

    The illustration seems to clearly depict a conflict. Is there an error in the caption?

  4. Further to avia_bdg’s comment above, I would like some clarification on the conflict resolution strategy in regard to replication. By my reading of the article and the dialogue above, it sounds as though the current design of CBL 2.0 virtually guarantees data loss by offline clients that use Sync Gateway. This would make the product completely unusable.

    Without a means of being able to deterministically resolve conflicts that result from a pull replication, the client rolls the dice on whether its changes will be thrown out. Merges—a fundamental requirement in many applications, including my principal project, where we currently use CBL 1.x—are no longer possible. That’s a show-stopper.

    Please assure me that I’ve misunderstood.

    1. Thank you for your feedback. Your understanding is right. We will evaluate bringing this in…

  5. Very good questions!
    The simpler one first – We will evaluate the option of overriding the custom conflict resolution policy on replication – but that would be post 2.0.
    Now for the other question –
    To clarify, Couchbase Lite 2.0 clients will always use conflict free mode (CE or EE). The distinction between CE and EE is on the SGW side.
    For the gory details –
    To clarify the distinction on SGW side, pushes from 2.0 clients will always be in “conflict free” mode . The implication of this – without going into too much detail on the protocol is that the SGW (EE and CE) will reject the revision if it causes a conflict (as discussed in blog post). The CE versus EE distinction is when updates come in from non CBL 2.0 clients / REST API clients. EE SGW will prevent conflicts from any of these clients. But you can have conflicts in SGW CE mode when interacting with non CBL 2.0 clients . Those conflicts will be handled by 1.x the usual way and CBL 2.0 clients will pull the winning revision from server

    1. Luckily I don’t think we’ll have this 1.x/2.x mixing problem, so we should be ok with CE then.

      I think Ben and I pretty much agree that we need some way to custom resolve conflicts resulting from pull replications.

      We are applying CBL on Xamarin mobile, on devices that will be offline in sparse internet availability environments and one-shot replicate occasionally. Since this is a mobile app we are looking at having an autosave feature for our data form – in case the user suspends the app and it is closed, wants to make partial edits,… This means potentially a high rate of revisions, which would be stored locally only until they are pushed. I ran some tests, and this means that if I edit with autosave 2 or 3 times, a single edit made offline on another device made at a later date will lose because fewer revisions exist. This is not going to be acceptible. The last edit (by datetime) would be better, custom code to do some merging would be even better, number of edits (the current automatic resolver algorithm) is not ok for this.

      Even if we change from instant autosave to fewer saves or manual saves, the problem remains: the automatic conflict resolution algorithm is only going to work for online scenarios, and not be acceptible for offline scenarios, where clients may be offline for shorter or longer times (weeks or months), and make many edits while offline, or make edits at different ratios and amounts from other offline clients, and more numerous older edits may throw out fewer newer edits.

      1. Thank you for the details. Your feedback is helpful. We will evaluate this further …will keep you posted

      2. Quick update : I removed the note on CE/ EE ….I added a new section on SGW configuration to clarify the same. There isn’t a distinction at CE/EE level.

Leave a reply