As a Developer Advocate at Couchbase, I go to many events and interact with a great deal of developers. When talking about Couchbase, I get a lot of questions around the durability of data that is being created. For example, Couchbase is a memory-first, and persisted-disk later type of database where every interaction happens against the memory layer before it asynchronously hits the disk, keeping things incredibly fast. The question that always comes up is, how can you be sure that the data is saved in the event of a node failure?

This is a great question and we’re going to look at how to get past these problems should it ever happen.

We’re going to see a few examples, all very similar, around different scenarios of data durability. These examples will be in Node.js, but the commands will be similar regardless of which Couchbase SDK you wish to use.

Let’s start with the first scenario. We have a massive multiplayer online role playing game (MMORPG) like World of Warcraft of similar and we need to periodically save the game state for a particular player so that when they sign back in at a later time, their position, statistics, and items are all kept in line. In this scenario, because saves happen frequently and latency is everything, saving to the cache is probably good enough. We can write code like the following:

The player state is saved and the game receives a response saying it is alright to move along. If the node fails and the save hasn’t hit the disk or has been replicated, it probably isn’t too big a deal because of the frequent saves.

Now let’s come up with another scenario. Let’s say we are creating user accounts for our application. While we could rely on the memory layer, we want a little more confidence that our data will reach the disk. After all if the node fails, we can’t recover from what was in memory, but we could recover what was on disk. So instead, we adjust our operation a bit:

We’ve added persist_to which allows us to specify how many nodes the data must persist to disk on before giving a response. In our case, we only want to know that the data sits on disk for a single node and that is alright.

Now let’s come up with our third and final scenario. Let’s say we have created a banking application. It is critical that our transactions exist on disk and it is critical this data be replicated in case of node and disk failure. Losing any information for a customer in regards to banking data could end up with massive penalties.

With that said, our creation operation changes a bit:

In the above code we are saying that after we create a document, we want it to be persisted to at least one disk and replicated to at least two other nodes before we receive a response. We can adjust these numbers as necessary, but remember, the higher the durability requirement, the slower the response from the operation.

Conclusion

There is quite a bit you can accomplish with Couchbase. While many situations won’t need you to define your own durability requirements, being able to decide if your data should hit cache, disk, or replica via the SDK is a good thing. Just keep in mind that the stronger your requirements, the slower things become, not because Couchbase is slow to do these things, but because more is happening.

If you want to learn more about using the Couchbase Node.js SDK, check out the SDK documentation. If you want to learn more about data durability, check out the Couchbase Developer Portal.

Author

Posted by Nic Raboy, Developer Advocate, Couchbase

Nic Raboy is an advocate of modern web and mobile development technologies. He has experience in Java, JavaScript, Golang and a variety of frameworks such as Angular, NativeScript, and Apache Cordova. Nic writes about his development experiences related to making web and mobile development easier to understand.

Leave a reply