What separates a document store from a key-value store?

A document database can query and manipulate the contents of what it stores, whereas the value remains an opaque mystery to the key-value store.

With Couchbase Server 4.5, we’ve introduced our new sub-document API to give you greater query and mutation access to the meat of your documents.

Sub-doc basics

So, what’s the tl;dr of Couchbase’s new sub-document features?

  • Retrieve parts of a document.
  • Change parts of a document by sending the changes to Couchbase Server.
  • Available now in all official Couchbase Server SDKs.
  • Compliments the query offered by views and the query and mutations offered by N1QL.

Okay, let’s get onto the why.

Why sub-document operations?

Let’s say we have a 1 MB JSON document in Couchbase that stores a chat log between two people. Each time the chat participants write something new, we want to add it to the log document.

Without sub-document operations, we’d need to:

  1. fetch the entire document
  2. deserialise it in your application
  3. add the new line of the conversation
  4. serialise it back to JSON
  5. send the full document back to Couchbase to be stored.

Using Couchbase’s Python client, it would look something like this:

Seems pretty wasteful. Now consider the speed at which people tend to chat in instant messaging conversations. Then perhaps think of how many concurrent chats could be taking place.

Couchbase Server can happily play its part there but it puts an unnecessary burden on your network and it makes you maintain application code to handle what you might reasonably consider to be database-layer functions.

With sub-document operations, we get to do a lot less. Adding a new line to a chat log would look more like this, again in Python:

We send the new line to Couchbase Server, specifying the document and where in the document to make the change.

That way, we get to avoid a round-trip, save network bandwidth by sending only what has changed and move responsibility for updating the document from the application to the database layer.

How sub-document operations work

Your Couchbase SDK uses three APIs to work with Couchbase Server:

  • key-value, using the memcached binary protocol
  • views, using the CouchDB REST protocol
  • N1QL, using N1QL’s REST protocol.

The SDK abstracts away the details but, if you’ve worked with Couchbase Server, you know which type of interaction you’re having.

Sub-document operations happen using an extension of the memcached protocol that handles your key-value operations.

Finding your sub-document

To perform a sub-document operation you need two things:

  • the document key
  • the path of the sub-document within the parent document.

What do we mean by path?

The path is the location within the document where you want to do something and it’s the same notation used by N1QL.

Let’s say we’re working with customer profiles for a simple online store. Here’s an example profile document:

The path to the country field in our customer’s delivery address would be:

Similarly, if the purchases are listed new to old, to find the id of the customer’s second most recent purchase we’d use:

So, our sub-document is whichever portion of the document is found at the path we provide. It could be an enormous object or a single index in an array.

Using sub-document operations

At the bucket-level, things are really simple. There are just two new methods:

  • lookupIn(key)
  • mutateIn(key).

In the Java SDK it is implemented using the builder pattern. So, these two bucket methods create for us an object, tied to a particular document, where we can do the actual work of querying or changing our sub-document.

Let’s take a look at an example in Java, where we’re finding the delivery address of our customer.

First up, we’re creating a builder object that is primed with our customer’s document. We can then perform the sub-document operations on that builder document.

If we’re using the sample JSON document from earlier in this post, results will be:

We can simplify this query by chaining the whole thing together:

As we build more complex queries and mutations, we can chain many operations together. In the Java SDK, doLookup denotes the end of the query and sends it for execution.

As we saw in the instant messaging example earlier, we can also modify documents using the sub-document API.

Let’s say we want to change some parts of our customer’s billing address. Back in Python, we could use the sub-document API to do it like this:

Just as in Java, we first select the document by its key and then perform a series of operations on its contents.

Next steps

For the detail of how to use the sub-document API in your preferred language, check out the updated SDK documentation.

You can also read Mark’s post, coming soon, that goes into more detail of using the sub-document API.

Author

Posted by Matthew Revell, Lead Developer Advocate, EMEA, Couchbase

Matthew Revell is a Lead Dev Advocate, EMEA Couchbase. He developed a global strategy for putting Couchbase front in the minds of the product's developers.

Leave a reply