Not too long ago I had written about using the Couchbase Server subdocument API with the Go SDK. Doing subdocument operations is incredibly useful if you’d like to change or access a part of a potentially huge NoSQL document. Subdocument operations save network resources and are great on performance.

A colleague of mine asked how one might do bulk subdocument mutations by key similar to how I demonstrated in my tutorial titled, Using Golang to get Multiple Couchbase Documents by Key in a Single Operation. The short answer is that you can’t with a single operation, but because Golang is so incredibly fast and awesome, you could do things in parallel and get the same results.

We’re going to see how to asynchronously perform subdocument mutations based on a list of document ids with the Go programming language and Couchbase.

Let’s figure out a real-world scenario that we want to accomplish. Let’s say we have a professional networking website that receives tens-of-millions of requests a second. Performance is a must so we decide we want to batch some of those requests together the best we can. Let’s say 100 people just updated their profile to include Golang as one of their skills. We want to append this as one of their list of skills.

Going forward, we’re going to be doing everything in a main.go file somewhere in our $GOPATH path. Open this file and include the following:

Let’s break down what we have so far.

Because we plan to do things in parallel with goroutines, we need to know when it is safe to terminate our application. The WaitGroup allows us to keep track of our asynchronous tasks and wait until they’ve all finished before continuing. Since we’re going to be processing data with goroutines, we’ll need a channel that all goroutines that use at the same time. Each of our goroutines will be an instance of the worker method.

In the main method we are declaring all the keys that will receive the mutation. In a production scenario, the application business logic will probably aggregate this list of keys.

We’re also establishing a connection to Couchbase.

With the foundation in place, let’s take a further look at the main method.

In Go, we can spin up a ridiculously large amount of goroutines that will run in parallel. Of course the real number you can spin up is dependant on your hardware, but for now, let’s be conservative with two. For every worker that we start, we increase the WaitGroup. As these goroutines stop, the WaitGroup will decrease which will eventually unblock the application and allow it to terminate.

You’ll also notice that we’ve added a channel for our string data. Each of our desired document ids are added to the channel and then the channel is closed. You’ll see why we do this when we define the worker logic.

The above snippet is our worker method logic. When the function terminates, the defer method executes which subtracts from the WaitGroup.

Each worker will run forever via a loop. Each loop iteration will take ids from the data channel. If we are not ok, it likely means that the channel is empty and we should end the loop. If we do get an id, plan to do a mutation on that document and append a new string in the skills path, which we assume to be an array. If the array does not exist within the document, one will be created.

If there is an error for any reason, maybe the key doesn’t exist, print that an error happened.

The full code to this simple demo is as follows:

Again, these subdocument mutations to the skills array happen in parallel through goroutines. For more information on using goroutines to do things concurrently, check out a previous tutorial I wrote on the subject titled, Concurrent Golang Applications with Goroutines and Channels.

Conclusion

You just saw another demo to doing subdocument mutations with Couchbase and Golang. This time we explored doing things in parallel instead of trying to use one of the bulk operators. By doing things in parallel we get nearly the same performance as doing bulk operations on a list of keys.

Author

Posted by Nic Raboy, Developer Advocate, Couchbase

Nic Raboy is an advocate of modern web and mobile development technologies. He has experience in Java, JavaScript, Golang and a variety of frameworks such as Angular, NativeScript, and Apache Cordova. Nic writes about his development experiences related to making web and mobile development easier to understand.

One Comment

  1. Hi Nic, great article. My use case is to perform bulk crud operations on around a hundred of sub-documents, given a list of document keys. Like you said and concluded in the article

    A colleague of mine asked how one might do bulk subdocument mutations by key similar to how I demonstrated in my tutorial titled, Using Golang to get Multiple Couchbase Documents by Key in a Single Operation. The short answer is that you can’t with a single operation, but because Golang is so incredibly fast and awesome, you could do things in parallel and get the same results.

    This time we explored doing things in parallel instead of trying to use one of the bulk operators. By doing things in parallel we get nearly the same performance as doing bulk operations on a list of keys.

    I want to confirm that since we will be performing hundreds of network calls anyway, how is this efficient then?

Leave a reply