There is a feature in Couchbase 4.x that can really help with Global Secondary Index creation and many do not know about. It is the ability to queue index creation statements and then parallelize their creation for efficiency and performance.

If you need to create multiple indexes at one time, you can defer the creation of the indexes. Then when you are ready, you tell the cluster to build those indexes. For example, I need to create two indexes, but I also want High Availability of my indexes, so I need a copy of each index on each of the Index Service (IS) nodes. In other words, four CREATE INDEX statements total.

If you did nothing else, these indexes would stay in deferred/created status. The next statement you need to call is BUILD INDEX as it will tell the cluster to actually initiate the build.

At this point the IS nodes will begin creating the indexes, but in parallel and even better is uses the same Database Change Protocol (DCP) stream from the Data Service nodes for better network IO utilization. If you built these without the defer, one would create and you’d wait. Under the covers the index service node would create that one from a DCP stream. When that one was complete, you’d put the next CREATE statement in and wait…another DCP stream of the same data. Keep doing that until all four are complete. All serially.

With the defer_build and then the BUILD INDEX statement, you put all the create statements in at once and then go get coffee. Not only will the indexes be built in parallel on each IS node and between the two IS nodes, but each IS node will only need a DCP stream of the bucket’s data once. So the whole process completes much faster and far more efficiently. Especially when scripting the creation of a lot index creations, then you can group these into batches.

Give it a try and report back. For more information refer to the documentation on CREATE INDEX and BUILD INDEX.

Author

Posted by Kirk Kirkconnell, Senior Solutions Engineer, Couchbase

Kirk Kirkconnell was a Senior Solutions Engineer at Couchbase working with customers in multiple capacities to assist them in architecting, deploying, and managing Couchbase. His expertise is in operations, hosting, and support of large-scale application and database infrastructures.

Leave a reply