Blog Post

Couchbase and Full-text Search: The Couchbase Transport for Elasticsearch

Marty Schoch of Couchbase Published

UPDATED October 9, 2013

The Couchbase Transport for Elastic Search has been updated to work with Couchbase 2.2.0 and Elastic Search 0.90.5.  The detailed steps below have been updated to reflect the new version numbers.

Now available for your next content app  

See http://blog.couchbase.com/announcing-release-couchbase-plug-elasticsearch 


Couchbase Server 2.0 adds powerful indexing and querying capabilities through its distributed map reduce implementation. But in addition to that many applications, particularly content applications also need full-text search capabilities.  Today we are releasing a developer preview of the Couchbase Transport Plugin for Elastic Search.  This plugin uses the new Cross Data Center Replication functionality which will be a part of Couchbase Server 2.0.  Using this new transport, you can get started with Couchbase and ElasticSearch easily.  This blog explains how you can have this integration up and running in minutes.

Getting Started

First, we'll need a Couchbase Cluster.  For our purposes a single instance running on your development machine is fine.  Download and Install the right package for your platform. I'll use the 64-bit RPM package of the Couchbase Server 2.0.0: 

?????????$ sudo rpm --install couchbase-server-enterprise_2.2.0_x86_64_openssl098.rpm

While going through the install wizard you can use all the default settings. However, be sure to select the "beer-sample" sample bucket to follow along with the examples below.

Now we're ready to install ElasticSearch. The current version of the plug-in works with ElasticSearch version 0.90.5. Download this file and unzip the installation.

$ curl -sLO \
http://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-0.90.5.tar.gz
$ tar zxvf elasticsearch-0.90.5.tar.gz

Now that we have the ElasticSearch core, let's add some plugins.

The first is called "head". It provides a basic web UI to ElasticSearch so that we can more easily see what is going on. Installation is as simple as:

$ cd elasticsearch-0.90.5
$ bin/plugin -install mobz/elasticsearch-head
-> Installing mobz/elasticsearch-head...
Trying https://github.com/mobz/elasticsearch-head/zipball/master... (assuming site plugin)
Downloading .....................................DONE
Identified as a _site plugin, moving to _site structure ...
Installed head

Now we are ready to install the Couchbase plug-in. Installation is just as simple as any other plugin.

$ bin/plugin -install transport-couchbase -url http://packages.couchbase.com.s3.amazonaws.com/releases/elastic-search-adapter/1.2.0/elasticsearch-transport-couchbase-1.2.0.zip
-> Installing transport-couchbase...
Trying http://packages.couchbase.com.s3.amazonaws.com/releases/elastic-search-adapter/1.2.0/elasticsearch-transport-couchbase-1.2.0.zip...
Downloading ..........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................DONE
Installed transport-couchbase

Now that the plug-in is installed we need to set the password for the Couchbase transport.

$ echo "couchbase.password: password" >> config/elasticsearch.yml
This sets the password to the literal string "password". The username defaults to "Administrator", but can be specified in "couchbase.username".
All the plug-ins we need are installed, let's go ahead and start ElasticSearch.
$ bin/elasticsearch

This starts ElasticSearch and puts it into the background.

Configuring ElasticSearch

Now that the plug-in is installed we need to install an index template that sets up some good defaults when working with Couchbase. All of these settings can be overridden by your own templates, or updated on a per-index basis.

$ curl -XPUT http://localhost:9200/_template/couchbase -d @plugins/transport-couchbase/couchbase_template.json
{"ok":true,"acknowledged":true}

Finally, for each Couchbase bucket we want to search we'll need to create a corresponding index in ElasticSearch.

$ curl -XPUT http://localhost:9200/beer-sample
{"ok":true,"acknowledged":true}

That's it, now we're ready to configure the Couchbase side.

Configuring Couchbase Server

The Couchbase plug-in for ElasticSearch works by taking advantage of the Cross Datacenter Replication (XDCR) feature introduced in Couchbase Server 2.0. The first step is to navigate to the Replications tab the in UI. Now click the Create Cluster Reference button. Name this cluster ElasticSearch and point it to the address 127.0.0.1:9091 (the default port used by this plug-in). The username field should be Administrator and the password field should match the value you configured in the previous section.

Finally, to start the replication press the Create Replication button. Select the beer-sample bucket, the remote cluster ElasticSearch and type in the name of the remote bucket beer-sample.

Once you press the Replicate button the data will begin transferring.

Integration Complete

Let's check on the progress of the transfer by visiting http://127.0.0.1:9200/_plugin/head/

The document count will increase in ElasticSearch and will ultimately be higher than the document count we see in Couchbase. This is OK and happens because we're also storing replication checkpoint documents. Now let's trying running a simple search to verify the data was properly indexed.

Let's search for one of my favorite beers by opening the following URL in your browser: http://localhost:9200/beer-sample/_search?q=duchesse

You should see the JSON output below:

We see that the search matched exactly one document. But notice that in the source section we don't see the contents of the beer document. This is done by design, when we installed the index templates, we disabled storing the full body of the document. The reason is simple, we already have incredibly fast access to the documents in Couchbase. Let's go back to the Couchbase UI and look at the document returned by the search.

From the Couchbase UI navigate to the the Data Buckets tab, then click the Documents button next to the beer-sample bucket. In the document ID field we'll paste in the ID returned by the search, 110fa6d6b2.

Success! This is the beer we were looking for.

Next Steps

At this point you have your Couchbase Cluster continuously replicating data to your ElasticSearch Cluster. We've only shown a very basic ElasticSearch query, much more complex queries are possible.

For more information: