Blog Post

Cross Data Center Replication - A step-by-step guide for Amazon AWS

Marty Schoch of Couchbase Published

One of the most exciting new features of Couchbase Server 2.0 is the addition of Cross Data Center Replication (XDCR). Using this feature you can increase the reliability of your application by operating in multiple data-centers and improve performance for your users by storing their data closer to their physical location.

Getting started with Cross Data Center Replication is easy, and Amazon EC2 provides a great place to take it for a test drive.

Infrastructure

First, we'll need 2 Couchbase clusters in separate regions. I've chosen to use the US East (N. Virginia) and US West (N. California) regions. In each region I've provisioned 2 servers (m1.large) running the standard Amazon Linux AMI. Once the servers are provisioned, you'll need to install Couchbase Server 2.0 beta on each server.

IMPORTANT: There is one substantial change that must be made on each server. By default, Couchbase Server will identify the server local IP address and use this for all cluster communication. That works fine within the region, but will not work across the internet. To work around this issue we will configure Couchbase to explicitly use the public DNS name provided by Amazon for each server. This will correctly resolve to the internal IP address for intra-cluster communication and to the public IP address for inter-cluster communication.

  1. Stop the server
    sudo /etc/init.d/couchbase-server stop
  2. Edit the file /opt/couchbase/bin/couchbase-server
  3. Find the last line in the _start section
  4. Add a '\' to the end of the line
    ...\
  5. Add a new line immediately afterwards that reads:
    -name ns_1@<Amazon Public DNS name>
  6. Delete the files under:
    • /opt/couchbase/var/lib/couchbase/data/*
    • /opt/couchbase/var/lib/couchbase/mnesia/*
    • /opt/couchbase/var/lib/couchbase/config/config.dat
  7. Start the server
    sudo /etc/init.d/couchbase-server start

A good way to check that your servers have been properly configured with their public DNS name is to look at the Server tab in the Couchbase Server UI. You should see the servers listed with their DNS names, if you see their private IP Addresses recheck the previous steps.

Load Sample Dataset

So that we have some documents to replicate, I'll load the beer-sample dataset into the east coast cluster.

$ cd /opt/couchbase/bin/tools
$ ./cbdocloader -u Administrator -p password -b default -n 127.0.0.1:8091 ../../samples/beer-sample.zip
{'username': 'Administrator', 'node': '127.0.0.1:8091', 'password': 'Couchbase', 'bucket': 'default', 'ram_quota': 100} ['../../samples/beer-sample.zip']
[2012-10-09 14:29:02,833] - [rest_client] [140174671374080] - INFO - existing buckets : [u'default']
[2012-10-09 14:29:02,834] - [rest_client] [140174671374080] - INFO - found bucket default
working with beer.json
working with 110fa32467.json
working with 110fe062a9.json
...long output omitted...
working with 110f179fca.json
working with 110f25fe73.json
View: _design/beer/_view/brewery_beers
View: _design/beer/_view/by_location

Go West, Young Documents

Let's start with a uni-directional replication from the east coast cluster to the west coast cluster. I'll navigate to the XDCR tab of my east coast cluster. Press the "Create Cluster Reference" button. In the IP/hostname field be sure to use the public DNS name of one of the servers in the remote cluster.

Now press the "Create Replication" button. Select the "default" bucket, the "WestSide" cluster, and type "default" for the remote cluster.

Now the XDCR tab should look like this:

Do not be alarmed, the status column will continue to say "Starting Up", that is normal. At this point all documents from the east coast cluster will be replication to the west coast cluster.

Now let's navigate to the "Data Buckets" tab. Click on the "default" bucket. Scroll down towards the bottom, and expand the section labeled "Outgoing Replication".

Initially the changes queue will be high, and the documentes checked and documents replicated will be 0. As the replication proceeds, the changes queue will drain to 0 and the documents checked and replicated will settle on the total document count. Here is what it looks like as it finishes replicating the documents (NOTE: XDCR replications are continuous, although it finished replicating all the changes so far, it will continue running and transferring future changes)

Let's take a look at the west coast cluster in the Couchbase Server UI.

The document count in the default bucket matches the value from the east coast cluster. Replication between data centers is working!

Return Trip

Many use cases require bi-directional replication, so let's go ahead and set up replication from the west coast back to the east coast. The steps are the same as before, only the cluster and DNS names change. Again, be sure to use the public DNS names of the Amazon servers.

To test that bi-directional replication is working, we need to make a change to the data on this side.

After my CouchConf SF presentation, we got a tweet from @PabstBlueRibbon informing us that our beer-sample dataset needs an update.

So let's take care of that now. In our dataset the value is currently 0. I'm going to edit the document with ID 110fa43a2e and change it to 12.

After saving the change, if we look at the bucket stats on the west coast cluster, we'll see the change being replicated.

Notice how the changes queue briefly went up t 1, and we see the 1 document was replicated. It's also worth noting that although it has checked all the documents it only replicated the one that changed, this shows that XDCR is checking revisions and only replicating the necessary data.

Now let's load the document on the east coast cluster to verify it worked.

It worked, the ibu is showing the value 12 on both clusters!

Best Practices

I've tried to keep the examples here simple, but there two issues you'll want to consider before taking this into production.

Server Names

In the example here I've directly used the EC2 instance public DNS names. This is not recommended in production because Amazon instances IP addresses (and their associated public DNS names) can change. We recommend one of two options:

  • Use a 3rd-party dynamic DNS service. Whenever the server's IP address changes, update a CNAME record that points to your server's public DNS name. It's important that this be a CNAME record and that it point to the public DNS name because you need the address to resolve to the correct address both inside and outside Amazon.
  • Use an Amazon Elastic IP Address. These do not cost any extra money (when in use), but you may need to request more from Amazon.
Data Security

Data transfered by XDCR is sent unencrypted and when replicating between Amazon regions this means it is transitting the public internet.

  • You can use XDCR to connect clusters in different availability zones without transitting the public internet. This doesn't provide as much reliability but it avoids the potential security issue.
  • You can use a 3rd-party VPN service to tunnel data between your Amazon regions.

More Information

Hopefully this tutorial has shown how you can have XDCR up and running in a matter of minutes in AWS. For more information about Couchbase XDCR see: