This blog will show how to create stateful containers in Kubernetes using Amazon EBS. Couchbase Server is a stateful container. This means that state of the container needs to be carried with it. In Kubernetes, the smallest atomic unit of running
a container is a pod. So a Couchbase Server container will run as a pod. And by default, all data stored in Couchbase Server is stored on the same host.
stateful containers

This figure is originally explained in Kubernetes Cluster on Amazon and Expose Couchbase Service. In addition, this figure shows storage local to the host.
Pods are ephemeral and may be restarted on a different host. A Kubernetes Volume outlives any containers that run within the pod, and data is preserved across container restarts. However
the volume will cease to exist when a pod ceases to exist. This is solved by Persistent Volumes that provide persistent, cluster-scoped storage for applications that require long lived data.

Creating and using a persistent volume is a three step process:
  1. Provision: Administrator provision a networked storage in the cluster, such as AWS ElasticBlockStore volumes. This is called as PersistentVolume.
  2. Request storage: User requests storage for pods by using claims. Claims can specify levels of resources (CPU and memory), specific sizes and access modes (e.g. can be mounted once read/write or many times write only).
    This is called as PersistentVolumeClaim.
  3. Use claim: Claims are mounted as volumes and used in pods for storage.

Specifically, this blog will show how to use an AWS ElasticBlockStore as PersistentVolume, create a PersistentVolumeClaim, and then claim it in a pod.

stateful containers

Complete source code for this blog is at: github.com/arun-gupta/couchbase-kubernetes.

Provision AWS Elastic Block Storage

Following restrictions need to be met if Amazon ElasticBlockStorage is used as a PersistentVolume with Kubernetes:

  • the nodes on which pods are running must be AWS EC2 instances
  • those instances need to be in the same region and availability-zone as the EBS volume
  • EBS only supports a single EC2 instance mounting a volume

Create an AWS Elastic Block Storage:

The region us-west-2 region and us-west-2a availability zone is used here. And so Kubernetes cluster need to start
in the same region and availability zone as well. This shows the output as:

Check if the volume is available as:

It shows the output as:

Note the unique identifier for the volume in VolumeId attribute. You can also verify the EBS block in AWS Console:

kubernetes-pv-couchbase-amazon-ebs

Start Kubernetes Cluster

Download Kubernetes 1.3.3, untar it and start the cluster on Amazon:

Three points to note here:

  • Zone in which the cluster is started is explicitly set to us-west-1a. This matches the zone where EBS storage volume was created.
  • By default, each node size is m3.medium. Here it is set to m3.large.
  • By default, 1 master and 4 worker nodes are created. Here only 3 worker nodes are created.

This will show the output as:

Read more details about starting a Kubernetes cluster on Amazon.

Couchbase Server Pod w/o Persistent Storage

Let’s create a Couchbase Server pod without persistent storage. This means that if the pod is rescheduled on a different host then it will not have access to the data created on it. Here are quick steps to run a Couchbase Server pod and
expose it outside the cluster:

Read more details at Kubernetes cluster at Amazon. The last command shows the ingress load balancer address. Access Couchbase Server Web Console at :8091.

kubernetes-pv-couchbase-amazon-elb

Login to the console using Administrator login and password password. The main page of Couchbase Server Web Console shows up:

 kubernetes-pv-couchbase-amazon-web-console-

A default travel-sample bucket is already created by arungupta/couchbase image. This bucket is shown in the Data Buckets tab:

kubernetes-pv-couchbase-amazon-databucket

Click on Create New Data Bucket button to create a new data bucket. Give it a name k8s, take all the defaults, and click on Create button to create the bucket:

kubernetes-pv-couchbase-amazon-k8s-bucket

Created bucket is shown in the Data Buckets tab:

kubernetes-pv-couchbase-amazon-k8s-bucket-created

Check status of the pod:

Delete the pod:

Watch the new pod being created:

Access the Web Console again and see that the bucket does not exist:

kubernetes-pv-couchbase-amazon-k8s-bucket-gone

Let’s clean up the resources created:

Couchbase Server Pod with Persistent Storage

Now, lets expose a Couchbase Server pod with persistent storage. As discussed above, lets create a PersistentVolume and claim the volume.

Request storage

Like any other Kubernetes resources, a persistent volume is created by using a resource description file:

The important pieces of information here are:

  • Creating a storage of 5 GB
  • Storage can be mounted by only one node for reading/writing
  • specifies the volume id created earlier

Read more details about definition of this file at kubernetes.io/docs/user-guide/persistent-volumes/. This file is available at: github.com/arun-gupta/couchbase-kubernetes/blob/master/pv/couchbase-pv.yml.
The volume itself can be created as:

and shows the output:

Use claim

A PersistentVolumeClaim can be created using this resource file:

In our case, both PersistentVolume and PersistentVolumeClaim are 5 GB but they don’t have to be. Read more details about definition of this file at kubernetes.io/docs/user-guide/persistent-volumes/#persistentvolumeclaims.
This file is at github.com/arun-gupta/couchbase-kubernetes/blob/master/pv/couchbase-pvc.yml. The claim can be created as:

and shows the output:

Create RC with Persistent Volume Claim

Create a Couchbase Replication Controller using this resource file:

Key parts here are:

  • Resource defines a Replication Controller using arungupta/couchbase Docker image
  • volumeMounts define which volumes are going to be mounted. /opt/couchbase/var is the directory where Couchbase Server stores all the data.
  • volumes define different volumes that can be used in this RC definition

Create the RC as:

and shows the output:

Check for pod as kubectl.sh get -w po to see:

Expose RC as a service:

Get all the services:

Describe the service as kubectl.sh describe svc couchbase to see:

Wait for ~3 mins for the load balancer to settle. Access the Couchbase Server Web Console at :8091. Once again, only travel-sample bucket exists. This is created by arungupta/couchbase image used in the RC definition.

Show Stateful Containers

Lets create a new bucket. Give it a name kubernetes-pv, take all defaults and click on Create button to create the bucket.

kubernetes-pv-couchbase-amazon-kubernetes-pv-bucket

The bucket now shows up in the console:

kubernetes-pv-couchbase-amazon-kubernetes-pv-bucket-created

Terminate Couchbase Server pod and see the state getting restored. Get the pods again:

Delete the pod:

Pod gets recreated:

And now when you access the Couchbase Web Console, the earlier created bucket still exists:

kubernetes-pv-couchbase-amazon-kubernetes-pv-bucket-still-there

That’s because the data was stored in the backing EBS storage.

Cleanup Kubernetes Cluster

Shutdown Kubernetes cluster:

And detach the volume:

Complete source code for this blog is at: github.com/arun-gupta/couchbase-kubernetes.

Enjoy!

Author

Posted by Arun Gupta, VP, Developer Advocacy, Couchbase

Arun Gupta is the vice president of developer advocacy at Couchbase. He has built and led developer communities for 10+ years at Sun, Oracle, and Red Hat. He has deep expertise in leading cross-functional teams to develop and execute strategy, planning and execution of content, marketing campaigns, and programs. Prior to that he led engineering teams at Sun and is a founding member of the Java EE team. Gupta has authored more than 2,000 blog posts on technology. He has extensive speaking experience in more than 40 countries on myriad topics and is a JavaOne Rock Star for three years in a row. Gupta also founded the Devoxx4Kids chapter in the US and continues to promote technology education among children. An author of several books on technology, an avid runner, a globe trotter, a Java Champion, a JUG leader, NetBeans Dream Team member, and a Docker Captain, he is easily accessible at @arungupta.

2 Comments

  1. Nice article. Thanks for putting it together.

    A clearly stated assumption is that the example was for a single AZ. But for actual production use, you’d want to deploy your k8s cluster across multiple AZs, which is quite easy to with kops. This throws several wrenches into the deployment of PVs. I haven’t been able to find any good examples of how to do this. Might you know of any? Or are maybe willing to update this example to support multiple AZs?

    Reading the k8s docs it appears that we’d need to use a StorageClass resource for each AZ, mark the PersistantVolume to the class, but I can’t figure out how to get a Deployment (or ReplicationController) to pick the correct StorageClass based on which AZ it’s in, which isn’t known until the place is scheduled on a Node.

    Also, the travel-sample didn’t show up for me either time, not that it’s important. And the second kubectl.sh expose rc couchbase --target-port=8091 --port=809--type=LoadBalancer should be kubectl expose rc couchbase --target-port=8091 --port=8091 --type=LoadBalancer

    Might also be interesting to update it to use kops (or reference your other great article on doing that) and replace ReplicationController with Deployment.

    1. Did you find a solution for this? I have been trying to create a CB cluster with PVs. It all seems to work the first time they start up, but once I kill a cd pod the new pod cannot join the CB cluster anymore, its ip seems to be blocked and the CB master pod cannot connect back the the new worker.

      Cheers

Leave a reply