Arkadiusz Borucki works as a Site Reliability Engineer at Amadeus, focused on NoSQL databases and automation. In his day-to-day work, he uses Couchbase, MongoDB, Oracle, Python, and Ansible. He’s a self-proclaimed big data enthusiast, interested in data store technologies, distributed systems, analytics, and automation. He speaks at several conferences and user groups in the United States and Europe. You can find him on Twitter at @_Aras_B

pasted image 0 1

Motivation: Why use Infrastructure as Code

Many IT teams still rely on manual configuration to manage infrastructure – old procedures and outdated shell scripts are still in use.

Sometimes members of one team use different procedures and scripts to the same database farm. Those people may leave the company without sharing knowledge or tips. This approach results in problems, errors, slow deployments, and inconsistent environments.

Server farms are getting bigger and bigger and data size is growing from gigabytes to tera or petabytes. Single machines are no longer able to handle this amount of data. Therefore, we have to scale our database horizontally, use more machines, and distribute data across them.

When we have two, five, or ten “old school” clusters set up based on procedures and scripts, that would be sufficient. The problems arise when the farm is rapidly growing.

  • What to do when your deployment has hundreds of servers? 
  • How to make sure the environment is consistent?
  • How to control what is installed on the machines?
  • How to track all the changes?

Information about infrastructure setup must be centralized. Infrastructure should be treated like software – as code that can be managed with the same tools and processes used by software developers. For example, use code to describe the infrastructure. Create a model of your Couchbase deployment – as a code with version control on it. You will not only be able to track who has done what, you can also roll back to an earlier configuration. Couchbase deployment will be consistent because the same settings will be applied to every machine. To prevent future problems and outages your Couchbase farm should be consistent, configuration should be centralized and divided between production and non production environments.

You can test new setups and settings in test or development branch before you apply those changes on production!

Codify everything

Use code to describe the infrastructure. Use Ansible for physical or virtual server management (patching, upgrades, configuration management, network management, new clusters deployments, orchestration).

Version everything

Use Git to manage infrastructure as a code repository. Git is an open-source distributed version control system. Use an appropriate branching model according to your business needs (production branch and test branch).

Manage your Couchbase deployment with Git

pasted image 0

One repository

Use one Git infrastructure as a code repository per organization or company. In one Git repository you can have a few branches (production, development, test, staging, etc.).

Ansible: How to operate distributed Couchbase cluster

Manual operations on a database farm are time and resource consuming. More manual operations bring more human mistakes, more overhead, and inconsistency.

Can you imagine a farm with 400 servers? How much time it takes to log into every machine and change settings? What if you skip one or two machines? What if you apply different settings on a few machines by mistake?

Ansible is a perfect tool for configuration management and orchestration of your infrastructure. By using Ansible you can go with Infrastructure As Code (keep Couchbase server’s definition in Git repository, track changes, and use all the advantages that come with Git version control).

Use Ansible git module to deploy changes from Git repository on your distributed database farm. Ansible git module grabs code from the specified git url and pushes it in the destination directory.

  • Ansible is agentless and uses a push approach (SSH).
  • Ansible is based on YAML files.
  • It is a good alternative to Puppet.
  • Ansible reduces manual steps on servers.
  • Ansible helps 95% reduction in operational overhead

Automation

  • Reduce overhead and human mistakes, speed up processes, provide consistency – use Ansible for your Couchbase farm automation.

Orchestration

  • Let’s put logic on automation and let’s eliminate repetitive steps. Ansible can be also used as an orchestrator!

Automation is concerned with a single task – starting Couchbase service, configuring a cluster, stopping Couchbase process.

Orchestration is concerned with automating the execution of a workflow of a process.

# Example git checkout from Ansible Playbook

Use the version option to specify a particular branch, tag, or commit id. Once you pull code from the Git repository you can apply it on your Couchbase deployment. You can apply it on all servers or just on part of farm. You can also specify a list of hosts in the Ansible inventory file and run it like this:

# Example run Ansible Playbook for cluster “couchstg”

 

Git: What can we keep in Git?

Server configuration repositories:

  • The default filesystems layout
  • List of required linux packages
  • Kernel parameters
  • Required users and groups
  • Cron scripts
  • Security settings

Couchbase/Ansible repositories:

  • Couchbase cluster definition
  • Couchbase Ansible playbooks
  • Couchbase roles
  • Couchbase hosts inventory files
  • RBAC config
  • XDCR config

Ansible: What should be automated?

  • Cluster deployment
  • Upgrades
  • Scaling
  • Resilience
  • Monitoring
  • Alerting
  • Security settings
  • Backup and restore
  • Couchbase cluster rebalance

  • Couchbase failover
  • Couchbase bucket creation
  • Linux kernel and security patching
  • Any manual activity from the Couchbase GUI or shell

Manual cluster installations and manual nodes management should not be supported. Automate as much as possible and always push code changes to Git. Couchbase provides REST API endpoints. From the Ansible playbook you can use HTTP methods – GET, POST, PUT, DELETE.

Couchbase REST API allows you to make any change on a Couchbase farm without a single click in GUI.

Ansible playbook can also run Couchbase CLI commands:

# Example auto failover (CLI commands) from Ansible Playbook

or   # Example rebalance from Ansible Playbook

 

# Example Install couchbase server from Ansible Playbook

 

Couchbase: REST API

The Couchbase REST API enables you to manage a Couchbase server deployment as well as perform operations such as storing design documents and querying for results directly from Ansible playbook.

You can easily make Couchbase REST API calls from your Ansible code. You can also create custom Ansible roles for Couchbase!

Couchbase offers the following REST APIs:

  • Cluster API – The Cluster REST API manages cluster operations
  • Server nodes API – The Server nodes REST API manages nodes in a cluster
  • Server groups API – The server groups REST API refers to the Rack Zone Awareness feature, which enables logical groupings of servers on a cluster where each server group physically belongs to a rack or availability zone
  • Buckets API – The Buckets REST API creates, deletes, flushes, and retrieves information about buckets and bucket operations
  • Views API – The Views REST API is used to index and query JSON documents.
  • XDCR API – The XDCR REST API is used to manage Cross Datacenter Replication (XDCR) operations
  • Logs API – The Logs REST API provides the REST API endpoints for retrieving log and diagnostic information as well as how an SDK can add entries into a log
  • User API – A read-only user is created with the /settings/readOnlyUser URI endpoint – only one read-only user can be created

# example Ansible empties the contents of the specified bucket via REST API:

Summary

In a modern world, when data is growing faster than ever and we need more and more machines to be able to keep and maintain our data, centralized management and automation with orchestration are very important. Consistency, overhead reduction, human error reduction, and faster processes are good reasons to start using infrastructure as code together with automation, orchestration, tools, and techniques related to this.

The Couchbase distributed cluster is a perfect candidate. Couchbase works well with tools like Ansible and also provides a useful REST API interface. Couchbase REST API methods can be called from Ansible playbooks or Python scripts.

DevOps practice increases the velocity and stability of deployments while also reducing failure recovery time and software update lead times.

In the second part of this tutorial I will show step by step how to build the Ansible role for a Couchbase cluster using Couchbase REST API methods and command line commands.

Author

Posted by Laura Czajkowski, Developer Community Manager, Couchbase

Laura Czajkowski is the Snr. Developer Community Manager at Couchbase overseeing the community. She’s responsible for our monthly developer newsletter.

Leave a reply