How Many Nodes? Part 3: Hardware considerations
The first two parts of this series covered an overview of the 5 factors of sizing a Couchbase Server cluster, and a more in depth look at various use cases and features and the effects they have on sizing. The fourth part will look at the metrics for monitoring a Couchbase Server cluster in order to determine the proper sizing and when to grow it.
Now I'd like to provide some guidance on hardware recommendations. While it would be easy for us at Couchbase to demand very hard-and-fast requirements for hardware, it wouldn't be at all appropriate given the vast difference in use cases and application/infrastructure environments that our customers want to use the software in. Take for example the difference between a customer running on their own physical hardware on their own datacenter where they have access to 512GB RAM systems, FusionIO cards and 10Gig-E networking versus a customer running within Amazon's AWS. Couchbase's customers span these two extremes and just about every configuration in between...you can see how hard it is for us to provide specific recommendations.
At the highest level, Couchbase is designed and tested to run on "commodity hardware". As you can tell from above, "commodity" means different things to different people. I always ask customers what type of hardware profile they available or want or plan with. We then size the cluster keeping in mind the benefits of horizontal scalability and the number nodes needed support the dataset and workoad...as opposed to saying you will definitely need 6 HP DL 380s with 64GB of RAM, 24 cores of CPU and 2TB of SSD. Our hardware recommendation will be at the intersection of your dataset and workload requirements, the resources available to you and cost. Our documentation covers this briefly as well: Resource Requirements
Okay, enough with the disclaimers, let's dig into the details. In the previous parts of this series, I listed the 5 factors of sizing Couchbase:
- Data Distribution
I'm going to start with the easiest first (at the bottom) and work my way up.
1. Data Distribution
The easiest :-) Any production deployment of Couchbase should have no less than 3 nodes. The reasoning behind this is for autofailover safety, ease of upgrades and further scale.
The 2nd easiest :-S. In general, you likely have very little choice regarding what type of network you have available. These days, anything 1Gig-E or above will be sufficient for Couchbase. If your application is particularly latency sensitive or requires extremely high throughput, you will benefit from end-to-end connectivity of 10Gig-E.
Put as few "hops" between your Couchbase nodes and the clients in terms of firewalls and routers. If you do need to have those in between, just realize that your latency will be affected.
When setting up XDCR, the network variability becomes greater but it also becomes less important due to how we've architected that feature.
Couchbase is more about CPU cores than the speed of each one (2.4GHz versus 2.6GHz versus 3.0 GHz versus Intel versus AMD will not make a difference). Couchbase is multi-threaded in many ways and will make use of many cores...but it's also not a requirement to have 32 or 64 or more cores (most would agree those are not exactly "commodity"). Start with 4 cores for a basic workload, and then add additional cores as needed for views and XDCR. See the previous parts of this series for more details.
This is going to be a big one, as well it should be. Couchbase is a database, but unlike any other. Where Oracle/MySQL/Postgres rely very heavily on disk performance, Couchbase separates the main application performance from disk IO which means the requirements are a lot lower. Lower, but not non-existent.
- Use the best "local" storage you have available. Our best practice and architecture revolves around a distributed system. We recommend against using a centralized backend storage system such as a SAN or NAS. While these may be able to live up to the performance needs, they present a single bottleneck / point of failure (even if HA) that limits the distributed nature of Couchbase. The benefits of having a shared storage may outweigh this, but it’s something to take into consideration.
- EBS < Virtual Drive < 7200 < 10K < SSD < FusionIO: I'll discuss cloud and VM hardware considerations later on, but it just makes sense that faster disks are faster. We typically see about 1000 or less writes per second per node for EBS, around 1500 for 10K drives and well over 6000 for SSD's. I've seen write rates over 30k per second per node...you get the picture: it all depends. If your workload is going to be such that you rely heavily on disk performance either for heavy reads (from disk), sustained writes, index updating, and/or XDCR then you'll want to think more about your disk layer. I realize "heavy" is a very vague term, but there's no other way to put it and I can only recommend that you leverage the expertise of our engineering and field teams here at Couchbase to get specific details about your application.
- RAID is not a requirement, but if you have a standard deployment configuration, generally RAID 0 or 10 is better than 1 or 5. You’ll want to use RAID for better throughput and latency, not for better redundancy since Couchbase will already have replicas across the cluster. In cases where you have the disk drives and space available and are storing large (>100GB) amounts of data per-node, RAID 5 may actually be worthwhile to prevent having to failover and rebalance when a disk fails.
- If you're relying on views, it's a good idea to have two separate disks and split the data and views between them for the best performance and efficiency. See more about views in the previous parts of this series.
- Disk space is usually cheap, get as much as you can afford. See my previous posts for how our append-only file format and compaction needs to be taken into account.
This one's actually pretty easy too...generally get as much RAM as you can across the cluster (based upon your sizing calculation needs). The sweetspot for Couchbase is usually around 8GB-128GB per node. While there are certainly exceptions; lower than 8 doesn’t leave much space available for headroom or disk caching and higher than 128 starts to put a very large amount of data under the responsiblity of just one node’s availability and available resources.
A few other considerations:
- While it may seem intuitive to try and use all available memory, it’s actually a best practice to leave some RAM available outside of the Couchbase quota. Most modern operating systems want a few gigabytes (Windows usually a bit more than Linux), and there may be other processes running on these nodes such as monitoring agents. There are also needs for IO caching both for views and for the general functioning of the system. We typically recommend about 60-80% of an system’s RAM to be allocated to Couchbase’s quota, leaving the rest for headroom and memory needs outside of Couchbase itself.
- You're better off with 6x32GB nodes rather than 3x64GB
- There will not be a noticeable difference between different types of RAM or speeds.
- Linux (ensure the most efficient and largest use of RAM):
- Have 5-10GB of swap space configured. While we certainly don’t expect to use it, we also want a bit of safety net from the Linux OOM killer
- Disable THP (transparent huge pages, sometimes known as anonymous hyge pages). While they can be beneficial in some cases, we've seen problems at our customers with them turned on (see your particular operating system and version for instructions on disabling THP).
- Set swappiness to 0
- Recent evidence suggests that disabling Zone Reclaim on NUMA systems can be beneficial (http://engineering.linkedin.com/performance/optimizing-linux-memory-management-low-latency-high-throughput-databases)
Many times we’re asked whether it is okay to run Couchbase on the same servers as the application. While there is no technical limitation to doing so, it would not be our best practice recommendation for a few reasons:
- Your sizing calculations become more complicated when factoring in requirements of other technologies.
- Most other applications actually have different hardware/sizing requirements than Couchbase and so the ability to assign resources where they matter is a benefit of splitting them out.
- Scaling gets more complex. Imagine a 3 node application farm with 3 nodes of Couchbase running on the same servers. Now you want to scale the application but you don’t need to scale Couchbase with it. Do you have 5 application servers, only 3 of which have Couchbase running?
- Administration is harder. Same environment as above...now you need to reboot your application tier, but you don’t want Couchbase to go down with it.
To VM or not to VM?
It's no secret that physical hardware will give you the best performance and the most efficient use of resources. However, a very large portion of Couchbase customers are using VMs and it's a deployment that we test and support. Some things to keep in mind:
- Treat Couchbase with respect...don't overcommit the host machine with other applications. Overcommitting CPU and RAM especially can lead to very unexpected (and hard to diagnose) results. Some of this can be controlled with actual settings on the vm host (like for RAM) and some of it is just best practice (like not allocating more virtual CPU’s than you have physical cores)
- Network is going to be a bit slower with VMs...maybe not noticeable, but there is an effect.
- The same numerical requirements apply for CPU and RAM
- Local storage is better than going back to a shared SAN for the same reasons as above for a distributed environment.
- Our general advice of more nodes rather than less is even stronger with VMs
- Make sure you don't have more than one Couchbase node per physical machine
- You can sacrifice on some of the above for non-production environments, but keep in mind that you may run into performance or stability issues. If you plan to have the same production datasets and same workloads in test, UAT and/or staging, you're going to have the same resource requirements in those environments as well.
- There doesn't seem to be a difference between choice of virtualization technologies
In the Clouds...
A "cloud" in this context is a deployment environment where the hardware is truly decoupled from the software for every component and in my mind is different than just running on VMs. Around 50% of our customers are deployed in Amazon. A handful others are in in some other cloud vendor such as GAE/Rackspace/Softlayer/Savvis/etc and a bunch are running their own private cloud. Keeping with the theme from the beginning, I'm in no position to tell you which infrastructure you must deploy Couchbase in...you're going to tell me.
Many/all of the same considerations apply to clouds as they do to VMs and we have some more specific documentation on cloud considerations here.
- RAM: You usually have less RAM available per node
- Disk: A single EBS drive is (potentially inconsistently) slow. We recommend an LVM/striped-RAID of 4-6 drives per node for the best performance and optimized/provisioned IOPS EBS drives are good to consider. EC2 also provides SSD-based instances which, while very costly at moment, would be the ultimate choice for performance. Keep in mind that you also have the ability to “pick and choose” the disk types for your install directory, data files directory and view/index directory.
- Data Distribution: Again, plan on using more nodes rather than less.
- Availability Zones: While we don't yet do anything special for deploying Couchbase across racks or zones (it is coming), it's still a good idea to split your cluster across zones. That way, the failure of one zone won't render your entire dataset unavailable. Using XDCR between zones may be a good consideration to provide added resilience.
"Scale up" vs "Scale out"
Couchbase’s elasticity via rebalancing allows not only for the seamless maintenance and upgrading of software, but for hardware as well. By adding new node of higher capacity and swapping them with the older nodes of lower capacity, you can achieve vertical scalability...all without loss of performance or availability of data to a running application.
Knowing when to increase the capacity of your nodes is going to depend on your own use case, resource availability and underlying cost variables. At one extreme, we would not recommend 1, 2 or 3 nodes of very beefy hardware...nor would we recommend 1000 very small nodes. Some general rules of thumb are:
- Scale out with the smallest node size possible up to about 15-20 nodes, and then consider scaling up in order to shrink the node count
- When scaling up, don't choose hardware that shrinks the node count below 6...rinse and repeat #1 above
- If your workload is memory bound, be careful about shrinking the node count with respect to the disk throughput you've got...the more RAM each node holds, the more disk IO will be needed by each node for both steady state workoad, compaction and especially rebalancing, etc.
- If your workload is disk bound, it's better to have more nodes to distribute the workload across rather than incrementally increasing the disk performance of each node in place
A good example of this is a recent customer who's sizing dictated 32 nodes of spinning disks due to a write heavy workload. By looking at SSD's, they were able to bring that down to a count of 9...fitting well into #1 and #2 above and easily offsetting the cost of the SSD's.
To start with, your minimum hardware requirements should look something like this:
- 3 nodes
- 4+GB RAM
- 4 CPU Cores (+1 per bucket, +1 per design document, +1 per XDCR stream, +1 per reader/writer worker beyond the default of 3)
- The best "local" storage you have available
- The lowest latency, highest throughput network you have available
At the end of the day, your overall architecture will be much better served by having more nodes of lower capacities/power than having less nodes of much increased power (and cost).
And remember, if in doubt, the Couchbase field team is on hand to assess your environment and provide guidance on your specific needs. We want to ensure you have the smoothest possible experience with the product.
In the next and fourth entry of this series, I will discuss the various metrics and monitoring practices for understanding the proper sizing of a Couchbase Server cluster.