MongoDB published an independent benchmark comparing the performance of MongoDB, Apache Cassandra, and Couchbase Server in single node deployments to counter the one we published comparing the performance of MongoDB and Couchbase Server in 9-node deployments. MongoDB performs well when it 1) is limited to a single node, 2) doesn’t store a lot of data, and 3) doesn’t support a lot of users. This is a sweet spot for MongoDB.
MongoDB raised awareness of NoSQL databases by making it easy for developers to build a proof of concept or small application. However, MongoDB can’t meet the rigorous demands of production deployments. Couchbase Server, on the other hand, shines when deployed as a distributed database. It scales with ease to store more data, support more users, and provide higher throughput and lower latency access to data.
1) Single Node Benchmark Fails to Address Scalability Requirements
If you want to see how well a database will perform with a small data set and a few users, benchmark it with a single node deployment. If you want to see how well it will perform in a production environment with a large data set and many users, benchmark it with a clustered deployment.
It’s important to not only measure performance at scale, but to measure performance while meeting enterprise requirements. For example, high availability.
Why didn’t MongoDB compare the performance of distributed deployments? Well, it’s difficult for MongoDB to scale beyond a single node.
As noted by InformationWeek (link), scaling is not linear. Adding nodes to a MongoDB replica set will not increase write performance because every write will still be executed by a single node – the primary node. The same is true for MongoDB shards – every write will still be executed by the primary nodes. If there are three shards with three nodes per shard, writes will be executed by the three primary nodes.
2) Benchmark Applied a Different Write Scenario for Couchbase Server – Not an Apples-to-Apples Comparison
MongoDB performed one operation per write – the update. However, MongoDB inadvertently had Couchbase Server perform two operations per write: one read, one update. This limited the write performance of Couchbase Server.
MongoDB (with WiredTiger) and Couchbase Server leverage document level locking. If two clients update the same document at the same time, one of them will fail and it will have to retry the update. This is the case for both MongoDB and Couchbase Server. It was the write scenario for MongoDB in this benchmark, but not Couchbase Server.
Another write scenario is when you need to ensure a client can’t update a document if it’s been updated by a different client first. In this write scenario, Couchbase Server supports compare-and-swap while MongoDB recommends the “Update Document if Current” pattern. This was the write scenario for Couchbase Server, but not MongoDB.
Why would MongoDB have Couchbase Server perform compare-and-swap, but not implement its own “Update Document if Current” pattern?
3) Benchmark Utilized Outdated Couchbase Client, Rather than Current Client
MongoDB chose to use an outdated client library released in 2013 for Couchbase Server, which limited the performance of Couchbase Server. We released a new client library last September, built on Netty and RxJava, followed by minor releases in February and March.
Why would MongoDB benchmark Couchbase Server with an outdated client library but benchmark itself with its latest client library?
4) Single Node Durability versus Distributed Database Durability
The point of durability is to ensure data is not lost when a server fails. In this benchmark, because it was performed with single node deployments, data can only be durable when it’s written to disk. It’s the same limitation of traditional relational databases.
Today, distributed databases rely a modern approach to durability that distributes the risk of data loss – they replicate data to multiple nodes. Couchbase Server is unique in that while it writes to disk like a conventional database, it leverages faster memory-to-memory replication between nodes too. The data is not only durable, it’s highly available. It can be replicated to nodes on different servers, different racks, or different data centers.
That being said, if MongoDB had used the latest client library Couchbase Server write performance would have been at least 10x higher. The two year old client library (1.1.8) waited a minimum of 100ms before checking if the write had been written to disk. In a later release (1.4.x), 10ms. In the latest release (2.x), 10µs. That's why you should always benchmark databases with their latest client libraries, not two year old ones.
MongoDB Rules Single Node Deployments
MongoDB is well-suited to a proof of concept or small application that has a small data set and a handful of users. Couchbase Server is better-suited to applications with more data, more users, and higher throughput / lower latency requirements – those that benefit from a distributed deployment. In fact, Couchbase Server is often selected to power mission-critical applications – small or large, consumer or enterprise, social or gaming – where traditional relational databases fail to provide the scalability or performance required.
Discuss on Hacker News
FYI – We benchmarked MongoDB and Couchbase Server with 9-node deployments.