Dissecting the NoSQL Benchmark

Let's dissect the Fall 2014 NoSQL Benchmark.

Apache Cassandra / DataStax Enterprise. MongoDB. Couchbase Server.

Go.

Hardware

	Server (8)	Client (32)
Processor	Intel Xeon E5-2620 V2 (2)	Intel Core i5-4440
Memory	64GB	8GB
Storage (Data)	100GB SATA 6Gb/s SSD (2)
Storage (OS)	80GB SATA 6Gb/s SSD

Network: 96Gb/s (1Gb/s Switch with 48 Ports)

Wait, 96Gb/s? It's a full duplex matrix. (link)

Why not benchmark with dozens of servers?

I prefer benchmarks performd on bare-metal server, on-premise servers. It's not cheap.

Data

	4 Nodes	6 Nodes	8 Nodes
Entries	20M	30M	40M
Copies	2	2	2
Feilds	10	10	10
Field	150 bytes	150 bytes	150 bytes
Entry	1,500 bytes	1,500 bytes	1,500 bytes
Data	55.8GB	83.8GB	111.6GB

Why not benchmark with terabytes of data?

The goal is to identify how well database perform and scale when the working sets in memory. The working set could have been 32GB, or it could have been 384GB. It wouldn't have mattered. It fit in memory. The working set could have been 10% of the data, or it could have been 90% of the data. It wouldn't have mattered. It fit in memory.

Consider customers streaming movies from Netflix. There are many, many customers. However, only a fraction of them are streaming movies at any given point in time. That's a working set. Consider web and mobile application users. I play Words with Friends with my wife. We play two to three games at a time. Every now and then I'll launch the app. I'll spend a few, if not several minutes, playing the right words to crush her. We're all about Amazon Prime. However, we only shop once in a while. When we do, we're part of a working set. Customers who are currently shopping. Not all customers. Then there is real time data. Consider click stream analysis and sensor data. The working set is users currently visiting the web site. Not all users. The working set is the last minute, the last hour, or the last day of sensor data. Not all sensor data.

While an in-memory working set improves read performance, it does not improve write performance.

All data was persistent.

Configuration

Consistency

Apache Cassandra, eventually consistent (ONE). MongoDB and Couchbase Server, consistent.

Durability / Persistence

Apache Cassandra and MongoDB and Couchbase Server, asynchronous persistence.

Note: The commit log was disable in Apache Cassandra.

Replication

There are two copies.

Topology

Apache Cassandra and Couchbase Server, one node per server. MongoDB, two nodes per server.

Note: MongoDB relies on a master / slave topology. To ensure every server responded to client read / write requests, one master and one slave (from different shards) were installed on every server.

Workloads

Read Heavy: 95% Reads / 5% Writes
Balanced: 50% Reads / 50% Writes

Database Nodes & Client Threads

The benchmarks were performed on deployments of four, six, and eight servers.

The benchmarks were performed with an increasing number of client threads.

Results

Read Heavy

Apache Cassandra

Throughput peaks between 248 and 496 client threads: 99K ops/s.
Adding nodes increases throughput.

Latency is never less than 1ms.
Adding nodes decreases latency.

MongoDB

Thoughput peaks between 132 and 198 client threads: 227K ops/s.
Adding nodes increases throughput.

Latency is less than 1ms up to 231 client threads: 200K ops/s.
Adding nodes decreases latency.

Couchbase Server

Throughput peaks at 2,970 client threads: 1.62M ops/s.
Adding nodes increases throughput.

Latency is less than 1ms up to 1,320 client threads: 1.44M ops/s.
Adding nodes decreases latency.

MongoDB v Couchbase Server

Let's compare the throughput and latency of MongoDB and Couchbase server under equivalent load.

Couchbase Server throughput increases under increasing load. MongoDB, not so much.

Couchbase Server latency is less than 0.6ms up to 264 client threads: 642K ops/s.
MongoDB is less than 1ms up to 231 client threads: 200K ops/s

Explanation

MongoDB read latency benefits from memory mapped files. However, memory mapped files do not perform as well as direct memory. That, and MongoDB relies on an archaic, complex topology. In fact, the benchmark compared MongoDB with 16 nodes to Couchbase Server with 8 nodes.

Apache Cassandra v Couchbase Server

Couchbase Server throughput increases under increasing load. Apache Cassandra, not so much.

Couchbase Server latency is less than 0.7ms up to 792 client threads: 1.22M ops/s.
Apache Cassandra latency is never less than 1ms.

Explanation

Apache Cassandra relies on a poor, disabled by default row cache (JIRA).

Balanced Read / Write

Apache Cassandra

Throughput peaks between 272 and 544 client threads: 89K ops/s.
Adding nodes increases throughput.

Latency is less than 1ms up to 680 client threads: 79K ops/s.
Adding nodes decreases latency.

MongoDB

Throughput peaks at 256 client threads: 85K ops/s.
Adding nodes increases throughput.

Latency is never less than 1ms.
Adding nodes decreases latency.

Couchbase Server

Throughput peaks at 3,000 client threads: 1.18M ops/s.
Adding nodes increases throughput.

Latency is less than 1ms up to 480 client threads: 499K ops/s.
Adding nodes decreases latency.

MongoDB v Couchbase Server

Let's compare the throughput and latency of MongoDB and Couchbase Server under equivalent load.

Couchbase Server throughput increases under increasing load. MongoDB, not so much.

Couchbase Server latency is less than 0.8ms up to 360 client threads: 457K ops/s.
MongoDB latency is never less than 1ms.

Explanation

Well, a single lock doesn't help (manual | JIRA). Document level locking has been requested since 2010.

Apache Cassandra v Couchbase Server

Couchbase Server throughput increases under increasing load. Apache Cassandra, not so much.

Couchbase Server latency is less than 1ms up to 480 client threads: 499K ops/s.
Apache Cassandra latency is less than 1ms up to 680 client threads: 79K ops/s.

Explanation

Apache Cassandra write latency benefits from the memtable, but Couchbas Server was engineered for concurrency.

Conclusion

	Reads		Writes
	Low Throughput	High Thoughput	Low Thoughput	High Throughput
Apache Cassandra	Poor	Poor	Great	Poor
MongoDB	Great	Poor	Poor	Poor
Couchbase Server	Great	Great	Great	Good

Questions?

Comment or join the discussion over on Reddit or HN.

You can download the white paper here.

Shane Johnson, Director, Product Marketing, Couchbase

Products

See How Capella Stacks Up

See How Capella Stacks Up

By Industry

By Need

Why NoSQL

What is NoSQL and why choose it?

Popular Docs

By Developer Role

Capella Playground

Start A Free Capella Trial

Resource Center

Education

Certification Exams 2023

Get Couchbase certified

About

Partnerships

Our Services

Partners: Register a Deal

Ready to register a deal with Couchbase?

Marriott

Dissecting the NoSQL Benchmark

Hardware

Data

Configuration

Workloads

Results

Read Heavy

Balanced Read / Write

Conclusion

Author

Posted by Shane Johnson, Director, Product Marketing, Couchbase

Leave a reply Cancel reply