You may have heard that MongoDB has issues with scaling out. You may have heard that Viber is replacing MongoDB with Couchbase Server. Have you heard that MongoHQ has to scale up?

I agree with MongoHQ on one thing. It’s easier to scale up. While scaling up may be the only solution for MongoDB, it is not the right solution. What happens when there is no more room to scale up? I think that MongoHQ would prefer to scale out, but they realized that MongoDB was not engineered for it.

It results in a loss of functionality…

In MongoDB, for instance, horizontal scales means losing out on unique secondary indexes (one of the few schema constraints MongoDB currently supports) for a given collection.

It requires difficult decisions…

In MongoDB, this choice is the “shard key decision” that you will hear experienced MongoDB users talk about with undertones of dread.

It is complex to configure and maintain…

But once you’ve got your sharded database into production, then there’s the underlying reality that your system has more “moving parts” in it than might be ideal because more moving parts mean more things can go wrong.

It results in a loss of performance…

In a sharded setup, the network connectivity between the Mongo router, config servers, and each shard influences overall performance.

Couchbase Server, on the other hand, was engineered to scale out. It does not result in a loss of functionality. It does not require difficult decisions. It is not complex to configure and maintain. It does not result in a loss of performance.

  • Couchbase Server does not require manual configuration of sharding. It is automatic.
  • Couchbase Server does not have a lot of moving parts. There is one node type.
  • When Couchbase Server is scaled out, the performance is increased.

I think the section on oplog tailling was rather troublesome. I think it highlights the fact that MongoDB was not engineered to scale out. It turns out that administrators and developers rely on a log file for insight and integration. The problem with scaling out MongoDB is that is requires developers to tail multiple log files. There is a log file per node.

Couchbase Server, on the other hand, does not require administrators and developers to tail a log file for insight and integration. Administrators can monitor Couchbase Server from any node. Developers can integrate with Elasticsearch or Apache Hadoop without having to interact with individual Couchbase Server nodes. The topology is transparent. That’s the benefit of a distributed system with a shared nothing architecture.

It’s worth pointing out that Couchbase Server supports cross data center replication. It allows the data in a cluster to be replicated to a different cluster. In fact, this is how we integrate with Elasticsearch. It is as a different cluster that Couchbase Server can replicate data to. That it is an Elasticsearch cluster is irrelevant. It is simply a different cluster to replicate data to.

Join the conversation over at reddit (link).
Join the conversation over at Hacker News (link).

Further Reading

Why We Start Scaling Vertically (Google Web Cache)

Note: MongoHQ as removed the original post and is now redirecting to a new one.

Topology: The Architecture of Distributed Systems

Note: This includes an overview of the moving parts within a MongoDB deployment that is scaled out.

Couchbase Server Elasticsearch Integration (page | doc)

Couchbase Server Apache Hadoop Integration (page | doc)

Posted by Shane Johnson, Director, Product Marketing, Couchbase

Shane K Johnson was the Director of Product Marketing at Couchbase. Prior to Couchbase, he occupied various roles in developing and evangelism with a background in Java and distributed systems. He has consulted with organizations in the financial, retail, telecommunications, and media industries to draft and implement architectures that relied on distributed systems for data and analysis.

8 Comments

  1. so couchbase is better?

    1. I would not say one is better or worse. It really depends on your need. If you do enough research, you will see all the NoSQL vendors will have a story of various migrations from one potential datastore to another (including Riak). Just two cents..

      {full disclosure Basho employee}

      This is a great topic though, scaling out. Some thoughts:

      As Couchbase adds nodes however, how do various workload performances scale (i.e. Write/Read ratio) given vBuckets replicas are not equal participants, there are two replication strategies: 1:n master/slave and 1..n chained.

      At \”large scale\”, failures occur more (i.e. node-time outs, split-brain, partial failures), how does the system address failures and what amount of human intervention is required?

      This comment isn\’t meant to provoke and put Riak in a favorable light as it does these things \”better\” …….but to ask those reading to think and ask questions when reading any materials. Riak has it\’s own set of limitations such as JSON field queries, etc. and Couchbase is fantastic for those on Memcache API.

      Technology at the end of the day is just a tool and should be evaluated per use/case. I hope this helps raluca. Cheers. 🙂

      1. Hey Frank,

        I\’ve never had a bad thing to say about Riak, and I still don\’t. That being said, I\’m not entirely sure I understand your comment. As nodes are added, the vBuckets are evenly spread across the cluster. Adding nodes increases performance as each node handles reads and writes for fewer vBuckets. There is only one replication strategy. It writes to all replicas (one-to-many). It does not daisy chain the writes to replicas.

    2. Yes, but you don\’t have to take my word for it. I think the performance issues of MongoDB are well known.

      You don\’t have to look too far to find folks in the MongoDB community telling you that you don\’t need sharding. When someone tells you that you don\’t need something, it\’s because they can\’t do it. The first example that comes to mind is the iPhone with copy & paste 😉

  2. Well, sounds like an infomercial ..
    Difficult to compare Couchbase with MongoDB.
    While MongoDB is a general purpose NoSQL, Couchbase is designed for caching: most data set has to be in memory, no complex querying, etc ..
    Also Couchbase\’s hashing means lower availability as a node down means whole data set is impacted ..

    1. 1. Couchbase Server is not designed for caching. It can be deployed as a cache, as a key / value store, as a document database, or as all three. That\’s the benefit of a great architecture, and it\’s something you have to get right in the beginning. It\’s why MongoDB has the issues it does.

      2. Hashing has nothing to do with availability. No, a failed node does not impact the whole data set. You can configure replicas. If a node fails, one of the replicas can be \’promoted\’. There is no loss of availability. Even if there are no replicas, a failed node only results in the loss of the data on said node.

      1. 1. When most of dataset has to be in memory for Couchbase to perform, it is optimized for caching ..
        2. I have no way to control where data lives due to consistent hashing in Couchbase (unlike MongoDB where I can use range partitioning). Hence a node down can cause many requests to fail (till fail over is complete after 30 secs period) as documents will be distributed across nodes. Adding additional hashing algo will allow to solve that issue.

        1. 1. Why do you keep saying that most of the data has to be in memory? Doesn\’t MongoDB use memory mapped files? A \”cache\” implies that the data is either transient or is persisted in a separate database. When Couchbase Server is used as a key / value store or as document database, neither is true.

          2. The problem with range partitioning is that it limits performance until the database is full. This is why most (NoSQL) databases rely on consistent hashing.

          3. Couchbase Server is a CP distributed system. It maintains consistency. Yes, it will wait 30 seconds before failing over a node. This is to ensure that the node in question is down as opposed to slow. If it is slow, the data that it owns may or may not be available. You don\’t want to fail over every node every time it takes too long to respond to a request. It would be unstable. While the data owned by a questionable node may or may not be available for 30 seconds, the data owned by the other nodes is available.

Leave a reply