From time to time we get questions on topics that do not directly fit into the documentation since they dig deeper into the internals of the client libraries. In this series we’ll cover different themes of interest – in this one it is how the SDK bootstraps itself.

Note that this post was written with the Java SDK 2.5.9 / 2.6.0 releases in mind. Things might change over time, but the overall approach should stay mostly the same.

First Steps

To know which remote hosts to ask for a config in general, we rely on the list of hostnames that the user provides through the CouchbaseCluster  API. So for example if this code is provided:

Then we are going to use node1 , node2  and node3  as part of the initial bootstrap process. It’s important to understand at this point that no actual connection is established just yet. The code just stores these three hosts in the ConfigurationProvider as seed hosts. They are not stored as strings though, but rather as a NetworkAddress each of which provides a convenient and more feature rich wrapper over the plain JDK InetAddress . For example it can be modified through system properties to disallow reverse DNS lookups or force IPv4 over IPv6 even if the JDK is configured differently.

If no host is provided, 127.0.0.1  will be set as a seed host.

For readers out there who are also using an SDK which relies on the connection string for configuration: the Java SDK will parse such a string when given, but will not use any of the configuration settings other than the list of bootstrap nodes. Every configuration must be applied through the CouchbaseEnvironment  and its builder or through system properties where available.

One last thing we need to cover before we can open a bucket and get into the weeds. If the CouchbaseEnvironment  is configured with dnsSrvEnabled , then during the cluster construction phase the code will perform a DNS SRV lookup to retrieve the actual hosts instead of the DNS SRV reference host. The code utilizes JVM classes to do this, the code can be found here.

Getting a Config

Now that we have a list of seed nodes to work with, as soon as a bucket gets opened the real work begins. Let’s consider the following code snippet as a starting point:

The act of opening a bucket has lots of small steps that need to be completed, but the first and most important step is to fetch an initial, good configuration from one of the nodes in the cluster.

The ConfigurationProvider builds up a chain of config loaders, where each loader has a different mechanism of trying to get a configuration. Currently there are two different loaders in place:

CarrierLoader: tries to fetch the configuration via the memcache binary protocol from the KV service (port 11210  by default).
HttpLoader: tries to fetch the config from the cluster manager (port 8091  by default).

The CarrierLoader  takes precedence since it is the more scalable option (the KV service is able to handle much higher requests per second than the cluster manager) and if it is not available the SDK will fall back to the HttpLoader . One of the reasons why it could fail is that certain (older) Couchbase Server releases don’t support this mechanism. Other failures in general include port 11210  being blocked by firewalls (which leads to other problems down the road anyways…) or sockets timing out.

The HttpLoader  itself also has two HTTP endpoints to fetch from (depending on Couchbase Server version again for backwards compatibility) one that provides a more terse config and one that provides a verbose one. The verbose one is the one that has been there since the Couchbase Server 1.8/2.0 days and is guaranteed to work, but the newer features are likely not exposed there.

You get the theme here: the newest/most scalable approaches are tried first with fallback options for older clients and custom configurations. In general the cluster manager is always responsible for managing and providing the current up-to-date config, but it is sent to the KV service as well so that the SDKs can retrieve it with a special command.

As a special case for older servers, if the config was loaded through the HttpLoader , the SDK will attach a streaming connection to one of the HTTP endpoints to receive config updates. If the CarrierLoader  was successful, we’ll keep polling with a configurable polling interval for a config via the binary protocol instead since it’s more resource efficient on the server side (the KV service is much better suited to handle potentially lots of requests than the cluster manager, which is why the carrier loading was implemented in the first place).

We’ll cover the different error cases later on, for now assume we were able to get a valid configuration from one of the loaders. At this point it is sent back to the configuration provider which is then checked and sent into the RequestHandler.

It is possible that the configuration provider is not forwarding the configuration, the main reason being that it is outdated or not new enough. Every configuration has a revision that comes with it and the SDK internally tracks this revision number to only propagate configurations which are newer than the one already processed. During a rebalance operation this can happen quite often, and if you have debug logging enabled you can spot this by looking for a log line with the message: Not applying new configuration, older or same rev ID .

As soon as the RequestHandler  gets a new config pushed it starts a reconfiguration. The reconfiguration has only one job: to take the configuration provided by the server and change the internal SDK’s state to mirror it. This can mean adding or removing node references, service endpoints or sockets. The SDK mirrors the common language of the server, so it manages Nodes which provide Services that can have one or more Endpoints.

If you enable debug logging, you can see during bootstrap how the different nodes, services and endpoints are enabled based on the configuration provided:

You can also see that it just ignores certain operations if the desired outcome is already achieved. You’ll see most of these messages when new configurations arrive, that is during bootstrap, rebalance or failover.

Reconfiguration, much like rebalance on the server, is an “online” operation. This means that in-flight operations are not affected most of the time and if they are they are rescheduled and put back into the ringbuffer until they succeed or time out.

As soon as the first reconfiguration has succeeded our bucket open request is completed and the Bucket  handle is returned on the synchronous API. Herein lies a non-obvious tradeoff: we could return the open bucket request right away but would not catch any errors during bootstrap. Waiting for the first reconfiguration to complete allows us to catch more errors right away, but also means that the more nodes are added to the cluster the longer it will take – make sure to measure the bootstrap performance for your cluster. The default timeout of 5 seconds should be plenty for most cases, but in slower and big cloud deployments it can make sense to increase this timeout.

You can do this by either modifying the environment:

or directly on the openBucket  call:

Bootstrap Failures and Configuration

What happens if one or all of the provided seed hosts are not available? To answer this question we need to take a quick look at how the loader tries to fetch the configurations.

For every host in the seed node list, a primary and fallback loader is assembled and all of them fire off in parallel trying to grab a proper config. Once the first one is found, the observable sequence is unsubscribed and only one config is used going forward. Using this approach rather than sequencing through provides faster bootstrap times but may also lead to more errors in the log if one or more nodes are not reachable. Since we are optimizing for the good case though (in the failure case something is wrong anyways) this seems like an acceptable tradeoff.

In essence, as long as there is a good node to bootstrap from in the seed node list, the SDK will be able to pick up a configuration. Let’s consider a special case where in a MDS setup you instruct the SDK to only bootstrap from a node which does not contain the KV service (let’s say only query and of course the cluster manager).

If you have debug logging enabled, you’ll see the SDK trying to fetch a config from the KV service but it’s being rejected with some error:

The specific error doesn’t matter, but later in the log you can see how the loader switches to the HTTP fallback which is available:

One implication of this is that even if on another node the KV service may be available, in this case the HTTP loader will still instruct the streaming connection to open to fetch subsequent configurations.

It is possible to customize the bootstrap behavior through environment settings:

  • bootstrapHttpEnabled : If the HttpLoader is activated at all (true by default).
  • bootstrapCarrierEnabled : If the CarrierLoader is activated at all (true by default).
  • bootstrapHttpDirectPort : Which port the HttpLoader should initially contact if SSL is not enabled (default 8091).
  • bootstrapHttpSslPort : Which port the HttpLoader should initially contact if SSL is enabled (default 18091).
  • bootstrapCarrierDirectPort : Which port the CarrierLoader should initially contact if SSL is not enabled (default 11210).
  • bootstrapCarrierSslPort : Which port the CarrierLoader should initially contact if SSL is enabled (default 11207).

These settings have sane defaults and in general do not need to be modified. They can become useful in debugging scenarios or if custom builds inside Couchbase are tested.

Final Thoughts

Usually bootstrap is smooth and resilient to individual node failures, but if something goes wrong and the bucket doesn’t open properly the first order of action is always to look at the logs at INFO/WARN level and if possible DEBUG logging is enabled. This usually provides insight into which sockets did not connect properly or where in the bootstrap phase the client had to stop (because of errors or timeouts).

Also remember that if you run large clusters, specifying a longer bucket open time than the usual 5 seconds might make sense if the network is slow or has a higher latency variance than usual.

Author

Posted by Michael Nitschinger

Michael Nitschinger works as a Principal Software Engineer at Couchbase. He is the architect and maintainer of the Couchbase Java SDK, one of the first completely reactive database drivers on the JVM. He also authored and maintains the Couchbase Spark Connector. Michael is active in the open source community, a contributor to various other projects like RxJava and Netty.

2 Comments

  1. Susantha Bathige October 4, 2018 at 12:24 pm

    Nice article! Thanks.
    Just curious about what logs at server side has the bootstrap info. Is it “memcached.log”? Appreciate if you could shed more light into this. Thanks.

  2. Yes, the memcached.log on the server side has some of the information. Although it is not logging a lot (other than the newer “hello” command) since it could be very spammy.

    Anything specifically you are looking for?

Leave a reply