This release is another bug fix/stability release concentrating on improving the retry algorithm for Views and adding more refined logging to the client, along with a few other miscellaneous fixes.

Improving the consistency of View operations is important given that Couchbase is a distributed database and nodes can be leaving or entering the cluster at any time. This becomes problematic for Couchbase Clients when, for example, a specific node leaves the cluster while an operation targeting that node is in mid-flight. The question is, what to do next? We _could_ simply fail and allow the hosting application handle the error and perhaps retry the operation by its own means, but we don’t feel that is the correct approach or the type of experience that an application developer using the client would appreciate.

With 1.3.4, when a view operation fails, the client uses the HTTP status code and some heuristics to determine whether or not to retry the operation or to bubble the error message back to the application with its success value set to false. If a retry is in order, the client uses an Exponential Back-off Strategy where for each retry, the duration between retries is doubled until the hard cap is reached or the operation is successful. The first attempt is not counted, but the client pauses for each subsequent retry: 1ms, 2ms, 4ms, etc. This is ensures that the client gives the cluster time to resolve any stability issues before outright failing the operation without causing a DoS attack on the cluster.

The algorithm is tunable via the ViewRetryCount property within the configuration and defaults to 2. Note that with this setting of 2, the client will try the operation four times before giving up: the first try is not counted, then the client will try at 1ms, then 2ms, and finally at 4ms. You can chose a setting between 0 and 10; 0 will disable the retries and 10 will do its last retry at 1024ms (actually it’s the sum of time between the first attempt and the last attempt, so the total time is much longer). Note that this algorithm may change in subsequent releases.

Official release notes can be found here.

Posted by Jeff Morris, Software Engineer, Couchbase

3 Comments

  1. Hi Jeff,

    Recent we have faced some connectivity issue with Couchbase from our .net code. While debugging the issue we found that the number of TCP connections that are being created to couch server were very high(almost 10k per server). Little more digging into the issue pointed us that the connection pressure has to do with our current consumption of Couchbase client. Reading through the code and by writing some unit test code we found that the connection pool is associated with CouchBaseClient object. Our code was creating new client object for every request and relying on destructor for disposing the object. As a result the is immense connection pressure on the server which resulted in connection timeouts and other connectivity issues.

    Based on the above observations we have changed our code to consume client object as singleton and reduced the min connection pool size from 10 to 3. Thsese changes seem to have resolved the issue. No connectivity issues were recorded from last 3 day of stress testing.

    I wanted to understand from you if this is the inteded consumption pattern? I can share more details if you need.

    Really appreciate your comments on this.

    Thanks

    -VM

    1. Hi VM –

      Yes, you are correct, the client should be a long-lived object; you create it when the process or appdomain starts and you destroy it before the process is destroyed. A singleton or public static instance is perfect; the client is thread-safe.

      You can read more here: http://docs.couchbase.com/couc

      From what I can tell, you are doing things \”the right\” way.

      -Jeff

  2. […] Couchbase .NET Client 1.3.4 Released! Read the blog and see the release notes […]

Leave a reply