Sean Lynch's blog

October 26, 2010

Why Membase Uses Erlang

Less and less often (because Erlang is becoming more popular), I’m asked why Membase chose to use Erlang for our cluster management and process supervision component. Common alternatives people suggest are Java, C++, Python, Ruby, and, more recently, node.js and Clojure (which would be my top choice if Erlang were off limits to me).

Read more »

April 20, 2010

Tuning Memcached Timeouts for a Cloud Environment

These days, more and more apps are running in the cloud, and they're starting to take memcached with them. For example, as we announced earlier this week, nearly 300 applications are using NorthScale's memcached as a service on Heroku's Ruby-based PaaS cloud platform.

In the past, most environments using memcached have run it on a single, controlled LAN: usually the frontend web servers sitting on the DMZ, without even the normal firewall or router sitting between the DMZ and the database. In this environment, one can reasonably expect that server failures are far more likely than even a single dropped packet, and waiting for a retransmit is likely to take longer than a hit to the database, so it makes sense to set extremely aggressive timeouts, on the order of 100-250ms or less, for memcached operations.

In contrast, cloud networking environments tend to be far less controlled, since they're shared with other customers, and even the location of a given service is not necessarily under the control of the user.

Read more »

March 15, 2010

Avoiding Death Spirals in Distributed Systems

It's easy to build a distributed service that works perfectly under ideal conditions but will fail when subjected to real-world conditions. Death-spirals are one of the most common of these failures, and their most common causes can be avoided by following some simple design guidelines.

Read more »