December 16, 2010

libmembase – a C interface to Membase

Membase is "on the wire" compatible with any memcached server if you connect to the standard memcached port (registered by myself back in 2009), so that you should should be able to access membase with any "memcapable" client. Backing this port is our membase proxy named moxi, and behind the scene it will do SASL authentication and proxy your requests to the correct membase server containing the item you want. One of the things that differs Membase from Memcached is that we store each item in a given vbucket that is mapped to a server. When you grow or shrink the cluster, membase will move the vbuckets to new servers.

There is no such thing as a free lunch, so accessing membase through moxi "costs" more than talking directly to the individual nodes yourself. We like to refer to such clients as "smart clients." As a developer on Memcached I need to test various stuff, so I went ahead and hacked together a quick prototype of such a library to ease my testing. Initially I wanted to extend libmemcached with this functionality, but that seemed to be a big (and risky) change I didn't have the guts to do at the time.

The current state of the library is far from production quality, and with a minimal list of supported features. So why announce it now? Well I don't think I'll find the time to implement everything myself, so I'm hoping that people will join me in adding features to the library when they need something that isn't there...

I've designed the library to be 100% callback based and integrated with libevent, making it easy for you to plug it into your application.

So let's say you want to create a TAP stream and listen to all of the modifications that happens in your cluster. All you need to do would be:
 

struct event_base *evbase = event_init();

   libmembase_t instance = libmembase_create(host, username, passwd, bucket, evbase);
   libmembase_connect(instance);

   libmembase_tap_filter_t filter;
   libmembase_callback_t callbacks = {
      .tap_mutation = tap_mutation
   };
   libmembase_set_callbacks(instance, &callbacks);
   libmembase_tap_cluster(instance, filter, true);

Then you would implement the tap callback function as:
 

static void tap_mutation(libmembase_t instance, const void *key, size_t nkey, const void
*data, size_t nbytes, uint32_t flags, uint32_t exp, const void *es, size_t nes)
{
   // Do whatever you want with the object
}

And thats all you need to do to tap your entire cluster :-) Let's extend the example to tap multiple buckets from the same code.
 

struct event_base *evbase = event_init();

   libmembase_t instance1 = libmembase_create(host, username, passwd, bucket1, evbase);
   libmembase_t instance2 = libmembase_create(host, username, passwd, bucket2, evbase);
   libmembase_connect(instance1);
   libmembase_connect(instance2);

   libmembase_tap_filter_t filter;
   libmembase_callback_t callbacks = {
      .tap_mutation = tap_mutation
   };
   libmembase_set_callbacks(instance1, &callbacks);
   libmembase_set_callbacks(instance2, &callbacks);
   libmembase_tap_cluster(instance1, filter, false);
   libmembase_tap_cluster(instance2, filter, false);

   event_base_loop(evbase, 0);


The instance handle is passed to the callback function so you should be able to tell which bucket each mutation event belongs to.

As I said all of the functions in the API is callback based, so if you want to retrieve an object you have to register a callback for get before calling libmembase_mget. Ex:
 

libmembase_callback_t callbacks = {
        .get = get_callback
    };
    libmembase_set_callbacks(instance, &callbacks);
    libmembase_mget(instance, num_keys, (const void * const *)keys, nkey);

    // If you don't want to run your own event loop, you can call the following method
    // that will run all spooled commands and wait for their replies before breaking out
    // of the event loop
    libmembase_execute(instance);

 

The signature for the get callback looks like: 

void get_callback(libmembase_t instance, libmembase_error_t error, const void *key,
size_t nkey, const void *bytes, size_t nbytes, uint32_t flags, uint64_t cas)
{
   // do whatever you want...
}

So what is missing from the library right now? 

• Proper error handling. Right now I'm using asserts and abort() to handle error situations, causing your application to crash... you don't want that in production ;-)
• Timeouts.. Right now it will only time out on TCP timeouts../
• A lot of operations! I'm only supporting get/add/replace/set...
• Fetch replicas..
• Gracefully handle change in the vbucket list
• +++

Do you feel like hacking on some of them?

For more on where to get and build libmemcached, click here.

Comments