February 16, 2010

The Memcached Way

One terrific part of the NorthScale startup adventure has been that we've been lucky to have so many great interactions with the memcached community -- including folks who've used memcached to power some of the largest and most popular web applications and sites on the planet. It seems appropriate as we launch NorthScale to take a moment to pull together a few war stories and lessons learned to date, and roll them up into a larger pattern that I'd like to call "The Memcached Way."

The first part of The Memcached Way is...

Simplicity Wins

Memcached is simple, from its very core bones, and in its philosophical approach to solving problems. This simplicity is why memcached has succeeded in its popularity and ability to deliver on its promise of solving pain. Memcached has a very simple usage pattern. It's easy to understand (just like a hashtable), and every developer can grok its simple key-value API of get, set, and delete -- even at 2AM when we're under the gun to get the website working again. It's also simple in that you can introduce memcached in a very piecemeal fashion -- just take care of the home page first with some memcached calls so you can breathe again. Tomorrow, look for the next set of slow queries and handle them with memcached. Then do some more memcached'ing the day after that. No need for a big app or systems rearchitecture where you call in the big, expensive guns. All the extra verbs (incr/decr, add/remove, etc) are just as dead simple to understand and use. Memcached has a simple performance philosophy -- each command must have an O(1) implementation. Because each memcached command operation follows that requirement, you know memcached is blazing fast and will remain blazing fast. For every command. If it's not an O(1) operation, it won't go into the core memcached server. As an aside here, there are new "range" commands that have been specified as part of the memcached binary protocol. These are meant to return a set of key/value results whose keys match a certain range query. However, given the above O(1) performance rule, it's highly likely the core memcached server software will not implement the range commands out-of-the-box, since the usual implementations for range operations take more than O(1). Range command implementations will, instead, be left as an engineering exercise to memcached extension developers, and we're especially excited on this front!

Back to the simplicity story...

Memcached has a simple, easy-to-implement networking protocol. This simple, easy protocol allowed for fast adoption by client-library developers in the community. Which meant you could use memcached from your favorite programming language, because it's likely somebody already wrote a client library. And, if you hated the client-library implementation or thought it was brain-dead, it's easy to write your own, better one and push it out to the world. Unfortunately, for some programming languages, that left a little too much choice, or the default client-library choice for a language was the slowest one, or perhaps that didn't implement all the new protocol features. Rod Ebrahimi here at NorthScale has taken a whack at this client-library "proliferation" issue, and has been contacting client-library implementators, running tests, and coming up with a NorthScale "preferred" or "recommended" set of client-library implementations to provide some high quality signal to y'all. Thank you Rod!

Again, back to simplicity...

Memcached has a simple scaling strategy -- so it's foolproof. Because memcached follows a simple "share nothing" server design, it scales-out infinitely. Memcached servers simply don't talk to each other, meaning a whole morass of difficult and subtle mesh and O(N^2) issues are just completely avoided with out-of-the-box memcached. Instead of having a single master lookup server, for example, memcached punts the issue to the user's application or clients, who must track their own server membership lists. Now, we've seen lots of duct-table and hand-rolled solutions in the wild as folks tried to handle this punt in many myriad ways, and in a later blog post, I'll describe the NorthScale pre-canned solution here to bring simplicity to the client-side server-list management issue Simple just works. Do a job and do it simply and well and it's easier to succeed. Thanks to the memcached community for proving it! More on The Memcached Way in forthcoming blog posts.

Comments