June 21, 2012

Why Database Technology Matters

[This blog was syndicated from http://damienkatz.net/]

Sometimes I get so down in the weeds of database technology, I forget why I think databases are so fascinating to me, why I found them so important to begin with. ACID. Latency, bandwidth, durability, performance, scalability, Bits and bytes. Virtual this, cloud that. Blah blah blah. Who the fuck cares?

I care.

Dear lord I care. I care so much it hurts.

"A database is an organized collection of data, today typically in digital form." -Wikipedia

I think about databases so much. So so much. New schemes for expanding their capacity, new ways of making them work, new ways of making them faster, more reliable, new ways of making them accessible to more developers and users.

I spend so much time thinking about them, it's embarrassing. As much time as I spend thinking about them, I feel like I should I should know so much more than I do.

HTTP, JSON, memcached, elastic clusters, developer accessibility, incremental map/reduce, distributed indexing, intra-cluster replication, cross-cluster replication, tail-append generational storage, disk fragmentation, memory fragmentation, memory/storage hierarchy, disk latency, write amplification, data compression, multi-core, multi-threading, inverted indexes, language parsing, interpreter runtimes, message passing, shared memory, recovery-oriented architectures. All that stuff that makes a database tick.

Why do I spend so time on this? Why have spent so many years on them?
Why do they fascinate me so much? Why did I quit my job and build an open source database engine with my own money, when I wasn't wealthy and I had a family to support?

Why the hell did I do that?

Because I think database technologies are among the most important fundamental advancements of humanity and our collective consciousness. I think databases are as important as telecommunications and the internet. I think they are as important as any scholarly library -- and that libraries are the earliest non-digital databases. I think databases are almost as important the invention of the written word.

Forget SQL. Forget network, document or object databases. Forget the relational algebra. Forget schemas. Forget joins and normalization. Forget ACID. Forget Map/Reduce.

Think knowledge representation. Think knowledge collection, transformation, aggregation, sharing. Think knowledge discovery.

Think of humanity and its collective mind expanding.

When IBM was at the absolute height of its power, they were the richest, most powerful company on the planet. They primarily sold mainframes for a lot of money, and at the core of those mainframes were big database engines, providing a big competitive advantage their customers gladly paid for.

Google has created a database indexing of the internet. They are force because they found ways to find meaning in the massive amounts of information already available. They are a very visible example of changing the way humanity thinks.

File systems are very simple databases. People have been building all sorts of searching and aggregation technology on top them for many years, to better unlock all that knowledge and information stored within.

Email? Email technology is essentially databases that you can send messages to. It's old fashioned and simple, and yet our email systems keeping getting more clever about ways to shows us what's in our unstructured personal databases.

Databases don't have to be huge to have a huge impact. SQLite makes databases accessible on small devices. It's the most deployed database on the planet. It's often easy to miss the impact when when it's billions of small installations, it starts to look like air. Something that's just there, all around us. But add it up and the impact is huge.

And of course big bad Oracle. As much as people love to hate them, they've made reliable database technology very accessible, something you can bet your business on, year after year. They are great at not just making the technology work, but the complete ecosystem around it, something necessary for enterprises and mission critical uses. There is a lot to criticize about them, but much to praise as well.

So yes, I care. I care deeply. I care about the big picture. And I care about the bits and bytes. I care about the ridiculously complex details most people will never see. I care about the boring stuff that makes the bigger stuff happen. And sometimes I forget why I care about it. Sometimes I lose sight of the big picture as I'm so focused on making the details work.

And sometimes I remember. And I feel incredibly lucky and privileged for the opportunities to have a positive impact on the collective mind of humanity. And my reward is to know, in some small way, that I've succeeded. And I want to do more. This is important stuff, the most important and effective way I know how to contribute to the world. It matters to me.

Comments