Alex Sirbu R&D Team Lead at Lentiq, has seven years’ experience in architecting and building reliable distributed systems.

Lentiq is a Kubernetes-based, multi-cloud data lake-as-a-service startup, meaning we manage other people’s data and workflows on a daily basis for a living. In order to properly achieve this, we require a fast, reliable, and scalable database to store all our internal information that also allows us to implement a microservices-oriented architecture. Couchbase has proven time and again that it covers all our requirements, and has been our database of choice ever since our project started. Thus, the following presents our history with integrating and using Couchbase as the sole persistent internal storage layer at Lentiq.

Choice

Why did we choose Couchbase in the first place? The answer to this stems from our previous experience with it. Our team shifted from the bare metal cloud project at Bigstep , our parent company, to the new Lentiq project. Some of our colleagues worked on the Couchbase product used by Bigstep, doing performance tests (which showed how well it worked), as well as actually using it at some point as a cache in front of a slower relational database. This previous  experience with Couchbase proved valuable when we had to choose our internal storage layer for Lentiq.

An important factor in our selection was the architecture of the new product. By splitting the application into multiple decoupled microservices, we needed to apply the same decoupling principle to the database layer. This would further enforce the separation of concerns between the modules, alongside the fact that the only way for microservices to obtain information from other services would be via exposed interfaces – no database bypassing allowed. We also needed to move fast and experiment, and not have to deal with strict schemas, so we decided to take the NoSQL route for this project. And, last but not least, as the project was to be written in Java and Spring, support for easily interfacing with the database was another essential factor.

Thus, taking into consideration all the requirements and also seeing how our old friend Couchbase checked all the boxes, we decided to use Couchbase for our MVP – and haven’t changed our mind since!

 

Architecture

The two latter requirements are pretty straight forward, but how did we use Couchbase to achieve the first requirement – a microservices-based architecture?

The obvious solution was to allocate a bucket per microservice, giving eacha separate user that can only access their specific bucket, with the cluster access credentials given to each service upon startup. Easy, right? By using decoupled modules, we managed to (mostly) have only one object type with which each microservice works closely. This specific object has both public (external, accessible by outside microservices) and internal fields. All the fields are saved inside the bucket, but the object is cleaned and exposed only with its public fields via the external interfaces of its owner.

Our allocation also assisted with scaling and reliability. As we had few microservices at the beginning, we put all the buckets inside the same Couchbase cluster. As we added more and more, we grouped them based on traffic and size of objects and split them up on multiple clusters, further isolating them and allowing independent scaling.

 

Queries

By using the Spring data Couchbase library , connecting to and actually using Couchbase for reading and writing data becomes easy.

Common read and write operations are provided by CouchbaseRepositories and CouchbaseTemplates (insert, update, delete, findById). What helps us a lot is the way in which these basic operations can be expanded upon. For example, querying for all the project environments inside the mini-data lake we call a data pool is done by simply having the following repository class:

For more complex querying (e.g., searching for a value inside an array), using the N1Q1 query language is as easy as adding an annotation for the given function. The following example returns all the datasets with a given status and identifies which contain the required visibility in their visibility array:

The n1q1.filter field is autopopulated with “_class=**SpecifiedClassForRepository**” to return only the requested type of objects, allowing multiple types of objects to be stored inside the same bucket, without having to do anything special.

Locking

Not only are we using Couchbase for storing and querying data, but we’re also using it to perform locking for our services.

One example where we need locking is when a client wants to create a new application to crunch some data, and we have to check if the user has sufficient resources to boot up the application. Doesn’t look like a straightforward need for locking, right? Well, as our environment is collaborative and multiple people can work on the same project, multiple clients can create applications inside the same environment at the same time. So, if we don’t use locking for that particular environment, even if each individual’s application is booted up, it is possible that by trying to boot them up at the same time, the actual physical resources are exceeded and one application – or both, if the applications have multiple components – could fail. Thus, the requirement is to perform the check while keeping a lock, until the new application can be taken into consideration by the other create operation.

What is our locking solution then? Pessimistic locking, in the form of getAndLock and unlock, which blocks access to a given document while the lock is kept. Here’s a simplified example:

 

The main issue we encounter with this approach is that in order for getAndLock to actually work all the time, the document which it locks needs to previously exist, and there is no atomic getOrCreateAndLock. To solve for this, as the lock is linked with the environment for which it locks access, we actually integrated the lock document creation and deletion with the creation and deletion flows for the environment. Thus, the document is already generated when a client gets to create an application (as the full environment creation process needs to have finished beforehand), so the lock can be properly taken. As an added bonus, once the particular environment gets deleted, the document is also deleted and doesn’t remain in the bucket doing nothing.

 

Conclusion

At Lentiq, we know that having a reliable and fast storage layer is of utmost importance both for our clients and for our internal services. Couchbase has proven to us that we made the right decision when we choose it as the database layer for our internal microservices, and we wholeheartedly recommend Couchbase if you are in need of a proper NoSQL database!

This post is brought to you by the Couchbase Community Writing Program

Posted by Laura Czajkowski, Developer Community Manager, Couchbase

Laura Czajkowski is the Snr. Developer Community Manager at Couchbase overseeing the community. She’s responsible for our monthly developer newsletter.

Leave a reply