Michael  K. Aboagye works as an application security engineer and web pentester with good knowledge in Linux, Git, PHP, Java, RDMS and Nosql database development. He likes to dabble in devops also.

Michael K. Aboagye

Document-Oriented Database and N1QL

For this article, I will share my knowledge on why a document-oriented database is entirely different from a relational database management system, and why a document-oriented database performs far better than a relational database management system in a world where big data is progressing steadily. Finally, I will show how to create a document-oriented database with Couchbase and how to use SQL syntax to run against JSON documents.

Brief Introduction to Couchbase

Couchbase, like MongoDB, is a NoSQL product or database system. NoSQL in this context simply means that data is stored without the aid of a structured query language; data is stored in documents. Couchbase is a combination of two products: Memcached and Couchbase.

Couchbase combines advantages of both Memcached and Couchbase to provide scale-out performance over a cluster of nodes. Couchbase is among a number of NoSQL products which has compromised eventual consistency to place more emphasis on availability, scalability, and partition tolerance.

According to the CAP theorem by Eric Brewer, it is impossible for distributed systems such as MongoDB or Couchbase to fulfill all the three concepts – consistency, availability, and partial tolerance. Thus, Couchbase addresses eventual consistency through its advanced replication feature, cross datacenter replication (XDCR), to effectively manage node failure, a common problem for most relational database management systems. provides a weaker form of consistency known as eventual consistency through replication. Moreover, Couchbase has its own way of dealing with node failure which is a common problem of most relational database management systems. Couchbase implements a vBucket known as a logical partition or sharding. Each data item in a vBucket is stored on all present nodes via replication. Thus, in case of node failure, Couchbase makes one of the replica vBuckets available automatically.

A Couchbase Server node comes with two major components – Data Manager and Cluster Manager.

Cluster Manager – is in charge of the Couchbase cluster environment. It handles the rebalancing of data between nodes, provides a cluster map that helps clients requesting particular data to know where to look for that data, monitoring nodes, gathering of statistics, and logging.

Data Manager – simply manages data storage and retrieval in documents. It contains the memory cache, query engine, and disk persistence mechanism.

Finally, Couchbase manages data in buckets. A bucket in Couchbase is similar to a database in Microsoft SQL Server or MySQL. A bucket is made up of related resources but different schemas. When there is more than one or two application–driven databases, you need two buckets to manage data coming from those two applications separately.

Difference Between RDBMS and Document-Oriented Database

A relational database system is completely different than a document-oriented database in terms of concept and structure (schema). Moreover, a relational database is built using SQL commands which are not too difficult to code and read.

In a relational database, tables are linked to each other through foreign keys. Foreign key is a concept in a SQL database system whereby a foreign key consists of common fields which are used to link tables. Also, schemas in relational databases are quite rigid as compared to a document-oriented database.

Let’s assume Aero Air has a relational database to manage its flights. The database consists of three tables, namely: Flight Schedule, Flight Information, and Flight Routes.

Flight ID Flight Name NumSeats Distance
001 Boeing Erv 5,000 1318km
002 Airbus Erv 1,000 1578km
003 Max Erv 2,000 2967km

 

Routes Code Flight ID From To
AeroAir001 001 Glasgow Paris
AeroAir002 002 Brussels Bengaluru
AeroAir003 003 Hamburg Moscow

 

Schedule Code Flight ID Dep Time Arrival Time
AeroAir-010 001 1700 GMT 2000 GMT
AeroAir-012 002 1600 GMT 1300 GMT
AeroAir-013 003 1400 GMT 1400 GMT

In a document-oriented database, schema of a database differs in every document in a database. A document-oriented database allows dynamic schema; thus a document may contain a specific field or key different from other documents. In a typical document database, SQL commands are not allowed. However, you can use N1QL, also referred to as “Nickel,” to run SQL against your documents.

N1QL is a specific query language for Couchbase that allows you to bring both worlds of NoSQL and SQL together. You can use N1QL to run against a JSON document, a feature that makes N1QL interesting, which we will explore later.

For our previous example of Aero Air database of flights, we can do the same using Couchbase by naming our bucket “AeroAir” to manage the stored data. Through the AeroAir bucket, Couchbase developers can edit data in documents.

Creating a Document-Oriented Database with Couchbase

Creating a document-oriented database iwith Couchbase requires basic knowledge in JSON format.

First, let’s assume you have already installed Couchbase Server (eitherEnterprise edition or Community edition).

Here is an overview of Couchbase Server:

Overview

How to Create a Bucket in Couchbase

Before you can create documents, you need to create a bucket to manage stored data. You can enter in any name you prefer; I chose to name the bucket AeroAirImage 2

As you can see, we have one active bucket, which is the AeroAir bucket.

Active bucket 1

Now let’s assume AeroAir has a simple document made up of flight details relevant to travelers or customers.

DOCUMENT 1

Remember, a document-oriented database is entirely different and quite flexible in comparison to a relational database belonging to the SQL family. With a document-oriented database, you can include extra fields in other documents as well as exclude them.

Let’s create a flight document for Boeing ERV. On the far right AeroAir, you can see that there are two buttons labeled Documents and View. Click in the document button to create a new flight document for our AeroAir bucket.

You create a new document by editing predefined data in JSON format with your own data.

PREDINE

As you can see, I have created a flight document for Boeing ERV in the AeroAir bucket.

new doc

In addition, I have created four documents in the AeroAir bucket. Each document represents flight information data, flight schedule data, and route information. You can go on to create as many documents as you want for the AeroAir bucket.

new doc set

Below is an illustration of how data in a document looks. Leaving a trailing comma in a JSON document is considered valid unless the comma in the last element precedes the closing tag.

json example

When you include an additional comma, you are notified of a JSON error. Check out the example below:

JSON ERROR

As stated earlier, with a document-oriented database you can input fields which are absent in some documents but present in other documents. It does not affect the database entirely because a document-oriented database has a very flexible schema. Check the example between flight Boeing ERV and Boeing FFH.

Beoing ERV example

Boeing ffh example

As you can see, although both documents are in the AeroAir bucket, one document has a field which is absent in another document, yet our database or bucket is not affected.

In Couchbase, accessing data in a document is quite different from a relational database where you rely on primary key to retrieve a particular value in a row. In Couchbase you can use the Document_ID of a document to access a specific document in a bucket once the document ID is valid and the document exists. Similar to a relational database, a document ID should be unique.

Thus, in our AeroAir database example, I used the name of the flights as document ID.

DOCU id

Now that we have created documents in the AeroAir data, it is time for us to query our documents.

Couchbase provides the N1QL query language, which is similar to SQL syntax, but it can operate on JSON documents and results are returned in JSON format.

Basic knowledge in MySQL or generally SQL is enough to take you through the basics and probably towards advanced N1QL commands. Like SQL, N1QL consists of data manipulation language and data definition language.

Basically, I will use clauses such as SELECT, WHERE, and FROM to query documents in AeroAir. Like SQL, N1QL offers more than SELECT, FROM, and WHERE. For instance, you can use the UPDATE clause if you want to update a value in your document, or INSERT if you want to include new data.

Once you know SQL syntax off hand, using Couchbase N1QL against JSON documents should not be a problem.

We will just focus on how to SELECT, FROM, and WHERE using the following clauses to query documents in AeroAir using Couchbase N1QL.

SELECT – You can use this clause to select a specific field in a document, including an asterisk symbol after the SELECT clause returns all fields in documents.

FROM  You can use this clause to the data bucket to work with or operate on.

WHERE  You can use this clause indicating conditions that retrieved documents must have or satisfy.

Couchbase provides a Query Workbench to run queries in the web console and a cbq tool if you prefer to use the command line.

Using the Query Workbench

Now that we have some documents in our buckets, let’s try some basic queries on one or two documents.

First and foremost, click on the QUERY tab next to the INDEXES tab or in between DATA BUCKETS and INDEXES. As you can see, there is a bucket analysis which shows the current bucket being managed by the system. Also on the right, there is the result panel showing where the returned result in JSON format is displayed.

queryaeroair

I have certain fields in the documents to make our query runs more interesting. Let’s say we are interested in a flight fixed with internet Wi-Fi. Running query in the web console without creating a primary index on the bucket AeroAir ends up as error.

create index

Now try running the same query in the web console. You can see that this time around the query executed successfully.

flightname

You can see that the results are returned in JSON format. We queried for a flight with internet speed more than 100 kbps. It returned the Boeing FFH flight as the only flight satisfying the query condition. The comparison operator simply means greater than 100 kilobytes per second.

You can specify a particular condition without using comparison operators. Let’s say we want find out which flight has Jack Charisma assigned as the head pilot.

jacvkcharism a

 

Interestingly, although N1QL uses SQL syntax, it can run against JSON documents. This is what makes Couchbase with N1QL interesting and a better alternative to a relational database.

This post is part of the Couchbase Community Writing Program

Author

Posted by Laura Czajkowski, Developer Community Manager, Couchbase

Laura Czajkowski is the Snr. Developer Community Manager at Couchbase overseeing the community. She’s responsible for our monthly developer newsletter.

Leave a reply