Structured Query Language (SQL) was originally designed as an intuitive query language for relational data stores. NoSQL, relatively nascent in comparison, still requires compatibility with SQL as many BI tools and applications understand it. Couchbase supports its own query language, N1QL, which extends SQL for querying JSON documents stored in Couchbase Server. The Simba Couchbase ODBC and JDBC drivers enable users to “have their cake and eat it too” by enabling users to leverage both SQL and N1QL. One of the ways it achieves this is with the Collaborative Query Execution (CQE) feature offered by SimbaEngine X. This article explains how Simba Couchbase drivers use CQE, by leveraging the similarities between SQL and N1QL, to ensure optimal performance.

JOIN Operation

N1QL supports Lookup Joins, which can be roughly translated into analogous JOINs in SQL. Here we will explain how Simba Couchbase drivers leverage the CQE feature to pass down the JOIN operations to Couchbase Server.

Consider the beer-sample data set with beer and brewery documents. Using Simba Couchbase drivers, these documents can be mapped to two different tables – beer and brewery – and users can execute JOIN queries on these two tables.

Say if a user wants to find out all of the beers that are brewed in the state of California, they would issue a SQL statement as follows:

To generate the result for this SQL query you would have to do the following:

  • Retrieve all the beer documents from Couchbase Server to the client
  • Retrieve all the brewer documents from Couchbase Server to the client
  • Execute the join
  • Filter out the results where state is California.

 

This is not very efficient as this can put a lot of load on the client-side if the joining tables are really big. Moreover, JOIN and WHERE operations are supported in N1QL. Transferring the load to the server-side would result in improved performance.
Simba Couchbase drivers, do exactly that. With CQE, Simba Couchbase drivers translate the given SQL into analogous N1QL, which can be passed down to Couchbase Server for execution. The above SQL statement can be translated to N1QL as follows:

 

Underneath the hood, Simba Couchbase drivers use Simba SQL Engine for query parsing, preparation and execution. Prior to query execution, the drivers create an Algebraic Expression Tree (AE-Tree) representation. The SQL statement takes this form just before Simba SQL Engine transforms it into an execution plan and executes it. Prior to execution, the driver can review the plan and see if it can execute any part of it. This is called the pass-down step.

Just before the pass-down step (as described in the AETree optimization article), the AE-Tree looks like this:

 

And after pass-down, the AETree looks like this:

 

As you can see the AE-Tree only contains one node that represents the joined table, which means everything else, including joining and filtering, are passed down to Couchbase Server. Less data sent across the wire from Couchbase Server to driver and fewer operations done client-side = better performance!

UNNEST Operation

N1QL supports UNNEST clause, which conceptually performs a join of a nested array with its parent object.

In Simba Couchbase drivers, any nested array is mapped to a virtual table and users can JOIN the parent table with the virtual table. This again uses the CQE capability of SimbaEngine X and leverages the N1QL UNNEST clause.

For example, all the brewery documents in the beer-sample data set contain nested arrays called brewery_address. In Simba Couchbase drivers these documents are mapped to a brewery table and a brewery_address virtual table. Say if a user wants to get all the brewery names after joining brewery and brewery_address, they would issue a SQL statement as follows:

Simba Couchbase drivers translate the above SQL statement to N1QL as follows:

 

Before pass-down, the AE-Tree looks like this:

 

After pass-down, the AE-Tree looks like this:

As you can see the AE-Tree again only contains one node that represents the joined table, which means everything else is passed down to Couchbase Server for execution. Again, the more operations occurring server-side, and the less data coming back across the wire to the driver, the more performant the query.

These two examples highlight how the Simba Couchbase drivers utilize the full potential of Couchbase Server, by passing down supported N1QL operations down to the server to execute. For any SQL features that are not supported by N1QL, the drivers execute the operations (client-side). In this way, the drivers ensure breadth of support for SQL and N1QL as well as high performance.

Now Is Your Turn

Download a free evaluation of Couchbase ODBC/JDBC drivers and try them with the new Couchbase Server 4.5!

Posted by Nowrin Joyita, Computer Scientist, Simba Technologies

Leave a reply