Couchbase Engagement Database Platform

Woo-Hoo!!! Have you heard of Engagement Databases? I am sure, “No”, not until Couchbase announced it with 5.0 release. In today’s digital economy, success of a business is dependent on its ability to leverage successful digital transformation to provide superior customer engagement & experience. DNA of all this lies in data, and that calls for a powerful database platform which can tackle the data problems in Systems of Engagement at scale, at ease, and at the roots. These problems are pretty unique and hence the solutions should be too. Superior understanding of the problems in this space and elegant & efficient solutions to those problems led to the discovery of a new category of databases, we call Engagement Databases. Such a database is complementary to the Transactional and Analytical databases you might have in your data fabric infrastructure. Following picture shows how they fit together.

N1QLis the SQL extended query language designed to enable your applications leverage the power of Engagement Databases & JSON. N1QL has all sophistication and bells & whistles necessary to deal with flexible schemas and the hierarchically structured JSON data. You can write simple declarative queries using N1QL to naturally deal with such data.

Couchbase 5.0 brings a number of rich query functionality and performance optimizations in N1QL to meet the demands of customer engagement focussed applications. Many of the functionality and performance improvements also include corresponding enhancements in GSI Indexes to help N1QL fully leverage the optimizations. See Couchbase Server 5.0 what’s new for complete list and details.

Adaptive Indexes

Adaptive Indexes are special type of indexes in Couchbase that can index all or a set of specified fields of a document in extremely adaptable way. Unlike composite GSI indexes, such an index is generic in nature. Let me explain.. Adaptive Indexes can efficiently lookup any of the index-key values without restrictions such as prefix index-key ordering, in typical composite indexes. This enables efficient ad-hoc search queries that can use WHERE clause predicates on arbitrary fields without requiring to create multiple composite indexes or different combinations of index keys.

For example, consider a user profile use case. A person’s profile may need to be searched based on any of the personal attributes such as first name, last name, age, city, address, job, title, company, foaf, etc. Similarly, a hotel room availability may be searched based on wide criteria, such as room facilities, amenities, price, location, distance from specific points of interest etc. In such scenarios, traditional secondary composite indexes can’t be used effectively (see section Contrast with Composite Indexes to understand the issues). Adaptive indexes solve this problem by creating one generic index that can run such ad hoc search queries effectively and efficiently.

For example, following figure(1) depicts how three different indexes I1, I2, I3 are needed for the 3 queries Q1, Q2, Q3. Further, a query on any other non-indexed fields is not possible without creating respective indexes. Moreover, a query on just age, such as Q4, will not be able to use the index I2, as it needs the prefix index-key id aswell. Figure(2) shows how a single adaptive index can help queries on all fields irrespective of the index-key order.

Note that, adaptive indexes is not a panacea. These figures may give an impression that adaptive indexes are better than simple/composite indexes. It is not entirely true. If you know queries (and index-keys) apriori, then it is always better to use normal secondary indexes. Because, they would be faster than adaptive indexes. Remember that, adaptive indexes can be much bigger in size as they index multiple/all index keys and corresponding values. So, they may not perform as fast as simple secondary indexes. However, adaptive indexes solve the problems that secondary indexes can’t with little overhead.

Life is all about trade-offs!! For more details and examples, refer to the Adaptive Index documentation.

Powerful Subquery expressions

N1QL is a powerful query language. It is SQL for JSON, and it has all necessary language extensions and expressions to deal with many subtleties in the JSON data modeling, such as:

Flexible schema
Missing fields and data values.
Multi values array values
Nested data where a field of a document can be another fully formed JSON object

These kind of things don’t exist in relational databases, and hence SQL din’t have to deal with it. However, by design, N1QL is required to address all nuances in working with JSON data. In fact, these are the kind of complex data characteristics and flexibility, and scalability, and simplicity, that has gravitated modern day applications to an Engagement Database Platform like Couchbase.

One such functionality to elegantly simplify working with array data is provided by N1QL’s support for Subquery expressions on Arrays. This allows you to treat an array as a keyspace of documents, and enable you to use all power of N1QL on the array treating the array elements as documents of the keyspace. You may not realize, but that’s much more powerful than how it reads ;-)

For example, consider following customer order data:

{
    "type": "order",
    "customerId": "Prasad_4444",
    "order_number": "12345",
    "products": [
        {
            "productId": "prod1",
            "quantity": 1,
            "price": 39.99
        },
        {
            "productId": "prod2",
            "quantity": 2,
            "price": 49.99
        }
    ]
}

{

"type": "order",

"customerId": "Prasad_4444",

"order_number": "12345",

"products": [

{

"productId": "prod1",

"quantity": 1,

"price": 39.99

{

"productId": "prod2",

"quantity": 2,

"price": 49.99

}

]

}

Here, the array field products[] contains the order details about all the products ordered. Now, let’s try to write a query to find the total value of the order. The query should run through each element of the products[] array, and compute the sum of quantity * price. Without using this feature (for ex: when using Couchbase 4.5), the query would look like:

SELECT d1.customerId, 
       ((SELECT RAW SUM(d2u.price * d2u.quantity)
         FROM default d2 
         USE KEYS meta(d1).id 
         UNNEST d2.products d2u
         WHERE d2.customerId = d1.customerId
         GROUP BY meta(d2).id))[0] AS total
FROM default d1 ;

SELECT d1.customerId,

((SELECT RAW SUM(d2u.price * d2u.quantity)

FROM default d2

USE KEYS meta(d1).id

UNNEST d2.products d2u

WHERE d2.customerId = d1.customerId

GROUP BY meta(d2).id))[0] AS total

FROM default d1 ;

This query itself is made possible by the wonderful UNNEST clause in N1QL’s SELECT statement. However, same query can be rewritten as below, using the Subquery Expression on Arrays:

MySQL
SELECT customerId,
       (SELECT RAW SUM(p.price * p.quantity)
        FROM d.products AS p)[0] AS total
FROM default AS d;

				
					
				1
2
3
4

						SELECT customerId,
       (SELECT RAW SUM(p.price * p.quantity)
        FROM d.products AS p)[0] AS total
FROM default AS d;

					

			

Isn’t it ridiculously simple and intuitive!! Infact, it would perform better too. That’s how N1QL empowers application developers, and an Engagement Database enables customers tackle their data challenges in Systems of Engagement applications. There is a lot more to talk about this wonderful feature.. See documentation and this blog.

Btw, imagine modeling this simple data with an RDBMS and write equivalent query in SQL, or another NoSql database. I am sure that can be a fun problem to solve if you wanna lose your weekend!!

External Data Access & Federated Queries

One of the data explosion problems is that customer engaging applications may need to find & integrate with relevant data from multiple internal & external sources. It is more and more important for data-centric applications to stitch their data with external data sources provided by:

Data service providers such as Google geocoding API, or Yahoo API etc.
Data providers such as www.data.gov
Micro services such as Couchbase Full Text Search, other Couchbase Clusters, or any other data sources that can provide JSON data.

Couchbase Server 5.0 adds a new N1QL function CURL(), which enables N1QL queries to interact and integrate with external JSON data sources available over HTTP/REST. This allows federated N1QL queries against external data sources. For example:

You can access Google geocoding API to get more location details about a given location stored in your database.

MySQL
SELECT CURL("https://maps.googleapis.com/maps/api/geocode/json",
            {
                 "data":["address=santa+cruz","components=country:ES"],
                 "get":true
            });

				
					
				1
2
3
4
5

						SELECT CURL("https://maps.googleapis.com/maps/api/geocode/json",
            {
                 "data":["address=santa+cruz","components=country:ES"],
                 "get":true
            });

					

			

You can use Yahoo finance data API to fetch more details about a given stock symbol stored in your database.

MySQL
SELECT meta().id, to_number(hdp_low) AS hdp_low 
FROM bucket_name
LET hdp_low = curl("https://query.yahooapis.com/v1/public/yql", 
                   {"data":"q=select%20*%20from%20yahoo.finance.quotes%20where%20symbol%20in%20(%22HDP%22)&format=json&diagnostics=true&env=store%3A%2F%2Fdatatables.org%2Falltableswithkeys&callback="
                   }).query.results.quote.DaysLow
WHERE to_number(hdp_low) < $min_threshold;

				
					
				1
2
3
4
5
6

						SELECT meta().id, to_number(hdp_low) AS hdp_low 
FROM bucket_name
LET hdp_low = curl("https://query.yahooapis.com/v1/public/yql", 
                   {"data":"q=select%20*%20from%20yahoo.finance.quotes%20where%20symbol%20in%20(%22HDP%22)&format=json&diagnostics=true&env=store%3A%2F%2Fdatatables.org%2Falltableswithkeys&callback="
                   }).query.results.quote.DaysLow
WHERE to_number(hdp_low) < $min_threshold;

					

			

Similarly, you can use FTS indexes in your N1QL query for full-text-searching, or access data from other Couchbase Clusters or other JSON data sources in your N1QL queries (using all the power of N1QL expressions on the external data). For more information, see CURL Function.

RBAC for N1QL Statements

Role Based Access Control (RBAC) is vital for security administration in enterprise deployments. Couchbase Server 5.0 brings RBAC for applications whereby users can be added to the system with various predefined roles. Query service adds support for RBAC with following major features:

Statement level roles are added for executing various N1QL statements such as SELECT, UPDATE, INSERT, and to access system keyspaces.
New N1QL statements GRANT and REVOKE are added aligned with the SQL standard.
New system keyspaces user_info, my_user_info, and applicable_roles are added to access various role assignments details.

Find more information here.

Monitoring and Profiling N1QL Queries

This feature enables fine grain and detailed query monitoring and profiling. This provides detailed information about various query phases, query operators involved in the processing of the query, and corresponding execution timings and characteristics (such as intermediate results). All monitoring information is also captured in various system keyspaces such as system:active_requests and system:completed_requests. This helps in diagnosing troubled queries and performance issues. Following picture shows the visual plan and query profile details in the query work bench.

New query parameters are provided to enable, disable, and control the monitoring capabilities, and can be configured for a query engine or individual queries. See Monitoring N1QL Queries for details.

N1QL Performance Improvements

In Couchbase Server 5.0, N1QL and the Global Secondary Indexes together enable many performance optimizations.

Complex Predicate Pushdown: Supports efficient and exact pushdown of complex AND/OR predicates to index. This helps in:
- avoiding spurious results from Indexer in certain queries where predicates have range scans on prefixing index-keys.
- efficiently using multiple indexes for different conditions of an OR predicate etc.
Index Projection: In the version 5.0, N1QL adds support to request only the required information from Index, and thus avoid any unnecessary processing.
Support is added to create indexes with ASC/DESC specified for individual index keys, and leverage index-order for queries with aligned ORDER BY clause.
Operators such as OFFSET, MAX(), MIN(), DISTINCT, COUNT (DISTINCT field) are pushed down to indexer.

For more details, see Index Pushdown Optimizations.

Summary

I am sure this is pretty exciting and overwhelming. We are super excited to share our vision on Engagement Databases and innovate N1QL to help you gear up!! In this article we talked about many rich N1QL feature and performance enhancements in the Couchbase Server 5.0 Engagement Database Platform. Drop a line if you have any comments or questions. Cheers!!

Prasad Varakur, Principal Product Manager, Couchbase

Products

See How Capella Stacks Up

See How Capella Stacks Up

By Industry

By Need

Why NoSQL

What is NoSQL and why choose it?

Popular Docs

By Developer Role

Developer Playground

Start a Capella session

Resource Center

Education

Certification Exams 2023

Get Couchbase certified

About

Partnerships

Our Services

Partners: Register a Deal

Ready to register a deal with Couchbase?

Marriott

Gear up with N1QL to the era of Engagement Databases