Couchbase Collections available with the upcoming version 7 brings Couchbase JSON database closer to that of other RDBMS. Collections will make it easier for customers who are looking to migrate their current RDBMS data model to Couchbase document model. This allows customers to benefit from Couchbase distributed shared-nothing database architecture, bringing high availability, and horizontal scalability. In addition to the flexibility of the data model, Couchbase N1QL is the only NoSQL database that provides N1QL, and extended SQL language, for querying its database.

What is a SCOPE

A bucket scope is equivalent to a schema in RDBMS. It is a logical container for Couchbase collections. Every Couchbase bucket contains a default scope. You can use these default containers directly, or define your own.

What is a COLLECTION

A collection is analogous to a table in RDBMS. Every scope has a default collection. While a collection can be used to store similar types of records, similar to RDBMS tables, there is no schema restriction on what you can store in a collection.

System keyspaces

KeyspacesDescription
system:all_scopesA list of all scopes, including system objects, such as the default scopes
system:scopesAll available scopes except the system scopes
system:all_collectionsA list of all collections, including system objects, such as the default collections
system:collectionsAll available collections except system collections

Key benefits for N1QL Collections

From the N1QL language perspective, both the Data Definition Language(DDL) and Data Manipulation Language (DML) will be simpler as you no longer need a type field on every document.

Query will be simpler and more intuitive

Bucket modelCollection model

 

 

The collection model provides you with the equivalent of the relational table, bridging the similarity between the logical and the physical data model, that RDBMS users have accustomed to.

Simpler N1QL Join condition

Bucket modelCollection model

 

 

The need to use the type field to restrict to a specific table of documents with the bucket model for the JOIN syntax is not longer required in the collection model.

Backward compatibility with Couchbase bucket model

Bucket modelCollection model

 

 

The bucket model is still supported as before. However, the bucket data is actually stored in the default scope and collection. Please note that the namespace prefix is required, and should have the value default: for all references to the scope or collection.

N1QL Query with Collections

With the Collection model, the query engine needs to be aware of the full path of the collection name. This is because a collection name does not need to be unique in a bucket, but only in its own scope. A fully qualified collection name has the following format.

However, you could reference a collection with its relative path by setting the query_context

Query Workbench UI allows you to set the query context by selecting the bucket and scope from the drop-down box.

The query context is also supported in SDKs, REST, and cbq.

Do you need to migrate to the Collections?

You do not need to migrate if you don’t want to.

  • Data – All existing data remains in the same bucket. You can reference your documents using the bucket query syntax, or using the new default scope and collection.
  • Queries – N1QL query syntax for the DDL and DML continue to support the bucket model.
  • Indexes – The existing indexes will remain at the bucket level, and will continue to be available to all of your queries as before.

Migrating from Bucket to Collections

For this section, I will use the travel-sample bucket as an example of to how to migrate from the bucket model to the collection model.

Data Migration

If your documents already have a field to identify their groups, then use that to create the collection. For the travel-sample, we can use the type field as the collection name. In addition, I will also create an inventory scope for all the travel samples collectioms

In the above example, we have created an ‘inventory’ scope and added new collections in the same travel-sample bucket.

Document Key

Document key needs to be unique. The document key that you currently have for your bucket should already be unique because they all exist in the same bucket. For that reason, the same document key should be adequate for use as a collection document key.

Copy the data

Here I use INSERT SELECT to copy the data from the bucket into the individual collection. I also use the existing META().id bucket key for the collection key.

Index conversion

It is highly likely that you will need to modify your existing bucket indexes for them to be effective for the collection model. This section lists out the common patterns for bucket indexes, and recommends the steps to convert them to a collection based index.

Bucket Index with type predicate – i.e. partial index

For a partial index with a specific type field filter, you can simply re-create the new index on the specific collection for the type.

Bucket ModelCollection Model

 

 

Bucket Index without type predicate

A Global Secondary Index (GSI) index can be created for fields that may or may not exist in the document. For instance, an index may include the ‘icao’ field, but not every document may have the ‘icao’ field. For such an index, the indexer will only include the document that has the field. For this kind of index, you may need to be more specific, and create such an index for the collection where the field exists.

Bucket ModelCollection Model

 

 

Bucket index for common field

A bucket index can also be created without specifying the specific type, if the field exists in multiple document types. Consider the example where the field city exists in multiple document types, i.e. airport, landmark and hotel. But the bucket model only has a single def_city index, which can cover all three document types. For this index, you will now need to create a separate index for each collection.

Bucket ModelCollection Model

 

 

Query conversion

Since the underlying data model has changed from a shared bucket to an individual collection based. The queries that you have will also need to be modified. Furthermore, after you have modified the query, you will also need to ensure that the queries are using the new collection based indexes.

Bucket ModelCollection Model

 


 

Documentation

Author

Posted by Binh Le

Binh Le is a Principal Product Manager for Couchbase Query service. Prior to Couchbase, he worked at Oracle and led the product management team for Sales Clould Analytics and CRM OnDemand. Binh holds a Bachelor's Degree in Computer Science from the University of Brighton, UK.

Leave a reply