In my posts on key-value data modelling with Couchbase, the main concerns were:

In a N1QL world, we’re still thinking about similar things and in this post I’ll look at two of them:

  • * how to represent document types
  • * modelling relationships between documents.

Fundamentally, it’s still a story of making the right trade-offs so that you can create a physical data model that most efficiently represents your logical model. All that’s changed are some implementation details.

Do nothing

First up, it’s worth saying that you can query existing JSON data without having to make any changes. N1QL isn’t going to demand you remodel your entire existing database.

However, you can make changes to your model that will make N1QL more efficient and your queries easier to write. And if you’re starting from scratch, you might as well consider N1QL queryability from the outset.

Document types

One of the most important parts of key-value data modelling is key design. In a key-value world, the key bears an enormous responsibility: it must tell you what you’re storing and it must be easy to find again.

For example, the keys for customer records might take the following format:

  • * cust::name the main customer record
  • * cust::name::payment the customer’s payment details

With rich query, the key name’s role changes. Rather than directly asking for the key, you’ll be asking for the data that you want. Those two responsibilities — describing what you’re storing and making the data easy to find — move into the document itself as a document type.

This is where we hit one of the big practical differences between N1QL and SQL: FROM does less work in N1QL today. The scope of a Couchbase bucket is vastly different from the scope of a typical relational table.

SQL FROM narrows the scope of our query to the data that interests us. N1QL FROM tells the query engine which bucket to look at. As a typical bucket contains all the data for one application, we need another way to distinguish types of document.

A customer

Let’s look further at a customer. Here’s a simple record:

{

  "name": "Alan Partridge",

  "email": "alan@example.com",

  "location": "Norwich",

  "phone": "+44-1603-619-331",

  "docInfo":

  {

    "type": "customer",

    "created": "2015-10-22T10:17:24.731Z",

    "schemaVersion": "1.2.3"

  }

}

The type value steps in to narrow the scope of our N1QL query:

SELECT * FROM default WHERE type = customer;

Using the type we can build indexes that contain only the documents of a particular type. For example:

CREATE INDEX customers ON default WHERE type=customer;

The likelihood is that your document schemas already specify a type, particularly if you’ve been using Couchbase views. Either way, when you create a document you should specify its type within the document body.

In future iterations of Couchbase and N1QL, we might see namespacing below the bucket level but types will always give us a low-cost way of differentiating documents.

Relationships between documents

The question of when to embed and when to refer remains an important one when modelling for N1QL. The difference is that N1QL handles these relationships for you through JOINs.

N1QL joins can take place between:

  • * documents in different buckets
  • * documents in the same bucket

In both cases, the JOIN matches a value in the document on the left side with the key name of a document on the right side. So, if one document contains the key name of another document, you can JOIN those two documents.

Let’s add our fictional customer’s most recent purchase to the profile:

{ lastPurchase: "prod::drivinggloves" }

The value here is also the key name of a product document elsewhere in our bucket.

So, to return the name of the customer and the price of their last purchase for every customer living in Norwuch, we’d use the following N1QL:

SELECT r.name, a.price FROM default r JOIN default a ON KEYS r.lastPurchase WHERE r.location = "Norwich";

The result would look something like:

{

  "requestID": "a2284985-541f-491d-b921-4c248f154293",

  "signature":

  {

    "name": "json",

    "price": "json" },

   "results":

    [

      { "name": "Alan Partridge",

      "price": "9.99"

      }

    ],

    "status": "success",

    "metrics":

    {

      "elapsedTime": "5.223111ms",

      "executionTime": "5.124029ms",

      "resultCount": 1,

      "resultSize": 77

    }

}

With N1QL, we can produce one to one, one to many and many to many relationships between documents. So while embedded documents were previously a large part of how we’d model our data for Couchbase, with N1QL they become effortless for us as developers.

In summary

The two main points to take from this post are:

  • your documents should have a type that your queries can use to narrow scope
  • JOINs rely on the primary key of the right-side document appearing in the body of the left-side document.

Ultimately, not much needs to change about your JSON data in order for you to use N1QL. As you dive more into N1QL, you’ll no doubt find document shapes that work better than others. If so, share them with us on the Couchbase forums!

Author

Posted by Matthew Revell, Lead Developer Advocate, EMEA, Couchbase

Matthew Revell is a Lead Dev Advocate, EMEA Couchbase. He developed a global strategy for putting Couchbase front in the minds of the product's developers.

Leave a reply