If you’re writing a web application, you’re probably already familiar with JSON documents. Couchbase supports JSON documents and sooner or later you will need to import some JSON documents into Couchbase Server

But just because you inserted data into Couchbase doesn’t mean that it goes directly to disk. Your data will first be inserted into the in-memory object managed cache and later in the background written to the disk asynchronously – decoupled completely from your action.

But what tools does a developer have to get a bunch of JSON data into Couchbase? This blog describes the cbdocloader tool in more detail. It saved me a ton of time by allowing me to import an entire Vancouver tree dataset that I was playing with.

Using cbdocloader

Following are the different command line parameters for the cbdocloader tool :
/opt/couchbase/bin/tools/cbdocloader -u Administrator -p password -n 10.3.2.54:8091 -b bucket_zip -s 10 output

where  
-s denotes the RAM quota in MB. This is an optional parameter (100 MB by default)
-n is the node ip address
-b the bucket name (If the bucket does not exist, an error will be thrown)
-u username
-p password

The Vancouver Tree Dataset

The City of Vancouver added a new dataset of street trees to the city’s open data catalog. This dataset includes a full address listing of all boulevard trees on the streets of Vancouver, along with the tree type and other characteristics.

 

Each JSON file in the dataset contains information for all the trees in a particular area. Using a simple python script , we split each JSON into multiple files to produce one JSON file per tree. We then loaded the data into Couchbase using the cbdocloader tool.

Loading the individual JSON files into Couchbase

The source documents fed into cbdocloader can be in a particular directory or in .zip format.

cbdocloader to load JSON documents in a folder: /opt/couchbase/bin/tools/cbdocloader -u Administrator -p password -n 10.3.2.54:8091 -b bucket -s 1000 output

cbdocloader to load a zipped folder (that contains json documents): /opt/couchbase/bin/tools/cbdocloader -u Administrator -p password -n 10.3.2.54:8091 -b bucket_zip -s 1000 output.zip

Interesting Data Facts

So can you guess how many trees are in the Vancouver Tree dataset?
Click here for the answer. Clue: It is the item count in the bucket shown.

Do you know which Vancouver neighborhood has the tallest tree in the city?

Now that you have loaded the data into Couchbase, try to write a simple view to figure out the answer. We will revisit this question in our view blog series so stay tuned folks!


Thanks to Abhinav for putting the screenshots together.

Author

Posted by The Couchbase Team

Jennifer Garcia is a Senior Web Manager at Couchbase Inc. As the website manager, Jennifer has overall responsibility for the website properties including design, implementation, content, and performance.

One Comment

  1. i have problem when loading json data to couchbase. i have followed the tutorial. and it says that bucket creation is successful. but msgs transferred is 0. what should i do?

Leave a reply