Couchbase is great as a source for Apache Kafka using the DCP connector.
However it is also great as an endpoint for digesting data, as it is fast, memory first and reliable storage.

In this blog post I will show you how to build simple Java application for a producer and a consumer which save the published messages from Kafka into Couchbase.

I assume here, that you already have a Kafka cluster (even if it’s single node cluster). If not, try to follow that installation guide.

This blog environment have 4 parts:

  1. Kafka producer
  2. Apache Kafka queue
  3. Kafka consumer
  4. Couchbase server

Producer

We need the producer in order to submit messages to our queue.

In the queue, those messages are being digested and every application which subscribed to the topic – can read those messages.

The source of our messages will be a dummy JSON file I’ve created using Mockaroo, which we will split and sent to the queue.

Our sample JSON data looks something similar to:

The producer code:

Output from the Kafka producer app

Output from the Kafka producer app

Consumer

This is a simple one, very straight forward, just get the messages from the queue, and use the Couchbase Java SDK in order to insert documents into Couchbase. For simplicity, I’ll be using the sync java SDK, but using the async is totally possible and even recommended.

Output from the Consumer app

Our Kafka consumer console output

Couchbase Server

Now we can look on the result in Couchbase server.

Look at kafkaExample bucket – Filled with 1000 documents.

Couchbase buckets

Each document looks something like that:

Sample document

Simple 3 part solution.

Note, that on a Production environment, Producer, Consumer, Kafka or Couchbase will be on or more machines each.

Full (including Maven dependencies) code in GitHub.

Roi.

Author

Posted by Roi Katz, Solution Architect, Couchbase

Roi is a Couchbase Solution Architect, software developer and architect with over 10 years of broad industry experience. He has been a trainer and author of courses with a specialization in Big Data Systems, NoSQL Databases, Couchbase, Distributed Architecture and Cloud Computing.

Leave a reply