Announcing the New Couchbase Spark Connector

Couchbase Spark Connector 1.2.0 is now available. It includes significant new features and enhancements, as well as support for Apache Spark 1.6.

On behalf of our outstanding product development team, I’m very pleased to announce the latest release 1.2.0 of the Couchbase Spark Connector. As usual, the timing is no coincidence since next week Spark Summit 2016 is happening in San Francisco. We are sponsoring the event, and as a result you can find me and my colleagues at the Couchbase booth (B7). We’ll also be giving a talk at 4:15PM on Wednesday June 8th on “Augmenting the Operational Database with Spark” — come by to hear about the latest developments with Couchbase and Spark.

Release Details

This release brings us up to date with Apache Spark release 1.6, and delivers several new features and enhancements. Specifically, the Couchbase Spark Connector release 1.2.0:

  • Is compatible with Apache Spark 1.6,
  • Adds Java and Scala support for Couchbase view, spatial view and N1QL query APIs on RDDs,
  • Adds support for the new Subdocument lookup API on RDDs and SparkContext (requires Couchbase Server 4.5),
  • Allows the Java API to use Spark SQL directly,
  • Extends the existing SparkSQL Filter expressions to include support for all Spark Filters, including nested filter expressions,
  • Adds enhancements to the Spark Streaming support (still experimental at this stage),
  • Includes several other bug fixes and enhancements.

Please see the Couchbase Spark Connector 1.2.0 Release Notes for the list of features in this release. Please see the Couchbase Spark Connector documentation for examples and Getting Started information. There is also an excellent whirlwind tour of the connector in Michael’s blog post if you haven’t already seen it.

The Connector is distributed from Maven Central (as well as spark-packages.org), so if you want to experiment with it using the spark-shell, this is all you need to get up and running:

> $SPARK_HOME/bin/spark-shell –packages com.couchbase.client:spark-connector_2.10:1.2.0

The Road Ahead

It’s been very exciting to talk with customers and partners about how they plan to use Spark and Couchbase together. Their feedback and real world projects will help guide future features in the connector. If you haven’t already tried out the connector, please do so and add your feedback and thoughts to where we take the connector next. For our part, we’re looking forward to the new functionality in Spark 2.x (see the pre-release blog post from Databricks), as well as providing additional enhancements to the Spark Streaming interface.

Happy coding, no bugs and fast in-memory operations!

Author

Posted by David Segleau, Director of Technical Product Marketing, Couchbase

David Segleau is the Director of Technical Product Marketing for Couchbase. Most recently he has been Director of Product Management for the Oracle NoSQL Database, Berkeley DB and Database Mobile Server and VP of Engineering for Sleepycat Software. He has over 30 years of industry experience, including management positions at Britton-Lee, ShareBase, Teradata, Illustra, Informix, ANTs Software, Sleepycat, and Oracle. David has spent his career working with innovative technologies, merging customer and business requirements with product development expertise to solve real-world problems.

Leave a reply