XML data can be converted to JSON, which can be loaded into Couchbase Server (Couchbase Server 5.0 beta now available). Depending on the source of the data, you might be able to use a tool like Talend. But you may also want to write a simple C# .NET application with Newtonsoft’s Json.NET to do it.

XML data

For the purposes of this tutorial, I’m going to use a very simple XML example. If your XML is more complex (multiple attributes, for instance), then your approach will also have to be more complex. (Json.NET can handle all XML to Json conversions, but it follows a specific set of conversion rules). Here’s a sample piece of data:

Notice that I’ve got this XML as a hardcoded string in C#. In a real-life situation, you would likely be pulling XML from a database, a REST API, XML files, etc.

Once you have the raw XML, you can create an XmlDocument object (XmlDocument lives in the System.Xml namespace).

Conversion with Json.NET

Once you have an XmlDocument object, you can use Json.NET to convert that object into a Json representation.

In this example, I’m asking Json.NET to serialize an XML node:

  • I used Formatting.None. If I wanted to display the actual Json, it might be better to use Formatting.Indented

  • The last true specifies that I want to omit the root object. In the XML above, you can think of <Invoice></Invoice> as the root object. I just want the values of the Invoice object. If I didn’t omit the root node, the resultant Json would look like: {"Invoice":{"Timestamp":"1/1/2017 00:01","CustNumber":"12345","AcctNumber":"54321"}}

Saving the Json result

Finally, let’s put the Json into Couchbase. The easiest way to do this would be to again call on JsonConvert to deserialize the Json into a C# object. That object would then be used with Couchbase’s bucket.Insert(…​) method.

With this method, the Json would be stored in Couchbase like so:

XML serialized to object

That might be fine, but often times you’re going to want more control of the format. With Json.NET, we can serialize to a given class, instead of just object. Let’s create an Invoice class like so:

Notice that there is some type information now. The Timestamp is a DateTime and the AcctNumber is an int. The conversion will still work, but the result will be different, according to Json.NET’s conversion rules. (Also check out the full Json.NET documentation if you aren’t familiar with it already).

The result of that insert will look like:

XML serialized to new class object

  • Notice that the timestamp field is different: it’s stored in a more standardized way.

  • The acctNumber field value is not in quotes, indicating that it’s being stored as a number.

  • Finally, notice that the field names are different. This is due to the way Json.NET names Json fields by default. You can specify different names by using the JsonProperty attribute.

That’s it

One more minor thing to point out: I used Guid.NewGuid().ToString() to create arbitrary keys for the documents. If you have value(s) in the XML data that you want to use for a key, you could/should use those value(s) instead.

This blog post was inspired by an email conversation with a Couchbase user. If you have any suggestions on tools, tips, or tricks to make this process easier, please let me know. Or, contact me if there’s something you’d like to see me blog about! You can email me or contact me @mgroves on Twitter.

Posted by Matthew Groves, Developer Advocate

Matthew is a Developer Advocate for Couchbase, and lives in the Central Ohio area. He has experience as a web developer as a consultant, in-house developer, and product developer. He has been a regular speaker at conferences and user groups all over the United States, and he has written AOP in .NET for Manning Books. He has experience in C# and .NET, but also with other web-related tools and technologies like JavaScript and PHP. You can find him on Twitter at @mgroves.

Leave a reply