Thursday, May 19, 2011

MongoDB, CouchDB and OrientDB - a quick comparison

There are many NoSQL databases, each with their own specific purpose and characteristics. This week, my colleagues and I have been looking for a database that helps store a set of simple JSON documents without a lot of hassle and overhead.

Although Cassandra and Hadoop are fascinating products with lots of potential, they seemed to be designed to solve different problems. The long list quickly narrowed down to MongoDB, CouchDB and OrientDB. This was more due to time constraints than a lack of choice. So that leaves many products in the NoSQL world for follow-up posts!

The quick comparison consisted of three questions:
  1. How easy is it to set up the server and connect to it using a Java API?
  2. How fast can you store 9300 records in JSON format?
  3. How fast can you read 9300 records in JSON format?

Please note that this is not a representative test. It is just a quick scan investigating the first results not hindered by any background knowledge and not performing the optimizations like an expert would. This means that the end result only says something about the quality out of the box, not about the ultimate limitations of the product. It must also be mentioned that all tests were performed on Windows XP, 32 bits. Perhaps this is not the best suitable environment for NoSQL.


These are our findings:

OrientDB

The OrientDB is relatively simple to install. The client API consists of two JARs - less than 850KB in total. The only hurdle in setting up a connection is determining what the connection string should be. It turns out that you need to explicitly create a database before you can connect to it. That is not unreasonable of course.

Writing 9300 records: ± 2,640ms (2nd place in this post)
Reading 9300 records: ± 5,157ms (2nd place in this post)

CouchDB

Installing CouchDB is painful from the start. After downloading, you need to make the server yourself. Or you can download an "unstable" installation package (just quoting the website). Setting up the client required the installation of nine JARs, 1.5MB in total. The CouchDB website suggests using HttpClient 4.0 which results in a ClassNotFoundException. After downgrading to HttpClient 3.1, the client starts working.

Writing 9300 records: ± 45,340ms (3rd place in this post)
Reading 9300 records: ± 196,985ms (3rd place in this post)

MongoDB

MongoDB is one of the most pleasant servers I have ever installed. The only hurdle during startup is that you need to create a /data/db directory on your hard drive. After that, anything is automated for you. The daemon starts up in an eye blink. Databases and collections are automatically created when necessary. The client API consists of a single JAR of 240KB and provides a clean and simple API.

Writing 9300 records: ± 1406ms (1st place in this post)
Reading 9300 records: ± 2140ms (1st place in this post)

Conclusion

In terms of ease of use and response times in the out-of-the-box, unconfigured, unoptimized installation, MongoDB clearly wins. And CouchDB loses in all possible ways.

OrientDB should not be disregarded too quickly though. It offers more interfaces and functionality out of the box than MongoDB. For example, for a user-friendly REST interface, MongoDB relies on 3rd party add ons while OrientDB offers a nice one out of the box. Another fundamental difference is that OrientDB also offers GraphDB functionality in addition to the document storage. Depending on the requirements, that may justify accepting the performance penalty which is significant but not dramatical.

If you're looking for absolute efficiency in terms of both performance and ease of use, go with MongoDB!

And there is still a lot more to find out about these products beyond the simple tests in this post. Keep an eye open for follow-up posts!