If you like MongoDB, you might also like OrientDB. It shares many of the characteristics and adds extra features, like a friendly REST interface and a GraphDB. As a first step, giving some extra attention to OrientDB, here is the source code for the last post. It is a bit crude, but gives a good impression of the simplicity of working with OrientDB in its most basic form.
This code reads a comma (semicolon) separated text file (CSV), breaks it up without taking care of escaped characters (semicolons within the text fields) and store all fields in OrientDB. It starts reading the first line of the CSV to serve as field names. It translates the field names from PascalCasing to camelCasing and removes a vendor-specific prefix. This step might need some changes for your own CSV file. I cannot provide my test data for legal reasons.
After that, it reads every line from the file and stores all the values under the field names from the first row. If there is an extra semicolon on that line, there will be errors in the data. In a more realistic example, such errors could be detected by comparing the split line with the number of field names from the first line. The code in this example should be used with a clean CSV file to prevent these errors.
package eu.adriaandejonge.orient;
import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import com.orientechnologies.orient.core.
db.document.ODatabaseDocumentTx;
import com.orientechnologies.orient.core.
record.impl.ODocument;
public class WriteOrient {
public static void main(String[] args) {
try {
long start = System.currentTimeMillis();
File file =
new File("E:/Development/uitjes.csv");
FileReader reader = new FileReader(file);
BufferedReader bufferedReader = new BufferedReader(reader);
String firstLine = bufferedReader.readLine() + "";
String[] columns = firstLine.split(";");
int length = columns.length;
for (int i = 0; i < length; i++) {
columns[i] = columns[i].replaceAll("w3s_", "");
columns[i] =
columns[i].substring(0, 1).toLowerCase() +
columns[i].substring(1);
}
ODatabaseDocumentTx db =
new ODatabaseDocumentTx("local:/tmp/demo").create();
int cnt = 0;
String line;
while ((line = bufferedReader.readLine()) != null) {
cnt++;
if(cnt % 100 == 0) System.out.println("cnt=" + cnt);
ODocument uitje = new ODocument(db, "uitje");
String[] values = line.split(";");
for (int i = 0; i < length; i++) {
if (i < values.length)
uitje.field(columns[i], values[i]);
}
uitje.save();
}
db.close();
System.out.println("DONE in " +
(System.currentTimeMillis() - start) + "ms");
} catch (Exception e) {
e.printStackTrace();
}
}
}
To estimate the amount of code needed to communicate with OrientDB, you should focus on the bold lines. The rest of the code only serves to read the CSV files. The bold code is comparable to code for similar NoSQL databases, like MongoDB and CouchDB. Even though there is no standardized API for these databases yet, you do not have to worry about lock-in too much. As long as you isolate the database specific code, you can easily migrate to a different datastore as long as it shares the same characteristics. OrientDB, MongoDB and CouchDB can all be characterized as NoSQL document storages that are particularly well suited for storing JSON documents with nested key-value pairs.
Reading data from OrientDB is somewhat similar. You can do a lot more than demonstrated in the code example. Querying and reading specific fields to name the most basic examples. What the code demonstrates, is that if you simply want to serve JSON documents to the outside world for client side processing, you don't need to write a lot of code.
package eu.adriaandejonge.orient;
import java.io.FileWriter;
import java.io.IOException;
import com.orientechnologies.orient.core.
db.document.ODatabaseDocumentTx;
import com.orientechnologies.orient.core.
record.impl.ODocument;
public class ReadOrient {
public static void main(String[] args) {
try {
tryOrient();
} catch (Exception e) {
e.printStackTrace();
}
}
private static void tryOrient() throws IOException {
long startTime = System.currentTimeMillis();
ODatabaseDocumentTx db =
new ODatabaseDocumentTx("local:/tmp/demo")
.open("admin", "admin");
readCollection(db);
System.out.println("DONE in " +
(System.currentTimeMillis() - startTime) + "ms");
}
private static void readCollection(ODatabaseDocumentTx db)
throws IOException {
int count = 0;
FileWriter fileWriter = new FileWriter("E:/orient.txt");
for(ODocument doc : db.browseClass("uitje")) {
count++;
fileWriter.write(doc.toJSON() + "\n");
}
System.out.println("# " + count);
}
}
To test these examples, you need to set up a local instance of OrientDB with a default set up. Also, you need to copy the JAR files called orientdb-core.jar and orient-commons.jar to your /lib folder. When connecting to a remote server, you require two additional JARs, orientdb-client.jar and orientdb-enterprise.jar. More details on the libraries required to connect can be found in the OrientDB documentation.
This is just a small first step towards an actual application. Let me know what you think of it. Suggestions for improvement and follow-up posts are welcome.