JavaData & JavaGetting Started with MongoDB as a Java NoSQL Solution

Getting Started with MongoDB as a Java NoSQL Solution

Developer.com content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

NoSQL (Not Only SQL) defies the traditional way of data persistence, beyond the norms of a Relational Model. It is devoid of the concept of tables, schemas, SQL, or rows that the relational model is built upon. Also, the providence of the ACID property that once guaranteed reliability and consistency seems insufficient under the changing environment of data handling in the distributed arena. Because applications that handle business data today are literally large, probably strewn across the grid in a distributed environment. Until NoSQL, schemas in relational model were either replicated or segmented horizontally across the grid to keep up with the huge data flow. This is just infeasible or too complex to maintain in the long run. A new model was sought under these circumstances and NoSQL popped up, bringing home some fresh ideas to look into. NoSQL is the cause that effected into a multitude of database implementations such as CouchDB, Cassandra, Hbase, Neo4J, and others. MongoDB is one prominent among them. This article explores the Java NoSQL relation from the perspective of MongoDB and tries to provide some ground up information in a concise manner before going hands on in Java.

NoSQL in a Nutshell

NoSQL is a generic term to define a database whereas MongoDB is an instance of one of four broad categries:

  • Graph Store is a set of databases where information is stored in a graph data structure similar to edges connecting nodes of information. Example: Neo4J, HyperGraphDB, and so forth.
  • Key-Value stores are the simplest of its kind where a single item is stored as a key-value pair, similar to the HashMap data structure typically found in Java.
  • Columnar stores information in a wide column structure instead of rows as found in a relational model. A typical example of this type of database is Cassandra, Hbase, and the like.
  • Document Base pairs a key with a complex data structure called Document. The data structure is complex in the sense that it may contain different key-value pairs, key-array pairs, or nested documents. MongoDB and CouchDB belong to this category.

In spite of the differences, there are some basic characteristics common to all the categories. To name a few:

  • Format Neutral: NoSQL databases can store data in different formats such as document, graph, key-value pairs, and more.
  • Joinless: Complex SQL join queries are not required; they can be extracted using simple document-oriented interfaces.
  • Schemaless:The data structure to store data are need not be defined beforehand. They can grow and shrink according to the changing needs. (Cassandra is an exception, however).
  • And more…

ACID Rules Vs. CAP Theorem

The providential ACID (Atomicity, Consistency, Isolation, and Durability) rules are the backbone of a RDBMS and every transaction in the relational model holds dearly onto these qualities. But, much of today’s data persistence in a large environment is incompatible with the availability and performance. The ACID rules statistically fell short, especially in distributed environment. This made people look beyond these standard rules.

A typical relational database transaction usually locks a part of the database so that every transaction must keep it in a consistent state until the transaction either completes successfully or fails miserably. Any other request in between on the locked part is invariably denied. This is fine while dealing with a low stream of requests and with smaller data. However, this stringent rule compromises the availability for an enterprise that deals with a huge request stream and large data clusters, something like a shopping rush at amazon.com. And, because databases are usually partitioned across multiple grids, the system must function even if communication among the servers is unreliable.

The CAP theorem, coined by Eric Brewer, provides a set of requirements to adhere when designing an application for distributed architecture. They are Consistency, Availability, and Partition tolerance (CAP). To put it naively, it is practically impossible to achieve three requirements that meet a single instance in a distributed environment. In such a situation, any two of the combinations must be chosen as a decisive factor. This decision, however, depends on the underlying application architecture.

Mongo1
Figure 1: The CAP theorem

MongoDB in the Scenario

MongoDB is a document-oriented database and is consistent by default. It leverages partition tolerance by a dint of replica sets. A replica set ensures that a write operation asynchronously replicates a log of the operation to secondary databases. It does not support transaction in the manner that a RDBMS encompasses a set of operations with sophisticated SQL statements. But, it allows an atomic update operation that works on a complex document structure. This trade-off leverages simplicity, scalability, and fast performance. However, the relational model still reigns supreme where strict transaction semantics are sought.

Database Design

MongoDB is schema less. That, however, does not mean that MongoDB is going to store any unordered, gibberish information in the database. Schema, essentially, means that a database is not bound by any predefined columns or data types. This leverages structural flexibility both semantically and syntactically.

{
   "ISBN": "Book",
   "Title": "World of Disney",
   "ISBN": "111-222-333-4444-0",
   "Publisher": "DisneyLand",
   "Author": [
      "Mouse, Mickey",
      "Duck, Donald",
      "Mouse, Minnie"
   ]
}

Observe that the document contains key-value pairs where, say, “Title” is the key and “World of Disney” is the value. The keys are strings and the value can be a set of data types such as arrays, binary data, and so forth. MongoDB stores data in BSON format.

Also, note that the document in an item that contains the actual data; it feels similar to rows in a relational model.

Mongo2
Figure 2: Comparing the MongoDB database model to the relational database model

A collection is frequently referred to in MongoDB, which means a container that stores document. In comparison to the relational model, MongoDb can be visualised as in Figure 2, but they are definitely not used in a similar manner. There are different types of collection available in MongoDB that are expandable. Also, there are capped collections, which can contain a specific amount of data.

Let’s take these rudimentary ideas into practice with a very simple implementation of connecting MongoDB in Java.

An Example: Java Connecting MongoDB

As with any other JDBC driver, the MongoDB JDBC driver can be downloaded and set up in a Java project in a similar manner. The Java code to test connectivity is as follows:

package org.mano.example
import com.mongodb.DB;
import com.mongodb.MongoClient;
public class MongoDbExample{

   public static void main( String args[] ){
      try{
         MongoClient mongoClient =
            new MongoClient("localhost", 27017);
         DB db = mongoClient.getDB( "test" );
         System.out.println("Connection established");
      }
      catch(Exception e){
         System.err.println( e.getMessage() );
      }
   }
}

Conclusion

NoSQL database models are still in their prime to create a strong foothold as a de-facto standard in view of the maturity and stability offered by the veteran relational model. But, the progress seen in recent years is overwhelming. It’s not too hard to understand the need of an out-of-the-box solution for data persistence in a distributed environment; NoSQL quite fits to be a viable alternative. This, however, is not at all a demise of the relational model; rather, it complements the exact need of the hour. The relational model is still perfect for flexibility such as getting records through complicated SQL queries, processing minute data from a record, and the like, something NoSQL databases are yet to solve. But, nonetheless, there will always be cases best handled through a NoSQL database and some will be a perfect fit for RDBMS.

Get the Free Newsletter!

Subscribe to Developer Insider for top news, trends & analysis

Latest Posts

Related Stories