Introduction
Azure DocumentDB is NoSQL solution provided by Microsoft on the cloud. It falls under the ‘Document’ category. Data is stored in a JSON format. DocumentDB refers to the database entities as ‘resources’. To use the resources, we require a DocumentDB database account that can be created by using Azure Subscription.
A database account can contain a set of databases; within the database, you can have multiple collections. Collections are, in turn, a set of documents; for example, records, stored procedures, and so forth. DocumentDB also allows the managing user access permissions at the document account level. We will see these features in more detail.
Architecture
To begin with, let’s understand the concept of a ‘resource’ in DocumentDB. One of the basic resource is a “Database account” that can have multiple capacity units that are a set of databases and blob storage. After creating a database account, the next step is to create a ‘Database’ similar to a namespace, as a logical container. A Database can have a set of ‘Collections’ that is a container for storing Documents. Documents are JSON data that represents a record. In addition to these, we also have ‘Users’ and ‘Permissions’. Users are namespaces to assign permissions and Permissions are authorized tokens to access a resource by a user. Figure 1 explains the structure of a DocumentDB account.
Figure 1: The structure of a DocumentDB account
The resources are further classified as a ‘system resource’ and ‘user-defined resource’. From the diagram shown in Figure 1, resources like ‘DocumentDB account’, ‘Databases’, ‘Users’,’permissions’, ‘collections’, ‘stored procedures’, ‘triggers’, and ‘UDFs’ are all system resources and they have fixed schemas. However, documents and attachments are ‘user-defined’ and have flexible schema.
DocumentDB supports stored procedure and triggers; this makes it possible to update multiple documents in one transaction. However, it supports only single transaction scope: All the documents participating in the transaction scope should belong to one collection only.
Data types supported by DocumentDB are the same types that are supported by JSON. It supports native data types like strings, numbers, Booleans, date time, and complex types like nested objects and arrays. It also supports bulk insert operations and all the documents in the bulk operation are treated as one transaction. The transaction is committed only if all the documents are inserted successfully.
DocumentDB also supports resource caching and the services are available as a RESTful service.
How to Use Document DB
To evaluate DocumentDB, log in to http://portal.azure.com with your account.
Configuration
After navigating to the portal, click the ‘New’ button and select DocumentDB.
Figure 2: Starting a new DocumentDB project
Fill in ‘ID’ and the ‘Location’ and click Create; this will create a DocumentDB account. Once the account is created, it starts appearing as a tile on the home page. Click the tile to see the details.
Figure 3: Viewing details
Figure 4: Viewing the Keys
Using Keys (Point 1 in Figure 4) helps you connect to the DocumentDB account. Point 2 in Figure 4 lists the different databases available in this account. When we click the database, it displays the different collections available within the database and clicking Collection displays the different records it contains.
Figure 5: Viewing the Keys, Part 2
Point 1 in Figure 5 displays the different collections in the database. Click Document Explorer (Point 2) to bring up the document explorer on the right, as shown in Point 3 of Figure 5. It displays the documents in the collection. Click the individual record to see the details, as shown in Figure 6:
Figure 6: Viewing the details
Let’s write the code to add and update documents with the help of an example.
Connect to the DocumentDB Server from C# Code
Open Visual Studio 2013, click “Manage Nuget Package Manager”, and install “Microsoft Azure DocumentDB Client Library”. Ensure that “Include Prerelease” is selected.
Figure 7: Selecting “Include Prerelease”
To connect to the database and create documents, we first need to retrieve the URL and the key to connect. This information can be retrieved by clicking on “Keys” in Figure 4, Point 1.
Add the following namespaces:
using Microsoft.Azure.Documents; using Microsoft.Azure.Documents.Client;
Sample code to create a connection is:
private static string URI = "<URL>"; private static string Key = "<Key>"; private static DocumentClient client = new DocumentClient(new Uri(URI), Key);
Create a Schema Definition
In this example, I am using a console application to add and update records. We create two classes: Student and Marks. We will create the student first and then update the student with his or her marks.
1. public class Student 2. { 3. public string id { get; set; } 4. public string StudentNo { get; set; } 5. public string Name { get; set; } 6. public string Email { get; set; } 7. public string Address { get; set; } 8. public Marks StudentMarks { get; set; } 9. } 10. public class Marks 11. { 12. public int Marks1 { get; set; } 13. public int Marks2 { get; set; } 14. public int Marks3 { get; set; } 15. }
We have used two generic methods that check if the database or collection doesn’t exist. If it doesn’t yet exist, it creates it; otherwise, it returns the existing database or collection.
1. public static async Task<Database> GetDatabase(string databaseName) 2. { 3. if (client.CreateDatabaseQuery().Where(db => db.Id == databaseName).AsEnumerable().Any()) 4. { 5. return client.CreateDatabaseQuery().Where(db => db.Id == 6. databaseName).AsEnumerable().FirstOrDefault(); 7. } 8. return await client.CreateDatabaseAsync(new Database { Id = databaseName }); 9. } 10. public static async Task<DocumentCollection> GetCollection(Database database, string collName) 11. { 12. if (client.CreateDocumentCollectionQuery (database.SelfLink).Where(coll => coll.Id == 13. collName).ToArray().Any()) 14. { 15. return client.CreateDocumentCollectionQuery(database.SelfLink). Where(coll => coll.Id == 16. collName).ToArray().FirstOrDefault(); 17. } 18. return await client.CreateDocumentCollectionAsync(database.SelfLink, new DocumentCollection 19. { Id = collName }); 20. }
Note: To resolve Query methods, ensure that the “using Microsoft.Azure.Documents.Linq;” namespace is added. |
Now, we will use the generic methods to create the database and collection:
Database database = GetDatabase("StudentDatabase").Result; DocumentCollection collection = GetCollection(database, "StudentList").Result;
Insert and Retrieve Information
To insert documents to the collection, we first create two objects for type Student.
1. Student Joseph = new Student() { EmployeeNo = "3", Name = "Joseph S", Email = "Joseph.S@abc.com", Address = "Delhi, India" }; 2. Student Priya = new Student() { EmployeeNo = "4", Name = "Priya R", Email = "Priya.R@abc.com", Address = "Seattle, US" };
The following commands insert the documents into the collection.
1. client.CreateDocumentAsync(collection.SelfLink, Joseph); 2. client.CreateDocumentAsync(collection.SelfLink, Priya);
After the student documents are created, we update the students with their marks. Now, we create two marks objects.
1. Marks JosephM = new Marks() { Marks1 = 10, Marks2 = 20, Marks3 = 30 }; 2. Marks PriyaM = new Marks() { Marks1 = 11, Marks2 = 21, Marks3 = 31 };
To update the existing documents, we first read the documents and then do an update. To do an update, we will need the internal guide for the document generated by Azure. For that reason, ‘id’ is added to the ‘Student’ class and it’s not explicitly set in the code.
We also create an ‘UpdateDocument‘ method to update the modified document back to the collection. To do that, we use the client.UpsertDocumentAsync method.
1. public static async Task<Document> UpdateDocument(DocumentCollection coll, Student emp) 2. { 3. return await client.UpsertDocumentAsync(coll.SelfLink, emp); 4. }
Use the following command to retrieve documents from the collection. It uses a select clause that is similar to the corresponding SQL statement. In the following code, assign the marks to the two student records and call the ‘UpdateDocument‘ function to update the collection.
1. foreach (Student employee in client.CreateDocumentQuery(collection.SelfLink, "select * from EmployeeList")) 2. { 3. if (employee.Name == "Joseph S") 4. employee.StudentMarks = JosephM; 5. else 6. employee.StudentMarks = PriyaM; 7. Document doc = UpdateDocument(collection, employee).Result; 8. Console.WriteLine("Employee Name: " + employee.Name + ", Address: " + employee.Address + " , 9. Email: " + employee.Email); 10. }
After running this statement, we can go the ‘Document Explorer’ view. Refer to Fig 5, Point 2 to see the changes.
Features
DocumentDB is a NoSQL solution and is an Azure offering. It uses Azure’s features for scaling an up or down approach. Because the documents are stored in JSON format, the data types supported by JSON automatically are applied. In addition, it also supports the concept of stored procedures and triggers. That helps in doing bulk transactions because it commits or rolls back the all documents involved in the transaction. Collections are stored in capacity units and each capacity unit includes 10GB of storage and throughput. Users can be assigned permissions (read, contribute, and owner) to the Database account.
Summary
DocumentDB is a NoSQL solution on Azure. This is targeted for mobile applications and web solutions that don’t require an relational database solution. It’s simple to use and, because it’s on the cloud, all the URLs are all pre-configured and globally available. All the data is stored in JSON format and helps in rapid development.