http://www.developer.com/

Back to article

Mnesia: A Distributed DBMS Rooted in Concurrency


February 10, 2010

Mnesia is a complete database management system (DBMS) included in Erlang OTP, an open source development environment for concurrent programming based on the Erlang language. As Mnesia is a true DBMS, distributing, replicating and fragmenting your data with it—even among thousands of nodes around the world—is child's play. You only need to run different Erlang nodes where you want to distribute the Mnesia database.

The story of how Mnesia got its name is probably apocryphal, but legend has it that the original name was Amnesia until an Ericsson executive declared that a database couldn't be called something associated with forgetting. So, the engineers dropped the "A" and built Mnesia, something that remembers all. Mnesia offers the fault tolerance feature of any program implemented in Erlang, and you can interact with the database in Erlang itself. With Mnesia, Erlang becomes a database language.

In this article, I highlight some of Mnesia's important features. The file test.erl provided in the source code download contains some simple functions that group together a few lines of code. It mainly uses an Erlang shell.

Unless absolutely necessary, I do not explain all the aspects of Erlang syntax or basic concepts. You can easily find information about those in the online documentation. Pay particular attention to atoms, variables, tuples, and lists. You can even find useful background information in my DevX article "Writing Parallel Programs with Erlang," where I explore the concurrent aspects of the language.

Nodes: The Heart of Erlang

The concept of nodes is at the heart of Erlang. Every Erlang shell could be a node, provided that it is started with a name and a secret word called a cookie. Let's see Erlang in action with some examples. I have three Debian Linux machines on my LAN (charlie, delta and nemo) where I have installed Erlang by simply running:

apt-get install erlang


I choose a nonsensical string (AXHMTYGVJDNJIFGYADNJ) as the secret word and first, second and third as the node names. Then, I ran these two commands on charlie and delta:

erl -sname first -setcookie AXHMTYGVJDNJIFGYADNJ
 
erl -sname second -setcookie AXHMTYGVJDNJIFGYADNJ


I can now connect the two Erlang nodes, first@charlie and second@delta, with a single ping from charlie:

(first@charlie)1> net_adm:ping(second@delta).                            
pong


I can test the connection by asking which nodes are connected:

(first@charlie)2> nodes().
[second@delta]


As you can see, I already have two cooperative Erlang nodes that can easily communicate with each other.

Mnesia Schema

The test.erl file in the source code download contains two functions, test:start_schema(NodesList) and test:start_tables, that have to be run only once. They deal with creating Mnesia schema, defining tables, and initially populating those tables with initial data.

The first operation you need to complete is creating a Mnesia schema, even before you start any Mnesia engine on the connected nodes. A Mnesia schema is a special table and a local directory on the filesystem. This directory can contain several files and must be unique for each node. You create the schema once for each database under the current directory with the name Mnesia.node_name@host, but you can override the default with the erl command option -mnesia dir [database_dir].

The function mnesia:create_schema(NodesList) takes a list of already connected Erlang nodes as arguments. If you run the following from one of the connected nodes:

mnesia:create_schema([local_node_name@localhost]).


the Mnesia dir will exist only in the local node. But if you run this:

mnesia:create_schema([local_node_name@localhost,
                   remote_node_name1@host1,
                      remote_node_name2@host2]).


all the nodes in the list will have a copy of Mnesia schema and the system will keep all the copies synchronized. This is an important detail with regard to redundancy and failover issues.

To put this in practice, I distribute the file test.erl on my three nodes and compile it with this command:

c(test.erl).


To create a schema on both the nodes first@charlie and second@delta, I run the following from one of the two nodes:

mnesia:create_schema([first@charlie,second@delta]).


Then I launch mnesia:start() on the other one.

Mnesia Tables

The database tables definition is another one-time operation, which you can perform on one of the nodes that have a copy of the schema. With the mnesia:create_table function, you can design the structure of the table as well as set a lot of important table attributes. I create and populate table1 and table2 by running:

(first@charlie)3> test:start_tables().
 
=INFO REPORT==== 4-Jan-2010::16:22:27 ===
    application: mnesia
    exited: stopped
    type: temporary
stopped


Provided that the file test.erl also had been compiled on second@delta, I could run the above functions also on delta. Take a look at the following code to see how the previously cited function mnesia:create_table is used:

mnesia:create_table(table1,
[
    {type, set},
    {attributes, record_info(fields, table1)},
    {disc_copies, [first@charlie, second@delta]}  
])


As you can see, the function takes as arguments a list of tuples {key, value} that determine the table characteristics. In this case, I create a "set" table named table1 whose columns are defined in the record directive at the top of the file:

-record(table1, {table1_id, name, color, number}).


Note the disc_copies key, which specifies where a hard disk copy of the table will exist. This means that you can tune the redundancy of a Mnesia database at the table level. In this case, table1 has copies on the hard disk on the nodes first@charlie and second@delta. In the next section, I show this function in action.

Testing Mnesia Nodes

On the second@delta node, I run:

(second@delta)5> test:select().
[{table1,1,"record1","brown",1724},
 {table1,2,"record2","orange",2367},
 {table1,3,"record3","red",7834}]


Then I shut down the second@delta node, exiting from the shell.

On the first@charlie node, I insert a new record on table1:

(first@charlie)6> test:insert().
ok
(first@charlie)7> test:select().
[{table1,1,"record1","brown",1724},
 {table1,2,"record2","orange",2367},
 {table1,3,"record3","red",7834},
 {table1,4,"record4","orange",8888}]


Now, I power on the second@delta node again and start the Mnesia engine with mnesia:start(). Then I extract table1 records:

(second@delta)2> test:select(). 
[{table1,1,"record1","brown",1724},
 {table1,2,"record2","orange",2367},
 {table1,3,"record3","red",7834},
 {table1,4,"record4","orange",8888}]


The table1 copy also contains the records inserted in the node first@charlie when second@delta was down. This means that the schema on second@delta synchronized itself transparently with the one on first@charlie during the boot.

You can infer the meaning of the ram_copies or disc_only_copies key for a table. For performance reasons, keeping a RAM copy of the table on a node is a good practice. However, this of course doesn't accomplish any persistence of the data. The crucial point to understand here is Mnesia's out-of-the-box mirroring of tables.

Adding a Mnesia Node

The third machine on my LAN is called nemo. I start the Erlang node third@nemo with this command and then start Mnesia on the node:

erl -sname third -setcookie AXHMTYGVJDNJIFGYADNJ


The system is so flexible that from one of the other two nodes I can add an additional node with this command:

(first@charlie)15> mnesia:change_config(extra_db_nodes,[third@nemo]).


As I wrote before, Mnesia schema is a table&$151;a special one, but ultimately a table like any other. So, the command above adds a RAM copy of the schema table on third@nemo. After this operation, you can access the remote tables table1 and table2 on first@charlie and second@delta from third@nemo. In fact, if I ask for information about the Mnesia system on nemo I receive:

(third@nemo)4> mnesia:system_info().
...
running db nodes   = [first@charlie,second@delta,third@nemo]
...
remote             = [table1]
ram_copies         = [schema]
...


Add Table Copies

Of course, you also can change the nature of the schema table on third@nemo on the fly with:

(first@charlie)16> mnesia:add_table_copy(schema, third@nemo, disc_copies).


Adding new copies of the schema and changing their nature are dynamic operations that could be applied to every table of a Mnesia schema.

Now you know how easy it is to start nodes, build a Mnesia engine over them, and add further nodes without stopping any service and in a completely transparent way. To improve performance and achieve a high degree of redundancy, each node could have its own replicas of both the schema and some other table(s).

Transactions

Mnesia clearly is a true distributed DBMS (when you perform an operation on one copy of the schema, it is automatically propagated to all other replicas and fragments), but is it also a transactional database? Can you perform a transaction, namely a set of operations grouped in a unit of work that is atomic, consistent, isolated and durable (ACID), against a Mnesia database? The answer is yes, and the rest of this section explains how you implement this.

Think of the all-or-nothing operations that you have to execute on a database. For example, say I want to delete an existing record (#1) and insert a new record (#5), as shown in the function atomic_op here:

atomic_op() ->
    Row = #table1{table1_id=5, name="record5", color="black", number=4598},
    mnesia:delete({table1,1}),
    mnesia:write(Row).


But I absolutely don't want to delete record #1 without inserting record #5. Moreover, I don't want someone else to perform some operation on record #1 between my deletion and my insertion. In other words, I want the atomicity and the isolation of the two operations preserved on all nodes involved. For this purpose, all I need to do is embed my atomic_op function inside the mnesia:transaction function as shown below:

op() ->
    F = fun atomic_op/0,
    {atomic, Val} = mnesia:transaction(F),
    Val.


You can see in action the Erlang mechanism that allows me to bind the variable F to my function atomic_op in order to give it as an argument to mnesia:transaction. This guarantees not only atomicity and isolation, but also consistency and durability for my two operations.

The mnesia:transaction function manages all the concurrent issues that affect transactions, such as setting and releasing locks. You don't have to manage these kinds of issues. Mnesia also avoids deadlocks by putting the transactions that don't successfully acquire a lock on hold, forcing them to release all the locks they already have. If the transaction contains code with side-effects, such as a print on the screen (io:format), you could experience a lot of repetitive messages on standard output before the transaction can succeed.

Dirty Operations

For those times when systems require faster access to data, Mnesia offers dirty functions that manipulate tables without transaction overhead. Using these functions boosts performance, making Mnesia a soft real-time DBMS. However, transaction atomicity and isolation are severely compromised.

The test.erl file in the code download contains the following functions:

  • op: a transaction function that deletes record #1 and inserts record #5
  • reverse_op: a transaction function that deletes record #5 and inserts record #1
  • mop: executes op and reverse_op a given number of times
  • dirty_op, dirty_reverse_op, and dirty_mop: the dirty versions of the above functions

First, I run test:mop(Counter) with Counter = 10000 in order to have the function work for a reasonable amount of time on both the first@charlie and second@delta nodes. The function ran and ended on both nodes without any consistency problems. If I concurrently run test:select() on third@nemo, I always receive a list with three records, as both op and reverse_op require after having applied mnesia:write and mnesia:delete in one shot.

The same test concurrently running the dirty version, test:dirty_mop(Counter), on both nodes produces a lot of error messages. Also, sometimes the test:select() on third@nemo returns a list with two or four records.

Query List Comprehensions

Mnesia doesn't have a SQL-like language to interact with data in tables, but you use Erlang as a database language by using Erlang's Query List Comprehension module (qlc) to query the database. The qlc module uses a powerful construct of the Erlang syntax called list comprehension.

When you want to define a set in mathematics, you use the set-builder notation that qualifies the elements of the set by stating their proprieties. Now, consider how this notation and the Erlang list comprehension syntax compare.

Let A be a set of Natural numbers less than 10 and equal to their squares. Here is the mathematical formula:

The Set-Builder Notation

In Erlang, given that the lists:seq(0,9) function returns the list of all integers less than 10, you obtain the above mentioned list of elements with the following syntax:

1> A = [X || X <- lists:seq(0,9), X == X * X].
[0,1]


The similarity between the two formulas should stand out. You can also think about list comprehension as a tool to build lists from lists. Here is the general syntax, where Expression contains a set of operations on the elements generated and filtered by the qualifiers:

[Expression || Qualifier1, Qualifier2, ...]


Now you understand how qlc provides an interface to query Mnesia databases. Before using this module, however, you have to find where the system put the file qlc.hrl and put the following line in your code:

-include_lib("/path_to/qlc.hrl" ).


In my file, test.erl, the function test:select() returns a list with all table1 records. If you pay attention to this line:

Handle = qlc:q([X || X <- mnesia:table(table1)])


you will realize that the argument for the function qlc:q is just a list comprehension whose generator is the function mnesia:table, which returns the content of the table. The function qlc:q returns a query handle that has to be evaluated by qlc:e, which collects and returns all table data in a list.

QueryList = qlc:e(Handle).


Note that all these functions act inside a mnesia:transaction for the reasons explained previously.

You can see a small example of a join between the tables table1 and table2 in the function test:join(). Here, the query list comprehension is:

Handle = qlc:q([X#table1.number || X <- mnesia:table(table1),
                                   Y <- mnesia:table(table2),
                                   X#table1.number > 2000,
                                   X#table1.table1_id =:= Y#table2.table2_id
])


For all intents and purposes, records are tuples with named fields have to be declared in a directive (first character is a minus sign). To access one field of a record, you can use the dot syntax. For example, X#table1.number returns the value of the specified field number and X should evaluate to a table1 record.

Explore Mnesia

I have highlighted some of Mnesia's important features, but certainly not all. Hopefully, what you have learned will arouse your curiosity and inspire you to investigate Mnesia further. I invite you to take a deeper look into Mnesia's other features in order to discover its full potential.

Code Download

  • test.erl
  • For Further Reading

  • Erlang online documentation
  • MNESIA Database Management System (presentation)
  • About the Author

    Roberto Giorgetti is an IT manager and technical writer based in Italy. He is mainly interested in open source technology in business and industrial areas. Roberto holds a degree in Nuclear Engineering.

    Sitemap | Contact Us

    Thanks for your registration, follow us on our social networks to keep up-to-date