February 1, 2015
Hot Topics:

Mnesia: A Distributed DBMS Rooted in Concurrency

  • February 10, 2010
  • By Roberto Giorgetti
  • Send Email »
  • More Articles »

Testing Mnesia Nodes

On the second@delta node, I run:

(second@delta)5> test:select().

Then I shut down the second@delta node, exiting from the shell.

On the first@charlie node, I insert a new record on table1:

(first@charlie)6> test:insert().
(first@charlie)7> test:select().

Now, I power on the second@delta node again and start the Mnesia engine with mnesia:start(). Then I extract table1 records:

(second@delta)2> test:select(). 

The table1 copy also contains the records inserted in the node first@charlie when second@delta was down. This means that the schema on second@delta synchronized itself transparently with the one on first@charlie during the boot.

You can infer the meaning of the ram_copies or disc_only_copies key for a table. For performance reasons, keeping a RAM copy of the table on a node is a good practice. However, this of course doesn't accomplish any persistence of the data. The crucial point to understand here is Mnesia's out-of-the-box mirroring of tables.

Adding a Mnesia Node

The third machine on my LAN is called nemo. I start the Erlang node third@nemo with this command and then start Mnesia on the node:

erl -sname third -setcookie AXHMTYGVJDNJIFGYADNJ

The system is so flexible that from one of the other two nodes I can add an additional node with this command:

(first@charlie)15> mnesia:change_config(extra_db_nodes,[third@nemo]).

As I wrote before, Mnesia schema is a table&$151;a special one, but ultimately a table like any other. So, the command above adds a RAM copy of the schema table on third@nemo. After this operation, you can access the remote tables table1 and table2 on first@charlie and second@delta from third@nemo. In fact, if I ask for information about the Mnesia system on nemo I receive:

(third@nemo)4> mnesia:system_info().
running db nodes   = [first@charlie,second@delta,third@nemo]
remote             = [table1]
ram_copies         = [schema]

Add Table Copies

Of course, you also can change the nature of the schema table on third@nemo on the fly with:

(first@charlie)16> mnesia:add_table_copy(schema, third@nemo, disc_copies).

Adding new copies of the schema and changing their nature are dynamic operations that could be applied to every table of a Mnesia schema.

Now you know how easy it is to start nodes, build a Mnesia engine over them, and add further nodes without stopping any service and in a completely transparent way. To improve performance and achieve a high degree of redundancy, each node could have its own replicas of both the schema and some other table(s).


Mnesia clearly is a true distributed DBMS (when you perform an operation on one copy of the schema, it is automatically propagated to all other replicas and fragments), but is it also a transactional database? Can you perform a transaction, namely a set of operations grouped in a unit of work that is atomic, consistent, isolated and durable (ACID), against a Mnesia database? The answer is yes, and the rest of this section explains how you implement this.

Think of the all-or-nothing operations that you have to execute on a database. For example, say I want to delete an existing record (#1) and insert a new record (#5), as shown in the function atomic_op here:

atomic_op() ->
    Row = #table1{table1_id=5, name="record5", color="black", number=4598},

But I absolutely don't want to delete record #1 without inserting record #5. Moreover, I don't want someone else to perform some operation on record #1 between my deletion and my insertion. In other words, I want the atomicity and the isolation of the two operations preserved on all nodes involved. For this purpose, all I need to do is embed my atomic_op function inside the mnesia:transaction function as shown below:

op() ->
    F = fun atomic_op/0,
    {atomic, Val} = mnesia:transaction(F),

You can see in action the Erlang mechanism that allows me to bind the variable F to my function atomic_op in order to give it as an argument to mnesia:transaction. This guarantees not only atomicity and isolation, but also consistency and durability for my two operations.

The mnesia:transaction function manages all the concurrent issues that affect transactions, such as setting and releasing locks. You don't have to manage these kinds of issues. Mnesia also avoids deadlocks by putting the transactions that don't successfully acquire a lock on hold, forcing them to release all the locks they already have. If the transaction contains code with side-effects, such as a print on the screen (io:format), you could experience a lot of repetitive messages on standard output before the transaction can succeed.

Dirty Operations

For those times when systems require faster access to data, Mnesia offers dirty functions that manipulate tables without transaction overhead. Using these functions boosts performance, making Mnesia a soft real-time DBMS. However, transaction atomicity and isolation are severely compromised.

The test.erl file in the code download contains the following functions:

  • op: a transaction function that deletes record #1 and inserts record #5
  • reverse_op: a transaction function that deletes record #5 and inserts record #1
  • mop: executes op and reverse_op a given number of times
  • dirty_op, dirty_reverse_op, and dirty_mop: the dirty versions of the above functions

First, I run test:mop(Counter) with Counter = 10000 in order to have the function work for a reasonable amount of time on both the first@charlie and second@delta nodes. The function ran and ended on both nodes without any consistency problems. If I concurrently run test:select() on third@nemo, I always receive a list with three records, as both op and reverse_op require after having applied mnesia:write and mnesia:delete in one shot.

The same test concurrently running the dirty version, test:dirty_mop(Counter), on both nodes produces a lot of error messages. Also, sometimes the test:select() on third@nemo returns a list with two or four records.

Query List Comprehensions

Mnesia doesn't have a SQL-like language to interact with data in tables, but you use Erlang as a database language by using Erlang's Query List Comprehension module (qlc) to query the database. The qlc module uses a powerful construct of the Erlang syntax called list comprehension.

When you want to define a set in mathematics, you use the set-builder notation that qualifies the elements of the set by stating their proprieties. Now, consider how this notation and the Erlang list comprehension syntax compare.

Let A be a set of Natural numbers less than 10 and equal to their squares. Here is the mathematical formula:

The Set-Builder Notation

In Erlang, given that the lists:seq(0,9) function returns the list of all integers less than 10, you obtain the above mentioned list of elements with the following syntax:

1> A = [X || X <- lists:seq(0,9), X == X * X].

The similarity between the two formulas should stand out. You can also think about list comprehension as a tool to build lists from lists. Here is the general syntax, where Expression contains a set of operations on the elements generated and filtered by the qualifiers:

[Expression || Qualifier1, Qualifier2, ...]

Now you understand how qlc provides an interface to query Mnesia databases. Before using this module, however, you have to find where the system put the file qlc.hrl and put the following line in your code:

-include_lib("/path_to/qlc.hrl" ).

In my file, test.erl, the function test:select() returns a list with all table1 records. If you pay attention to this line:

Handle = qlc:q([X || X <- mnesia:table(table1)])

you will realize that the argument for the function qlc:q is just a list comprehension whose generator is the function mnesia:table, which returns the content of the table. The function qlc:q returns a query handle that has to be evaluated by qlc:e, which collects and returns all table data in a list.

QueryList = qlc:e(Handle).

Note that all these functions act inside a mnesia:transaction for the reasons explained previously.

You can see a small example of a join between the tables table1 and table2 in the function test:join(). Here, the query list comprehension is:

Handle = qlc:q([X#table1.number || X <- mnesia:table(table1),
                                   Y <- mnesia:table(table2),
                                   X#table1.number > 2000,
                                   X#table1.table1_id =:= Y#table2.table2_id

For all intents and purposes, records are tuples with named fields have to be declared in a directive (first character is a minus sign). To access one field of a record, you can use the dot syntax. For example, X#table1.number returns the value of the specified field number and X should evaluate to a table1 record.

Explore Mnesia

I have highlighted some of Mnesia's important features, but certainly not all. Hopefully, what you have learned will arouse your curiosity and inspire you to investigate Mnesia further. I invite you to take a deeper look into Mnesia's other features in order to discover its full potential.

Code Download

  • test.erl
  • For Further Reading

  • Erlang online documentation
  • MNESIA Database Management System (presentation)
  • About the Author

    Roberto Giorgetti is an IT manager and technical writer based in Italy. He is mainly interested in open source technology in business and industrial areas. Roberto holds a degree in Nuclear Engineering.
    Tags: open source, database, Concurrency, Erlang, Mnesia

    Page 2 of 2

    Comment and Contribute


    (Maximum characters: 1200). You have characters left.



    Enterprise Development Update

    Don't miss an article. Subscribe to our newsletter below.

    Sitemap | Contact Us

    Thanks for your registration, follow us on our social networks to keep up-to-date
    Rocket Fuel