This chapter describes the Mnesia transaction system and the transaction properties which make Mnesia a fault tolerant, distributed database management system.
Also covered in this chapter are the locking functions, including table locks and sticky locks, as well as alternative functions which bypass the transaction system in favor of improved speed and reduced overheads. These functions are called "dirty operations". We also describe the usage of nested transactions. This chapter contains the following sections:
Transactions are an important tool when designing fault tolerant, distributed systems. A Mnesia transaction is a mechanism by which a series of database operations can be executed as one functional block. The functional block which is run as a transaction is called a Functional Object (Fun), and this code can read, write, or delete Mnesia records. The Fun is evaluated as a transaction which either commits, or aborts. If a transaction succeeds in executing Fun it will replicate the action on all nodes involved, or abort if an error occurs.
The following example shows a transaction which raises the salary of certain employee numbers.
The transaction
The Mnesia transaction system facilitates the construction of reliable, distributed systems by providing the following important properties:
Atomicity means that database changes which are executed by a transaction take effect on all nodes involved, or on none of the nodes. In other words, the transaction either succeeds entirely, or it fails entirely.
Atomicity is particularly important when we want to
atomically write more than one record in the same
transaction. The
Mnesia is a distributed DBMS where data can be replicated on several nodes. In many such applications, it is important that a series of write operations are performed atomically inside a transaction. The atomicity property ensures that a transaction take effect on all nodes, or none at all.
Consistency. This transaction property ensures that a transaction always leaves the DBMS in a consistent state. For example, Mnesia ensures that inconsistencies will not occur if Erlang, Mnesia or the computer crashes while a write operation is in progress.
Isolation. This transaction property ensures that transactions which execute on different nodes in a network, and access and manipulate the same data records, will not interfere with each other.
The isolation property makes it possible to concurrently execute
the
The isolation property is extremely useful if the following circumstances occurs where an employee (with an employee number 123) and two processes, (P1 and P2), are concurrently trying to raise the salary for the employee. The initial value of the employees salary is, for example, 5. Process P1 then starts to execute, it reads the employee record and adds 2 to the salary. At this point in time, process P1 is for some reason preempted and process P2 has the opportunity to run. P2 reads the record, adds 3 to the salary, and finally writes a new employee record with the salary set to 8. Now, process P1 start to run again and writes its employee record with salary set to 7, thus effectively overwriting and undoing the work performed by process P2. The update performed by P2 is lost.
A transaction system makes it possible to concurrently execute two or more processes which manipulate the same record. The programmer does not need to check that the updates are synchronous, this is overseen by the transaction handler. All programs accessing the database through the transaction system may be written as if they had sole access to the data.
Durability. This transaction property ensures that changes made to the DBMS by a transaction are permanent. Once a transaction has been committed, all changes made to the database are durable - i.e. they are written safely to disc and will not be corrupted or disappear.
The durability feature described does not entirely apply to situations where Mnesia is configured as a "pure" primary memory database.
Different transaction managers employ different strategies to satisfy the isolation property. Mnesia uses the standard technique of two-phase locking. This means that locks are set on records before they are read or written. Mnesia uses five different kinds of locks.
Mnesia employs a strategy whereby functions such as
Deadlocks can occur when concurrent processes set and release locks on the same records. Mnesia employs a "wait-die" strategy to resolve these situations. If Mnesia suspects that a deadlock can occur when a transaction tries to set a lock, the transaction is forced to release all its locks and sleep for a while. The Fun in the transaction will be evaluated one more time.
For this reason, it is important that the code inside the Fun given to
This transaction could write the text
The Mnesia programmer cannot prioritize one particular transaction to execute before other transactions which are waiting to execute. As a result, the Mnesia DBMS transaction system is not suitable for hard real time applications. However, Mnesia contains other features that have real time properties.
Mnesia dynamically sets and releases locks as
transactions execute, therefore, it is very dangerous to execute code with
transaction side-effects. In particular, a
If a transaction terminates abnormally, Mnesia will automatically release the locks held by the transaction.
We have shown examples of a number of functions that can be used inside a transaction. The following list shows the simplest Mnesia functions that work with transactions. It is important to realize that these functions must be embedded in a transaction. If no enclosing transaction (or other enclosing Mnesia activity) exists, they will all fail.
As previously stated, the locking strategy used by Mnesia is to lock one record when we read a record, and lock all replicas of a record when we write a record. However, there are applications which use Mnesia mainly for its fault-tolerant qualities, and these applications may be configured with one node doing all the heavy work, and a standby node which is ready to take over in case the main node fails. Such applications may benefit from using sticky locks instead of the normal locking scheme.
A sticky lock is a lock which stays in place at a node after the transaction which first acquired the lock has terminated. To illustrate this, assume that we execute the following transaction:
F = fun() ->
mnesia:write(#foo{a = kalle})
end,
mnesia:transaction(F).
The
Normal locking requires:
If we use sticky locks, we must first change the code as follows:
F = fun() ->
mnesia:s_write(#foo{a = kalle})
end,
mnesia:transaction(F).
This code uses the
It is much more efficient to set a local lock than it is to set a networked lock, and for this reason sticky locks can benefit application that use a replicated table and perform most of the work on only one of the nodes.
If a record is stuck at node
Mnesia supports read and write locks on whole tables as a complement to the normal locks on single records. As previously stated, Mnesia sets and releases locks automatically, and the programmer does not have to code these operations. However, transactions which read and write a large number of records in a specific table will execute more efficiently if we start the transaction by setting a table lock on this table. This will block other concurrent transactions from the table. The following two function are used to set explicit table locks for read and write operations:
Alternate syntax for acquisition of table locks is as follows:
mnesia:lock({table, Tab}, read)
mnesia:lock({table, Tab}, write)
The matching operations in Mnesia may either lock the entire table or just a single record (when the key is bound in the pattern).
Write locks are normally acquired on all nodes where a replica of the table resides (and is active). Read locks are acquired on one node (the local one if a local replica exists).
The function
mnesia:lock({global, GlobalKey, Nodes}, LockKind)
LockKind ::= read | write | ...
The lock is acquired on the LockItem on all Nodes in the nodes list.
In many applications, the overhead of processing a transaction may result in a loss of performance. Dirty operation are short cuts which bypass much of the processing and increase the speed of the transaction.
Dirty operation are useful in many situations, for example in a datagram routing application where Mnesia stores the routing table, and it is time consuming to start a whole transaction every time a packet is received. For this reason, Mnesia has functions which manipulate tables without using transactions. This alternative to processing is known as a dirty operation. However, it is important to realize the trade-off in avoiding the overhead of transaction processing:
The major advantage of dirty operations is that they execute much faster than equivalent operations that are processed as functional objects within a transaction.
Dirty operations
are written to disc if they are performed on a table of type
A dirty operation will ensure a certain level of consistency. For example, it is not possible for dirty operations to return garbled records. Hence, each individual read or write operation is performed in an atomic manner.
All dirty functions execute a call to
Records in
If there are no records at all in the table, this function
will return the atom
Returns the list of records that are associated with Slot
in a table. It can be used to traverse a table in a manner
similar to the
The behavior of this function is undefined if the
table is written on while being
traversed.
Counters are positive integers with a value greater than or
equal to zero. Updating a counter will add the
There exists no special counter records in
Mnesia. However, records on the form of
It is not possible to have transaction protected updates of counter records.
There are two significant differences when using this function instead of reading the record, performing the arithmetic, and writing the record:
In Mnesia, all records in a table must have the same name. All
the records must be instances of the same
record type. The record name does however not necessarily be
the same as the table name. Even though that it is the case in
the most of the examples in this document. If a table is created
without the
mnesia:create_table(subscriber, [])
However, if the table is is created with an explicit record name as argument, as shown below, it is possible to store subscriber records in both of the tables regardless of the table names:
TabDef = [{record_name, subscriber}],
mnesia:create_table(my_subscriber, TabDef),
mnesia:create_table(your_subscriber, TabDef).
In order to access such
tables it is not possible to use the simplified access functions
as described earlier in the document. For example,
writing a subscriber record into a table requires a
mnesia:write(subscriber, #subscriber{}, write)
mnesia:write(my_subscriber, #subscriber{}, sticky_write)
mnesia:write(your_subscriber, #subscriber{}, write)
The following simplified piece of code illustrates the relationship between the simplified access functions used in most examples and their more flexible counterparts:
mnesia:dirty_write(Record) ->
Tab = element(1, Record),
mnesia:dirty_write(Tab, Record).
mnesia:dirty_delete({Tab, Key}) ->
mnesia:dirty_delete(Tab, Key).
mnesia:dirty_delete_object(Record) ->
Tab = element(1, Record),
mnesia:dirty_delete_object(Tab, Record)
mnesia:dirty_update_counter({Tab, Key}, Incr) ->
mnesia:dirty_update_counter(Tab, Key, Incr).
mnesia:dirty_read({Tab, Key}) ->
Tab = element(1, Record),
mnesia:dirty_read(Tab, Key).
mnesia:dirty_match_object(Pattern) ->
Tab = element(1, Pattern),
mnesia:dirty_match_object(Tab, Pattern).
mnesia:dirty_index_match_object(Pattern, Attr)
Tab = element(1, Pattern),
mnesia:dirty_index_match_object(Tab, Pattern, Attr).
mnesia:write(Record) ->
Tab = element(1, Record),
mnesia:write(Tab, Record, write).
mnesia:s_write(Record) ->
Tab = element(1, Record),
mnesia:write(Tab, Record, sticky_write).
mnesia:delete({Tab, Key}) ->
mnesia:delete(Tab, Key, write).
mnesia:s_delete({Tab, Key}) ->
mnesia:delete(Tab, Key, sticky_write).
mnesia:delete_object(Record) ->
Tab = element(1, Record),
mnesia:delete_object(Tab, Record, write).
mnesia:s_delete_object(Record) ->
Tab = element(1, Record),
mnesia:delete_object(Tab, Record. sticky_write).
mnesia:read({Tab, Key}) ->
mnesia:read(Tab, Key, read).
mnesia:wread({Tab, Key}) ->
mnesia:read(Tab, Key, write).
mnesia:match_object(Pattern) ->
Tab = element(1, Pattern),
mnesia:match_object(Tab, Pattern, read).
mnesia:index_match_object(Pattern, Attr) ->
Tab = element(1, Pattern),
mnesia:index_match_object(Tab, Pattern, Attr, read).
As previously described, a functional object (Fun) performing
table access operations as listed below may be
passed on as arguments to the function
mnesia:write/3 (write/1, s_write/1)
mnesia:delete/3 (delete/1, s_delete/1)
mnesia:delete_object/3 (delete_object/1, s_delete_object/1)
mnesia:read/3 (read/1, wread/1)
mnesia:match_object/2 (match_object/1)
mnesia:select/3 (select/2)
mnesia:foldl/3 (foldl/4, foldr/3, foldr/4)
mnesia:all_keys/1
mnesia:index_match_object/4 (index_match_object/2)
mnesia:index_read/3
mnesia:lock/2 (read_lock_table/1, write_lock_table/1)
mnesia:table_info/2
These functions will be performed in a
transaction context involving mechanisms like locking, logging,
replication, checkpoints, subscriptions, commit protocols
etc.However, the same function may also be
evaluated in other activity contexts.
The following activity access contexts are currently supported:
transaction
sync_transaction
async_dirty
sync_dirty
ets
By passing the same "fun" as argument to the function
By passing the same "fun" as argument to the function
By passing the same "fun" as an argument to the function
You can check if your code is executed within a transaction with
Mnesia tables with storage type RAM_copies and disc_copies
are implemented internally as "ets-tables" and
it is possible for applications to access the these tables
directly. This is only recommended if all options have been weighed
and the possible outcomes are understood. By passing the earlier
mentioned "fun" to the function
The Fun may also be passed as an argument to the function
The callback module does
not have to access real Mnesia tables, it is free to do whatever
it likes as long as the callback interface is fulfilled.
In Appendix C "The Activity Access Call Back Interface" the source
code for one alternate implementation is provided
(mnesia_frag.erl). The context sensitive function
QLC queries may be performed in all these activity contexts (transaction, sync_transaction, async_dirty, sync_dirty and ets). The ets activity will only work if the table has no indices.
The mnesia:dirty_* function always executes with async_dirty semantics regardless of which activity access contexts are invoked. They may even invoke contexts without any enclosing activity access context.
Transactions may be nested in an arbitrary fashion. A child transaction
must run in the same process as its parent. When a child transaction
aborts, the caller of the child transaction will get the
return value
No locks are released when child transactions terminate. Locks
created by a sequence of nested transactions are kept until
the topmost transaction terminates. Furthermore, any updates
performed by a nested transaction are only propagated
in such a manner so that the parent of the nested transaction
sees the updates. No final commitment will be done until
the top level transaction is terminated.
So, although a nested transaction returns
The ability to have nested transaction with identical semantics as top level transaction makes it easier to write library functions that manipulate mnesia tables.
Say for example that we have a function that adds a new subscriber to a telephony system:
add_subscriber(S) -> mnesia:transaction(fun() -> case mnesia:read( ..........
This function needs to be called as a transaction.
Now assume that we wish to write a function that
both calls the
It is also possible to mix different activity access contexts while nesting, but the dirty ones (async_dirty,sync_dirty and ets) will inherit the transaction semantics if they are called inside a transaction and thus it will grab locks and use two or three phase commit.
add_subscriber(S) -> mnesia:transaction(fun() -> %% Transaction context mnesia:read({some_tab, some_data}), mnesia:sync_dirty(fun() -> %% Still in a transaction context. case mnesia:read( ..) ..end), end). add_subscriber2(S) -> mnesia:sync_dirty(fun() -> %% In dirty context mnesia:read({some_tab, some_data}), mnesia:transaction(fun() -> %% In a transaction context. case mnesia:read( ..) ..end), end).
When it is not possible to use
mnesia:select(Tab, MatchSpecification, LockKind) ->
transaction abort | [ObjectList]
mnesia:select(Tab, MatchSpecification, NObjects, Lock) ->
transaction abort | {[Object],Continuation} | '$end_of_table'
mnesia:select(Cont) ->
transaction abort | {[Object],Continuation} | '$end_of_table'
mnesia:match_object(Tab, Pattern, LockKind) ->
transaction abort | RecordList
These functions matches a
The pattern provided to the functions must be a valid record,
and the first element of the provided tuple must be the
Use the function
Wildpattern = mnesia:table_info(employee, wild_pattern),
%% Or use
Wildpattern = #employee{_ = '_'},
For the employee table the wild pattern will look like:
{employee, '_', '_', '_', '_', '_',' _'}.
In order to constrain the match you must replace some
of the
Pat = #employee{sex = female, _ = '_'},
F = fun() -> mnesia:match_object(Pat) end,
Females = mnesia:transaction(F).
It is also possible to use the match function if we want to check the equality of different attributes. Assume that we want to find all employees which happens to have a employee number which is equal to their room number:
Pat = #employee{emp_no = '$1', room_no = '$1', _ = '_'},
F = fun() -> mnesia:match_object(Pat) end,
Odd = mnesia:transaction(F).
The function
Select can be used to add additional constraints and create
output which can not be done with
The second argument to select is a
A detailed explanation of match specifications can be found in the Erts users guide: Match specifications in Erlang , and the ets/dets documentations may provide some additional information.
The functions
There is a severe performance penalty in using
If the key attribute is bound in a pattern, the match operation
is very efficient. However, if the key attribute in a pattern is
given as
QLC queries can also be used to search Mnesia tables. By
using
If no options are specified a read lock will acquired and 100 results will be returned in each chunk, and select will be used to traverse the table, i.e.:
mnesia:table(Tab) ->
mnesia:table(Tab, [{n_objects,100},{lock, read}, {traverse, select}]).
The function
Mnesia provides a couple of functions which iterates over all the records in a table.
mnesia:foldl(Fun, Acc0, Tab) -> NewAcc | transaction abort
mnesia:foldr(Fun, Acc0, Tab) -> NewAcc | transaction abort
mnesia:foldl(Fun, Acc0, Tab, LockType) -> NewAcc | transaction abort
mnesia:foldr(Fun, Acc0, Tab, LockType) -> NewAcc | transaction abort
These functions iterate over the mnesia table
The first time the
The difference between
These functions might be used to find records in a table
when it is impossible to write constraints for
For example finding all the employees who has a salary below 10 could look like:
Constraint =
fun(Emp, Acc) when Emp#employee.salary < 10 ->
[Emp | Acc];
(_, Acc) ->
Acc
end,
Find = fun() -> mnesia:foldl(Constraint, [], employee) end,
mnesia:transaction(Find).
]]>
Raising the salary to 10 for everyone with a salary below 10 and return the sum of all raises:
Increase =
fun(Emp, Acc) when Emp#employee.salary < 10 ->
OldS = Emp#employee.salary,
ok = mnesia:write(Emp#employee{salary = 10}),
Acc + 10 - OldS;
(_, Acc) ->
Acc
end,
IncLow = fun() -> mnesia:foldl(Increase, 0, employee, write) end,
mnesia:transaction(IncLow).
]]>
A lot of nice things can be done with the iterator functions but some caution should be taken about performance and memory utilization for large tables.
Call these iteration functions on nodes that contain a replica of the
table. Each call to the function
Mnesia also provides some functions that make it possible for
the user to iterate over the table. The order of the
iteration is unspecified if the table is not of the
mnesia:first(Tab) -> Key | transaction abort
mnesia:last(Tab) -> Key | transaction abort
mnesia:next(Tab,Key) -> Key | transaction abort
mnesia:prev(Tab,Key) -> Key | transaction abort
mnesia:snmp_get_next_index(Tab,Index) -> {ok, NextIndex} | endOfTable
The order of first/last and next/prev are only valid for
If records are written and deleted during the traversal, use
Writing or deleting in transaction context creates a local copy of each modified record, so modifying each record in a large table uses a lot of memory. Mnesia will compensate for every written or deleted record during the iteration in a transaction context, which may reduce the performance. If possible avoid writing or deleting records in the same transaction before iterating over the table.
In dirty context, i.e.