19972009 Ericsson AB. All Rights Reserved. The contents of this file are subject to the Erlang Public License, Version 1.1, (the "License"); you may not use this file except in compliance with the License. You should have received a copy of the Erlang Public License along with this software. If not, it can be retrieved online at http://www.erlang.org/. Software distributed under the License is distributed on an "AS IS" basis, WITHOUT WARRANTY OF ANY KIND, either express or implied. See the License for the specific language governing rights and limitations under the License. Transactions and Other Access Contexts Claes Wikström, Hans Nilsson and Håkan Mattsson Mnesia_chap4.xml

This chapter describes the Mnesia transaction system and the transaction properties which make Mnesia a fault tolerant, distributed database management system.

Also covered in this chapter are the locking functions, including table locks and sticky locks, as well as alternative functions which bypass the transaction system in favor of improved speed and reduced overheads. These functions are called "dirty operations". We also describe the usage of nested transactions. This chapter contains the following sections:

transaction properties, which include atomicity, consistency, isolation, and durability Locking Dirty operations Record names vs table names Activity concept and various access contexts Nested transactions Pattern matching Iteration
Transaction Properties

Transactions are an important tool when designing fault tolerant, distributed systems. A Mnesia transaction is a mechanism by which a series of database operations can be executed as one functional block. The functional block which is run as a transaction is called a Functional Object (Fun), and this code can read, write, or delete Mnesia records. The Fun is evaluated as a transaction which either commits, or aborts. If a transaction succeeds in executing Fun it will replicate the action on all nodes involved, or abort if an error occurs.

The following example shows a transaction which raises the salary of certain employee numbers.

The transaction raise(Eno, Raise) - > contains a Fun made up of four lines of code. This Fun is called by the statement mnesia:transaction(F) and returns a value.

The Mnesia transaction system facilitates the construction of reliable, distributed systems by providing the following important properties:

The transaction handler ensures that a Fun which is placed inside a transaction does not interfere with operations embedded in other transactions when it executes a series of operations on tables. The transaction handler ensures that either all operations in the transaction are performed successfully on all nodes atomically, or the transaction fails without permanent effect on any of the nodes. The Mnesia transactions have four important properties, which we call Atomicity, Consistency,Isolation, and Durability, or ACID for short. These properties are described in the following sub-sections.
Atomicity

Atomicity means that database changes which are executed by a transaction take effect on all nodes involved, or on none of the nodes. In other words, the transaction either succeeds entirely, or it fails entirely.

Atomicity is particularly important when we want to atomically write more than one record in the same transaction. The raise/2 function, shown as an example above, writes one record only. The insert_emp/3 function, shown in the program listing in Chapter 2, writes the record employee as well as employee relations such as at_dep and in_proj into the database. If we run this latter code inside a transaction, then the transaction handler ensures that the transaction either succeeds completely, or not at all.

Mnesia is a distributed DBMS where data can be replicated on several nodes. In many such applications, it is important that a series of write operations are performed atomically inside a transaction. The atomicity property ensures that a transaction take effect on all nodes, or none at all.

Consistency

Consistency. This transaction property ensures that a transaction always leaves the DBMS in a consistent state. For example, Mnesia ensures that inconsistencies will not occur if Erlang, Mnesia or the computer crashes while a write operation is in progress.

Isolation

Isolation. This transaction property ensures that transactions which execute on different nodes in a network, and access and manipulate the same data records, will not interfere with each other.

The isolation property makes it possible to concurrently execute the raise/2 function. A classical problem in concurrency control theory is the so called "lost update problem".

The isolation property is extremely useful if the following circumstances occurs where an employee (with an employee number 123) and two processes, (P1 and P2), are concurrently trying to raise the salary for the employee. The initial value of the employees salary is, for example, 5. Process P1 then starts to execute, it reads the employee record and adds 2 to the salary. At this point in time, process P1 is for some reason preempted and process P2 has the opportunity to run. P2 reads the record, adds 3 to the salary, and finally writes a new employee record with the salary set to 8. Now, process P1 start to run again and writes its employee record with salary set to 7, thus effectively overwriting and undoing the work performed by process P2. The update performed by P2 is lost.

A transaction system makes it possible to concurrently execute two or more processes which manipulate the same record. The programmer does not need to check that the updates are synchronous, this is overseen by the transaction handler. All programs accessing the database through the transaction system may be written as if they had sole access to the data.

Durability

Durability. This transaction property ensures that changes made to the DBMS by a transaction are permanent. Once a transaction has been committed, all changes made to the database are durable - i.e. they are written safely to disc and will not be corrupted or disappear.

The durability feature described does not entirely apply to situations where Mnesia is configured as a "pure" primary memory database.

Locking

Different transaction managers employ different strategies to satisfy the isolation property. Mnesia uses the standard technique of two-phase locking. This means that locks are set on records before they are read or written. Mnesia uses five different kinds of locks.

Read locks. A read lock is set on one replica of a record before it can be read. Write locks. Whenever a transaction writes to an record, write locks are first set on all replicas of that particular record. Read table locks. If a transaction traverses an entire table in search for a record which satisfy some particular property, it is most inefficient to set read locks on the records, one by one. It is also very memory consuming, since the read locks themselves may take up considerable space if the table is very large. For this reason, Mnesia can set a read lock on an entire table. Write table locks. If a transaction writes a large number of records to one table, it is possible to set a write lock on the entire table. Sticky locks. These are write locks that stay in place at a node after the transaction which initiated the lock has terminated.

Mnesia employs a strategy whereby functions such as mnesia:read/1 acquire the necessary locks dynamically as the transactions execute. Mnesia automatically sets and releases the locks and the programmer does not have to code these operations.

Deadlocks can occur when concurrent processes set and release locks on the same records. Mnesia employs a "wait-die" strategy to resolve these situations. If Mnesia suspects that a deadlock can occur when a transaction tries to set a lock, the transaction is forced to release all its locks and sleep for a while. The Fun in the transaction will be evaluated one more time.

For this reason, it is important that the code inside the Fun given to mnesia:transaction/1 is pure. Some strange results can occur if, for example, messages are sent by the transaction Fun. The following example illustrates this situation:

This transaction could write the text "Trying to write ... " a thousand times to the terminal. Mnesia does guarantee, however, that each and every transaction will eventually run. As a result, Mnesia is not only deadlock free, but also livelock free.

The Mnesia programmer cannot prioritize one particular transaction to execute before other transactions which are waiting to execute. As a result, the Mnesia DBMS transaction system is not suitable for hard real time applications. However, Mnesia contains other features that have real time properties.

Mnesia dynamically sets and releases locks as transactions execute, therefore, it is very dangerous to execute code with transaction side-effects. In particular, a receive statement inside a transaction can lead to a situation where the transaction hangs and never returns, which in turn can cause locks not to release. This situation could bring the whole system to a standstill since other transactions which execute in other processes, or on other nodes, are forced to wait for the defective transaction.

If a transaction terminates abnormally, Mnesia will automatically release the locks held by the transaction.

We have shown examples of a number of functions that can be used inside a transaction. The following list shows the simplest Mnesia functions that work with transactions. It is important to realize that these functions must be embedded in a transaction. If no enclosing transaction (or other enclosing Mnesia activity) exists, they will all fail.

mnesia:transaction(Fun) -> {aborted, Reason} |{atomic, Value}. This function executes one transaction with the functional object Fun as the single parameter. mnesia:read({Tab, Key}) -> transaction abort | RecordList. This function reads all records with Key as key from table Tab. This function has the same semantics regardless of the location of Table. If the table is of type bag, the read({Tab, Key}) can return an arbitrarily long list. If the table is of type set, the list is either of length one, or []. mnesia:wread({Tab, Key}) -> transaction abort | RecordList. This function behaves the same way as the previously listed read/1 function, except that it acquires a write lock instead of a read lock. If we execute a transaction which reads a record, modifies the record, and then writes the record, it is slightly more efficient to set the write lock immediately. In cases where we issue a mnesia:read/1, followed by a mnesia:write/1, the first read lock must be upgraded to a write lock when the write operation is executed. mnesia:write(Record) -> transaction abort | ok. This function writes a record into the database. The Record argument is an instance of a record. The function returns ok, or aborts the transaction if an error should occur. mnesia:delete({Tab, Key}) -> transaction abort | ok. This function deletes all records with the given key. mnesia:delete_object(Record) -> transaction abort | ok. This function deletes records with object id Record. This function is used when we want to delete only some records in a table of type bag.
Sticky Locks

As previously stated, the locking strategy used by Mnesia is to lock one record when we read a record, and lock all replicas of a record when we write a record. However, there are applications which use Mnesia mainly for its fault-tolerant qualities, and these applications may be configured with one node doing all the heavy work, and a standby node which is ready to take over in case the main node fails. Such applications may benefit from using sticky locks instead of the normal locking scheme.

A sticky lock is a lock which stays in place at a node after the transaction which first acquired the lock has terminated. To illustrate this, assume that we execute the following transaction:

F = fun() -> mnesia:write(#foo{a = kalle}) end, mnesia:transaction(F).

The foo table is replicated on the two nodes N1 and N2.

Normal locking requires:

one network rpc (2 messages) to acquire the write lock three network messages to execute the two-phase commit protocol.

If we use sticky locks, we must first change the code as follows:

F = fun() -> mnesia:s_write(#foo{a = kalle}) end, mnesia:transaction(F).

This code uses the s_write/1 function instead of the write/1 function. The s_write/1 function sets a sticky lock instead of a normal lock. If the table is not replicated, sticky locks have no special effect. If the table is replicated, and we set a sticky lock on node N1, this lock will then stick to node N1. The next time we try to set a sticky lock on the same record at node N1, Mnesia will see that the lock is already set and will not do a network operation in order to acquire the lock.

It is much more efficient to set a local lock than it is to set a networked lock, and for this reason sticky locks can benefit application that use a replicated table and perform most of the work on only one of the nodes.

If a record is stuck at node N1 and we try to set a sticky lock for the record on node N2, the record must be unstuck. This operation is expensive and will reduce performance. The unsticking is done automatically if we issue s_write/1 requests at N2.

Table Locks

Mnesia supports read and write locks on whole tables as a complement to the normal locks on single records. As previously stated, Mnesia sets and releases locks automatically, and the programmer does not have to code these operations. However, transactions which read and write a large number of records in a specific table will execute more efficiently if we start the transaction by setting a table lock on this table. This will block other concurrent transactions from the table. The following two function are used to set explicit table locks for read and write operations:

mnesia:read_lock_table(Tab) Sets a read lock on the table Tab mnesia:write_lock_table(Tab) Sets a write lock on the table Tab

Alternate syntax for acquisition of table locks is as follows:

mnesia:lock({table, Tab}, read) mnesia:lock({table, Tab}, write)

The matching operations in Mnesia may either lock the entire table or just a single record (when the key is bound in the pattern).

Global Locks

Write locks are normally acquired on all nodes where a replica of the table resides (and is active). Read locks are acquired on one node (the local one if a local replica exists).

The function mnesia:lock/2 is intended to support table locks (as mentioned previously) but also for situations when locks need to be acquired regardless of how tables have been replicated:

mnesia:lock({global, GlobalKey, Nodes}, LockKind) LockKind ::= read | write | ...

The lock is acquired on the LockItem on all Nodes in the nodes list.

Dirty Operations

In many applications, the overhead of processing a transaction may result in a loss of performance. Dirty operation are short cuts which bypass much of the processing and increase the speed of the transaction.

Dirty operation are useful in many situations, for example in a datagram routing application where Mnesia stores the routing table, and it is time consuming to start a whole transaction every time a packet is received. For this reason, Mnesia has functions which manipulate tables without using transactions. This alternative to processing is known as a dirty operation. However, it is important to realize the trade-off in avoiding the overhead of transaction processing:

The atomicity and the isolation properties of Mnesia are lost. The isolation property is compromised, because other Erlang processes, which use transaction to manipulate the data, do not get the benefit of isolation if we simultaneously use dirty operations to read and write records from the same table.

The major advantage of dirty operations is that they execute much faster than equivalent operations that are processed as functional objects within a transaction.

Dirty operations are written to disc if they are performed on a table of type disc_copies, or type disc_only_copies. Mnesia also ensures that all replicas of a table are updated if a dirty write operation is performed on a table.

A dirty operation will ensure a certain level of consistency. For example, it is not possible for dirty operations to return garbled records. Hence, each individual read or write operation is performed in an atomic manner.

All dirty functions execute a call to exit({aborted, Reason}) on failure. Even if the following functions are executed inside a transaction no locks will be acquired. The following functions are available:

mnesia:dirty_read({Tab, Key}). This function reads record(s) from Mnesia. mnesia:dirty_write(Record). This function writes the record Record mnesia:dirty_delete({Tab, Key}). This function deletes record(s) with the key Key. mnesia:dirty_delete_object(Record) This function is the dirty operation alternative to the function delete_object/1

mnesia:dirty_first(Tab). This function returns the "first" key in the table Tab.

Records in set or bag tables are not sorted. However, there is a record order which is not known to the user. This means that it is possible to traverse a table by means of this function in conjunction with the dirty_next/2 function.

If there are no records at all in the table, this function will return the atom '$end_of_table'. It is not recommended to use this atom as the key for any user records.

mnesia:dirty_next(Tab, Key). This function returns the "next" key in the table Tab. This function makes it possible to traverse a table and perform some operation on all records in the table. When the end of the table is reached the special key '$end_of_table' is returned. Otherwise, the function returns a key which can be used to read the actual record.

The behavior is undefined if any process perform a write operation on the table while we traverse the table with the dirty_next/2 function. This is because write operations on a Mnesia table may lead to internal reorganizations of the table itself. This is an implementation detail, but remember the dirty functions are low level functions.
mnesia:dirty_last(Tab) This function works exactly like mnesia:dirty_first/1 but returns the last object in Erlang term order for the ordered_set table type. For all other table types, mnesia:dirty_first/1 and mnesia:dirty_last/1 are synonyms. mnesia:dirty_prev(Tab, Key) This function works exactly like mnesia:dirty_next/2 but returns the previous object in Erlang term order for the ordered_set table type. For all other table types, mnesia:dirty_next/2 and mnesia:dirty_prev/2 are synonyms.

mnesia:dirty_slot(Tab, Slot)

Returns the list of records that are associated with Slot in a table. It can be used to traverse a table in a manner similar to the dirty_next/2 function. A table has a number of slots that range from zero to some unknown upper bound. The function dirty_slot/2 returns the special atom '$end_of_table' when the end of the table is reached.

The behavior of this function is undefined if the table is written on while being traversed. mnesia:read_lock_table(Tab) may be used to ensure that no transaction protected writes are performed during the iteration.

mnesia:dirty_update_counter({Tab,Key}, Val).

Counters are positive integers with a value greater than or equal to zero. Updating a counter will add the Val and the counter where Val is a positive or negative integer.

There exists no special counter records in Mnesia. However, records on the form of {TabName, Key, Integer} can be used as counters, and can be persistent.

It is not possible to have transaction protected updates of counter records.

There are two significant differences when using this function instead of reading the record, performing the arithmetic, and writing the record:

it is much more efficient the dirty_update_counter/2 function is performed as an atomic operation although it is not protected by a transaction. Accordingly, no table update is lost if two processes simultaneously execute the dirty_update_counter/2 function.
mnesia:dirty_match_object(Pat). This function is the dirty equivalent of mnesia:match_object/1. mnesia:dirty_select(Tab, Pat). This function is the dirty equivalent of mnesia:select/2. mnesia:dirty_index_match_object(Pat, Pos). This function is the dirty equivalent of mnesia:index_match_object/2. mnesia:dirty_index_read(Tab, SecondaryKey, Pos). This function is the dirty equivalent of mnesia:index_read/3. mnesia:dirty_all_keys(Tab). This function is the dirty equivalent of mnesia:all_keys/1.
Record Names versus Table Names

In Mnesia, all records in a table must have the same name. All the records must be instances of the same record type. The record name does however not necessarily be the same as the table name. Even though that it is the case in the most of the examples in this document. If a table is created without the record_name property the code below will ensure all records in the tables have the same name as the table:

mnesia:create_table(subscriber, [])

However, if the table is is created with an explicit record name as argument, as shown below, it is possible to store subscriber records in both of the tables regardless of the table names:

TabDef = [{record_name, subscriber}], mnesia:create_table(my_subscriber, TabDef), mnesia:create_table(your_subscriber, TabDef).

In order to access such tables it is not possible to use the simplified access functions as described earlier in the document. For example, writing a subscriber record into a table requires a mnesia:write/3function instead of the simplified functions mnesia:write/1 and mnesia:s_write/1:

mnesia:write(subscriber, #subscriber{}, write) mnesia:write(my_subscriber, #subscriber{}, sticky_write) mnesia:write(your_subscriber, #subscriber{}, write)

The following simplified piece of code illustrates the relationship between the simplified access functions used in most examples and their more flexible counterparts:

mnesia:dirty_write(Record) -> Tab = element(1, Record), mnesia:dirty_write(Tab, Record). mnesia:dirty_delete({Tab, Key}) -> mnesia:dirty_delete(Tab, Key). mnesia:dirty_delete_object(Record) -> Tab = element(1, Record), mnesia:dirty_delete_object(Tab, Record) mnesia:dirty_update_counter({Tab, Key}, Incr) -> mnesia:dirty_update_counter(Tab, Key, Incr). mnesia:dirty_read({Tab, Key}) -> Tab = element(1, Record), mnesia:dirty_read(Tab, Key). mnesia:dirty_match_object(Pattern) -> Tab = element(1, Pattern), mnesia:dirty_match_object(Tab, Pattern). mnesia:dirty_index_match_object(Pattern, Attr) Tab = element(1, Pattern), mnesia:dirty_index_match_object(Tab, Pattern, Attr). mnesia:write(Record) -> Tab = element(1, Record), mnesia:write(Tab, Record, write). mnesia:s_write(Record) -> Tab = element(1, Record), mnesia:write(Tab, Record, sticky_write). mnesia:delete({Tab, Key}) -> mnesia:delete(Tab, Key, write). mnesia:s_delete({Tab, Key}) -> mnesia:delete(Tab, Key, sticky_write). mnesia:delete_object(Record) -> Tab = element(1, Record), mnesia:delete_object(Tab, Record, write). mnesia:s_delete_object(Record) -> Tab = element(1, Record), mnesia:delete_object(Tab, Record. sticky_write). mnesia:read({Tab, Key}) -> mnesia:read(Tab, Key, read). mnesia:wread({Tab, Key}) -> mnesia:read(Tab, Key, write). mnesia:match_object(Pattern) -> Tab = element(1, Pattern), mnesia:match_object(Tab, Pattern, read). mnesia:index_match_object(Pattern, Attr) -> Tab = element(1, Pattern), mnesia:index_match_object(Tab, Pattern, Attr, read).
Activity Concept and Various Access Contexts

As previously described, a functional object (Fun) performing table access operations as listed below may be passed on as arguments to the function mnesia:transaction/1,2,3:

mnesia:write/3 (write/1, s_write/1)

mnesia:delete/3 (delete/1, s_delete/1)

mnesia:delete_object/3 (delete_object/1, s_delete_object/1)

mnesia:read/3 (read/1, wread/1)

mnesia:match_object/2 (match_object/1)

mnesia:select/3 (select/2)

mnesia:foldl/3 (foldl/4, foldr/3, foldr/4)

mnesia:all_keys/1

mnesia:index_match_object/4 (index_match_object/2)

mnesia:index_read/3

mnesia:lock/2 (read_lock_table/1, write_lock_table/1)

mnesia:table_info/2

These functions will be performed in a transaction context involving mechanisms like locking, logging, replication, checkpoints, subscriptions, commit protocols etc.However, the same function may also be evaluated in other activity contexts.

The following activity access contexts are currently supported:

transaction

sync_transaction

async_dirty

sync_dirty

ets

By passing the same "fun" as argument to the function mnesia:sync_transaction(Fun [, Args]) it will be performed in synced transaction context. Synced transactions waits until all active replicas has committed the transaction (to disc) before returning from the mnesia:sync_transaction call. Using sync_transaction is useful for applications that are executing on several nodes and want to be sure that the update is performed on the remote nodes before a remote process is spawned or a message is sent to a remote process, and also when combining transaction writes with dirty_reads. This is also useful in situations where an application performs frequent or voluminous updates which may overload Mnesia on other nodes.

By passing the same "fun" as argument to the function mnesia:async_dirty(Fun [, Args]) it will be performed in dirty context. The function calls will be mapped to the corresponding dirty functions. This will still involve logging, replication and subscriptions but there will be no locking, local transaction storage or commit protocols involved. Checkpoint retainers will be updated but will be updated "dirty". Thus, they will be updated asynchronously. The functions will wait for the operation to be performed on one node but not the others. If the table resides locally no waiting will occur.

By passing the same "fun" as an argument to the function mnesia:sync_dirty(Fun [, Args]) it will be performed in almost the same context as mnesia:async_dirty/1,2. The difference is that the operations are performed synchronously. The caller will wait for the updates to be performed on all active replicas. Using sync_dirty is useful for applications that are executing on several nodes and want to be sure that the update is performed on the remote nodes before a remote process is spawned or a message is sent to a remote process. This is also useful in situations where an application performs frequent or voluminous updates which may overload Mnesia on other nodes.

You can check if your code is executed within a transaction with mnesia:is_transaction/0, it returns true when called inside a transaction context and false otherwise.

Mnesia tables with storage type RAM_copies and disc_copies are implemented internally as "ets-tables" and it is possible for applications to access the these tables directly. This is only recommended if all options have been weighed and the possible outcomes are understood. By passing the earlier mentioned "fun" to the function mnesia:ets(Fun [, Args]) it will be performed but in a very raw context. The operations will be performed directly on the local ets tables assuming that the local storage type are RAM_copies and that the table is not replicated on other nodes. Subscriptions will not be triggered nor checkpoints updated, but this operation is blindingly fast. Disc resident tables should not be updated with the ets-function since the disc will not be updated.

The Fun may also be passed as an argument to the function mnesia:activity/2,3,4 which enables usage of customized activity access callback modules. It can either be obtained directly by stating the module name as argument or implicitly by usage of the access_module configuration parameter. A customized callback module may be used for several purposes, such as providing triggers, integrity constraints, run time statistics, or virtual tables.

The callback module does not have to access real Mnesia tables, it is free to do whatever it likes as long as the callback interface is fulfilled.

In Appendix C "The Activity Access Call Back Interface" the source code for one alternate implementation is provided (mnesia_frag.erl). The context sensitive function mnesia:table_info/2 may be used to provide virtual information about a table. One usage of this is to perform QLC queries within an activity context with a customized callback module. By providing table information about table indices and other QLC requirements, QLC may be used as a generic query language to access virtual tables.

QLC queries may be performed in all these activity contexts (transaction, sync_transaction, async_dirty, sync_dirty and ets). The ets activity will only work if the table has no indices.

The mnesia:dirty_* function always executes with async_dirty semantics regardless of which activity access contexts are invoked. They may even invoke contexts without any enclosing activity access context.

Nested transactions

Transactions may be nested in an arbitrary fashion. A child transaction must run in the same process as its parent. When a child transaction aborts, the caller of the child transaction will get the return value {aborted, Reason} and any work performed by the child will be erased. If a child transaction commits, the records written by the child will be propagated to the parent.

No locks are released when child transactions terminate. Locks created by a sequence of nested transactions are kept until the topmost transaction terminates. Furthermore, any updates performed by a nested transaction are only propagated in such a manner so that the parent of the nested transaction sees the updates. No final commitment will be done until the top level transaction is terminated. So, although a nested transaction returns {atomic, Val}, if the enclosing parent transaction is aborted, the entire nested operation is aborted.

The ability to have nested transaction with identical semantics as top level transaction makes it easier to write library functions that manipulate mnesia tables.

Say for example that we have a function that adds a new subscriber to a telephony system:

      add_subscriber(S) ->
          mnesia:transaction(fun() ->
              case mnesia:read( ..........
    

This function needs to be called as a transaction. Now assume that we wish to write a function that both calls the add_subscriber/1 function and is in itself protected by the context of a transaction. By simply calling the add_subscriber/1 from within another transaction, a nested transaction is created.

It is also possible to mix different activity access contexts while nesting, but the dirty ones (async_dirty,sync_dirty and ets) will inherit the transaction semantics if they are called inside a transaction and thus it will grab locks and use two or three phase commit.

      add_subscriber(S) ->
          mnesia:transaction(fun() ->
             %% Transaction context 
             mnesia:read({some_tab, some_data}),
             mnesia:sync_dirty(fun() ->
                 %% Still in a transaction context.
                 case mnesia:read( ..) ..end), end).
      add_subscriber2(S) ->
          mnesia:sync_dirty(fun() ->
             %% In dirty context 
             mnesia:read({some_tab, some_data}),
             mnesia:transaction(fun() ->
                 %% In a transaction context.
                 case mnesia:read( ..) ..end), end).
    
Pattern Matching

When it is not possible to use mnesia:read/3 Mnesia provides the programmer with several functions for matching records against a pattern. The most useful functions of these are:

mnesia:select(Tab, MatchSpecification, LockKind) -> transaction abort | [ObjectList] mnesia:select(Tab, MatchSpecification, NObjects, Lock) -> transaction abort | {[Object],Continuation} | '$end_of_table' mnesia:select(Cont) -> transaction abort | {[Object],Continuation} | '$end_of_table' mnesia:match_object(Tab, Pattern, LockKind) -> transaction abort | RecordList

These functions matches a Pattern against all records in table Tab. In a mnesia:select call Pattern is a part of MatchSpecification described below. It is not necessarily performed as an exhaustive search of the entire table. By utilizing indices and bound values in the key of the pattern, the actual work done by the function may be condensed into a few hash lookups. Using ordered_set tables may reduce the search space if the keys are partially bound.

The pattern provided to the functions must be a valid record, and the first element of the provided tuple must be the record_name of the table. The special element '_' matches any data structure in Erlang (also known as an Erlang term). The special elements ']]> behaves as Erlang variables i.e. matches anything and binds the first occurrence and matches the coming occurrences of that variable against the bound value.

Use the function mnesia:table_info(Tab, wild_pattern) to obtain a basic pattern which matches all records in a table or use the default value in record creation. Do not make the pattern hard coded since it will make your code more vulnerable to future changes of the record definition.

Wildpattern = mnesia:table_info(employee, wild_pattern), %% Or use Wildpattern = #employee{_ = '_'},

For the employee table the wild pattern will look like:

{employee, '_', '_', '_', '_', '_',' _'}.

In order to constrain the match you must replace some of the '_' elements. The code for matching out all female employees, looks like:

Pat = #employee{sex = female, _ = '_'}, F = fun() -> mnesia:match_object(Pat) end, Females = mnesia:transaction(F).

It is also possible to use the match function if we want to check the equality of different attributes. Assume that we want to find all employees which happens to have a employee number which is equal to their room number:

Pat = #employee{emp_no = '$1', room_no = '$1', _ = '_'}, F = fun() -> mnesia:match_object(Pat) end, Odd = mnesia:transaction(F).

The function mnesia:match_object/3 lacks some important features that mnesia:select/3 have. For example mnesia:match_object/3 can only return the matching records, and it can not express constraints other then equality. If we want to find the names of the male employees on the second floor we could write:

Select can be used to add additional constraints and create output which can not be done with mnesia:match_object/3.

The second argument to select is a MatchSpecification. A MatchSpecification is list of MatchFunctions, where each MatchFunction consists of a tuple containing {MatchHead, MatchCondition, MatchBody}. MatchHead is the same pattern used in mnesia:match_object/3 described above. MatchCondition is a list of additional constraints applied to each record, and MatchBody is used to construct the return values.

A detailed explanation of match specifications can be found in the Erts users guide: Match specifications in Erlang , and the ets/dets documentations may provide some additional information.

The functions select/4 and select/1 are used to get a limited number of results, where the Continuation are used to get the next chunk of results. Mnesia uses the NObjects as an recommendation only, thus more or less results then specified with NObjects may be returned in the result list, even the empty list may be returned despite there are more results to collect.

There is a severe performance penalty in using mnesia:select/[1|2|3|4] after any modifying operations are done on that table in the same transaction, i.e. avoid using mnesia:write/1 or mnesia:delete/1 before a mnesia:select in the same transaction.

If the key attribute is bound in a pattern, the match operation is very efficient. However, if the key attribute in a pattern is given as '_', or '$1', the whole employee table must be searched for records that match. Hence if the table is large, this can become a time consuming operation, but it can be remedied with indices (refer to Chapter 5: Indexing) if mnesia:match_object is used.

QLC queries can also be used to search Mnesia tables. By using mnesia:table/[1|2] as the generator inside a QLC query you let the query operate on a mnesia table. Mnesia specific options to mnesia:table/2 are {lock, Lock}, {n_objects,Integer} and {traverse, SelMethod}. The lock option specifies whether mnesia should acquire a read or write lock on the table, and n_objects specifies how many results should be returned in each chunk to QLC. The last option is traverse and it specifies which function mnesia should use to traverse the table. Default select is used, but by using {traverse, {select, MatchSpecification}} as an option to mnesia:table/2 the user can specify it's own view of the table.

If no options are specified a read lock will acquired and 100 results will be returned in each chunk, and select will be used to traverse the table, i.e.:

mnesia:table(Tab) -> mnesia:table(Tab, [{n_objects,100},{lock, read}, {traverse, select}]).

The function mnesia:all_keys(Tab) returns all keys in a table.

Iteration

Mnesia provides a couple of functions which iterates over all the records in a table.

mnesia:foldl(Fun, Acc0, Tab) -> NewAcc | transaction abort mnesia:foldr(Fun, Acc0, Tab) -> NewAcc | transaction abort mnesia:foldl(Fun, Acc0, Tab, LockType) -> NewAcc | transaction abort mnesia:foldr(Fun, Acc0, Tab, LockType) -> NewAcc | transaction abort

These functions iterate over the mnesia table Tab and apply the function Fun to each record. The Fun takes two arguments, the first argument is a record from the table and the second argument is the accumulator. The Fun return a new accumulator.

The first time the Fun is applied Acc0 will be the second argument. The next time the Fun is called the return value from the previous call, will be used as the second argument. The term the last call to the Fun returns will be the return value of the fold[lr] function.

The difference between foldl and foldr is the order the table is accessed for ordered_set tables, for every other table type the functions are equivalent.

LockType specifies what type of lock that shall be acquired for the iteration, default is read. If records are written or deleted during the iteration a write lock should be acquired.

These functions might be used to find records in a table when it is impossible to write constraints for mnesia:match_object/3, or when you want to perform some action on certain records.

For example finding all the employees who has a salary below 10 could look like:

Constraint = fun(Emp, Acc) when Emp#employee.salary < 10 -> [Emp | Acc]; (_, Acc) -> Acc end, Find = fun() -> mnesia:foldl(Constraint, [], employee) end, mnesia:transaction(Find). ]]>

Raising the salary to 10 for everyone with a salary below 10 and return the sum of all raises:

Increase = fun(Emp, Acc) when Emp#employee.salary < 10 -> OldS = Emp#employee.salary, ok = mnesia:write(Emp#employee{salary = 10}), Acc + 10 - OldS; (_, Acc) -> Acc end, IncLow = fun() -> mnesia:foldl(Increase, 0, employee, write) end, mnesia:transaction(IncLow). ]]>

A lot of nice things can be done with the iterator functions but some caution should be taken about performance and memory utilization for large tables.

Call these iteration functions on nodes that contain a replica of the table. Each call to the function Fun access the table and if the table resides on another node it will generate a lot of unnecessary network traffic.

Mnesia also provides some functions that make it possible for the user to iterate over the table. The order of the iteration is unspecified if the table is not of the ordered_set type.

mnesia:first(Tab) -> Key | transaction abort mnesia:last(Tab) -> Key | transaction abort mnesia:next(Tab,Key) -> Key | transaction abort mnesia:prev(Tab,Key) -> Key | transaction abort mnesia:snmp_get_next_index(Tab,Index) -> {ok, NextIndex} | endOfTable

The order of first/last and next/prev are only valid for ordered_set tables, for all other tables, they are synonyms. When the end of the table is reached the special key '$end_of_table' is returned.

If records are written and deleted during the traversal, use mnesia:fold[lr]/4 with a write lock. Or mnesia:write_lock_table/1 when using first and next.

Writing or deleting in transaction context creates a local copy of each modified record, so modifying each record in a large table uses a lot of memory. Mnesia will compensate for every written or deleted record during the iteration in a transaction context, which may reduce the performance. If possible avoid writing or deleting records in the same transaction before iterating over the table.

In dirty context, i.e. sync_dirty or async_dirty, the modified records are not stored in a local copy; instead, each record is updated separately. This generates a lot of network traffic if the table has a replica on another node and has all the other drawbacks that dirty operations have. Especially for the mnesia:first/1 and mnesia:next/2 commands, the same drawbacks as described above for dirty_first and dirty_next applies, i.e. no writes to the table should be done during iteration.