Age | Commit message (Collapse) | Author |
|
|
|
Bad timing could lead to hanging transactions after a mnesia down from a
node with sticky locks.
Excellent bug report from janchochol
Situation:
* node A and B have copies of table T
* node A ows sticky of table T
* node A goes down (e.g. crash)
* node B tries to perform transactional operation on table T
(e.g. mnesia:select)
In this situation there is possibility that first (and maybe other)
transaction on node B will hang indefinitely.
This is caused by race condition, when transaction process send lock
request operation to node A and waits for reply. When node A is down
it will never send reply, so process on node B will be stuck
forever.
Reason is that message sent to mnesia_locker gen_server from
mnesia_locker:mnesia_down can be received after mnesia_locker gen_server
already replies to transaction processes with {switch, N, Req} and
node N is down.
Monitoring remote process when sending request to other node should
be safe solution.
|
|
|
|
timer:send_interval behaves badly when resuming from sleep on some
platforms. For example, if I sleep for 10 minutes, and have a
send_interval running once per minute, when I resume, 10 messages
will be sent immediately, eliminating the benefit of only running
the work periodically. This is admittedly a separate bug with
send_interval, but the workaround is straightforward, and also
protects from messages piling up in the queue when the work takes
longer than the interval.
This patch fixes piled up error reports on resume from sleep:
** WARNING ** Mnesia is overloaded: {dump_log, write_threshold}
You'll still be warned if mnesia is overloaded, just not repeatedly.
Additionally, erlang:send_after is more efficient than using the
timer module equivalent [1]
[1] http://www.erlang.org/doc/efficiency_guide/commoncaveats.html#id57251
|
|
|
|
|
|
|
|
Since the table loader also sets (table) write locks, a special
lock type, 'load', was needed. Unfortunately, this affects mnesia
activity callbacks that redefine the lock operation.
|
|
With {majority, true} set for a table, write transactions will
abort if they cannot commit to a majority of the nodes that
have a copy of the table. Currently, the implementation hooks
into the prepare_commit, and forces an asymmetric transaction
if the commit set affects any table with the majority flag set.
In the commit itself, the transaction will abort if it cannot
satisfy the majority requirement for all tables involved in the
thransaction.
A future optimization might be to abort already when a write
lock is attempted on such a table (/-object) and the lock cannot
be set on enough nodes.
This functionality introduces the possibility to automatically
"fence off" a table in the presence of failures.
This is a first implementation. Only basic tests have been
performed.
|
|
With help from Kostis
|
|
A process that calls mnesia:subscribe(activity) will receive the message:
{mnesia_activity_event, ActivityID, complete}
when any activity that caused a change to a database has finished
committing its changes. This allows a subscriber to collect messages
already available through the mnesia:subscribe({table, ...}) system
to group them as completed transactions.
|
|
invoking mnesia:sync_transaction/[1,2]. Thanks Igor Ribeiro
Sucupira.
|
|
|