diff options
author | Dan Gudmundsson <[email protected]> | 2013-10-10 10:55:46 +0200 |
---|---|---|
committer | Dan Gudmundsson <[email protected]> | 2013-11-25 12:33:16 +0100 |
commit | 9237801d22a38d2643ffe94ab626c4d2815012dd (patch) | |
tree | 7ffb6c1af05db1c45d30f6ce72f8548934f52b60 /lib/mnesia/src/mnesia_monitor.erl | |
parent | 3abf1b5ef82478b152581152ad3ec749e8b7edaa (diff) | |
download | otp-9237801d22a38d2643ffe94ab626c4d2815012dd.tar.gz otp-9237801d22a38d2643ffe94ab626c4d2815012dd.tar.bz2 otp-9237801d22a38d2643ffe94ab626c4d2815012dd.zip |
mnesia: Synchronize lock cleanup after mnesia down
Bad timing could lead to hanging transactions after a mnesia down from a
node with sticky locks.
Excellent bug report from janchochol
Situation:
* node A and B have copies of table T
* node A ows sticky of table T
* node A goes down (e.g. crash)
* node B tries to perform transactional operation on table T
(e.g. mnesia:select)
In this situation there is possibility that first (and maybe other)
transaction on node B will hang indefinitely.
This is caused by race condition, when transaction process send lock
request operation to node A and waits for reply. When node A is down
it will never send reply, so process on node B will be stuck
forever.
Reason is that message sent to mnesia_locker gen_server from
mnesia_locker:mnesia_down can be received after mnesia_locker gen_server
already replies to transaction processes with {switch, N, Req} and
node N is down.
Monitoring remote process when sending request to other node should
be safe solution.
Diffstat (limited to 'lib/mnesia/src/mnesia_monitor.erl')
-rw-r--r-- | lib/mnesia/src/mnesia_monitor.erl | 6 |
1 files changed, 1 insertions, 5 deletions
diff --git a/lib/mnesia/src/mnesia_monitor.erl b/lib/mnesia/src/mnesia_monitor.erl index 7a788238fc..438da65158 100644 --- a/lib/mnesia/src/mnesia_monitor.erl +++ b/lib/mnesia/src/mnesia_monitor.erl @@ -482,11 +482,7 @@ handle_cast({mnesia_down, mnesia_controller, Node}, State) -> mnesia_tm:mnesia_down(Node), {noreply, State}; -handle_cast({mnesia_down, mnesia_tm, {Node, Pending}}, State) -> - mnesia_locker:mnesia_down(Node, Pending), - {noreply, State}; - -handle_cast({mnesia_down, mnesia_locker, Node}, State) -> +handle_cast({mnesia_down, mnesia_tm, Node}, State) -> Down = {mnesia_down, Node}, mnesia_lib:report_system_event(Down), GoingDown = lists:delete(Node, State#state.going_down), |