This section is to be read with the
This is a new behavior in Erlang/OTP 19.0. It has been thoroughly reviewed, is stable enough to be used by at least two heavy OTP applications, and is here to stay. Depending on user feedback, we do not expect but can find it necessary to make minor not backward compatible changes into Erlang/OTP 20.0.
Established Automata Theory does not deal much with how a state transition is triggered, but assumes that the output is a function of the input (and the state) and that they are some kind of values.
For an Event-Driven State Machine, the input is an event that triggers a state transition and the output is actions executed during the state transition. It can analogously to the mathematical model of a Finite-State Machine be described as a set of relations of the following form:
State(S) x Event(E) -> Actions(A), State(S')
These relations are interpreted as follows:
if we are in state
As
Like most
The
In mode
StateName(EventType, EventContent, Data) -> ... code for actions here ... {next_state, NewStateName, NewData}.
This form is used in most examples here for example in section
In mode
handle_event(EventType, EventContent, State, Data) -> ... code for actions here ... {next_state, NewState, NewData}
See section
Both these modes allow other return tuples; see
The two
This can be done, for example, by focusing on one state at the time and for every state ensure that all events are handled. Alternatively, you can focus on one event at the time and ensure that it is handled in every state. You can also use a mix of these strategies.
With
This mode fits well when you have a regular state diagram, like the ones in this chapter, which describes all events and actions belonging to a state visually around that state, and each state has its unique name.
With
This mode works equally well when you want to focus on
one event at the time or on
one state at the time, but function
The mode enables the use of non-atom states, for example,
complex states or even hierarchical states.
If, for example, a state diagram is largely alike
for the client side and the server side of a protocol,
you can have a state
The
StateName(enter, _OldState, Data) -> ... code for state entry actions here ... {keep_state, NewData}; StateName(EventType, EventContent, Data) -> ... code for actions here ... {next_state, NewStateName, NewData}.
Depending on how your state machine is specified,
this can be a very useful feature,
but it forces you to handle the state enter calls in all states.
See also the
In the first section
There are more specific state-transition actions
that a callback function can order the
For details, see the
Events are categorized in different
The following is a complete list of event types and where they come from:
A door with a code lock can be seen as a state machine. Initially, the door is locked. When someone presses a button, an event is generated. Depending on what buttons have been pressed before, the sequence so far can be correct, incomplete, or wrong. If correct, the door is unlocked for 10 seconds (10,000 milliseconds). If incomplete, we wait for another button to be pressed. If wrong, we start all over, waiting for a new button sequence.
This code lock state machine can be implemented using
gen_statem:start_link({local,?NAME}, ?MODULE, Code, []).
button(Digit) ->
gen_statem:cast(?NAME, {button,Digit}).
init(Code) ->
do_lock(),
Data = #{code => Code, remaining => Code},
{ok, locked, Data}.
callback_mode() ->
state_functions.
locked(
cast, {button,Digit},
#{code := Code, remaining := Remaining} = Data) ->
case Remaining of
[Digit] ->
do_unlock(),
{next_state, open, Data#{remaining := Code},
[{state_timeout,10000,lock}]};
[Digit|Rest] -> % Incomplete
{next_state, locked, Data#{remaining := Rest}};
_Wrong ->
{next_state, locked, Data#{remaining := Code}}
end.
open(state_timeout, lock, Data) ->
do_lock(),
{next_state, locked, Data};
open(cast, {button,_}, Data) ->
{next_state, open, Data}.
do_lock() ->
io:format("Lock~n", []).
do_unlock() ->
io:format("Unlock~n", []).
terminate(_Reason, State, _Data) ->
State =/= locked andalso do_lock(),
ok.
code_change(_Vsn, State, Data, _Extra) ->
{ok, State, Data}.
]]>
The code is explained in the next sections.
In the example in the previous section,
gen_statem:start_link({local,?NAME}, ?MODULE, Code, []).
]]>
The first argument,
If the name is omitted, the
The second argument,
The interface functions (
The third argument,
The fourth argument,
If name registration succeeds, the new
do_lock(),
Data = #{code => Code, remaining => Code},
{ok,locked,Data}.
]]>
Function
Function
state_functions.
]]>
Function
The function notifying the code lock about a button event is
implemented using
gen_statem:cast(?NAME, {button,Digit}).
]]>
The first argument is the name of the
The event is made into a message and sent to the
case Remaining of
[Digit] -> % Complete
do_unlock(),
{next_state, open, Data#{remaining := Code},
[{state_timeout,10000,lock}]};
[Digit|Rest] -> % Incomplete
{next_state, locked, Data#{remaining := Rest}};
[_|_] -> % Wrong
{next_state, locked, Data#{remaining := Code}}
end.
open(state_timeout, lock, Data) ->
do_lock(),
{next_state, locked, Data};
open(cast, {button,_}, Data) ->
{next_state, open, Data}.
]]>
If the door is locked and a button is pressed, the pressed
button is compared with the next correct button.
Depending on the result, the door is either unlocked
and the
If the pressed button is incorrect, the server data restarts from the start of the code sequence.
If the whole code is correct, the server changes states
to
In state
When a correct code has been given, the door is unlocked and
the following tuple is returned from
10,000 is a time-out value in milliseconds.
After this time (10 seconds), a time-out occurs.
Then,
do_lock(),
{next_state, locked, Data};
]]>
The timer for a state time-out is automatically cancelled
when the state machine changes states. You can restart
a state time-out by setting it to a new time, which cancels
the running timer and starts a new. This implies that
you can cancel a state time-out by restarting it with
time
Sometimes events can arrive in any state of the
Consider a
gen_statem:call(?NAME, code_length).
...
locked(...) -> ... ;
locked(EventType, EventContent, Data) ->
handle_event(EventType, EventContent, Data).
...
open(...) -> ... ;
open(EventType, EventContent, Data) ->
handle_event(EventType, EventContent, Data).
handle_event({call,From}, code_length, #{code := Code} = Data) ->
{keep_state, Data, [{reply,From,length(Code)}]}.
]]>
This example uses
If mode
handle_event_function.
handle_event(cast, {button,Digit}, State, #{code := Code} = Data) ->
case State of
locked ->
case maps:get(remaining, Data) of
[Digit] -> % Complete
do_unlock(),
{next_state, open, Data#{remaining := Code},
[{state_timeout,10000,lock}]};
[Digit|Rest] -> % Incomplete
{keep_state, Data#{remaining := Rest}};
[_|_] -> % Wrong
{keep_state, Data#{remaining := Code}}
end;
open ->
keep_state_and_data
end;
handle_event(state_timeout, lock, open, Data) ->
do_lock(),
{next_state, locked, Data}.
...
]]>
If the
If it is necessary to clean up before termination, the shutdown
strategy must be a time-out value and the
process_flag(trap_exit, true),
do_lock(),
...
]]>
When ordered to shut down, the
In this example, function
State =/= locked andalso do_lock(),
ok.
]]>
If the
gen_statem:stop(?NAME).
]]>
This makes the
A time-out feature inherited from
It is ordered by the state transition action
This type of time-out is useful to for example act on inactivity. Let us restart the code sequence if no button is pressed for say 30 seconds:
{next_state, locked, Data#{remaining := Code}};
locked(
cast, {button,Digit},
#{code := Code, remaining := Remaining} = Data) ->
...
[Digit|Rest] -> % Incomplete
{next_state, locked, Data#{remaining := Rest}, 30000};
...
]]>
Whenever we receive a button event we start an event time-out
of 30 seconds, and if we get an event type
An event time-out is cancelled by any other event so you either get some other event or the time-out event. It is therefore not possible nor needed to cancel or restart an event time-out. Whatever event you act on has already cancelled the event time-out...
The previous example of state time-outs only work if the state machine stays in the same state during the time-out time. And event time-outs only work if no disturbing unrelated events occur.
You may want to start a timer in one state and respond
to the time-out in another, maybe cancel the time-out
without changing states, or perhaps run multiple
time-outs in parallel. All this can be accomplished with
Here is how to accomplish the state time-out
in the previous example by instead using a generic time-out
named
case Remaining of
[Digit] ->
do_unlock(),
{next_state, open, Data#{remaining := Code},
[{{timeout,open_tm},10000,lock}]};
...
open({timeout,open_tm}, lock, Data) ->
do_lock(),
{next_state,locked,Data};
open(cast, {button,_}, Data) ->
{keep_state,Data};
...
]]>
Just as
Another way to handle a late time-out can be to not cancel it, but to ignore it if it arrives in a state where it is known to be late.
The most versatile way to handle time-outs is to use
Erlang Timers; see
Here is how to accomplish the state time-out in the previous example by instead using an Erlang Timer:
case Remaining of
[Digit] ->
do_unlock(),
Tref = erlang:start_timer(10000, self(), lock),
{next_state, open, Data#{remaining := Code, timer => Tref}};
...
open(info, {timeout,Tref,lock}, #{timer := Tref} = Data) ->
do_lock(),
{next_state,locked,maps:remove(timer, Data)};
open(cast, {button,_}, Data) ->
{keep_state,Data};
...
]]>
Removing the
If you need to cancel a timer because of some other event, you can use
Another way to handle a late time-out can be to not cancel it, but to ignore it if it arrives in a state where it is known to be late.
If you want to ignore a particular event in the current state
and handle it in a future state, you can postpone the event.
A postponed event is retried after the state has
changed, that is,
Postponing is ordered by the state transition
In this example, instead of ignoring button events
while in the
{keep_state,Data,[postpone]};
...
]]>
Since a postponed event is only retried after a state change,
you have to think about where to keep a state data item.
You can keep it in the server
This is not important if you do not postpone events. But if you later decide to start postponing some events, then the design flaw of not having separate states when they should be, might become a hard to find bug.
It is not uncommon that a state diagram does not specify how to handle events that are not illustrated in a particular state in the diagram. Hopefully this is described in an associated text or from the context.
Possible actions: ignore as in drop the event (maybe log it) or deal with the event in some other state as in postpone it.
Erlang's selective receive statement is often used to describe simple state machine examples in straightforward Erlang code. The following is a possible implementation of the first example:
spawn(
fun () ->
true = register(?NAME, self()),
do_lock(),
locked(Code, Code)
end).
button(Digit) ->
?NAME ! {button,Digit}.
locked(Code, [Digit|Remaining]) ->
receive
{button,Digit} when Remaining =:= [] ->
do_unlock(),
open(Code);
{button,Digit} ->
locked(Code, Remaining);
{button,_} ->
locked(Code, Code)
end.
open(Code) ->
receive
after 10000 ->
do_lock(),
locked(Code, Code)
end.
do_lock() ->
io:format("Locked~n", []).
do_unlock() ->
io:format("Open~n", []).
]]>
The selective receive in this case causes implicitly
A selective receive cannot be used from a
The state transition
Both mechanisms have the same theoretical time and memory complexity, while the selective receive language construct has smaller constant factors.
Say you have a state machine specification
that uses state entry actions.
Allthough you can code this using self-generated events
(described in the next section), especially if just
one or a few states has got state entry actions,
this is a perfect use case for the built in
You return a list containing
process_flag(trap_exit, true),
Data = #{code => Code},
{ok, locked, Data}.
callback_mode() ->
[state_functions,state_enter].
locked(enter, _OldState, Data) ->
do_lock(),
{keep_state,Data#{remaining => Code}};
locked(
cast, {button,Digit},
#{code := Code, remaining := Remaining} = Data) ->
case Remaining of
[Digit] ->
{next_state, open, Data};
...
open(enter, _OldState, _Data) ->
do_unlock(),
{keep_state_and_data, [{state_timeout,10000,lock}]};
open(state_timeout, lock, Data) ->
{next_state, locked, Data};
...
]]>
You can repeat the state entry code by returning one of
It can sometimes be beneficial to be able to generate events
to your own state machine.
This can be done with the state transition
You can generate events of any existing
One example for this is to pre-process incoming data, for example decrypting chunks or collecting characters up to a line break. Purists may argue that this should be modelled with a separate state machine that sends pre-processed events to the main state machine. But to decrease overhead the small pre-processing state machine can be implemented in the common state event handling of the main state machine using a few state data variables that then sends the pre-processed events as internal events to the main state machine.
The following example uses an input model where you give the lock
characters with
gen_statem:call(?NAME, {chars,Chars}).
enter() ->
gen_statem:call(?NAME, enter).
...
locked(enter, _OldState, Data) ->
do_lock(),
{keep_state,Data#{remaining => Code, buf => []}};
...
handle_event({call,From}, {chars,Chars}, #{buf := Buf} = Data) ->
{keep_state, Data#{buf := [Chars|Buf],
[{reply,From,ok}]};
handle_event({call,From}, enter, #{buf := Buf} = Data) ->
Chars = unicode:characters_to_binary(lists:reverse(Buf)),
try binary_to_integer(Chars) of
Digit ->
{keep_state, Data#{buf := []},
[{reply,From,ok},
{next_event,internal,{button,Chars}}]}
catch
error:badarg ->
{keep_state, Data#{buf := []},
[{reply,From,{error,not_an_integer}}]}
end;
...
]]>
If you start this program with
This section includes the example after most of the mentioned modifications and some more using state enter calls, which deserves a new state diagram:
Notice that this state diagram does not specify how to handle
a button event in the state
Using state functions:
gen_statem:start_link({local,?NAME}, ?MODULE, Code, []).
stop() ->
gen_statem:stop(?NAME).
button(Digit) ->
gen_statem:cast(?NAME, {button,Digit}).
code_length() ->
gen_statem:call(?NAME, code_length).
init(Code) ->
process_flag(trap_exit, true),
Data = #{code => Code},
{ok, locked, Data}.
callback_mode() ->
[state_functions,state_enter].
locked(enter, _OldState, #{code := Code} = Data) ->
do_lock(),
{keep_state, Data#{remaining => Code}};
locked(
timeout, _,
#{code := Code, remaining := Remaining} = Data) ->
{keep_state, Data#{remaining := Code}};
locked(
cast, {button,Digit},
#{code := Code, remaining := Remaining} = Data) ->
case Remaining of
[Digit] -> % Complete
{next_state, open, Data};
[Digit|Rest] -> % Incomplete
{keep_state, Data#{remaining := Rest}, 30000};
[_|_] -> % Wrong
{keep_state, Data#{remaining := Code}}
end;
locked(EventType, EventContent, Data) ->
handle_event(EventType, EventContent, Data).
open(enter, _OldState, _Data) ->
do_unlock(),
{keep_state_and_data, [{state_timeout,10000,lock}]};
open(state_timeout, lock, Data) ->
{next_state, locked, Data};
open(cast, {button,_}, _) ->
{keep_state_and_data, [postpone]};
open(EventType, EventContent, Data) ->
handle_event(EventType, EventContent, Data).
handle_event({call,From}, code_length, #{code := Code}) ->
{keep_state_and_data, [{reply,From,length(Code)}]}.
do_lock() ->
io:format("Locked~n", []).
do_unlock() ->
io:format("Open~n", []).
terminate(_Reason, State, _Data) ->
State =/= locked andalso do_lock(),
ok.
code_change(_Vsn, State, Data, _Extra) ->
{ok,State,Data}.
]]>
This section describes what to change in the example
to use one
[handle_event_function,state_enter].
%% State: locked
handle_event(
enter, _OldState, locked,
#{code := Code} = Data) ->
do_lock(),
{keep_state, Data#{remaining => Code}};
handle_event(
timeout, _, locked,
#{code := Code, remaining := Remaining} = Data) ->
{keep_state, Data#{remaining := Code}};
handle_event(
cast, {button,Digit}, locked,
#{code := Code, remaining := Remaining} = Data) ->
case Remaining of
[Digit] -> % Complete
{next_state, open, Data};
[Digit|Rest] -> % Incomplete
{keep_state, Data#{remaining := Rest}, 30000};
[_|_] -> % Wrong
{keep_state, Data#{remaining := Code}}
end;
%%
%% State: open
handle_event(enter, _OldState, open, _Data) ->
do_unlock(),
{keep_state_and_data, [{state_timeout,10000,lock}]};
handle_event(state_timeout, lock, open, Data) ->
{next_state, locked, Data};
handle_event(cast, {button,_}, open, _) ->
{keep_state_and_data,[postpone]};
%%
%% Any state
handle_event({call,From}, code_length, _State, #{code := Code}) ->
{keep_state_and_data, [{reply,From,length(Code)}]}.
...
]]>
Notice that postponing buttons from the
The example servers so far in this chapter print the full internal state in the error log, for example, when killed by an exit signal or because of an internal error. This state contains both the code lock code and which digits that remain to unlock.
This state data can be regarded as sensitive, and maybe not what you want in the error log because of some unpredictable event.
Another reason to filter the state can be that the state is too large to print, as it fills the error log with uninteresting details.
To avoid this, you can format the internal state
that gets in the error log and gets returned from
StateData =
{State,
maps:filter(
fun (code, _) -> false;
(remaining, _) -> false;
(_, _) -> true
end,
Data)},
case Opt of
terminate ->
StateData;
normal ->
[{data,[{"State",StateData}]}]
end.
]]>
It is not mandatory to implement a
The callback mode
One reason to use this is when you have a state item
that when changed should cancel the
Suppose now that we call
So we make the
If a process now calls
We define the state as
gen_statem:start_link(
{local,?NAME}, ?MODULE, {Code,LockButton}, []).
stop() ->
gen_statem:stop(?NAME).
button(Digit) ->
gen_statem:call(?NAME, {button,Digit}).
code_length() ->
gen_statem:call(?NAME, code_length).
set_lock_button(LockButton) ->
gen_statem:call(?NAME, {set_lock_button,LockButton}).
init({Code,LockButton}) ->
process_flag(trap_exit, true),
Data = #{code => Code, remaining => undefined},
{ok, {locked,LockButton}, Data}.
callback_mode() ->
[handle_event_function,state_enter].
handle_event(
{call,From}, {set_lock_button,NewLockButton},
{StateName,OldLockButton}, Data) ->
{next_state, {StateName,NewLockButton}, Data,
[{reply,From,OldLockButton}]};
handle_event(
{call,From}, code_length,
{_StateName,_LockButton}, #{code := Code}) ->
{keep_state_and_data,
[{reply,From,length(Code)}]};
%%
%% State: locked
handle_event(
EventType, EventContent,
{locked,LockButton}, #{code := Code, remaining := Remaining} = Data) ->
case {EventType, EventContent} of
{enter, _OldState} ->
do_lock(),
{keep_state, Data#{remaining := Code}};
{timeout, _} ->
{keep_state, Data#{remaining := Code}};
{{call,From}, {button,Digit}} ->
case Remaining of
[Digit] -> % Complete
{next_state, {open,LockButton}, Data,
[{reply,From,ok}]};
[Digit|Rest] -> % Incomplete
{keep_state, Data#{remaining := Rest},
[{reply,From,ok}, 30000]};
[_|_] -> % Wrong
{keep_state, Data#{remaining := Code},
[{reply,From,ok}]}
end
end;
%%
%% State: open
handle_event(
EventType, EventContent,
{open,LockButton}, Data) ->
case {EventType, EventContent} of
{enter, _OldState} ->
do_unlock(),
{keep_state_and_data, [{state_timeout,10000,lock}]};
{state_timeout, lock} ->
{next_state, {locked,LockButton}, Data};
{{call,From}, {button,Digit}} ->
if
Digit =:= LockButton ->
{next_state, {locked,LockButton}, Data,
[{reply,From,locked}]};
true ->
{keep_state_and_data,
[postpone]}
end
end.
do_lock() ->
io:format("Locked~n", []).
do_unlock() ->
io:format("Open~n", []).
terminate(_Reason, State, _Data) ->
State =/= locked andalso do_lock(),
ok.
code_change(_Vsn, State, Data, _Extra) ->
{ok,State,Data}.
format_status(Opt, [_PDict,State,Data]) ->
StateData =
{State,
maps:filter(
fun (code, _) -> false;
(remaining, _) -> false;
(_, _) -> true
end,
Data)},
case Opt of
terminate ->
StateData;
normal ->
[{data,[{"State",StateData}]}]
end.
]]>
It can be an ill-fitting model for a physical code lock
that the
If you have many servers in one node
and they have some state(s) in their lifetime in which
the servers can be expected to idle for a while,
and the amount of heap memory all these servers need
is a problem, then the memory footprint of a server
can be mimimized by hibernating it through
It is rather costly to hibernate a process; see
We can in this example hibernate in the
case {EventType, EventContent} of
{enter, _OldState} ->
do_unlock(),
{keep_state_and_data,
[{state_timeout,10000,lock},hibernate]};
...
]]>
The atom
To change that we would need to insert
action
Another not uncommon scenario is to use the event time-out to triger hibernation after a certain time of inactivity.
This server probably does not use heap memory worth hibernating for. To gain anything from hibernation, your server would have to produce some garbage during callback execution, for which this example server can serve as a bad example.