19972014 Ericsson AB. All Rights Reserved. The contents of this file are subject to the Erlang Public License, Version 1.1, (the "License"); you may not use this file except in compliance with the License. You should have received a copy of the Erlang Public License along with this software. If not, it can be retrieved online at http://www.erlang.org/. Software distributed under the License is distributed on an "AS IS" basis, WITHOUT WARRANTY OF ANY KIND, either express or implied. See the License for the specific language governing rights and limitations under the License. Supervisor Behaviour sup_princ.xml

This section should be read in conjunction with supervisor(3), where all details about the supervisor behaviour are described.

Supervision Principles

A supervisor is responsible for starting, stopping, and monitoring its child processes. The basic idea of a supervisor is that it shall keep its child processes alive by restarting them when necessary.

Which child processes to start and monitor is specified by a list of child specifications. The child processes are started in the order specified by this list, and terminated in the reversed order.

Example

The callback module for a supervisor starting the server from the gen_server chapter could look like this:

-module(ch_sup). -behaviour(supervisor). -export([start_link/0]). -export([init/1]). start_link() -> supervisor:start_link(ch_sup, []). init(_Args) -> SupFlags = #{strategy => one_for_one, intensity => 1, period => 5}, ChildSpecs = [#{id => ch3, start => {ch3, start_link, []}, restart => permanent, shutdown => brutal_kill, type => worker, modules => [cg3]}], {ok, {SupFlags, ChildSpecs}}.

The SupFlags variable in the return value from init/1 represents the supervisor flags.

The ChildSpecs variable in the return value from init/1 is a list of child specifications.

Supervisor Flags

This is the type definition for the supervisor flags:

strategy(), % optional intensity => non_neg_integer(), % optional period => pos_integer()} % optional strategy() = one_for_all | one_for_one | rest_for_one | simple_one_for_one]]>

strategy specifies the restart strategy.

intensity and period specify the maximum restart intensity.

Restart Strategy

The restart strategy is specified by the strategy key in the supervisor flags map returned by the callback function init:

SupFlags = #{strategy => Strategy, ...}

The strategy key is optional in this map. If it is not given, it defaults to one_for_one.

one_for_one

If a child process terminates, only that process is restarted.

One_For_One Supervision
one_for_all

If a child process terminates, all other child processes are terminated, and then all child processes, including the terminated one, are restarted.

One_For_All Supervision
rest_for_one

If a child process terminates, the 'rest' of the child processes -- i.e. the child processes after the terminated process in start order -- are terminated. Then the terminated child process and the rest of the child processes are restarted.

simple_one_for_one

See simple-one-for-one supervisors.

Maximum Restart Intensity

The supervisors have a built-in mechanism to limit the number of restarts which can occur in a given time interval. This is specified by the two keys intensity and period in the supervisor flags map returned by the callback function init:

SupFlags = #{intensity => MaxR, period => MaxT, ...}

If more than MaxR number of restarts occur in the last MaxT seconds, the supervisor terminates all the child processes and then itself.

When the supervisor terminates, the next higher level supervisor takes some action. It either restarts the terminated supervisor or terminates itself.

The intention of the restart mechanism is to prevent a situation where a process repeatedly dies for the same reason, only to be restarted again.

The keys intensity and period are optional in the supervisor flags map. If they are not given, they default to 1 and 5, respectively.

Child Specification

This is the type definition for a child specification:

child_id(), % mandatory start => mfargs(), % mandatory restart => restart(), % optional shutdown => shutdown(), % optional type => worker(), % optional modules => modules()} % optional child_id() = term() mfargs() = {M :: module(), F :: atom(), A :: [term()]} modules() = [module()] | dynamic restart() = permanent | transient | temporary shutdown() = brutal_kill | timeout() worker() = worker | supervisor]]>

id is used to identify the child specification internally by the supervisor.

The id key is mandatory.

Note that this identifier on occations has been called "name". As far as possible, the terms "identifier" or "id" are now used but in order to keep backwards compatibility, some occurences of "name" can still be found, for example in error messages.

start defines the function call used to start the child process. It is a module-function-arguments tuple used as apply(M, F, A).

It should be (or result in) a call to supervisor:start_link, gen_server:start_link, gen_fsm:start_link, or gen_event:start_link. (Or a function compliant with these functions, see supervisor(3) for details.

The start key is mandatory.

restart defines when a terminated child process shall be restarted.

A permanent child process is always restarted. A temporary child process is never restarted (not even when the supervisor's restart strategy is rest_for_one or one_for_all and a sibling's death causes the temporary process to be terminated). A transient child process is restarted only if it terminates abnormally, i.e. with another exit reason than normal, shutdown, or {shutdown,Term}.

The restart key is optional. If it is not given, the default value permanent will be used.

shutdown defines how a child process shall be terminated.

brutal_kill means the child process is unconditionally terminated using exit(Child, kill). An integer timeout value means that the supervisor tells the child process to terminate by calling exit(Child, shutdown) and then waits for an exit signal back. If no exit signal is received within the specified time, the child process is unconditionally terminated using exit(Child, kill). If the child process is another supervisor, it should be set to infinity to give the subtree enough time to shut down. It is also allowed to set it to infinity, if the child process is a worker.

Be careful when setting the shutdown time to infinity when the child process is a worker. Because, in this situation, the termination of the supervision tree depends on the child process, it must be implemented in a safe way and its cleanup procedure must always return.

The shutdown key is optional. If it is not given, and the child is of type worker, the default value 5000 will be used; if the child is of type supervisor, the default value infinity will be used.

type specifies if the child process is a supervisor or a worker.

The type key is optional. If it is not given, the default value worker will be used.

modules should be a list with one element [Module], where Module is the name of the callback module, if the child process is a supervisor, gen_server or gen_fsm. If the child process is a gen_event, the value shall be dynamic.

This information is used by the release handler during upgrades and downgrades, see Release Handling.

The modules key is optional. If it is not given, it defaults to [M], where M comes from the child's start {M,F,A}.

Example: The child specification to start the server ch3 in the example above looks like:

#{id => ch3, start => {ch3, start_link, []}, restart => permanent, shutdown => brutal_kill, type => worker, modules => [ch3]}

or simplified, relying on the default values:

#{id => ch3, start => {ch3, start_link, []} shutdown => brutal_kill}

Example: A child specification to start the event manager from the chapter about gen_event:

#{id => error_man, start => {gen_event, start_link, [{local, error_man}]}, modules => dynamic}

Both server and event manager are registered processes which can be expected to be accessible at all times, thus they are specified to be permanent.

ch3 does not need to do any cleaning up before termination, thus no shutdown time is needed but brutal_kill should be sufficient. error_man may need some time for the event handlers to clean up, thus the shutdown time is set to 5000 ms (which is the default value).

Example: A child specification to start another supervisor:

#{id => sup, start => {sup, start_link, []}, restart => transient, type => supervisor} % will cause default shutdown=>infinity
Starting a Supervisor

In the example above, the supervisor is started by calling ch_sup:start_link():

start_link() -> supervisor:start_link(ch_sup, []).

ch_sup:start_link calls the function supervisor:start_link/2. This function spawns and links to a new process, a supervisor.

The first argument, ch_sup, is the name of the callback module, that is the module where the init callback function is located. The second argument, [], is a term which is passed as-is to the callback function init. Here, init does not need any indata and ignores the argument.

In this case, the supervisor is not registered. Instead its pid must be used. A name can be specified by calling supervisor:start_link({local, Name}, Module, Args) or supervisor:start_link({global, Name}, Module, Args).

The new supervisor process calls the callback function ch_sup:init([]). init shall return {ok, {SupFlags, ChildSpecs}}:

init(_Args) -> SupFlags = #{}, ChildSpecs = [#{id => ch3, start => {ch3, start_link, []}, shutdown => brutal_kill}], {ok, {SupFlags, ChildSpecs}}.

The supervisor then starts all its child processes according to the given child specifications. In this case there, is one child process, ch3.

Note that supervisor:start_link is synchronous. It does not return until all child processes have been started.

Adding a Child Process

In addition to the static supervision tree, we can also add dynamic child processes to an existing supervisor with the following call:

supervisor:start_child(Sup, ChildSpec)

Sup is the pid, or name, of the supervisor. ChildSpec is a child specification.

Child processes added using start_child/2 behave in the same manner as the other child processes, with the following important exception: If a supervisor dies and is re-created, then all child processes which were dynamically added to the supervisor will be lost.

Stopping a Child Process

Any child process, static or dynamic, can be stopped in accordance with the shutdown specification:

supervisor:terminate_child(Sup, Id)

The child specification for a stopped child process is deleted with the following call:

supervisor:delete_child(Sup, Id)

Sup is the pid, or name, of the supervisor. Id is the value associated with the id key in the child specification.

As with dynamically added child processes, the effects of deleting a static child process is lost if the supervisor itself restarts.

Simple-One-For-One Supervisors

A supervisor with restart strategy simple_one_for_one is a simplified one_for_one supervisor, where all child processes are dynamically added instances of the same child specification.

Example of a callback module for a simple_one_for_one supervisor:

-module(simple_sup). -behaviour(supervisor). -export([start_link/0]). -export([init/1]). start_link() -> supervisor:start_link(simple_sup, []). init(_Args) -> SupFlags = #{strategy => simple_one_for_one, intensity => 0, period => 1}, ChildSpecs = [#{id => call, start => {call, start_link, []}, shutdown => brutal_kill}], {ok, {SupFlags, ChildSpecs}}.

When started, the supervisor will not start any child processes. Instead, all child processes are added dynamically by calling:

supervisor:start_child(Sup, List)

Sup is the pid, or name, of the supervisor. List is an arbitrary list of terms which will be added to the list of arguments specified in the child specification. If the start function is specified as {M, F, A}, the child process is started by calling apply(M, F, A++List).

For example, adding a child to simple_sup above:

supervisor:start_child(Pid, [id1])

results in the child process being started by calling apply(call, start_link, []++[id1]), or actually:

call:start_link(id1)

A child under a simple_one_for_one supervisor can be terminated with

supervisor:terminate_child(Sup, Pid)

where Sup is the pid, or name, of the supervisor and Pid is the pid of the child.

Because a simple_one_for_one supervisor could have many children, it shuts them all down asynchronously. This means that the children will do their cleanup in parallel and therefore the order in which they are stopped is not defined.

Stopping

Since the supervisor is part of a supervision tree, it will automatically be terminated by its supervisor. When asked to shutdown, it will terminate all child processes in reversed start order according to the respective shutdown specifications, and then terminate itself.