1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
|
<?xml version="1.0" encoding="latin1" ?>
<!DOCTYPE chapter SYSTEM "chapter.dtd">
<chapter>
<header>
<copyright>
<year>1997</year><year>2011</year>
<holder>Ericsson AB. All Rights Reserved.</holder>
</copyright>
<legalnotice>
The contents of this file are subject to the Erlang Public License,
Version 1.1, (the "License"); you may not use this file except in
compliance with the License. You should have received a copy of the
Erlang Public License along with this software. If not, it can be
retrieved online at http://www.erlang.org/.
Software distributed under the License is distributed on an "AS IS"
basis, WITHOUT WARRANTY OF ANY KIND, either express or implied. See
the License for the specific language governing rights and limitations
under the License.
</legalnotice>
<title>Supervisor Behaviour</title>
<prepared></prepared>
<docno></docno>
<date></date>
<rev></rev>
<file>sup_princ.xml</file>
</header>
<p>This section should be read in conjunction with
<c>supervisor(3)</c>, where all details about the supervisor
behaviour is given.</p>
<section>
<title>Supervision Principles</title>
<p>A supervisor is responsible for starting, stopping and
monitoring its child processes. The basic idea of a supervisor is
that it should keep its child processes alive by restarting them
when necessary.</p>
<p>Which child processes to start and monitor is specified by a
list of <seealso marker="#spec">child specifications</seealso>.
The child processes are started in the order specified by this
list, and terminated in the reversed order.</p>
</section>
<section>
<title>Example</title>
<p>The callback module for a supervisor starting the server from
the <seealso marker="gen_server_concepts#ex">gen_server chapter</seealso>
could look like this:</p>
<marker id="ex"></marker>
<code type="none">
-module(ch_sup).
-behaviour(supervisor).
-export([start_link/0]).
-export([init/1]).
start_link() ->
supervisor:start_link(ch_sup, []).
init(_Args) ->
{ok, {{one_for_one, 1, 60},
[{ch3, {ch3, start_link, []},
permanent, brutal_kill, worker, [ch3]}]}}.</code>
<p><c>one_for_one</c> is the <seealso marker="#strategy">restart strategy</seealso>.</p>
<p>1 and 60 defines the <seealso marker="#frequency">maximum restart frequency</seealso>.</p>
<p>The tuple <c>{ch3, ...}</c> is a <seealso marker="#spec">child specification</seealso>.</p>
</section>
<section>
<marker id="strategy"></marker>
<title>Restart Strategy</title>
<section>
<title>one_for_one</title>
<p>If a child process terminates, only that process is restarted.</p>
<marker id="sup4"></marker>
<image file="../design_principles/sup4.gif">
<icaption>One_For_One Supervision</icaption>
</image>
</section>
<section>
<title>one_for_all</title>
<p>If a child process terminates, all other child processes are
terminated and then all child processes, including
the terminated one, are restarted.</p>
<marker id="sup5"></marker>
<image file="../design_principles/sup5.gif">
<icaption>One_For_All Supervision</icaption>
</image>
</section>
<section>
<title>rest_for_one</title>
<p>If a child process terminates, the 'rest' of the child
processes -- i.e. the child processes after the terminated
process in start order -- are terminated. Then the terminated
child process and the rest of the child processes are restarted.</p>
</section>
</section>
<section>
<marker id="frequency"></marker>
<title>Maximum Restart Frequency</title>
<p>The supervisors have a built-in mechanism to limit the number of
restarts which can occur in a given time interval. This is
determined by the values of the two parameters <c>MaxR</c> and
<c>MaxT</c> in the start specification returned by the callback
function <c>init</c>:</p>
<code type="none">
init(...) ->
{ok, {{RestartStrategy, MaxR, MaxT},
[ChildSpec, ...]}}.</code>
<p>If more than <c>MaxR</c> number of restarts occur in the last
<c>MaxT</c> seconds, then the supervisor terminates all the child
processes and then itself.</p>
<p>When the supervisor terminates, then the next higher level
supervisor takes some action. It either restarts the terminated
supervisor, or terminates itself.</p>
<p>The intention of the restart mechanism is to prevent a situation
where a process repeatedly dies for the same reason, only to be
restarted again.</p>
</section>
<section>
<marker id="spec"></marker>
<title>Child Specification</title>
<p>This is the type definition for a child specification:</p>
<code type="none"><![CDATA[
{Id, StartFunc, Restart, Shutdown, Type, Modules}
Id = term()
StartFunc = {M, F, A}
M = F = atom()
A = [term()]
Restart = permanent | transient | temporary
Shutdown = brutal_kill | integer() >=0 | infinity
Type = worker | supervisor
Modules = [Module] | dynamic
Module = atom()]]></code>
<list type="bulleted">
<item>
<p><c>Id</c> is a name that is used to identify the child
specification internally by the supervisor.</p>
</item>
<item>
<p><c>StartFunc</c> defines the function call used to start
the child process. It is a module-function-arguments tuple
used as <c>apply(M, F, A)</c>.</p>
<p>It should be (or result in) a call to
<c>supervisor:start_link</c>, <c>gen_server:start_link</c>,
<c>gen_fsm:start_link</c> or <c>gen_event:start_link</c>.
(Or a function compliant with these functions, see
<c>supervisor(3)</c> for details.</p>
</item>
<item>
<p><c>Restart</c> defines when a terminated child process should
be restarted.</p>
<list type="bulleted">
<item>A <c>permanent</c> child process is always restarted.</item>
<item>A <c>temporary</c> child process is never restarted.</item>
<item>A <c>transient</c> child process is restarted only if it
terminates abnormally, i.e. with another exit reason than
<c>normal</c>.</item>
</list>
</item>
<item>
<marker id="shutdown"></marker>
<p><c>Shutdown</c> defines how a child process should be
terminated.</p>
<list type="bulleted">
<item><c>brutal_kill</c> means the child process is
unconditionally terminated using <c>exit(Child, kill)</c>.</item>
<item>An integer timeout value means that the supervisor tells
the child process to terminate by calling
<c>exit(Child, shutdown)</c> and then waits for an exit
signal back. If no exit signal is received within
the specified time, the child process is unconditionally
terminated using <c>exit(Child, kill)</c>.</item>
<item>If the child process is another supervisor, it should be
set to <c>infinity</c> to give the subtree enough time to
shutdown.</item>
</list>
</item>
<item>
<p><c>Type</c> specifies if the child process is a supervisor or
a worker.</p>
</item>
<item>
<p><c>Modules</c> should be a list with one element
<c>[Module]</c>, where <c>Module</c> is the name of
the callback module, if the child process is a supervisor,
gen_server or gen_fsm. If the child process is a gen_event,
<c>Modules</c> should be <c>dynamic</c>.</p>
<p>This information is used by the release handler during
upgrades and downgrades, see
<seealso marker="release_handling">Release Handling</seealso>.</p>
</item>
</list>
<p>Example: The child specification to start the server <c>ch3</c>
in the example above looks like:</p>
<code type="none">
{ch3,
{ch3, start_link, []},
permanent, brutal_kill, worker, [ch3]}</code>
<p>Example: A child specification to start the event manager from
the chapter about
<seealso marker="events#mgr">gen_event</seealso>:</p>
<code type="none">
{error_man,
{gen_event, start_link, [{local, error_man}]},
permanent, 5000, worker, dynamic}</code>
<p>Both the server and event manager are registered processes which
can be expected to be accessible at all times, thus they are
specified to be <c>permanent</c>.</p>
<p><c>ch3</c> does not need to do any cleaning up before
termination, thus no shutdown time is needed but
<c>brutal_kill</c> should be sufficient. <c>error_man</c> may
need some time for the event handlers to clean up, thus
<c>Shutdown</c> is set to 5000 ms.</p>
<p>Example: A child specification to start another supervisor:</p>
<code type="none">
{sup,
{sup, start_link, []},
transient, infinity, supervisor, [sup]}</code>
</section>
<section>
<marker id="super_tree"></marker>
<title>Starting a Supervisor</title>
<p>In the example above, the supervisor is started by calling
<c>ch_sup:start_link()</c>:</p>
<code type="none">
start_link() ->
supervisor:start_link(ch_sup, []).</code>
<p><c>ch_sup:start_link</c> calls the function
<c>supervisor:start_link/2</c>. This function spawns and links to
a new process, a supervisor.</p>
<list type="bulleted">
<item>The first argument, <c>ch_sup</c>, is the name of
the callback module, that is the module where the <c>init</c>
callback function is located.</item>
<item>The second argument, [], is a term which is passed as-is to
the callback function <c>init</c>. Here, <c>init</c> does not
need any indata and ignores the argument.</item>
</list>
<p>In this case, the supervisor is not registered. Instead its pid
must be used. A name can be specified by calling
<c>supervisor:start_link({local, Name}, Module, Args)</c> or
<c>supervisor:start_link({global, Name}, Module, Args)</c>.</p>
<p>The new supervisor process calls the callback function
<c>ch_sup:init([])</c>. <c>init</c> is expected to return
<c>{ok, StartSpec}</c>:</p>
<code type="none">
init(_Args) ->
{ok, {{one_for_one, 1, 60},
[{ch3, {ch3, start_link, []},
permanent, brutal_kill, worker, [ch3]}]}}.</code>
<p>The supervisor then starts all its child processes according to
the child specifications in the start specification. In this case
there is one child process, <c>ch3</c>.</p>
<p>Note that <c>supervisor:start_link</c> is synchronous. It does
not return until all child processes have been started.</p>
</section>
<section>
<title>Adding a Child Process</title>
<p>In addition to the static supervision tree, we can also add
dynamic child processes to an existing supervisor with
the following call:</p>
<code type="none">
supervisor:start_child(Sup, ChildSpec)</code>
<p><c>Sup</c> is the pid, or name, of the supervisor.
<c>ChildSpec</c> is a <seealso marker="#spec">child specification</seealso>.</p>
<p>Child processes added using <c>start_child/2</c> behave in
the same manner as the other child processes, with the following
important exception: If a supervisor dies and is re-created, then
all child processes which were dynamically added to the supervisor
will be lost.</p>
</section>
<section>
<title>Stopping a Child Process</title>
<p>Any child process, static or dynamic, can be stopped in
accordance with the shutdown specification:</p>
<code type="none">
supervisor:terminate_child(Sup, Id)</code>
<p>The child specification for a stopped child process is deleted
with the following call:</p>
<code type="none">
supervisor:delete_child(Sup, Id)</code>
<p><c>Sup</c> is the pid, or name, of the supervisor.
<c>Id</c> is the id specified in the <seealso marker="#spec">child specification</seealso>.</p>
<p>As with dynamically added child processes, the effects of
deleting a static child process is lost if the supervisor itself
restarts.</p>
</section>
<section>
<title>Simple-One-For-One Supervisors</title>
<p>A supervisor with restart strategy <c>simple_one_for_one</c> is
a simplified one_for_one supervisor, where all child processes are
dynamically added instances of the same process.</p>
<p>Example of a callback module for a simple_one_for_one supervisor:</p>
<code type="none">
-module(simple_sup).
-behaviour(supervisor).
-export([start_link/0]).
-export([init/1]).
start_link() ->
supervisor:start_link(simple_sup, []).
init(_Args) ->
{ok, {{simple_one_for_one, 0, 1},
[{call, {call, start_link, []},
temporary, brutal_kill, worker, [call]}]}}.</code>
<p>When started, the supervisor will not start any child processes.
Instead, all child processes are added dynamically by calling:</p>
<code type="none">
supervisor:start_child(Sup, List)</code>
<p><c>Sup</c> is the pid, or name, of the supervisor.
<c>List</c> is an arbitrary list of terms which will be added to
the list of arguments specified in the child specification. If
the start function is specified as <c>{M, F, A}</c>, then
the child process is started by calling
<c>apply(M, F, A++List)</c>.</p>
<p>For example, adding a child to <c>simple_sup</c> above:</p>
<code type="none">
supervisor:start_child(Pid, [id1])</code>
<p>results in the child process being started by calling
<c>apply(call, start_link, []++[id1])</c>, or actually:</p>
<code type="none">
call:start_link(id1)</code>
<p>A child under a <c>simple_one_for_one</c> supervisor can be terminated
with</p>
<code type="none">
supervisor:terminate_child(Sup, Pid)</code>
<p>where <c>Sup</c> is the pid, or name, of the supervisor and
<c>Pid</c> is the pid of the child.</p>
<p>Because a <c>simple_one_for_one</c> supervisor could have many children,
it shuts them all down at same time. So, order in which they are stopped is
not defined. For the same reason, it could have an overhead with regards to
the <c>Shutdown</c> strategy.</p>
</section>
<section>
<title>Stopping</title>
<p>Since the supervisor is part of a supervision tree, it will
automatically be terminated by its supervisor. When asked to
shutdown, it will terminate all child processes in reversed start
order according to the respective shutdown specifications, and
then terminate itself.</p>
</section>
</chapter>
|