aboutsummaryrefslogtreecommitdiffstats
path: root/lib/stdlib/src/supervisor.erl
AgeCommit message (Collapse)Author
2016-09-05Don't report error for shutdown exit tupleMagnus Henoch
When a simple_one_for_one supervisor is shutting down, and a child exits with an exit reason of the form {shutdown, Term}, handle it as if the exit reason were 'shutdown', without printing an error report. This makes the behaviour of wait_dynamic_children match that of do_restart. This fixes ERL-163.
2016-06-08Merge branch 'josevalim/supervisor-get-callback-module/PR-1000/OTP-13619'Siri Hansen
* josevalim/supervisor-get-callback-module/PR-1000/OTP-13619: Return callback module in supervisor format_status
2016-06-04Return callback module in supervisor format_statusJosé Valim
The previous implementation of supervisor:get_callback_module/1 used sys:get_status/1 to get the supervisor inner state and retrieve the callback module. Such implementation forbids any other supervisor implementation that has an internal state different than the #state{} record in supervisor.erl. This patch allows supervisors to return the callback module as part of the sys:get_status/1 data, no longer coupling the callback module implementation with the inner #state{} record. Notice we have kept the clause matching the previous sys:get_status/1 reply for backwards compatibility purposes.
2016-05-25Merge branch 'josevalim/supervisor-try-again-restart/PR-1001/OTP-13618'Siri Hansen
* josevalim/supervisor-try-again-restart/PR-1001/OTP-13618: Avoid potential timer bottleneck on supervisor restart
2016-04-28Enhance map specs in erts, stdlib, runtime_toolsMagnus Lång
Using the new type syntax, we can specify which keys are required, and which are optional in a way Dialyzer could use.
2016-03-31Avoid potential timer bottleneck on supervisor restartJosé Valim
Prior to this patch, when the supervisor was unable to restart a child, it used the timer process to execute a function later on, allowing other messages in the supervisor queue to be consumed. This patch keeps the same idea but bypasses the timer process by directly sending a message to the supervisor process itself.
2016-03-15update copyright-yearHenrik Nord
2016-02-10Merge branch 'maint'Siri Hansen
Conflicts: lib/stdlib/src/supervisor.erl
2016-02-03Speed up supervisor:count_children/1; simple_one_for_oneRory Byrne
Speed up supervisor:count_children/1 for simple_one_for_one supervisors. This is achieved by avoiding looping through all the child process and verifying that each one is alive. For a supervisor with 100,000 'temporary' children the count-time will drop from approx 25ms to about 0.005ms. For a supervisor with 100,000 'permanent' or 'transient' children the count-time will drop from approx 30ms to about 0.005ms. This avoids having the supervisor block for an extended period while the count takes place. Under normal circumstances the accuracy of the result should also improve since the duration is too short for many processes to die during the count.
2016-02-03Add supervisor:get_callback_module/1Siri Hansen
This function is used by release_handler during upgrade. This was earlier implemented in the release_handler, but it required a copy og the definition of the supervisor's internal state, which caused problems when this state was updated.
2015-10-08stdlib: Refactor the supervisor module's stateHans Bolinder
The field 'dynamics' in #state{} is a union of two opaque types, which is possibly problematic. Tagging the types should make the code safe for warnings from future versions of Dialyzer.
2015-10-08Update Kernel and STDLIBHans Bolinder
Record field types have been modified due to commit 8ce35b2: "Take out automatic insertion of 'undefined' from typed record fields".
2015-10-02Merge branch 'nybek/supervisor_reporting_error' into maintZandra
* nybek/supervisor_reporting_error: Fix supervisor reporting error
2015-06-18Change license text to APLv2Bruce Yinhe
2015-06-09Fix supervisor:get_childspec/2 for simple_one_for_oneRory Byrne
A bug in supervisor:get_childspec/2 results in {error, simple_one_for_one} being returned on every call when the supervisor strategy is simple_one_for_one. This commit includes a small refactoring which brings together the two 'start_child' clauses to make the code easier to follow.
2015-06-09Fix supervisor reporting errorRory Byrne
When a simple_one_for_one supervisor is shutting down, if one or more of its children fail to terminate within the 'shutdown' timeout, then the following reporting error will occur: * If there is only one child that fails to exit on time, then no supervisor report will be generated. * If more than one child fails to exit on time, then the reporting of 'nb_children' in the supervisor report will be 1 less than it should be.
2015-05-26Don't store child that returns ignore in simple supervisorJames Fish
If a child of a simple_one_for_one returns ignore from its start function no longer store the child for any restart type. It is not possible to restart or delete the child because the supervisor is a simple_one_for_one. Previously a simple_one_for_one would crash, potentially without shutting down all of its children, when it tried to shutdown a child with undefined pid. Previous only one undefined pid child was stored in a simple_one_for_one supervisor no matter how many times the child start function returned ignore.
2015-04-22supervisor: Correct restart handlingBjörn Gustavsson
fbaa0bec replaced the use of now/0 with erlang:monotonic_time/1 but at the same time introduced a bug in inPeriod/3 so that it would always return 'true' (the subtraction Time - Now would always result in a non-positive number that would always be less than Period). The symptoms of the bug is that when a child has been restarted the maximum number of times allowed, the supervisor will terminate, regardless of how much time that elapses between the restarts. There was no test case that detected this problem. Add the missing test case to ensure that this bug stays killed. Reported-by: Rafał Studnicki
2015-03-20Merge branch 'rickard/time_api/OTP-11997'Rickard Green
* rickard/time_api/OTP-11997: (22 commits) Update primary bootstrap inets: Suppress deprecated warning on erlang:now/0 inets: Cleanup of multiple copies of functions Add inets_lib with common functions used by multiple modules inets: Update comments Suppress deprecated warning on erlang:now/0 Use new time API and be back-compatible in inets Remove unused functions and removed redundant test asn1 test SUITE: Eliminate use of now/0 Disable deprecated warning on erlang:now/0 in diameter_lib Use new time API and be back-compatible in ssh Replace all calls to now/0 in CT with new time API functions test_server: Replace usage of erlang:now() with usage of new API Replace usage of erlang:now() with usage of new API Replace usage of erlang:now() with usage of new API Replace usage of erlang:now() with usage of new API Replace usage of erlang:now() with usage of new API otp_SUITE: Warn for calls to erlang:now/0 Replace usage of erlang:now() with usage of new API Multiple timer wheels Erlang based BIF timer implementation for scalability Implement ethread events with timeout ... Conflicts: bootstrap/bin/start.boot bootstrap/bin/start_clean.boot bootstrap/lib/compiler/ebin/beam_asm.beam bootstrap/lib/compiler/ebin/compile.beam bootstrap/lib/kernel/ebin/auth.beam bootstrap/lib/kernel/ebin/dist_util.beam bootstrap/lib/kernel/ebin/global.beam bootstrap/lib/kernel/ebin/hipe_unified_loader.beam bootstrap/lib/kernel/ebin/inet_db.beam bootstrap/lib/kernel/ebin/inet_dns.beam bootstrap/lib/kernel/ebin/inet_res.beam bootstrap/lib/kernel/ebin/os.beam bootstrap/lib/kernel/ebin/pg2.beam bootstrap/lib/stdlib/ebin/dets.beam bootstrap/lib/stdlib/ebin/dets_utils.beam bootstrap/lib/stdlib/ebin/erl_tar.beam bootstrap/lib/stdlib/ebin/escript.beam bootstrap/lib/stdlib/ebin/file_sorter.beam bootstrap/lib/stdlib/ebin/otp_internal.beam bootstrap/lib/stdlib/ebin/qlc.beam bootstrap/lib/stdlib/ebin/random.beam bootstrap/lib/stdlib/ebin/supervisor.beam bootstrap/lib/stdlib/ebin/timer.beam erts/aclocal.m4 erts/emulator/beam/bif.c erts/emulator/beam/erl_bif_info.c erts/emulator/beam/erl_db_hash.c erts/emulator/beam/erl_init.c erts/emulator/beam/erl_process.h erts/emulator/beam/erl_thr_progress.c erts/emulator/beam/utils.c erts/emulator/sys/unix/sys.c erts/preloaded/ebin/erlang.beam erts/preloaded/ebin/erts_internal.beam erts/preloaded/ebin/init.beam erts/preloaded/src/erts_internal.erl lib/common_test/test/ct_hooks_SUITE_data/cth/tests/empty_cth.erl lib/diameter/src/base/diameter_lib.erl lib/kernel/src/os.erl lib/ssh/test/ssh_basic_SUITE.erl system/doc/efficiency_guide/advanced.xml
2015-03-20Replace usage of erlang:now() with usage of new APIRickard Green
2014-10-20New function supervisor:get_childspec/2Siri Hansen
Takes the name of the child (or Pid, in the case of a simple_one_for_one supervisor) and returns the map which specifies the child.
2014-10-20Allow maps for supervisor flags and child specsSiri Hansen
Earlier, supervisor flags and child specs were given as tuples. While this is kept for backwards compatibility, it is now also allowed to give these parameters as maps: -type sup_flags() :: #{strategy => strategy(), % optional intensity => non_neg_integer(), % optional period => pos_integer()} % optional -type child_spec() :: #{id => child_id(), % mandatory start => mfargs(), % mandatory restart => restart(), % optional shutdown => shutdown(), % optional type => worker(), % optional modules => modules()} % optional Default values are as follows: Supervisor flags: strategy: one_for_one intensity: 1 period: 5 Child specs: restart: permanent type: worker shutdown: 5000 for workers, 'infinity' for supervisors modules: [M], where M comes from the child's start {M,F,A} Some of these default values are quite hard to decide on, since there really is no "most common way". It always depends on the use case. So we decided that the most important reason for having default values is to lower the start barrier and get "something" running. For production use, most systems must be fine tuned in this respect anyway. This is how we reasoned about it: Strategy: just pick one - and 'one_for_one' was most used in OTP. Max restart frequency (intensity/period): by allowing one restart only we keep the important supervisor feature of restarting children, but we also avoid the possibility of scaling to a huge amount of restarts if the supervisor tree is deep. Restart: just pick one - and 'permanent' is fairly common. Shutdown for workers: to avoid the confusion of why the terminate function is not executed, we decided not to use 'brutal_kill'. Which number to use is probably not that important, so we chose 5000. Shutdown for supervisors: the recommended shutdown value for supervisors is 'infinity', so we decied to use that. Having the same (integer) value as for workers can give very strange results. Type: just pick one - and we believe that 'worker' is most common
2014-10-08Rebase supervisorSiri Hansen
Remove duplicated code related to checking of supervisor flags.
2014-02-23Deprecate pre-defined built-in typesHans Bolinder
The types array(), dict(), digraph(), gb_set(), gb_tree(), queue(), set(), and tid() have been deprecated. They will be removed in OTP 18.0. Instead the types array:array(), dict:dict(), digraph:graph(), gb_set:set(), gb_tree:tree(), queue:queue(), sets:set(), and ets:tid() can be used. (Note: it has always been necessary to use ets:tid().) It is allowed in OTP 17.0 to locally re-define the types array(), dict(), and so on. New types array:array/1, dict:dict/2, gb_sets:set/1, gb_trees:tree/2, queue:queue/1, and sets:set/1 have been added.
2014-02-14Merge branch 'calebh/fix-regestry-type-annotation'Henrik Nord
* calebh/fix-regestry-type-annotation: Fix alternative registry type annotations in supervisor OTP-11707
2014-02-07Fix alternative registry type annotations in supervisorCaleb
The type annotations for alternative registries using the {via,Module,Name} syntax was not given for sup_name() and sup_ref() in the supervisor module. The type annotation was inconsistent with the documentation and causes Dialyzer to complain about valid supervisor:start_link and supervisor:start_child calls.
2013-06-12Update copyright yearsBjörn-Egil Dahlberg
2013-05-06Fix unmatched_return warnings in stdlibSiri Hansen
2013-05-06Fix unmatched_returns warnings in STDLIB and KernelHans Bolinder
2013-04-04Fix rest_for_one and one_for_all restarting a child not terminatedJames Fish
In rest_for_one and one_for_all supervisors one child dying can cause multiple children to be restarted. Previously if the child that caused the restart is started successfully but another child fails to start, the supervisor would not terminate this child with the other successfully restarted children as no record of the pid was kept. Thus the supervisor would try to start this child again. This could lead to multiples of the same child or if the child is registered cause repeated attempts at starting this child - until the max restart threshold was reached. Now the child that failed to start becomes the restarting child, instead of staying with the same child, for the next restart attempt. This has the following side effects: 1) In one_for_all the new version of the child that original died is terminated before a restart attempt is made. 2) In rest_for_one all succesfully restarted children are not terminated and restarting continues from the child that failed to start.
2012-10-11If supervisor:start_link fails to start child, add child id to error reasonSiri Hansen
2012-10-05Have supervisor send errors up the chainTomas Pihl
If a child fails to start, supervisor relies upon error_logger which does not work when IO is inhibited. Instead pass the error up the chain and let someone else use a proper Reason for any possible printouts.
2012-03-30Update copyright yearsBjörn-Egil Dahlberg
2012-03-16Merge branch 'rj/fix-supervisor-shutdown-doc' into maintGustav Simonsson
* rj/fix-supervisor-shutdown-doc: Fix small typo in kernel app doc Cosmetic: split very long lines from supervisor doc Fix supervisor doc: Shutdown, MaxR and MaxT type specs Add the type restrictions in the code comments Remove trailing spaces OTP-9987
2012-03-05Leave control back to gen_server during supervisor's restart loopSiri Hansen
When an attempt to restart a child failed, supervisor would earlier keep the execution flow and try to restart the child over and over again until it either succeeded or the restart frequency limit was reached. If none of these happened, supervisor would hang forever in this loop. This commit adds a timer of 0 ms where the control is left back to the gen_server which implements the supervisor. This way any incoming request to the supervisor will be handled - which could help breaking the infinite loop - e.g. shutdown request for the supervisor or for the problematic child. This introduces some incompatibilities in stdlib due to new return values from supervisor: * restart_child/2 can now return {error,restarting} * delete_child/2 can now return {error,restarting} * which_children/1 returns a list of {Id,Child,Type,Mods}, where Child, in addition to the old pid() or 'undefined', now also can be 'restarting'.
2012-02-24Add the type restrictions in the code commentsRicardo Catalinas Jiménez
2012-02-24Remove trailing spacesRicardo Catalinas Jiménez
2011-12-20Don't save child spec for temporary child if child's start func returns ignoreSiri Hansen
Supervisor should never keep child specs for dead temporary children.
2011-11-29Fix dialyzer warnings in supervisorSiri Hansen
Dialyzer complained over a mismatch between the callback spec of Mod:code_change in gen_server and the spec of supervisor:code_change (which is the implementation of a gen_server Mod:code_change). This commit changes the callback spec to allow {error,Reason} as return value. Also, release_handler is updated to handle this return value.
2011-10-28Handle undefined pid when reporting error from supervisorSiri Hansen
2011-10-20Merge branch 'cf/simple_one_for_one_shutdown'Henrik Nord
* cf/simple_one_for_one_shutdown: Explain how dynamic child processes are stopped Stack errors when dynamic children are stopped Explicitly kill dynamic children in supervisors Conflicts: lib/stdlib/doc/src/supervisor.xml OTP-9647
2011-10-10Allow an infinite timeout to shutdown worker processesChristopher Faulet
Now, in child specification, the shutdown value can also be set to infinity for worker children. This restriction was removed because this is not always possible to predict the shutdown time for a worker. This is highly application-dependent.
2011-10-07Add '-callback' attributes in stdlib's behavioursStavros Aronis
Replace the behaviour_info(callbacks) export in stdlib's behaviours with -callback' attributes for all the callbacks.
2011-09-16Stack errors when dynamic children are stoppedChristopher Faulet
Because a simple_one_for_one supervisor can have many workers, we stack errors during its shutdown to report only one message for each encountered error type. Instead of reporting the child's pid, we use the number of concerned children.
2011-09-16Explicitly kill dynamic children in supervisorsChristopher Faulet
According to the supervisor's documentation: "Important note on simple-one-for-one supervisors: The dynamically created child processes of a simple-one-for-one supervisor are not explicitly killed, regardless of shutdown strategy, but are expected to terminate when the supervisor does (that is, when an exit signal from the parent process is received)." All is fine as long as we stop simple_one_for_one supervisor manually. Dynamic children catch the exit signal from the supervisor and leave. But, if this happens when we stop an application, after the top supervisor has stopped, the application master kills all remaining processes associated to this application. So, dynamic children that trap exit signals can be killed during their cleanup (here we mean inside terminate/2). This is unpredictable and highly time-dependent. In this commit, supervisor module is patched to explicitly terminate dynamic children accordingly to the shutdown strategy. NOTE: Order in which dynamic children are stopped is not defined. In fact, this is "almost" done at the same time.
2011-08-30Merge branch 'dev' into majorHenrik Nord
2011-08-23fix supervisors restarting temporary childrenFred Hebert
In the current implementation of supervisors, temporary children should never be restarted. However, when a temporary child is restarted as part of a one_for_all or rest_for_one strategy where the failing process is not the temporary child, the supervisor still tries to restart it. Because the supervisor doesn't keep some of the MFA information of temporary children, this causes the supervisor to hit its restart limit and crash. This patch fixes the behaviour by inserting a clause in terminate_children/2-3 (private function) that will omit temporary children when building a list of killed processes, to avoid having the supervisor trying to restart them again. Only supervisors in need of restarting children used the list, so the change should be of no impact for the functions that called terminate_children/2-3 only to kill all children. The documentation has been modified to make this behaviour more explicit.
2011-06-17Handle exit reason {shutdown,Term} as shutdown for children of supervisorSiri Hansen
In R13B proc_lib, gen_server and gen_fsm were all changed to handle exit reason {shutdown,Term} in the same way as exit reason 'shutdown', i.e. no crash reports are generated. This is an update of supervisor to do the same, i.e. handle these two exit reasons in the same way. This means that for children with restart type 'transient' there will be no attempt to restart the process if it terminates with reason {shutdown,Term}, and there will be no supervisor report.
2011-05-12Types and specifications have been modified and addedHans Bolinder
2011-05-04Change list to set in supervisor for saving pids of dynamic temprary childrenSiri Hansen
Since initial arguments of temporary children under simple_one_for_one supervisors are not saved, only a list of pids was stored in such supervisors. When adding/deleting many children, this would scale badly. To avoid this the list is now changed to a set.