19962009Ericsson AB. All Rights Reserved.
The contents of this file are subject to the Erlang Public License,
Version 1.1, (the "License"); you may not use this file except in
compliance with the License. You should have received a copy of the
Erlang Public License along with this software. If not, it can be
retrieved online at http://www.erlang.org/.
Software distributed under the License is distributed on an "AS IS"
basis, WITHOUT WARRANTY OF ANY KIND, either express or implied. See
the License for the specific language governing rights and limitations
under the License.
heartMagnus Fröberg1998-01-28AheartHeartbeat Monitoring of an Erlang Runtime System
This modules contains the interface to the heart process.
heart sends periodic heartbeats to an external port
program, which is also named heart. The purpose of
the heart port program is to check that the Erlang runtime system
it is supervising is still running. If the port program has not
received any heartbeats within HEART_BEAT_TIMEOUT seconds
(default is 60 seconds), the system can be rebooted. Also, if
the system is equipped with a hardware watchdog timer and is
running Solaris, the watchdog can be used to supervise the entire
system.
An Erlang runtime system to be monitored by a heart program,
should be started with the command line flag -heart (see
also erl(1). The heart
process is then started automatically:
% erl -heart ...
If the system should be rebooted because of missing heart-beats,
or a terminated Erlang runtime system, the environment variable
HEART_COMMAND has to be set before the system is started.
If this variable is not set, a warning text will be printed but
the system will not reboot. However, if the hardware watchdog is
used, it will trigger a reboot HEART_BEAT_BOOT_DELAY
seconds later nevertheless (default is 60).
To reboot on the WINDOWS platform HEART_COMMAND can be
set to heart -shutdown (included in the Erlang delivery)
or of course to any other suitable program which can activate a
reboot.
The hardware watchdog will not be started under Solaris if
the environment variable HW_WD_DISABLE is set.
The HEART_BEAT_TIMEOUT and HEART_BEAT_BOOT_DELAY
environment variables can be used to configure the heart timeouts,
they can be set in the operating system shell before Erlang is
started or be specified at the command line:
% erl -heart -env HEART_BEAT_TIMEOUT 30 ...
The value (in seconds) must be in the range 10 < X <= 65535.
It should be noted that if the system clock is adjusted with
more than HEART_BEAT_TIMEOUT seconds, heart will
timeout and try to reboot the system. This can happen, for
example, if the system clock is adjusted automatically by use of
NTP (Network Time Protocol).
In the following descriptions, all function fails with reason
badarg if heart is not started.
set_cmd(Cmd) -> ok | {error, {bad_cmd, Cmd}}Set a temporary reboot commandCmd = string()
Sets a temporary reboot command. This command is used if
a HEART_COMMAND other than the one specified with
the environment variable should be used in order to reboot
the system. The new Erlang runtime system will (if it
misbehaves) use the environment variable
HEART_COMMAND to reboot.
Limitations: The length of the Cmd command string
must be less than 2047 characters.
clear_cmd() -> okClear the temporary boot command
Clears the temporary boot command. If the system terminates,
the normal HEART_COMMAND is used to reboot.
get_cmd() -> {ok, Cmd}Get the temporary reboot commandCmd = string()
Get the temporary reboot command. If the command is cleared,
the empty string will be returned.