inviso: (Latin) to go to see, visit, inspect, look at.
The
The Inviso trace system consists of one or several runtime components supposed to run on each Erlang node doing tracing and one control component which can run on any node with available processor power. Inviso may also be part of a higher layer trace tool. See the inviso-tool as an example. The implementation is spread out over the Runtime_tools and the Inviso Erlang/OTP applications. Erlang modules necessary to run the runtime component are located in Runtime_tools and therefore assumed to be available on any node. Even though Inviso is introduced with Erlang/OTP R11B the runtime component implementation is done with backward compatibility in mind. Meaning that it is possible to compile and run it on older Erlang/OTP releases.
This document describes the control and runtime components of the Inviso trace system.
Inviso is built on Erlang trace BIFs and standard linked in trace-port drivers for efficient trace message logging. This means that Inviso can not co-exist in runtime with any other trace tool using the trace BIFs.
This is a short step-by-step description of how tracing using Inviso can be done.
This "recipe" is valid also when tracing in a non-distributed environment. The only difference is that function calls not taking a node-name as argument are used. The runtime component will then of course run on the same node as the control component.
Simple example illustrating the above listed recipe. It traces on two nodes, node1 where the control component also runs. And node2 which is a remote node from the control components perspective. The example uses a mixture of API-calls specifying what nodes to trace on and API functions working on all added nodes. This is in this example interchangeable since all to the control component known nodes are participating in the same way.
Eshell V5.5 (abort with ^G) (node1@hurin)1>application:start(runtime_tools). ok (node1@hurin)2> inviso:start(). {ok,<0.56.0>} (node1@hurin)3> inviso:add_nodes([node(),node2@hurin],mytag). {ok,[{'node1@hurin',{ok,new}}, {'node2@hurin',{ok,new}}]} (node1@hurin)4> inviso:init_tracing( [{node(),[{trace,{file,"tracefile_node1.log"}},{ti,{file,"trace_node1.ti"}}]}, {node2@hurin,[{trace,{file,"tracefile_node2.log"}},{ti,{file,"trace_node2.ti"}}]}]). {ok,[{'node1@hurin',{ok,[{trace_log,ok},{ti_log,ok}]}}, {'node2@hurin',{ok,[{trace_log,ok},{ti_log,ok}]}}]} (node1@hurin)5> inviso:tpm_localnames([node(),node2@hurin]). {ok,[{'node1@hurin',{{ok,1},{ok,1}}}, {'node2@hurin',{{ok,1},{ok,1}}}]} (node1@hurin)6> inviso:tpl([node(),node2@hurin],code,which,'_',[]). {ok,[{'node1@hurin',{ok,[2]}}, {'node2@hurin',{ok,[2]}}]} (node1@hurin)7> inviso:tf(all,[call,timestamp]). {ok,[{'node1@hurin',{ok,"/"}}, {'node2@hurin',{ok,"-"}}]} (node1@hurin)8> code:which(ordset). non_existing (node1@hurin)9> inviso:stop_tracing(). {ok,[{'node1@hurin',{ok,idle}}, {'node2@hurin',{ok,idle}}]} (node1@hurin)10> inviso:fetch_log([node2@hurin],".","aprefix_"). {ok,[{'node2@hurin', {complete,[{trace_log,[{ok,"aprefix_tracefile_node2.log"}]}, {ti_log,[{ok,"aprefix_trace_node2.ti"}]}]}}]} (node1@hurin)11> inviso:list_logs([node()]). {ok,[{'node1@hurin', {ok,[{trace_log,".",["tracefile_node1.log"]}, {ti_log,".",["trace_node1.ti"]}]}}]} (node1@hurin)12> inviso_lfm:merge( [{node(),[{trace_log,["tracefile_node1.log"]}, {ti_log,["trace_node1.ti"]}]}, {node2@hurin,[{trace_log,["aprefix_tracefile_node2.log"]}, {ti_log,["aprefix_trace_node2.ti"]}]}],"theoutfile.txt"). {ok,15} (node1@hurin)13> inviso:clear(). {ok,[{'node1@hurin',{ok,{new,running}}}, {'node2@hurin',{ok,{new,running}}}]} (node1@hurin)14> inviso:stop_nodes(). {ok,[{'node2@hurin',ok}, {'node1@hurin',ok}]} (node1@hurin)15>
Incarnation runtime tags are used to identify an incarnation of a runtime component. An incarnation is one "start-up" of a runtime component on a specific Erlang node. The reason why it can sometimes be necessary to examine the incarnation runtime tag is that a user wants to connect, adopt, an already running runtime component. This may be the case if the runtime component has autostarted or because the control component terminated without killing the runtime component. While the user has been out of control of the runtime component it may very well have terminated and been restarted. If it was restarted without the user's knowledge, its incarnation runtime tag has most likely changed. The user can therefore, if the current incarnation runtime tag is not what it is supposed to be, conclude that the runtime component is not "doing" what is expected.
The runtime tag is set at runtime component start-up. This is either done when it is started manually by a call to
A runtime component has a state and a status. The possible states are:
The status describes if the runtime component is
Meta tracing is a trace mechanism separate from the regular tracing. It is normally used by a trace-tool to learn about function calls made anywhere in an Erlang node. A typical example is that there is a possibility in Inviso to get pids translated to registered name in the final formatted trace-log (for processes having registered names). This is done by meta-tracing on the BIF
Meta tracing in Inviso is done by the
The runtime meta tracer can also be used to translate pids to own identifiers. The only thing needed is one or several association points in the form of function calls which will only be made if an association is done in the system. The pid and own-identifier must be arguments and/or return values from the same function call.
The runtime meta tracer can further more be used to achieve side-effects during tracing, like turning tracing on or off.
It may sometimes be necessary to wait for a meta traced function to return before it can be decided what to do. This may be due to that one piece of information to make the decision is in the arguments to the function, the other in the return value. This kind of logic can be programmed to be executed by the inviso meta tracer. In order for the inviso meta tracer to "remember" function-call arguments until the function return trace message arrives, a
The default public loop data structure is a tuple of size two. The first element in that tuple is used by the predefined meta tracing for capturing locally registered names. The second element is free to use for any other purpose. The elements of the tuple must in the default implementation be lists of tuples. Where each sub-tuple shall represent one waiting call. The last element of that tuple must be a now-stamp (as returned by the BIF
The inviso meta tracer "cleans" the public loop data structure approximately once every minute. The reason for this is that entries in the public loop data structure may become abandoned. If for instance a process crashes while executing the body of a meta traced function, no return value will be generated. Or in other words, receiving the call meta trace-message can have caused information to have been written into the public loop data structure. That entry will be used and removed when the return_trace meta trace-message arrives. But if the meta traced function causes an exception, no return_trace message will come. The function which normally removes the entry is then therefore never called.
The default clean-function assumes that every item in the public loop data tuple is a list. Where each list contains tuples where the last element of those tuples are "now-stamps". The default clean-function considers an entry older than 30 seconds to be abandoned.
When activating meta tracing for a function for the purpose of writing pid-alias associations in the trace information file, a call-func and possibly also a return-func is specified. These functions will be called when a meta trace message arrives to the inviso meta tracer as a result of function calls or returns for this meta traced function. What exactly to write in the trace information file is dictated by the merge mechanism. This since pid-alias translations are done off line when merging log-files. See the chapter on merging and formatting log files for more details.
Simple example where the call to the function
-module(mytrace).
call_assoc_id(_CallingPid,[Pid,Ref],PublLoopData) ->
{ok,PublLoopData,term_to_binary({Pid,Ref,alias,now()})}.
(node1@hurin)21> inviso:tpm(connection,assoc_id,2,[], {mytrace,call_assoc_id}). {ok,[{'node1@hurin',{ok,1}}, {'node2@hurin',{ok,1}}]} (node1@hurin)22>
It is of course very likely that the public loop data structure must be extended to host all functions where the meta tracer must delay its action until the function in question returns. What is necessary is to create your own public loop data structure at trace initialization. This is done by using the
Simple example where tracing is initiated with a public loop data structure having 10 places for nine (the locally registered names is mandatory) different functions to be meta traced. Note that the BIF
(node1@hurin)4> inviso:init_tracing( [{node2@hurin,[{trace,{file,"tracefile_node2.log"}}, {ti,{file,"trace_node2.ti",{{erlang,list_to_tuple,[lists:duplicate(10,[])]}, void,{inviso_rt_meta,clean_std_publld}}}}]}]). {ok,[{'node2@hurin',{ok,[{trace_log,ok},{ti_log,ok}]}}]} (node1@hurin)5>
Since meta tracing is independent of regular tracing and catches any function call to a particular function made in any process, it is well suited to be used to turn things on or off during execution. That trick is done by letting the
In order to trace before any user interaction is possible, an autostart mechanism is implemented. The runtime component is started by the top supervisor of the Runtime_Tools application top supervisor. Hence the Runtime_Tools application must be part of the boot script for autostart tracing to work. The Runtime_Tools applications must of course be started before any application that is to be traced. Do note that application startup is not entirely synchronous. Meaning that just because the application controller has begun starting the next application, Runtime_Tools is not necessarily fully up and running.
The autostart mechanism is configurable. The runtime component comes with a standard autostart configuration, only missing two text-files to be completely operational.
The autostart is controlled by the Runtime_Tools application configuration parameter
An
autostart(RuntimeToolsArg) = {MFA,Options,Tag} | any()
If MFA does not properly point out a function possible to call with
As mentioned above, Inviso comes with a complete implementation of autostart sufficient for most situations.
The default autostart module is
Its
The config file must be an ascii text file with one or more tuples ended with a dot. The following parameters are recognized:
Optional parameter where
Optional parameter controlling how initialization shall be done. The control component will spawn a separate process to do the initializations by doing
Optional parameter specifying the options for the runtime component itself. See
Optional parameter specifying the runtime component tag. If missing the default tag will be
Example:
{repeat,1}.
{mfa,{inviso_autostart_server,
init,
[[{tracerdata,{file,"mylogfile"}},
{cmdfiles,["a_trace_case.txt"]},
{bindings,[{'M',mymod},{'F','_'},{'Arity','_'}]},
{translations,[]}]]}}.
The example file results in the start of a runtime component given no specific options. There will only be one autostart since the repeat parameter is set to 1. Tracing will be initiated by the standard initiator (
To further facilitate the standard autostart implementation a default initiator is implemented. To use it, simply specify it as mfa in the config file read by the standard autostart module.
Its
Specifies how tracing is initiated. See
Specifies trace-case files which shall be executed to set the patterns and flags of the trace. See the
Optional parameter specifying how functions in trace-case files shall be translated. This is useful since trace-cases can be written for higher-layer Inviso tools, but must during an autostart execute using
Translations=
[{{Mod1,Func1,Arity},
{Mod2,Func2,{TranslMod,TranslFunc}}},...]
TranslMod:TranslFunc(ListOfOrigArgs)->
ListOfTransformedArgs
Optional parameter specifying the actual values of variables used in the trace-cases.
To facilitate creating the configuration file described above, there are functions in a module named
The node(s) in question must be running since the functionality in the utility library uses distributed Erlang to access the file system.
In order to protect real "live" systems from getting a runtime component lingering around without a control component, a dependency property can be specified at runtime component start-up. The property specifies a dependency in milliseconds. Meaning that if the property is set to 0 (zero), the runtime component will terminate immediately if its current control component terminates.
If a control component tries to start a runtime component at an Erlang node where there already is a runtime component, the control component will adopt the already existing runtime component if it has no current control component. Otherwise the control component will experience an error, not being able to start a runtime component at that node.
It must also be noted that an autostart runtime component is running without control component, at least before any control component adopts it.
Since Inviso is intended to be used on real "live" systems, it is possible to protect the system against overload, having Inviso suspend tracing should an overload situation occur.
What indicates an overload situation must be programmed and configured outside of Inviso. Inviso can initiate an overload protection, call an overload function periodically and clean-up an overload mechanism should it decide to terminate.
Internally inside the runtime component, suspending tracing means removing all process trace flags and meta patterns. Reactivating tracing is outside the scoop of Inviso, but can be implemented in a tool using Inviso.
Simple example adding a runtime component and making it protect its Erlang node from overload.
inviso:add_node(my_rt_tag, [{overload,{{my_ovl,check}, 15000, {my_ovl,start,[my_port_pgm]}, {my_ovl,stop,[my_port_pgm]}}}]).
Immediately when the runtime component is started, it will initiate overload protection by calling
If logging trace messages to a logfile has been used (decided when tracing is initiated) the various log files will be located on the different Erlang nodes participating in the trace. The log files must be merged and formatted for the following reasons:
The first step before any merging can take place is of course to get all log files, including any trace information files to a location where the logfile merger can access them. This can either be done by simply copying the files. However if the file systems on the Erlang nodes are not that easily accessed, there is a
Inviso comes with two Erlang modules,
Trace messages in the log files must of course be time-stamped for the logfile merger to be capable of correctly merging them. This means using the
The standard inviso log-file reader understands the following trace information file entries:
{Pid,Alias,alias,NowStamp}
{Pid,Alias,unalias,NowStamp}
The
The idea behind trace cases is that someone knowledgeable of a certain system component can write a file specifying the trace-patterns and process trace flags necessary to trace on certain items once and for all. Hence a trace case will most likely be a series of calls to functions setting trace patterns and process trace flags.
However, the actual Erlang nodes and values of arguments given in the trace function calls can not be static in order for the trace cases to become useful and reusable. A trace case file must therefore be possible to parameterize. Introducing variables that will get their values at the time of trace case execution. It may also be the case that Inviso is used as a component in a higher layer trace tool. Trace cases may therefore be written calling more complex functions than the low level
This results in that for trace cases to be useful there must be a function call translation mechanism and an execution environment capable of handling variable bindings.
A trace-case is a text ascii file consisting of function calls written as they could have been done in the Erlang shell:
modulename:functionname(arg1,arg3,...).
A trace-case may contain any valid function call, including binding new variables which are used later in the trace-case, but:
Example: Trace cases are expected to be written to be executed directly in an Erlang shell (by some utility reading a text file on trace case format) calling
Assume that we have the following trace-case file:
inviso:tpl(Nodes,mymod,'_','_',MS).
inviso:tf(Nodes,all,[call,timestamp]).
For this to work in an autostart the following translation is needed:
[{{inviso,tpl,5},{inviso_rt,tpl,{erlang,tl}}},
{{inviso,tf,3},{inviso_rt,tf,{erlang,tl}}}]
Since transforming the arguments from
Further there must be a variable binding for