19992016
Ericsson AB. All Rights Reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
Match Specifications in Erlang
Patrik Nyblom
1999-06-01
PA1
match_spec.xml
A "match specification" (match_spec) is an Erlang term describing a
small "program" that tries to match something. It can be used
to either control tracing with
erlang:trace_pattern/3
or to search for objects in an ETS table with for example
ets:select/2.
The match specification in many ways works like a small function in Erlang,
but is interpreted/compiled by the Erlang runtime system to something much more
efficient than calling an Erlang function. The match specification is also
very limited compared to the expressiveness of real Erlang functions.
The most notable difference between a match specification and an Erlang
fun is the syntax. Match specifications are Erlang terms, not Erlang code.
Also, a match specification has a strange concept of exceptions:
-
An exception (such as ) in the
part, which resembles an Erlang guard,
generates immediate failure.
-
An exception in the part, which resembles
the body of an Erlang function, is implicitly caught and results in the
single atom .
Grammar
A match specification used in tracing can be described in the following
informal grammar:
- MatchExpression ::= [ MatchFunction, ... ]
- MatchFunction ::= { MatchHead, MatchConditions, MatchBody }
- MatchHead ::= MatchVariable | |
[ MatchHeadPart, ... ]
- MatchHeadPart ::= term() | MatchVariable |
- MatchVariable ::= '$<number>'
- MatchConditions ::= [ MatchCondition, ...] |
- MatchCondition ::= { GuardFunction } | { GuardFunction,
ConditionExpression, ... }
- BoolFunction ::= |
| |
| |
| |
| |
| |
| |
| |
| |
| |
- ConditionExpression ::= ExprMatchVariable | { GuardFunction } |
{ GuardFunction, ConditionExpression, ... } | TermConstruct
- ExprMatchVariable ::= MatchVariable (bound in the MatchHead) |
|
- TermConstruct = {{}} | {{ ConditionExpression, ... }} |
| [ConditionExpression, ...] |
| #{term() => ConditionExpression, ...} |
NonCompositeTerm | Constant
- NonCompositeTerm ::= term() (not list or tuple or map)
- Constant ::= {, term()}
- GuardFunction ::= BoolFunction | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| ']]> |
=']]> | |
| |
| |
| |
- MatchBody ::= [ ActionTerm ]
- ActionTerm ::= ConditionExpression | ActionCall
- ActionCall ::= {ActionFunction} | {ActionFunction, ActionTerm, ...}
- ActionFunction ::= |
| |
| |
| |
| |
| |
|
A match specification used in
ets(3)
can be described in the following informal grammar:
- MatchExpression ::= [ MatchFunction, ... ]
- MatchFunction ::= { MatchHead, MatchConditions, MatchBody }
- MatchHead ::= MatchVariable | |
{ MatchHeadPart, ... }
- MatchHeadPart ::= term() | MatchVariable |
- MatchVariable ::= '$<number>'
- MatchConditions ::= [ MatchCondition, ...] |
- MatchCondition ::= { GuardFunction } |
{ GuardFunction, ConditionExpression, ... }
- BoolFunction ::= |
| |
| |
| |
| |
| |
| |
| |
| |
| |
- ConditionExpression ::= ExprMatchVariable | { GuardFunction } |
{ GuardFunction, ConditionExpression, ... } | TermConstruct
- ExprMatchVariable ::= MatchVariable (bound in the MatchHead) |
|
- TermConstruct = {{}} | {{ ConditionExpression, ... }} |
| [ConditionExpression, ...] | #{} |
#{term() => ConditionExpression, ...} | NonCompositeTerm | Constant
- NonCompositeTerm ::= term() (not list or tuple or map)
- Constant ::= {, term()}
- GuardFunction ::= BoolFunction | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| ']]> |
=']]> | |
| |
| |
| |
- MatchBody ::= [ ConditionExpression, ... ]
Function Descriptions
Functions Allowed in All Types of Match Specifications
The functions allowed in work as
follows:
is_atom, is_float, is_integer, is_list,
is_number, is_pid, is_port, is_reference,
is_tuple, is_map, is_binary, is_function
-
Same as the corresponding guard tests in Erlang, return
or .
is_record
-
Takes an additional parameter, which must be the result
of )]]>, like in
.
'not'
-
Negates its single argument (anything other
than gives ).
'and'
-
Returns if all its arguments (variable
length argument list) evaluate to , otherwise
. Evaluation order is undefined.
'or'
-
Returns if any of its arguments
evaluates to . Variable length argument
list. Evaluation order is undefined.
'andalso'
-
Works as , but quits evaluating its
arguments when one argument evaluates to something else
than true. Arguments are evaluated left to right.
'orelse'
-
Works as , but quits evaluating as soon
as one of its arguments evaluates to .
Arguments are evaluated left to right.
'xor'
-
Only two arguments, of which one must be true and the
other false to return ; otherwise
returns false.
abs, element, hd, length, node,
round, size, tl, trunc, '+',
'-', '*', 'div', 'rem', 'band',
'bor', 'bxor', 'bnot', 'bsl',
'bsr', '>', '>=', '<', '=<',
'=:=', '==', '=/=', '/=',
self
-
Same as the corresponding Erlang BIFs (or operators). In case of
bad arguments, the result depends on the context. In the
part of the expression, the test
fails immediately (like in an Erlang guard). In the
part, exceptions are implicitly caught
and the call results in the atom .
Functions Allowed Only for Tracing
The functions allowed only for tracing work as follows:
is_seq_trace
-
Returns if a sequential trace token is set
for the current process, otherwise .
set_seq_token
-
Works as , but returns
on success, and
on error or bad argument. Only allowed in the
part and only allowed when tracing.
get_seq_token
-
Same as and only
allowed in the part when tracing.
message
-
Sets an additional message appended to the
trace message sent. One can only set one additional message in
the body. Later calls replace the appended message.
As a special case, disables
sending of trace messages ('call' and 'return_to') for this function
call, just like if the match specification had not matched.
This can be useful if only the side effects of
the part are desired.
Another special case is , which
sets the default behavior, as if the function had no match
specification; trace message is sent with no extra information
(if no other calls to are placed before
, it is in fact a "noop").
Takes one argument: the message. Returns
and can only be used in the part and
when tracing.
return_trace
-
Causes a trace message to be sent
upon return from the current function. Takes no arguments, returns
and can only be used in the
part when tracing.
If the process trace flag is active, the
trace message is inhibited.
Warning: If the traced function is tail-recursive, this
match specification function destroys that property. Hence, if a
match specification executing this function is used on a
perpetual server process, it can only be active for a limited
period of time, or the emulator will eventually use all memory in
the host machine and crash. If this match specification function is
inhibited using process trace flag ,
tail-recursiveness still remains.
exception_trace
-
Works as return_trace plus; if the traced function exits
because of an exception,
an trace message is generated,
regardless of the exception is caught or not.
process_dump
-
Returns some textual information about
the current process as a binary. Takes no arguments and is only
allowed in the part when tracing.
enable_trace
-
With one parameter this function turns on tracing like the Erlang
call , where
is the parameter to
.
With two parameters, the first parameter is to be either a process
identifier or the registered name of a process. In this case tracing
is turned on for the designated process in the same way as in the
Erlang call , where
P1 is the first and P2 is the second argument. The
process gets its trace messages sent to the
same tracer as the process executing the statement uses.
cannot be one of the atoms
, or
(unless they are registered names).
cannot be
or .
Returns and can only be used in
the part when tracing.
disable_trace
-
With one parameter this function disables tracing like the Erlang
call , where
is the parameter to
.
With two parameters this function works as the Erlang call
, where P1
can be either a process identifier or a registered name and is
specified as the first argument to the match specification function.
cannot be
or .
Returns and can only be used in the
part when tracing.
trace
-
With two parameters this function takes a list
of trace flags to disable as first parameter and a list
of trace flags to enable as second parameter. Logically, the
disable list is applied first, but effectively all changes
are applied atomically. The trace flags
are the same as for ,
not including , but including
.
If a tracer is specified in both lists, the tracer in the
enable list takes precedence. If no tracer is specified, the same
tracer as the process executing the match specification is used (not the meta tracer).
If that process doesn't have tracer either, then trace flags are ignored.
When using a tracer module,
the module must be loaded before the match specification is
executed. If it is not loaded, the match fails.
With three parameters to this function, the first is
either a process identifier or the registered name of a
process to set trace flags on, the second is the disable
list, and the third is the enable list.
Returns if any trace property was changed
for the trace target process, otherwise .
Can only be used in the part when
tracing.
caller
-
Returns the calling function as a tuple {Module, Function,
Arity} or the atom if the calling
function cannot be determined. Can only be used in the
part when tracing.
Notice that if a "technically built in function" (that is, a
function not written in Erlang) is traced, the
function sometimes returns the atom
. The calling
Erlang function is not available during such calls.
display
-
For debugging purposes only. Displays the single argument as an
Erlang term on stdout, which is seldom what is wanted.
Returns and can only be used in the
part when tracing.
get_tcw
-
Takes no argument and returns the value of the node's trace
control word. The same is done by
.
The trace control word is a 32-bit unsigned integer intended for
generic trace control. The trace control word can be tested and
set both from within trace match specifications and with BIFs.
This call is only allowed when tracing.
set_tcw
-
Takes one unsigned integer argument, sets the value of
the node's trace control word to the value of the argument,
and returns the previous value. The same is done by
.
It is only allowed to use in the
part when tracing.
silent
-
Takes one argument. If the argument is ,
the call trace message mode for the current process is set to
silent for this call and all later calls, that is, call trace
messages are inhibited even if
is called in the
part for a traced function.
This mode can also be activated with flag
to
.
If the argument is , the call trace
message mode for the current process is set to normal
(non-silent) for this call and all later calls.
If the argument is not or
, the call trace message mode is
unaffected.
All "function calls" must be tuples, even if they take no arguments.
The value of is the atom()
, but the value of is
the pid() of the current process.
Match target
Each execution of a match specification is done against
a match target term. The format and content of the target term
depends on the context in which the match is done. The match
target for ETS is always a full table tuple. The match target
for call trace is always a list of all function arguments. The
match target for event trace depends on the event type, see
table below.
Context |
Type |
Match target |
Description |
ETS |
|
{Key, Value1, Value2, ...} |
A table object |
Trace |
call |
[Arg1, Arg2, ...] |
Function arguments |
Trace |
send |
[Receiver, Message] |
Receiving process/port and message term |
Trace |
'receive' |
[Node, Sender, Message] |
Sending node, process/port and message term |
Match target depending on context
Variables and Literals
Variables take the form ']]>, where
]]> is an integer between 0 and
100,000,000 (1e+8). The behavior if the number is outside these limits
is undefined. In the part, the
special variable matches anything, and never gets
bound (like in Erlang).
-
In the parts,
no unbound variables are allowed, so is
interpreted as itself (an atom). Variables can only be bound in the
part.
-
In the and
parts, only variables bound
previously can be used.
-
As a special case, the following apply in the
parts:
-
The variable expands to the whole
match target term.
-
The variable expands to a list of the
values of all bound variables in order (that is,
).
In the part, all literals (except the
variables above) are interpreted "as is".
In the parts, the
interpretation is in some ways different. Literals in these parts
can either be written "as is", which works for all literals except
tuples, or by using the special form ,
where is any Erlang term.
For tuple literals in the match specification, double tuple parentheses
can also be used, that is, construct them as a tuple of
arity one containing a single tuple, which is the one to be
constructed. The "double tuple parenthesis" syntax is useful to
construct tuples from already bound variables, like in
. Examples:
Expression |
Variable Bindings |
Result |
{{'$1','$2'}} |
'$1' = a, '$2' = b |
{a,b} |
{const, {'$1', '$2'}} |
Irrelevant |
{'$1', '$2'} |
a |
Irrelevant |
a |
'$1' |
'$1' = [] |
[] |
['$1'] |
'$1' = [] |
[[]] |
[{{a}}] |
Irrelevant |
[{a}] |
42 |
Irrelevant |
42 |
"hello" |
Irrelevant |
"hello" |
$1 |
Irrelevant |
49 (the ASCII value for
character '1') |
Literals in MatchCondition/MatchBody Parts of a Match
Specification
Execution of the Match
The execution of the match expression, when the runtime system
decides whether a trace message is to be sent, is as follows:
For each tuple in the list and while
no match has succeeded:
-
Match the part against the match target
term, binding the ']]> variables
(much like in ). If the
part cannot match the arguments, the
match fails.
-
Evaluate each (where only
']]> variables previously bound in the
part can occur) and expect it to return
the atom . When a condition does not evaluate
to , the match fails. If any BIF call
generates an exception, the match also fails.
-
Two cases can occur:
-
If the match specification is executing when tracing:
Evaluate each in the same way as
the , but ignore the return
values. Regardless of what happens in this part, the match has
succeeded.
-
If the match specification is executed when selecting objects
from an ETS table:
Evaluate the expressions in order and return the value of
the last expression (typically there is only one expression
in this context).
Differences between Match Specifications in ETS and Tracing
ETS match specifications produce a return value.
Usually the contains one single
that defines the return value
without any side effects. Calls with side effects are not allowed in
the ETS context.
When tracing there is no return value to produce, the
match specification either matches or does not. The effect when the
expression matches is a trace message rather than a returned
term. The s are executed as in an imperative
language, that is, for their side effects. Functions with side effects
are also allowed when tracing.
Tracing Examples
Match an argument list of three, where the first and third arguments
are equal:
Match an argument list of three, where the second argument is
a number > 3:
', '$1', 3}],
[]}]
]]>
Match an argument list of three, where the third argument is
either a tuple containing argument one and two, or a list
beginning with argument one and two (that is,
or ):
The above problem can also be solved as follows:
Match two arguments, where the first is a tuple beginning with
a list that in turn begins with the second argument times
two (that is, [{[4,x],y},2] or [{[8], y, z},4]):
Match three arguments. When all three are equal and are
numbers, append the process dump to the trace message, otherwise
let the trace message be "as is", but set the sequential trace
token label to 4711:
As can be noted above, the parameter list can be matched against a
single or an .
To replace the whole parameter list with a single variable is a special
case. In all other cases the must be a
proper list.
Generate a trace message only if the trace control word is set to 1:
Generate a trace message only if there is a seq_trace token:
Remove the 'silent' trace flag when the first argument is
'verbose', and add it when it is 'silent':
Add a return_trace message if the function is of arity 3:
Generate a trace message only if the function is of arity 3 and the
first argument is 'trace':
ETS Examples
Match all objects in an ETS table, where the first element is
the atom 'strider' and the tuple arity is 3, and return the whole
object:
Match all objects in an ETS table with arity > 1 and the first
element is 'gandalf', and return element 2:
=',{size, '$1'},2}],
[{element,2,'$1'}]}]
]]>
In this example, if the first element had been the key, it is
much more efficient to match that key in the
part than in the part.
The search space of the tables is restricted with regards to the
so
that only objects with the matching key are searched.
Match tuples of three elements, where the second element is either
'merry' or 'pippin', and return the whole objects:
Function ets:test_ms/2>
can be useful for testing complicated ETS matches.