20012015
Ericsson AB. All Rights Reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
The Abstract Format
Arndt Jonasson
Kenneth Lundin
1
Jultomten
00-12-01
A
absform.xml
This document describes the standard representation of parse trees for Erlang
programs as Erlang terms. This representation is known as the abstract format.
Functions dealing with such parse trees are
and functions in the modules
,
,
,
,
,
and
.
They are also used as input and output for parse transforms (see the module
).
We use the function to denote the mapping from an Erlang source
construct to its abstract format representation , and write
.
The word below represents an integer, and denotes the
number of the line in the source file where the construction occurred.
Several instances of in the same construction may denote
different lines.
Since operators are not terms in their own right, when operators are
mentioned below, the representation of an operator should be taken to
be the atom with a printname consisting of the same characters as the
operator.
Module declarations and forms
A module declaration consists of a sequence of forms that are either
function declarations or attributes.
- If D is a module declaration consisting of the forms
, ..., , then
Rep(D) = .
- If F is an attribute , then
Rep(F) = .
- If F is an attribute , then
Rep(F) = .
- If F is an attribute , then
Rep(F) = .
- If F is an attribute , then
Rep(F) = .
- If F is an attribute , then
Rep(F) = .
- If F is an attribute , then
Rep(F) = .
- If F is an attribute , then
Rep(F) = .
- If F is a record declaration , then
Rep(F) =
. For Rep(V), see below.
- If F is a type attribute (i.e. or
)
where each
is a variable, then Rep(F) =
.
For Rep(T), see below.
- If F is a type spec (i.e. or
)
,
where each is a fun type clause with an
argument sequence of the same length , then
Rep(F) =
.
For Rep(Tc_i), see below.
- If F is a type spec (i.e. or
)
,
where each is a fun type clause with an
argument sequence of the same length , then
Rep(F) =
.
For Rep(Tc_i), see below.
- If F is a wild attribute , then
Rep(F) = .
- If F is a function declaration ,
where each is a function clause with a
pattern sequence of the same length , then
Rep(F) = .
Type clauses
- If T is a fun type clause
Ret]]>, where each
and are types, then
Rep(T) =
.
- If T is a bounded fun type clause ,
where is an unbounded fun type clause and
is a type guard sequence, then Rep(T) =
.
Type guards
- If G is a constraint , where
is an atom and each is a
type, then Rep(G) =
.
- If G is a type definition ,
where is a variable and
is a type, then Rep(G) =
.
Types
- If T is a type definition ,
where is a variable and
is a type, then Rep(T) =
.
- If T is a type union ,
where each is a type, then Rep(T) =
.
- If T is a type range ,
where and are types, then
Rep(T) = .
- If T is a binary operation ,
where is an arithmetic or bitwise binary operator
and and are types, then
Rep(T) = .
- If T is , where is an
arithmetic or bitwise unary operator and is a
type, then Rep(T) = .
- If T is a fun type , then Rep(T) =
.
- If T is a variable , then Rep(T) =
, where is an atom
with a printname consisting of the same characters as
.
- If T is an atomic literal L and L is not a string literal, then
Rep(T) = Rep(L).
- If T is a tuple or map type (i.e.
or ), then Rep(T) =
.
- If T is a type , where each
is a type, then Rep(T) =
.
- If T is a remote type , where
each is a type and and
, then Rep(T) =
.
- If T is the nil type , then Rep(T) =
.
- If T is a list type , where
is a type, then Rep(T) =
.
- If T is a non-empty list type , where
is a type, then Rep(T) =
.
- If T is a map type , where each
is a map pair type, then Rep(T) =
.
- If T is a map pair type V]]>, where
and are types,
then Rep(T) =
.
- If T is a tuple type , where
each is a type, then Rep(T) =
.
- If T is a record type , where
is an atom, then Rep(T) =
.
- If T is a record type ,
where is an atom, then Rep(T) =
.
- If T is a record field type ,
where is an atom, then Rep(T) =
.
- If T is a record field type >]]>, then Rep(T) =
.
- If T is a binary type >]]>, where
is a type, then Rep(T) =
.
- If T is a binary type >]]>,
where is a type, then Rep(T) =
.
- If T is a binary type >]]>,
where and is a type, then
Rep(T) =
.
- If T is a fun type Ret)]]>, then
Rep(T) = .
- If T is a fun type , where
is an unbounded fun type clause,
then Rep(T) = .
Record fields
Each field in a record declaration may have an optional
explicit default initializer expression
- If V is , then
Rep(V) = .
- If V is , then
Rep(V) = .
- If V is , where is
an atom and is a type, then Rep(V) =
.
- If V is , where
is an atom, is an expression and
is a type, then Rep(V) =
.
Representation of parse errors and end of file
In addition to the representations of forms, the list that represents
a module declaration (as returned by functions in and
) may contain tuples and , denoting
syntactically incorrect forms and warnings, and , denoting an end
of stream encountered before a complete form had been parsed.
Atomic literals
There are five kinds of atomic literals, which are represented in the
same way in patterns, expressions and guards:
- If L is an integer or character literal, then
Rep(L) = .
- If L is a float literal, then
Rep(L) = .
- If L is a string literal consisting of the characters
, ..., , then
Rep(L) = .
- If L is an atom literal, then
Rep(L) = .
Note that negative integer and float literals do not occur as such; they are
parsed as an application of the unary negation operator.
Patterns
If is a sequence of patterns , then
Rep(Ps) = . Such sequences occur as the
list of arguments to a function or fun.
Individual patterns are represented as follows:
- If P is an atomic literal L, then Rep(P) = Rep(L).
- If P is a compound pattern , then
Rep(P) = .
- If P is a variable pattern , then
Rep(P) = ,
where A is an atom with a printname consisting of the same characters as
.
- If P is a universal pattern , then
Rep(P) = .
- If P is a tuple pattern , then
Rep(P) = .
- If P is a nil pattern , then
Rep(P) = .
- If P is a cons pattern , then
Rep(P) = .
- If E is a binary pattern >]]>, then
Rep(E) = .
For Rep(TSL), see below.
An omitted is represented by . An omitted
(type specifier list) is represented by .
- If P is , where is a binary operator (this
is either an occurrence of applied to a literal string or character
list, or an occurrence of an expression that can be evaluated to a number
at compile time),
then Rep(P) = .
- If P is , where is a unary operator (this is an
occurrence of an expression that can be evaluated to a number at compile
time), then Rep(P) = .
- If P is a record pattern ,
then Rep(P) =
.
- If P is , then
Rep(P) = .
- If P is , then
Rep(P) = ,
i.e., patterns cannot be distinguished from their bodies.
Note that every pattern has the same source form as some expression, and is
represented the same way as the corresponding expression.
Expressions
A body B is a sequence of expressions , and
Rep(B) = .
An expression E is one of the following alternatives:
- If P is an atomic literal , then
Rep(P) = Rep(L).
- If E is , then
Rep(E) = .
- If E is a variable , then
Rep(E) = ,
where is an atom with a printname consisting of the same
characters as .
- If E is a tuple skeleton , then
Rep(E) = .
- If E is , then
Rep(E) = .
- If E is a cons skeleton , then
Rep(E) = .
- If E is a binary constructor >]]>, then
Rep(E) = .
For Rep(TSL), see below.
An omitted is represented by . An omitted
(type specifier list) is represented by .
- If E is , where is a binary operator,
then Rep(E) = .
- If E is , where is a unary operator, then
Rep(E) = .
- If E is , then
Rep(E) =
.
- If E is , then
Rep(E) =
.
- If E is , then
Rep(E) = .
- If E is , then
Rep(E) = .
- If E is where each
is a map assoc or exact field, then Rep(E) =
. For Rep(W), see
below.
- If E is where
is a map assoc or exact field, then Rep(E) =
. For
Rep(W), see below.
- If E is , then
Rep(E) = .
- If E is , then
Rep(E) = .
- If E is , then
Rep(E) =
.
- If E is a list comprehension ,
where each is a generator or a filter, then
Rep(E) = . For Rep(W), see
below.
- If E is a binary comprehension >]]>,
where each is a generator or a filter, then
Rep(E) = . For Rep(W), see
below.
- If E is , where is a body, then
Rep(E) = .
- If E is ,
where each is an if clause then
Rep(E) =
.
- If E is ,
where is an expression and each is a
case clause then
Rep(E) =
.
- If E is ,
where is a body and each is a catch clause then
Rep(E) =
.
- If E is ,
where is a body,
each is a case clause and
each is a catch clause then
Rep(E) =
.
- If E is ,
where and are bodies then
Rep(E) =
.
- If E is ,
where and are a bodies and
each is a case clause then
Rep(E) =
.
- If E is ,
where and are bodies and
each is a catch clause then
Rep(E) =
.
- If E is ,
where and are a bodies,
each is a case clause and
each is a catch clause then
Rep(E) =
.
- If E is ,
where each is a case clause then
Rep(E) =
.
- If E is B_t end]]>,
where each is a case clause,
is an expression and is a body, then
Rep(E) =
.
- If E is , then
Rep(E) = .
- If E is , then
Rep(E) = .
(Before the R15 release: Rep(E) = .)
- If E is
where each is a function clause then Rep(E) =
.
- If E is
where is a variable and each
is a function clause then Rep(E) =
.
- If E is ,
where each is a generator or a filter, then
Rep(E) = .
For Rep(W), see below.
- If E is , a Mnesia record access
inside a query, then
Rep(E) = .
- If E is , then
Rep(E) = ,
i.e., parenthesized expressions cannot be distinguished from their bodies.
Generators and filters
When W is a generator or a filter (in the body of a list or binary comprehension), then:
- If W is a generator , where is a pattern and
is an expression, then
Rep(W) = .
- If W is a generator , where is a pattern and
is an expression, then
Rep(W) = .
- If W is a filter , which is an expression, then
Rep(W) = .
Binary element type specifiers
A type specifier list TSL for a binary element is a sequence of type
specifiers .
Rep(TSL) = .
When TS is a type specifier for a binary element, then:
- If TS is an atom , Rep(TS) = .
- If TS is a couple where is an atom and
is an integer, Rep(TS) = .
Map assoc and exact fields
When W is an assoc or exact field (in the body of a map), then:
- If W is an assoc field V]]>, where
and are both expressions,
then Rep(W) = .
- If W is an exact field , where
and are both expressions,
then Rep(W) = .
Clauses
There are function clauses, if clauses, case clauses
and catch clauses.
A clause is one of the following alternatives:
- If C is a function clause B]]>
where is a pattern sequence and is a body, then
Rep(C) = .
- If C is a function clause B]]>
where is a pattern sequence,
is a guard sequence and is a body, then
Rep(C) = .
- If C is an if clause B]]>
where is a guard sequence and is a body, then
Rep(C) = .
- If C is a case clause B]]>
where is a pattern and is a body, then
Rep(C) = .
- If C is a case clause B]]>
where is a pattern,
is a guard sequence and is a body, then
Rep(C) = .
- If C is a catch clause B]]>
where is a pattern and is a body, then
Rep(C) = .
- If C is a catch clause B]]>
where is an atomic literal or a variable pattern,
is a pattern and is a body, then
Rep(C) = .
- If C is a catch clause B]]>
where is a pattern, is a guard sequence
and is a body, then
Rep(C) = .
- If C is a catch clause B]]>
where is an atomic literal or a variable pattern,
is a pattern, is a guard sequence
and is a body, then
Rep(C) = .
Guards
A guard sequence Gs is a sequence of guards , and
Rep(Gs) = . If the guard sequence is
empty, Rep(Gs) = .
A guard G is a nonempty sequence of guard tests , and
Rep(G) = .
A guard test is one of the following alternatives:
- If Gt is an atomic literal L, then Rep(Gt) = Rep(L).
- If Gt is a variable pattern , then
Rep(Gt) = ,
where A is an atom with a printname consisting of the same characters as
.
- If Gt is a tuple skeleton , then
Rep(Gt) = .
- If Gt is , then
Rep(Gt) = .
- If Gt is a cons skeleton , then
Rep(Gt) = .
- If Gt is a binary constructor >]]>, then
Rep(Gt) = .
For Rep(TSL), see above.
An omitted is represented by . An omitted
(type specifier list) is represented by .
- If Gt is , where
is a binary operator, then Rep(Gt) = .
- If Gt is , where is a unary operator, then
Rep(Gt) = .
- If Gt is , then
Rep(E) =
.
- If Gt is , then
Rep(Gt) = .
- If Gt is , then
Rep(Gt) = .
- If Gt is , where is an atom, then
Rep(Gt) = .
- If Gt is , where is
the atom and is an atom or an operator, then
Rep(Gt) = .
- If Gt is , where is
the atom and is an atom or an operator, then
Rep(Gt) = .
- If Gt is , then
Rep(Gt) = ,
i.e., parenthesized guard tests cannot be distinguished from their bodies.
Note that every guard test has the same source form as some expression,
and is represented the same way as the corresponding expression.
The abstract format after preprocessing
The compilation option can be given to the
compiler to have the abstract code stored in
the chunk in the BEAM file
(for debugging purposes).
In OTP R9C and later, the chunk will
contain
where is the abstract code as described
in this document.
In releases of OTP prior to R9C, the abstract code after some more
processing was stored in the BEAM file. The first element of the
tuple would be either (R7B) or
(R8B).