<?xml version="1.0" encoding="latin1" ?>
<!DOCTYPE chapter SYSTEM "chapter.dtd">
<chapter>
<header>
<copyright>
<year>2003</year><year>2009</year>
<holder>Ericsson AB. All Rights Reserved.</holder>
</copyright>
<legalnotice>
The contents of this file are subject to the Erlang Public License,
Version 1.1, (the "License"); you may not use this file except in
compliance with the License. You should have received a copy of the
Erlang Public License along with this software. If not, it can be
retrieved online at http://www.erlang.org/.
Software distributed under the License is distributed on an "AS IS"
basis, WITHOUT WARRANTY OF ANY KIND, either express or implied. See
the License for the specific language governing rights and limitations
under the License.
</legalnotice>
<title>Types and Function Specifications</title>
<prepared>Kostis Sagonas, Tobias Lindahl, Kenneth Lundin</prepared>
<docno></docno>
<date></date>
<rev></rev>
<file>typespec.xml</file>
</header>
<section>
<title>Introduction of Types</title>
<p>
Although Erlang is a dynamically typed language this section describes
an extension to the Erlang language for declaring sets of Erlang terms
to form a particular type, effectively forming a specific sub-type of the
set of all Erlang terms.
</p>
<p>
Subsequently, these types can be used to specify types of record fields
and the argument and return types of functions.
</p>
<p>
Type information can be used to document function interfaces,
provide more information for bug detection tools such as <c>Dialyzer</c>,
and can be exploited by documentation tools such as <c>Edoc</c> for
generating program documentation of various forms.
It is expected that the type language described in this document will
supersede and replace the purely comment-based <c>@type</c> and
<c>@spec</c> declarations used by <c>Edoc</c>.
</p>
<warning>
The syntax and semantics described here is still preliminary and might be
slightly changed and extended before it becomes officially supported.
The plan is that this will happen in R14B.
</warning>
</section>
<section>
<marker id="syntax"></marker>
<title>Types and their Syntax</title>
<p>
Types describe sets of Erlang terms.
Types consist and are built from a set of predefined types (e.g. <c>integer()</c>,
<c>atom()</c>, <c>pid()</c>, ...) described below.
Predefined types represent a typically infinite set of Erlang terms which
belong to this type.
For example, the type <c>atom()</c> stands for the set of all Erlang atoms.
</p>
<p>
For integers and atoms, we allow for singleton types (e.g. the integers <c>-1</c>
and <c>42</c> or the atoms <c>'foo'</c> and <c>'bar'</c>).
All other types are built using unions of either predefined types or singleton
types. In a type union between a type and one of its sub-types the sub-type is
absorbed by the super-type and the union is subsequently treated as if the
sub-type was not a constituent of the union. For example, the type union:
</p>
<pre>
atom() | 'bar' | integer() | 42</pre>
<p>
describes the same set of terms as the type union:
</p>
<pre>
atom() | integer()</pre>
<p>
Because of sub-type relations that exist between types, types form a lattice
where the topmost element, any(), denotes the set of all Erlang terms and
the bottom-most element, none(), denotes the empty set of terms.
</p>
<p>
The set of predefined types and the syntax for types is given below:
</p>
<pre><![CDATA[
Type :: any() %% The top type, the set of all Erlang terms.
| none() %% The bottom type, contains no terms.
| pid()
| port()
| ref()
| [] %% nil
| Atom
| Binary
| float()
| Fun
| Integer
| List
| Tuple
| Union
| UserDefined %% described in Section 2
Union :: Type1 | Type2
Atom :: atom()
| Erlang_Atom %% 'foo', 'bar', ...
Binary :: binary() %% <<_:_ * 8>>
| <<>>
| <<_:Erlang_Integer>> %% Base size
| <<_:_*Erlang_Integer>> %% Unit size
| <<_:Erlang_Integer, _:_*Erlang_Integer>>
Fun :: fun() %% any function
| fun((...) -> Type) %% any arity, returning Type
| fun(() -> Type)
| fun((TList) -> Type)
Integer :: integer()
| Erlang_Integer %% ..., -1, 0, 1, ... 42 ...
| Erlang_Integer..Erlang_Integer %% specifies an integer range
List :: list(Type) %% Proper list ([]-terminated)
| improper_list(Type1, Type2) %% Type1=contents, Type2=termination
| maybe_improper_list(Type1, Type2) %% Type1 and Type2 as above
Tuple :: tuple() %% stands for a tuple of any size
| {}
| {TList}
TList :: Type
| Type, TList
]]></pre>
<p>
Because lists are commonly used, they have shorthand type notations.
The type <c>list(T)</c> has the shorthand <c>[T]</c>. The shorthand <c>[T,...]</c> stands for
the set of non-empty proper lists whose elements are of type <c>T</c>.
The only difference between the two shorthands is that <c>[T]</c> may be an
empty list but <c>[T,...]</c> may not.
</p>
<p>
Notice that the shorthand for <c>list()</c>, i.e. the list of elements of unknown type,
is <c>[_]</c> (or <c>[any()]</c>), not <c>[]</c>.
The notation <c>[]</c> specifies the singleton type for the empty list.
</p>
<p>
For convenience, the following types are also built-in.
They can be thought as predefined aliases for the type unions also shown in
the table. (Some type unions below slightly abuse the syntax of types.)
</p>
<table>
<row>
<cell><b>Built-in type</b></cell><cell><b>Stands for</b></cell>
</row>
<row>
<cell><c>term()</c></cell><cell><c>any()</c></cell>
</row>
<row>
<cell><c>bool()</c></cell><cell><c>'false' | 'true'</c></cell>
</row>
<row>
<cell><c>byte()</c></cell><cell><c>0..255</c></cell>
</row>
<row>
<cell><c>char()</c></cell><cell><c>0..16#10ffff</c></cell>
</row>
<row>
<cell><c>non_neg_integer()</c></cell><cell><c>0..</c></cell>
</row>
<row>
<cell><c>pos_integer()</c></cell><cell><c>1..</c></cell>
</row>
<row>
<cell><c>neg_integer()</c></cell><cell><c>..-1</c></cell>
</row>
<row>
<cell><c>number()</c></cell><cell><c>integer() | float()</c></cell>
</row>
<row>
<cell><c>list()</c></cell><cell><c>[any()]</c></cell>
</row>
<row>
<cell><c>maybe_improper_list()</c></cell><cell><c>maybe_improper_list(any(), any())</c></cell>
</row>
<row>
<cell><c>maybe_improper_list(T)</c></cell><cell><c>maybe_improper_list(T, any())</c></cell>
</row>
<row>
<cell><c>string()</c></cell><cell><c>[char()]</c></cell>
</row>
<row>
<cell><c>nonempty_string()</c></cell><cell><c>[char(),...]</c></cell>
</row>
<row>
<cell><c>iolist()</c></cell><cell><c>maybe_improper_list(
char() | binary() | iolist(), binary() | [])</c></cell>
</row>
<row>
<cell><c>module()</c></cell><cell><c>atom()</c></cell>
</row>
<row>
<cell><c>mfa()</c></cell><cell><c>{atom(),atom(),byte()}</c></cell>
</row>
<row>
<cell><c>node()</c></cell><cell><c>atom()</c></cell>
</row>
<row>
<cell><c>timeout()</c></cell><cell><c>'infinity' | non_neg_integer()</c></cell>
</row>
<row>
<cell><c>no_return()</c></cell><cell><c>none()</c></cell>
</row>
</table>
<p>
Users are not allowed to define types with the same names as the predefined or
built-in ones.
This is checked by the compiler and its violation results in a compilation
error.
(For bootstrapping purposes, it can also result to just a warning if this
involves a built-in type which has just been introduced.)
</p>
<note>
The following built-in list types also exist,
but they are expected to be rarely used. Hence, they have long names:
</note>
<pre>
nonempty_maybe_improper_list(Type) :: nonempty_maybe_improper_list(Type, any())
nonempty_maybe_improper_list() :: nonempty_maybe_improper_list(any())
</pre>
<p>
where the following two types
define the set of Erlang terms one would expect:
</p>
<pre>
nonempty_improper_list(Type1, Type2)
nonempty_maybe_improper_list(Type1, Type2)
</pre>
<p>
Also for convenience, we allow for record notation to be used.
Records are just shorthands for the corresponding tuples.
</p>
<pre>
Record :: #Erlang_Atom{}
| #Erlang_Atom{Fields}
</pre>
<p>
Records have been extended to possibly contain type information.
This is described in the sub-section <seealso marker="#typeinrecords">"Type information in record declarations"</seealso> below.
</p>
</section>
<section>
<title>Type declarations of user-defined types</title>
<p>
As seen, the basic syntax of a type is an atom followed by closed
parentheses. New types are declared using '-type' compiler attributes
as in the following:
</p>
<pre>
-type my_type() :: Type.
</pre>
<p>
where the type name is an atom (<c>'my_type'</c> in the above) followed by
parenthesis. Type is a type as defined in the previous section.
A current restriction is that Type can contain only predefined types
or user-defined types which have been previously defined.
This restriction is enforced by the compiler and results in a
compilation error. (A similar restriction currently exists for records).
</p>
<p>
This means that currently general recursive types cannot be defined.
Lifting this restriction is future work.
</p>
<p>
Type declarations can also be parameterized by including type variables
between the parentheses. The syntax of type variables is the same as
Erlang variables (starts with an upper case letter).
Naturally, these variables can - and should - appear on the RHS of the
definition. A concrete example appears below:
</p>
<pre>
-type orddict(Key, Val) :: [{Key, Val}].
</pre>
</section>
<marker id="typeinrecords"/>
<section>
<title>
Type information in record declarations
</title>
<p>
The types of record fields can be specified in the declaration of the
record. The syntax for this is:
</p>
<pre>
-record(rec, {field1 :: Type1, field2, field3 :: Type3}).
</pre>
<p>
For fields without type annotations, their type defaults to any().
I.e., the above is a shorthand for:
</p>
<pre>
-record(rec, {field1 :: Type1, field2 :: any(), field3 :: Type3}).
</pre>
<p>
In the presence of initial values for fields,
the type must be declared after the initialization as in the following:
</p>
<pre>
-record(rec, {field1 = [] :: Type1, field2, field3 = 42 :: Type3}).
</pre>
<p>
Naturally, the initial values for fields should be compatible
with (i.e. a member of) the corresponding types.
This is checked by the compiler and results in a compilation error
if a violation is detected. For fields without initial values,
the singleton type <c>'undefined'</c> is added to all declared types.
In other words, the following two record declarations have identical
effects:
</p>
<pre>
-record(rec, {f1 = 42 :: integer(),
f2 :: float(),
f3 :: 'a' | 'b').
-record(rec, {f1 = 42 :: integer(),
f2 :: 'undefined' | float(),
f3 :: 'undefined' | 'a' | 'b').
</pre>
<p>
For this reason, it is recommended that records contain initializers,
whenever possible.
</p>
<p>
Any record, containing type information or not, once defined,
can be used as a type using the syntax:
</p>
<pre>
#rec{}
</pre>
<p>
In addition, the record fields can be further specified when using
a record type by adding type information about the field in the following
manner:
</p>
<pre>
#rec{some_field :: Type}
</pre>
<p>
Any unspecified fields are assumed to have the type in the original
record declaration.
</p>
</section>
<section>
<title>Specifications (contracts) for functions</title>
<p>
A contract (or specification) for a function is given using the new
compiler attribute <c>'-spec'</c>. The basic format is as follows:
</p>
<pre>
-spec Module:Function(ArgType1, ..., ArgTypeN) -> ReturnType.
</pre>
<p>
The arity of the function has to match the number of arguments,
or else a compilation error occurs.
</p>
<p>
This form can also be used in header files (.hrl) to declare type
information for exported functions.
Then these header files can be included in files that (implicitly or
explicitly) import these functions.
</p>
<p>
For most uses within a given module, the following shorthand is allowed:
</p>
<pre>
-spec Function(ArgType1, ..., ArgTypeN) -> ReturnType.
</pre>
<p>
Also, for documentation purposes, argument names can be given:
</p>
<pre>
-spec Function(ArgName1 :: Type1, ..., ArgNameN :: TypeN) -> RT.
</pre>
<p>
A function specification can be overloaded.
That is, it can have several types, separated by a semicolon (<c>;</c>):
</p>
<pre>
-spec foo(T1, T2) -> T3
; (T4, T5) -> T6.
</pre>
<p>
A current restriction, which currently results in a warning
(OBS: not an error) by the compiler, is that the domains of the argument
types cannot be overlapping.
For example, the following specification results in a warning:
</p>
<pre>
-spec foo(pos_integer()) -> pos_integer()
; (integer()) -> integer().
</pre>
<p>
Type variables can be used in specifications to specify relations for
the input and output arguments of a function.
For example, the following specification defines the type of a
polymorphic identity function:
</p>
<pre>
-spec id(X) -> X.
</pre>
<p>
However, note that the above specification does not restrict the input
and output type in any way.
We can constrain these types by guard-like subtype constraints:
</p>
<pre>
-spec id(X) -> X when is_subtype(X, tuple()).
</pre>
<p>
and provide bounded quantification. Currently,
the <c>is_subtype/2</c> guard is the only guard which can
be used in a <c>'-spec'</c> attribute.
</p>
<p>
The scope of an <c>is_subtype/2</c> constraint is the
<c>(...) -> RetType</c>
specification after which it appears. To avoid confusion,
we suggest that different variables are used in different constituents of
an overloaded contract as in the example below:
</p>
<pre>
-spec foo({X, integer()}) -> X when is_subtype(X, atom())
; ([Y]) -> Y when is_subtype(Y, number()).
</pre>
<p>
Some functions in Erlang are not meant to return;
either because they define servers or because they are used to
throw exceptions as the function below:
</p>
<pre>
my_error(Err) -> erlang:throw({error, Err}).
</pre>
<p>
For such functions we recommend the use of the special no_return()
type for their "return", via a contract of the form:
</p>
<pre>
-spec my_error(term()) -> no_return().
</pre>
</section>
</chapter>