diff options
Diffstat (limited to 'system/doc/reference_manual/typespec.xml')
-rwxr-xr-x | system/doc/reference_manual/typespec.xml | 464 |
1 files changed, 464 insertions, 0 deletions
diff --git a/system/doc/reference_manual/typespec.xml b/system/doc/reference_manual/typespec.xml new file mode 100755 index 0000000000..a3660713e4 --- /dev/null +++ b/system/doc/reference_manual/typespec.xml @@ -0,0 +1,464 @@ +<?xml version="1.0" encoding="latin1" ?> +<!DOCTYPE chapter SYSTEM "chapter.dtd"> + +<chapter> + <header> + <copyright> + <year>2003</year><year>2009</year> + <holder>Ericsson AB. All Rights Reserved.</holder> + </copyright> + <legalnotice> + The contents of this file are subject to the Erlang Public License, + Version 1.1, (the "License"); you may not use this file except in + compliance with the License. You should have received a copy of the + Erlang Public License along with this software. If not, it can be + retrieved online at http://www.erlang.org/. + + Software distributed under the License is distributed on an "AS IS" + basis, WITHOUT WARRANTY OF ANY KIND, either express or implied. See + the License for the specific language governing rights and limitations + under the License. + + </legalnotice> + + <title>Types and Function Specifications</title> + <prepared>Kostis Sagonas, Tobias Lindahl, Kenneth Lundin</prepared> + <docno></docno> + <date></date> + <rev></rev> + <file>typespec.xml</file> + </header> + + <section> + <title>Introduction of Types</title> + <p> + Although Erlang is a dynamically typed language this section describes + an extension to the Erlang language for declaring sets of Erlang terms + to form a particular type, effectively forming a specific sub-type of the + set of all Erlang terms. + </p> + <p> + Subsequently, these types can be used to specify types of record fields + and the argument and return types of functions. + </p> + <p> + Type information can be used to document function interfaces, + provide more information for bug detection tools such as <c>Dialyzer</c>, + and can be exploited by documentation tools such as <c>Edoc</c> for + generating program documentation of various forms. + It is expected that the type language described in this document will + supersede and replace the purely comment-based <c>@type</c> and + <c>@spec</c> declarations used by <c>Edoc</c>. + </p> + <warning> + The syntax and semantics described here is still preliminary and might be + slightly changed and extended before it becomes officially supported. + The plan is that this will happen in R14B. + </warning> + </section> + <section> + <marker id="syntax"></marker> + <title>Types and their Syntax</title> + <p> + Types describe sets of Erlang terms. + Types consist and are built from a set of predefined types (e.g. <c>integer()</c>, + <c>atom()</c>, <c>pid()</c>, ...) described below. + Predefined types represent a typically infinite set of Erlang terms which + belong to this type. + For example, the type <c>atom()</c> stands for the set of all Erlang atoms. + </p> + <p> + For integers and atoms, we allow for singleton types (e.g. the integers <c>-1</c> + and <c>42</c> or the atoms <c>'foo'</c> and <c>'bar'</c>). + + All other types are built using unions of either predefined types or singleton + types. In a type union between a type and one of its sub-types the sub-type is + absorbed by the super-type and the union is subsequently treated as if the + sub-type was not a constituent of the union. For example, the type union: + </p> + <pre> + atom() | 'bar' | integer() | 42</pre> + <p> + describes the same set of terms as the type union: + </p> + <pre> +atom() | integer()</pre> + <p> + Because of sub-type relations that exist between types, types form a lattice + where the topmost element, any(), denotes the set of all Erlang terms and + the bottom-most element, none(), denotes the empty set of terms. + </p> + <p> + The set of predefined types and the syntax for types is given below: + </p> + <pre><![CDATA[ +Type :: any() %% The top type, the set of all Erlang terms. + | none() %% The bottom type, contains no terms. + | pid() + | port() + | ref() + | [] %% nil + | Atom + | Binary + | float() + | Fun + | Integer + | List + | Tuple + | Union + | UserDefined %% described in Section 2 + +Union :: Type1 | Type2 + +Atom :: atom() + | Erlang_Atom %% 'foo', 'bar', ... + +Binary :: binary() %% <<_:_ * 8>> + | <<>> + | <<_:Erlang_Integer>> %% Base size + | <<_:_*Erlang_Integer>> %% Unit size + | <<_:Erlang_Integer, _:_*Erlang_Integer>> + +Fun :: fun() %% any function + | fun((...) -> Type) %% any arity, returning Type + | fun(() -> Type) + | fun((TList) -> Type) + +Integer :: integer() + | Erlang_Integer %% ..., -1, 0, 1, ... 42 ... + | Erlang_Integer..Erlang_Integer %% specifies an integer range + +List :: list(Type) %% Proper list ([]-terminated) + | improper_list(Type1, Type2) %% Type1=contents, Type2=termination + | maybe_improper_list(Type1, Type2) %% Type1 and Type2 as above + +Tuple :: tuple() %% stands for a tuple of any size + | {} + | {TList} + +TList :: Type + | Type, TList +]]></pre> + <p> + Because lists are commonly used, they have shorthand type notations. + The type <c>list(T)</c> has the shorthand <c>[T]</c>. The shorthand <c>[T,...]</c> stands for + the set of non-empty proper lists whose elements are of type <c>T</c>. + The only difference between the two shorthands is that <c>[T]</c> may be an + empty list but <c>[T,...]</c> may not. + </p> + <p> + Notice that the shorthand for <c>list()</c>, i.e. the list of elements of unknown type, + is <c>[_]</c> (or <c>[any()]</c>), not <c>[]</c>. + The notation <c>[]</c> specifies the singleton type for the empty list. + </p> + <p> + For convenience, the following types are also built-in. + They can be thought as predefined aliases for the type unions also shown in + the table. (Some type unions below slightly abuse the syntax of types.) + </p> + <table> + <row> + <cell><b>Built-in type</b></cell><cell><b>Stands for</b></cell> + </row> + <row> + <cell><c>term()</c></cell><cell><c>any()</c></cell> + </row> + <row> + <cell><c>bool()</c></cell><cell><c>'false' | 'true'</c></cell> + </row> + <row> + <cell><c>byte()</c></cell><cell><c>0..255</c></cell> + </row> + <row> + <cell><c>char()</c></cell><cell><c>0..16#10ffff</c></cell> + </row> + <row> + <cell><c>non_neg_integer()</c></cell><cell><c>0..</c></cell> + </row> + <row> + <cell><c>pos_integer()</c></cell><cell><c>1..</c></cell> + </row> + <row> + <cell><c>neg_integer()</c></cell><cell><c>..-1</c></cell> + </row> + <row> + <cell><c>number()</c></cell><cell><c>integer() | float()</c></cell> + </row> + <row> + <cell><c>list()</c></cell><cell><c>[any()]</c></cell> + </row> + <row> + <cell><c>maybe_improper_list()</c></cell><cell><c>maybe_improper_list(any(), any())</c></cell> + </row> + <row> + <cell><c>maybe_improper_list(T)</c></cell><cell><c>maybe_improper_list(T, any())</c></cell> + </row> + <row> + <cell><c>string()</c></cell><cell><c>[char()]</c></cell> + </row> + <row> + <cell><c>nonempty_string()</c></cell><cell><c>[char(),...]</c></cell> + </row> + <row> + <cell><c>iolist()</c></cell><cell><c>maybe_improper_list( +char() | binary() | iolist(), binary() | [])</c></cell> + </row> + <row> + <cell><c>module()</c></cell><cell><c>atom()</c></cell> + </row> + <row> + <cell><c>mfa()</c></cell><cell><c>{atom(),atom(),byte()}</c></cell> + </row> + <row> + <cell><c>node()</c></cell><cell><c>atom()</c></cell> + </row> + <row> + <cell><c>timeout()</c></cell><cell><c>'infinity' | non_neg_integer()</c></cell> + </row> + <row> + <cell><c>no_return()</c></cell><cell><c>none()</c></cell> + </row> + </table> + + <p> + Users are not allowed to define types with the same names as the predefined or + built-in ones. + This is checked by the compiler and its violation results in a compilation + error. + (For bootstrapping purposes, it can also result to just a warning if this + involves a built-in type which has just been introduced.) + </p> + <note> + The following built-in list types also exist, + but they are expected to be rarely used. Hence, they have long names: + </note> + <pre> +nonempty_maybe_improper_list(Type) :: nonempty_maybe_improper_list(Type, any()) +nonempty_maybe_improper_list() :: nonempty_maybe_improper_list(any()) + </pre> + <p> + where the following two types + define the set of Erlang terms one would expect: + </p> + <pre> +nonempty_improper_list(Type1, Type2) +nonempty_maybe_improper_list(Type1, Type2) + </pre> + <p> + Also for convenience, we allow for record notation to be used. + Records are just shorthands for the corresponding tuples. + </p> + <pre> +Record :: #Erlang_Atom{} + | #Erlang_Atom{Fields} + </pre> + <p> + Records have been extended to possibly contain type information. + This is described in the sub-section <seealso marker="#typeinrecords">"Type information in record declarations"</seealso> below. + </p> + </section> + + <section> + <title>Type declarations of user-defined types</title> + <p> + As seen, the basic syntax of a type is an atom followed by closed + parentheses. New types are declared using '-type' compiler attributes + as in the following: + </p> + <pre> +-type my_type() :: Type. + </pre> + <p> + where the type name is an atom (<c>'my_type'</c> in the above) followed by + parenthesis. Type is a type as defined in the previous section. + A current restriction is that Type can contain only predefined types + or user-defined types which have been previously defined. + This restriction is enforced by the compiler and results in a + compilation error. (A similar restriction currently exists for records). + </p> + <p> + This means that currently general recursive types cannot be defined. + Lifting this restriction is future work. + </p> + <p> + Type declarations can also be parameterized by including type variables + between the parentheses. The syntax of type variables is the same as + Erlang variables (starts with an upper case letter). + Naturally, these variables can - and should - appear on the RHS of the + definition. A concrete example appears below: + </p> + <pre> +-type orddict(Key, Val) :: [{Key, Val}]. + </pre> + + </section> + + <marker id="typeinrecords"/> + <section> + <title> + Type information in record declarations + </title> + <p> + The types of record fields can be specified in the declaration of the + record. The syntax for this is: + </p> + <pre> +-record(rec, {field1 :: Type1, field2, field3 :: Type3}). + </pre> + <p> + For fields without type annotations, their type defaults to any(). + I.e., the above is a shorthand for: + </p> + <pre> +-record(rec, {field1 :: Type1, field2 :: any(), field3 :: Type3}). + </pre> + <p> + In the presence of initial values for fields, + the type must be declared after the initialization as in the following: + </p> + <pre> +-record(rec, {field1 = [] :: Type1, field2, field3 = 42 :: Type3}). + </pre> + <p> + Naturally, the initial values for fields should be compatible + with (i.e. a member of) the corresponding types. + This is checked by the compiler and results in a compilation error + if a violation is detected. For fields without initial values, + the singleton type <c>'undefined'</c> is added to all declared types. + In other words, the following two record declarations have identical + effects: + </p> + <pre> +-record(rec, {f1 = 42 :: integer(), + f2 :: float(), + f3 :: 'a' | 'b'). + +-record(rec, {f1 = 42 :: integer(), + f2 :: 'undefined' | float(), + f3 :: 'undefined' | 'a' | 'b'). + </pre> + <p> + For this reason, it is recommended that records contain initializers, + whenever possible. + </p> + <p> + Any record, containing type information or not, once defined, + can be used as a type using the syntax: + </p> + <pre> +#rec{} + </pre> + <p> + In addition, the record fields can be further specified when using + a record type by adding type information about the field in the following + manner: + </p> + <pre> +#rec{some_field :: Type} + </pre> + <p> + Any unspecified fields are assumed to have the type in the original + record declaration. + </p> + </section> + + <section> + <title>Specifications (contracts) for functions</title> + <p> + A contract (or specification) for a function is given using the new + compiler attribute <c>'-spec'</c>. The basic format is as follows: + </p> + <pre> +-spec Module:Function(ArgType1, ..., ArgTypeN) -> ReturnType. + </pre> + <p> + The arity of the function has to match the number of arguments, + or else a compilation error occurs. + </p> + <p> + This form can also be used in header files (.hrl) to declare type + information for exported functions. + Then these header files can be included in files that (implicitly or + explicitly) import these functions. + </p> + <p> + For most uses within a given module, the following shorthand is allowed: + </p> + <pre> +-spec Function(ArgType1, ..., ArgTypeN) -> ReturnType. + </pre> + <p> + Also, for documentation purposes, argument names can be given: + </p> + <pre> +-spec Function(ArgName1 :: Type1, ..., ArgNameN :: TypeN) -> RT. + </pre> + <p> + A function specification can be overloaded. + That is, it can have several types, separated by a semicolon (<c>;</c>): + </p> + <pre> +-spec foo(T1, T2) -> T3 + ; (T4, T5) -> T6. + </pre> + <p> + A current restriction, which currently results in a warning + (OBS: not an error) by the compiler, is that the domains of the argument + types cannot be overlapping. + For example, the following specification results in a warning: + </p> + <pre> +-spec foo(pos_integer()) -> pos_integer() + ; (integer()) -> integer(). + </pre> + <p> + Type variables can be used in specifications to specify relations for + the input and output arguments of a function. + For example, the following specification defines the type of a + polymorphic identity function: + </p> + <pre> +-spec id(X) -> X. + </pre> + <p> + However, note that the above specification does not restrict the input + and output type in any way. + We can constrain these types by guard-like subtype constraints: + </p> + <pre> +-spec id(X) -> X when is_subtype(X, tuple()). + </pre> + <p> + and provide bounded quantification. Currently, + the <c>is_subtype/2</c> guard is the only guard which can + be used in a <c>'-spec'</c> attribute. + </p> + <p> + The scope of an <c>is_subtype/2</c> constraint is the + <c>(...) -> RetType</c> + specification after which it appears. To avoid confusion, + we suggest that different variables are used in different constituents of + an overloaded contract as in the example below: + </p> + <pre> +-spec foo({X, integer()}) -> X when is_subtype(X, atom()) + ; ([Y]) -> Y when is_subtype(Y, number()). + </pre> + <p> + Some functions in Erlang are not meant to return; + either because they define servers or because they are used to + throw exceptions as the function below: + </p> + <pre> +my_error(Err) -> erlang:throw({error, Err}). + </pre> + <p> + For such functions we recommend the use of the special no_return() + type for their "return", via a contract of the form: + </p> + <pre> +-spec my_error(term()) -> no_return(). + </pre> + </section> +</chapter> + |