This section describes the "bit syntax" which was added to the Erlang language in release 5.0 (R7). Compared to the original bit syntax prototype by Claes Wikström and Tony Rogvall (presented on the Erlang User's Conference 1999), this implementation differs primarily in the following respects,
the character pairs '<<' and '>>' are used to delimit a binary patterns and constructor (not '<' and '>' as in the prototype),
the tail syntax ('|Variable') has been eliminated,
all size expressions must be bound,
a type
lists and tuples cannot be generated
there are no paddings whatsoever.
In Erlang a Bin is used for constructing binaries and matching binary patterns. A Bin is written with the following syntax:
>
]]>
A Bin is a low-level sequence of bytes. The purpose of a Bin is to be able to, from a high level, construct a binary,
>
]]>
in which case all elements must be bound, or to match a binary,
> = Bin
]]>
where
Each element specifies a certain segment of the binary. A segment is is a set of contiguous bits of the binary (not necessarily on a byte boundary). The first element specifies the initial segment, the second element specifies the following segment etc.
The following examples illustrate how binaries are constructed or matched, and how elements and tails are specified.
Example 1: A binary can be constructed from a set of constants or a string literal:
>,
Bin12 = <<"abc">>
]]>
yields binaries of size 3;
Example 2: Similarly, a binary can be constructed from a set of bound variables:
>
]]>
yields a binary of size 4, and
Example 3: A Bin can also be used for matching: if
> = Bin2
]]>
yields
Example 4: The following is a more elaborate example
of matching, where
> when HLen >= 5, 4*HLen =< DgramSize ->
OptsLen = 4*(HLen - ?IP_MIN_HDR_LEN),
<> = RestDgram,
...
end.
]]>
Here the segment corresponding to the
An IP datagram header is of variable length, and its length -
measured in the number of 32-bit words - is given in the segment
corresponding to
The tail variables
If the first 4-bits segment of
Note that "
Each segment has the following general syntax:
Both the
Default values will be used for missing specifications. The default values are described in the section "Defaults" below.
Used in binary construction, the
The
The
Example:
X:4/little-signed-integer-unit:8
This element has a total size of 4*8 = 32 bits, and it contains a signed integer in little-endian order.
The default type for a segment is
The default
The default unit depends on the the type.
For
The default signedness is
The default endianness is
This section describes the rules for constructing binaries using
the bit syntax. Unlike when constructing lists or tuples, the construction
of a binary can fail with a
There can be zero or more segments in a binary to be constructed.
The expression '
Each segment in a binary can consist of zero or more bits.
There are no alignment rules for individual segments, but the total
number of bits in all segments must be evenly divisible by 8,
or in other words, the resulting binary must consist of a whole number
of bytes. An
>
]]>
The total number of bits is 7, which is not evenly divisible by 8;
thus, there will be
>
]]>
will successfully construct a binary of 8 bits, or one byte. (Provided that all of X, Y and Z are integers.)
As noted earlier, segments have the following general syntax:
When constructing binaries,
>
]]>
This expression must be rewritten to
>
]]>
in order to be accepted by the compiler.
As syntactic sugar, an literal string may be written instead of a element.
> ]]>
which is syntactic sugar for
> ]]>
This section describes the rules for matching binaries using the bit syntax.
There can be zero or more segments in a binary binary pattern. A binary pattern can occur in every place patterns are allowed, also inside other patterns. Binary patterns cannot be nested.
The pattern '
Each segment in a binary can consist of zero or more bits.
A segment of type
This means that the following head will never match:
>) -> ]]>
As noted earlier, segments have the following general syntax:
When matching
>) ->
{X,T}. ]]>
The two occurrences of
The correct way to write this example is like this:
<> = Bin,
{X,T}. ]]>
To match out the rest of binary, specify a binary field without size:
>) -> ]]>
As always, the size of the tail must be evenly divisible by 8.
Assume that we need a function that creates a binary out of a list of triples of integers. A first (inefficient) version of such a function could look like this:
triples_to_bin(T, <<>>).
triples_to_bin([{X,Y,Z} | T], Acc) ->
triples_to_bin(T, <>); % inefficient
triples_to_bin([], Acc) ->
Acc. ]]>
The reason for the inefficiency of this function is that for
each triple, the binary constructed so far (
The efficient way to write this function in R7 is:
triples_to_bin(T, []).
triples_to_bin([{X,Y,Z} | T], Acc) ->
triples_to_bin(T, [<> | Acc]);
triples_to_bin([], Acc) ->
list_to_binary(lists:reverse(Acc)). ]]>
Note that