aboutsummaryrefslogtreecommitdiffstats
path: root/system/doc/efficiency_guide/binaryhandling.xml
diff options
context:
space:
mode:
authorBjörn Gustavsson <[email protected]>2015-03-12 15:35:13 +0100
committerBjörn Gustavsson <[email protected]>2015-03-12 15:38:25 +0100
commit6513fc5eb55b306e2b1088123498e6c50b9e7273 (patch)
tree986a133cb88ddeaeb0292f99af67e4d1015d1f62 /system/doc/efficiency_guide/binaryhandling.xml
parent42a0387e886ddbf60b0e2cb977758e2ca74954ae (diff)
downloadotp-6513fc5eb55b306e2b1088123498e6c50b9e7273.tar.gz
otp-6513fc5eb55b306e2b1088123498e6c50b9e7273.tar.bz2
otp-6513fc5eb55b306e2b1088123498e6c50b9e7273.zip
Update Efficiency Guide
Language cleaned up by the technical writers xsipewe and tmanevik from Combitech. Proofreading and corrections by Björn Gustavsson.
Diffstat (limited to 'system/doc/efficiency_guide/binaryhandling.xml')
-rw-r--r--system/doc/efficiency_guide/binaryhandling.xml359
1 files changed, 188 insertions, 171 deletions
diff --git a/system/doc/efficiency_guide/binaryhandling.xml b/system/doc/efficiency_guide/binaryhandling.xml
index 4ba1378059..0ac1a7ee32 100644
--- a/system/doc/efficiency_guide/binaryhandling.xml
+++ b/system/doc/efficiency_guide/binaryhandling.xml
@@ -23,7 +23,7 @@
The Initial Developer of the Original Code is Ericsson AB.
</legalnotice>
- <title>Constructing and matching binaries</title>
+ <title>Constructing and Matching Binaries</title>
<prepared>Bjorn Gustavsson</prepared>
<docno></docno>
<date>2007-10-12</date>
@@ -31,10 +31,10 @@
<file>binaryhandling.xml</file>
</header>
- <p>In R12B, the most natural way to write binary construction and matching is now
+ <p>In R12B, the most natural way to construct and match binaries is
significantly faster than in earlier releases.</p>
- <p>To construct at binary, you can simply write</p>
+ <p>To construct a binary, you can simply write as follows:</p>
<p><em>DO</em> (in R12B) / <em>REALLY DO NOT</em> (in earlier releases)</p>
<code type="erl"><![CDATA[
@@ -46,13 +46,12 @@ my_list_to_binary([H|T], Acc) ->
my_list_to_binary([], Acc) ->
Acc.]]></code>
- <p>In releases before R12B, <c>Acc</c> would be copied in every iteration.
- In R12B, <c>Acc</c> will be copied only in the first iteration and extra
- space will be allocated at the end of the copied binary. In the next iteration,
- <c>H</c> will be written in to the extra space. When the extra space runs out,
- the binary will be reallocated with more extra space.</p>
-
- <p>The extra space allocated (or reallocated) will be twice the size of the
+ <p>In releases before R12B, <c>Acc</c> is copied in every iteration.
+ In R12B, <c>Acc</c> is copied only in the first iteration and extra
+ space is allocated at the end of the copied binary. In the next iteration,
+ <c>H</c> is written into the extra space. When the extra space runs out,
+ the binary is reallocated with more extra space. The extra space allocated
+ (or reallocated) is twice the size of the
existing binary data, or 256, whichever is larger.</p>
<p>The most natural way to match binaries is now the fastest:</p>
@@ -64,55 +63,79 @@ my_binary_to_list(<<H,T/binary>>) ->
my_binary_to_list(<<>>) -> [].]]></code>
<section>
- <title>How binaries are implemented</title>
+ <title>How Binaries are Implemented</title>
<p>Internally, binaries and bitstrings are implemented in the same way.
- In this section, we will call them <em>binaries</em> since that is what
+ In this section, they are called <em>binaries</em> because that is what
they are called in the emulator source code.</p>
- <p>There are four types of binary objects internally. Two of them are
- containers for binary data and two of them are merely references to
- a part of a binary.</p>
-
- <p>The binary containers are called <em>refc binaries</em>
- (short for <em>reference-counted binaries</em>) and <em>heap binaries</em>.</p>
+ <p>Four types of binary objects are available internally:</p>
+ <list type="bulleted">
+ <item><p>Two are containers for binary data and are called:</p>
+ <list type="bulleted">
+ <item><em>Refc binaries</em> (short for
+ <em>reference-counted binaries</em>)</item>
+ <item><em>Heap binaries</em></item>
+ </list></item>
+ <item><p>Two are merely references to a part of a binary and
+ are called:</p>
+ <list type="bulleted">
+ <item><em>sub binaries</em></item>
+ <item><em>match contexts</em></item>
+ </list></item>
+ </list>
- <p><marker id="refc_binary"></marker><em>Refc binaries</em>
- consist of two parts: an object stored on
- the process heap, called a <em>ProcBin</em>, and the binary object itself
- stored outside all process heaps.</p>
+ <section>
+ <marker id="refc_binary"></marker>
+ <title>Refc Binaries</title>
+ <p>Refc binaries consist of two parts:</p>
+ <list type="bulleted">
+ <item>An object stored on the process heap, called a
+ <em>ProcBin</em></item>
+ <item>The binary object itself, stored outside all process
+ heaps</item>
+ </list>
<p>The binary object can be referenced by any number of ProcBins from any
- number of processes; the object contains a reference counter to keep track
+ number of processes. The object contains a reference counter to keep track
of the number of references, so that it can be removed when the last
reference disappears.</p>
<p>All ProcBin objects in a process are part of a linked list, so that
the garbage collector can keep track of them and decrement the reference
counters in the binary when a ProcBin disappears.</p>
+ </section>
- <p><marker id="heap_binary"></marker><em>Heap binaries</em> are small binaries,
- up to 64 bytes, that are stored directly on the process heap.
- They will be copied when the process
- is garbage collected and when they are sent as a message. They don't
+ <section>
+ <marker id="heap_binary"></marker>
+ <title>Heap Binaries</title>
+ <p>Heap binaries are small binaries, up to 64 bytes, and are stored
+ directly on the process heap. They are copied when the process is
+ garbage-collected and when they are sent as a message. They do not
require any special handling by the garbage collector.</p>
+ </section>
- <p>There are two types of reference objects that can reference part of
- a refc binary or heap binary. They are called <em>sub binaries</em> and
- <em>match contexts</em>.</p>
+ <section>
+ <title>Sub Binaries</title>
+ <p>The reference objects <em>sub binaries</em> and
+ <em>match contexts</em> can reference part of
+ a refc binary or heap binary.</p>
<p><marker id="sub_binary"></marker>A <em>sub binary</em>
is created by <c>split_binary/2</c> and when
a binary is matched out in a binary pattern. A sub binary is a reference
- into a part of another binary (refc or heap binary, never into a another
+ into a part of another binary (refc or heap binary, but never into another
sub binary). Therefore, matching out a binary is relatively cheap because
the actual binary data is never copied.</p>
+ </section>
- <p><marker id="match_context"></marker>A <em>match context</em> is
- similar to a sub binary, but is optimized
- for binary matching; for instance, it contains a direct pointer to the binary
- data. For each field that is matched out of a binary, the position in the
- match context will be incremented.</p>
+ <section>
+ <title>Match Context</title>
+ <marker id="match_context"></marker>
+ <p>A <em>match context</em> is similar to a sub binary, but is
+ optimized for binary matching. For example, it contains a direct
+ pointer to the binary data. For each field that is matched out of
+ a binary, the position in the match context is incremented.</p>
<p>In R11B, a match context was only used during a binary matching
operation.</p>
@@ -122,27 +145,28 @@ my_binary_to_list(<<>>) -> [].]]></code>
context and discard the sub binary. Instead of creating a sub binary,
the match context is kept.</p>
- <p>The compiler can only do this optimization if it can know for sure
+ <p>The compiler can only do this optimization if it knows
that the match context will not be shared. If it would be shared, the
functional properties (also called referential transparency) of Erlang
would break.</p>
+ </section>
</section>
<section>
- <title>Constructing binaries</title>
-
- <p>In R12B, appending to a binary or bitstring</p>
+ <title>Constructing Binaries</title>
+ <p>In R12B, appending to a binary or bitstring
+ is specially optimized by the <em>runtime system</em>:</p>
<code type="erl"><![CDATA[
<<Binary/binary, ...>>
<<Binary/bitstring, ...>>]]></code>
- <p>is specially optimized by the <em>run-time system</em>.
- Because the run-time system handles the optimization (instead of
+ <p>As the runtime system handles the optimization (instead of
the compiler), there are very few circumstances in which the optimization
- will not work.</p>
+ does not work.</p>
- <p>To explain how it works, we will go through this code</p>
+ <p>To explain how it works, let us examine the following code line
+ by line:</p>
<code type="erl"><![CDATA[
Bin0 = <<0>>, %% 1
@@ -152,81 +176,81 @@ Bin3 = <<Bin2/binary,7,8,9>>, %% 4
Bin4 = <<Bin1/binary,17>>, %% 5 !!!
{Bin4,Bin3} %% 6]]></code>
- <p>line by line.</p>
-
- <p>The first line (marked with the <c>%% 1</c> comment), assigns
+ <list type="bulleted">
+ <item>Line 1 (marked with the <c>%% 1</c> comment), assigns
a <seealso marker="#heap_binary">heap binary</seealso> to
- the variable <c>Bin0</c>.</p>
+ the <c>Bin0</c> variable.</item>
- <p>The second line is an append operation. Since <c>Bin0</c>
+ <item>Line 2 is an append operation. As <c>Bin0</c>
has not been involved in an append operation,
a new <seealso marker="#refc_binary">refc binary</seealso>
- will be created and the contents of <c>Bin0</c> will be copied
- into it. The <em>ProcBin</em> part of the refc binary will have
+ is created and the contents of <c>Bin0</c> is copied
+ into it. The <em>ProcBin</em> part of the refc binary has
its size set to the size of the data stored in the binary, while
- the binary object will have extra space allocated.
- The size of the binary object will be either twice the
+ the binary object has extra space allocated.
+ The size of the binary object is either twice the
size of <c>Bin0</c> or 256, whichever is larger. In this case
- it will be 256.</p>
+ it is 256.</item>
- <p>It gets more interesting in the third line.
+ <item>Line 3 is more interesting.
<c>Bin1</c> <em>has</em> been used in an append operation,
- and it has 255 bytes of unused storage at the end, so the three new bytes
- will be stored there.</p>
+ and it has 255 bytes of unused storage at the end, so the 3 new
+ bytes are stored there.</item>
- <p>Same thing in the fourth line. There are 252 bytes left,
- so there is no problem storing another three bytes.</p>
+ <item>Line 4. The same applies here. There are 252 bytes left,
+ so there is no problem storing another 3 bytes.</item>
- <p>But in the fifth line something <em>interesting</em> happens.
- Note that we don't append to the previous result in <c>Bin3</c>,
- but to <c>Bin1</c>. We expect that <c>Bin4</c> will be assigned
- the value <c>&lt;&lt;0,1,2,3,17&gt;&gt;</c>. We also expect that
+ <item>Line 5. Here, something <em>interesting</em> happens. Notice
+ that the result is not appended to the previous result in <c>Bin3</c>,
+ but to <c>Bin1</c>. It is expected that <c>Bin4</c> will be assigned
+ the value <c>&lt;&lt;0,1,2,3,17&gt;&gt;</c>. It is also expected that
<c>Bin3</c> will retain its value
(<c>&lt;&lt;0,1,2,3,4,5,6,7,8,9&gt;&gt;</c>).
- Clearly, the run-time system cannot write the byte <c>17</c> into the binary,
+ Clearly, the runtime system cannot write byte <c>17</c> into the binary,
because that would change the value of <c>Bin3</c> to
- <c>&lt;&lt;0,1,2,3,4,17,6,7,8,9&gt;&gt;</c>.</p>
-
- <p>What will happen?</p>
+ <c>&lt;&lt;0,1,2,3,4,17,6,7,8,9&gt;&gt;</c>.</item>
+ </list>
- <p>The run-time system will see that <c>Bin1</c> is the result
+ <p>The runtime system sees that <c>Bin1</c> is the result
from a previous append operation (not from the latest append operation),
- so it will <em>copy</em> the contents of <c>Bin1</c> to a new binary
- and reserve extra storage and so on. (We will not explain here how the
- run-time system can know that it is not allowed to write into <c>Bin1</c>;
+ so it <em>copies</em> the contents of <c>Bin1</c> to a new binary,
+ reserve extra storage, and so on. (Here is not explained how the
+ runtime system can know that it is not allowed to write into <c>Bin1</c>;
it is left as an exercise to the curious reader to figure out how it is
done by reading the emulator sources, primarily <c>erl_bits.c</c>.)</p>
<section>
- <title>Circumstances that force copying</title>
+ <title>Circumstances That Force Copying</title>
<p>The optimization of the binary append operation requires that
there is a <em>single</em> ProcBin and a <em>single reference</em> to the
ProcBin for the binary. The reason is that the binary object can be
- moved (reallocated) during an append operation, and when that happens
+ moved (reallocated) during an append operation, and when that happens,
the pointer in the ProcBin must be updated. If there would be more than
one ProcBin pointing to the binary object, it would not be possible to
find and update all of them.</p>
- <p>Therefore, certain operations on a binary will mark it so that
+ <p>Therefore, certain operations on a binary mark it so that
any future append operation will be forced to copy the binary.
In most cases, the binary object will be shrunk at the same time
to reclaim the extra space allocated for growing.</p>
- <p>When appending to a binary</p>
+ <p>When appending to a binary as follows, only the binary returned
+ from the latest append operation will support further cheap append
+ operations:</p>
<code type="erl"><![CDATA[
Bin = <<Bin0,...>>]]></code>
- <p>only the binary returned from the latest append operation will
- support further cheap append operations. In the code fragment above,
+ <p>In the code fragment in the beginning of this section,
appending to <c>Bin</c> will be cheap, while appending to <c>Bin0</c>
will force the creation of a new binary and copying of the contents
of <c>Bin0</c>.</p>
<p>If a binary is sent as a message to a process or port, the binary
will be shrunk and any further append operation will copy the binary
- data into a new binary. For instance, in the following code fragment</p>
+ data into a new binary. For example, in the following code fragment
+ <c>Bin1</c> will be copied in the third line:</p>
<code type="erl"><![CDATA[
Bin1 = <<Bin0,...>>,
@@ -234,12 +258,12 @@ PortOrPid ! Bin1,
Bin = <<Bin1,...>> %% Bin1 will be COPIED
]]></code>
- <p><c>Bin1</c> will be copied in the third line.</p>
-
- <p>The same thing happens if you insert a binary into an <em>ets</em>
- table or send it to a port using <c>erlang:port_command/2</c> or pass it to
+ <p>The same happens if you insert a binary into an Ets
+ table, send it to a port using <c>erlang:port_command/2</c>, or
+ pass it to
<seealso marker="erts:erl_nif#enif_inspect_binary">enif_inspect_binary</seealso>
in a NIF.</p>
+
<p>Matching a binary will also cause it to shrink and the next append
operation will copy the binary data:</p>
@@ -249,22 +273,23 @@ Bin1 = <<Bin0,...>>,
Bin = <<Bin1,...>> %% Bin1 will be COPIED
]]></code>
- <p>The reason is that a <seealso marker="#match_context">match context</seealso>
+ <p>The reason is that a
+ <seealso marker="#match_context">match context</seealso>
contains a direct pointer to the binary data.</p>
- <p>If a process simply keeps binaries (either in "loop data" or in the process
- dictionary), the garbage collector may eventually shrink the binaries.
- If only one such binary is kept, it will not be shrunk. If the process later
- appends to a binary that has been shrunk, the binary object will be reallocated
- to make place for the data to be appended.</p>
+ <p>If a process simply keeps binaries (either in "loop data" or in the
+ process
+ dictionary), the garbage collector can eventually shrink the binaries.
+ If only one such binary is kept, it will not be shrunk. If the process
+ later appends to a binary that has been shrunk, the binary object will
+ be reallocated to make place for the data to be appended.</p>
</section>
-
</section>
<section>
- <title>Matching binaries</title>
+ <title>Matching Binaries</title>
- <p>We will revisit the example shown earlier</p>
+ <p>Let us revisit the example in the beginning of the previous section:</p>
<p><em>DO</em> (in R12B)</p>
<code type="erl"><![CDATA[
@@ -272,36 +297,35 @@ my_binary_to_list(<<H,T/binary>>) ->
[H|my_binary_to_list(T)];
my_binary_to_list(<<>>) -> [].]]></code>
- <p>too see what is happening under the hood.</p>
-
- <p>The very first time <c>my_binary_to_list/1</c> is called,
+ <p>The first time <c>my_binary_to_list/1</c> is called,
a <seealso marker="#match_context">match context</seealso>
- will be created. The match context will point to the first
- byte of the binary. One byte will be matched out and the match context
- will be updated to point to the second byte in the binary.</p>
+ is created. The match context points to the first
+ byte of the binary. 1 byte is matched out and the match context
+ is updated to point to the second byte in the binary.</p>
- <p>In R11B, at this point a <seealso marker="#sub_binary">sub binary</seealso>
+ <p>In R11B, at this point a
+ <seealso marker="#sub_binary">sub binary</seealso>
would be created. In R12B,
the compiler sees that there is no point in creating a sub binary,
because there will soon be a call to a function (in this case,
- to <c>my_binary_to_list/1</c> itself) that will immediately
+ to <c>my_binary_to_list/1</c> itself) that immediately will
create a new match context and discard the sub binary.</p>
- <p>Therefore, in R12B, <c>my_binary_to_list/1</c> will call itself
+ <p>Therefore, in R12B, <c>my_binary_to_list/1</c> calls itself
with the match context instead of with a sub binary. The instruction
- that initializes the matching operation will basically do nothing
+ that initializes the matching operation basically does nothing
when it sees that it was passed a match context instead of a binary.</p>
<p>When the end of the binary is reached and the second clause matches,
the match context will simply be discarded (removed in the next
- garbage collection, since there is no longer any reference to it).</p>
+ garbage collection, as there is no longer any reference to it).</p>
<p>To summarize, <c>my_binary_to_list/1</c> in R12B only needs to create
<em>one</em> match context and no sub binaries. In R11B, if the binary
contains <em>N</em> bytes, <em>N+1</em> match contexts and <em>N</em>
- sub binaries will be created.</p>
+ sub binaries are created.</p>
- <p>In R11B, the fastest way to match binaries is:</p>
+ <p>In R11B, the fastest way to match binaries is as follows:</p>
<p><em>DO NOT</em> (in R12B)</p>
<code type="erl"><![CDATA[
@@ -317,13 +341,14 @@ my_complicated_binary_to_list(Bin, Skip) ->
end.]]></code>
<p>This function cleverly avoids building sub binaries, but it cannot
- avoid building a match context in each recursion step. Therefore, in both R11B and R12B,
+ avoid building a match context in each recursion step.
+ Therefore, in both R11B and R12B,
<c>my_complicated_binary_to_list/1</c> builds <em>N+1</em> match
- contexts. (In a future release, the compiler might be able to generate code
- that reuses the match context, but don't hold your breath.)</p>
+ contexts. (In a future Erlang/OTP release, the compiler might be able
+ to generate code that reuses the match context.)</p>
- <p>Returning to <c>my_binary_to_list/1</c>, note that the match context was
- discarded when the entire binary had been traversed. What happens if
+ <p>Returning to <c>my_binary_to_list/1</c>, notice that the match context
+ was discarded when the entire binary had been traversed. What happens if
the iteration stops before it has reached the end of the binary? Will
the optimization still work?</p>
@@ -336,29 +361,23 @@ after_zero(<<>>) ->
<<>>.
]]></code>
- <p>Yes, it will. The compiler will remove the building of the sub binary in the
- second clause</p>
+ <p>Yes, it will. The compiler will remove the building of the sub binary in
+ the second clause:</p>
<code type="erl"><![CDATA[
-.
-.
-.
+...
after_zero(<<_,T/binary>>) ->
after_zero(T);
-.
-.
-.]]></code>
+...]]></code>
- <p>but will generate code that builds a sub binary in the first clause</p>
+ <p>But it will generate code that builds a sub binary in the first clause:</p>
<code type="erl"><![CDATA[
after_zero(<<0,T/binary>>) ->
T;
-.
-.
-.]]></code>
+...]]></code>
- <p>Therefore, <c>after_zero/1</c> will build one match context and one sub binary
+ <p>Therefore, <c>after_zero/1</c> builds one match context and one sub binary
(assuming it is passed a binary that contains a zero byte).</p>
<p>Code like the following will also be optimized:</p>
@@ -371,12 +390,14 @@ all_but_zeroes_to_list(<<0,T/binary>>, Acc, Remaining) ->
all_but_zeroes_to_list(<<Byte,T/binary>>, Acc, Remaining) ->
all_but_zeroes_to_list(T, [Byte|Acc], Remaining-1).]]></code>
- <p>The compiler will remove building of sub binaries in the second and third clauses,
- and it will add an instruction to the first clause that will convert <c>Buffer</c>
- from a match context to a sub binary (or do nothing if <c>Buffer</c> already is a binary).</p>
+ <p>The compiler removes building of sub binaries in the second and third
+ clauses, and it adds an instruction to the first clause that converts
+ <c>Buffer</c> from a match context to a sub binary (or do nothing if
+ <c>Buffer</c> is a binary already).</p>
- <p>Before you begin to think that the compiler can optimize any binary patterns,
- here is a function that the compiler (currently, at least) is not able to optimize:</p>
+ <p>Before you begin to think that the compiler can optimize any binary
+ patterns, the following function cannot be optimized by the compiler
+ (currently, at least):</p>
<code type="erl"><![CDATA[
non_opt_eq([H|T1], <<H,T2/binary>>) ->
@@ -386,43 +407,43 @@ non_opt_eq([_|_], <<_,_/binary>>) ->
non_opt_eq([], <<>>) ->
true.]]></code>
- <p>It was briefly mentioned earlier that the compiler can only delay creation of
- sub binaries if it can be sure that the binary will not be shared. In this case,
- the compiler cannot be sure.</p>
+ <p>It was mentioned earlier that the compiler can only delay creation of
+ sub binaries if it knows that the binary will not be shared. In this case,
+ the compiler cannot know.</p>
- <p>We will soon show how to rewrite <c>non_opt_eq/2</c> so that the delayed sub binary
- optimization can be applied, and more importantly, we will show how you can find out
- whether your code can be optimized.</p>
+ <p>Soon it is shown how to rewrite <c>non_opt_eq/2</c> so that the delayed
+ sub binary optimization can be applied, and more importantly, it is shown
+ how you can find out whether your code can be optimized.</p>
<section>
- <title>The bin_opt_info option</title>
+ <title>Option bin_opt_info</title>
<p>Use the <c>bin_opt_info</c> option to have the compiler print a lot of
- information about binary optimizations. It can be given either to the compiler or
- <c>erlc</c></p>
+ information about binary optimizations. It can be given either to the
+ compiler or <c>erlc</c>:</p>
<code type="erl"><![CDATA[
erlc +bin_opt_info Mod.erl]]></code>
- <p>or passed via an environment variable</p>
+ <p>or passed through an environment variable:</p>
<code type="erl"><![CDATA[
export ERL_COMPILER_OPTIONS=bin_opt_info]]></code>
- <p>Note that the <c>bin_opt_info</c> is not meant to be a permanent option added
- to your <c>Makefile</c>s, because it is not possible to eliminate all messages that
- it generates. Therefore, passing the option through the environment is in most cases
- the most practical approach.</p>
+ <p>Notice that the <c>bin_opt_info</c> is not meant to be a permanent
+ option added to your <c>Makefile</c>s, because all messages that it
+ generates cannot be eliminated. Therefore, passing the option through
+ the environment is in most cases the most practical approach.</p>
- <p>The warnings will look like this:</p>
+ <p>The warnings look as follows:</p>
<code type="erl"><![CDATA[
./efficiency_guide.erl:60: Warning: NOT OPTIMIZED: sub binary is used or returned
./efficiency_guide.erl:62: Warning: OPTIMIZED: creation of sub binary delayed]]></code>
- <p>To make it clearer exactly what code the warnings refer to,
- in the examples that follow, the warnings are inserted as comments
- after the clause they refer to:</p>
+ <p>To make it clearer exactly what code the warnings refer to, the
+ warnings in the following examples are inserted as comments
+ after the clause they refer to, for example:</p>
<code type="erl"><![CDATA[
after_zero(<<0,T/binary>>) ->
@@ -434,12 +455,12 @@ after_zero(<<_,T/binary>>) ->
after_zero(<<>>) ->
<<>>.]]></code>
- <p>The warning for the first clause tells us that it is not possible to
- delay the creation of a sub binary, because it will be returned.
- The warning for the second clause tells us that a sub binary will not be
+ <p>The warning for the first clause says that the creation of a sub
+ binary cannot be delayed, because it will be returned.
+ The warning for the second clause says that a sub binary will not be
created (yet).</p>
- <p>It is time to revisit the earlier example of the code that could not
+ <p>Let us revisit the earlier example of the code that could not
be optimized and find out why:</p>
<code type="erl"><![CDATA[
@@ -456,16 +477,16 @@ non_opt_eq([_|_], <<_,_/binary>>) ->
non_opt_eq([], <<>>) ->
true.]]></code>
- <p>The compiler emitted two warnings. The <c>INFO</c> warning refers to the function
- <c>non_opt_eq/2</c> as a callee, indicating that any functions that call <c>non_opt_eq/2</c>
- will not be able to make delayed sub binary optimization.
- There is also a suggestion to change argument order.
- The second warning (that happens to refer to the same line) refers to the construction of
- the sub binary itself.</p>
+ <p>The compiler emitted two warnings. The <c>INFO</c> warning refers
+ to the function <c>non_opt_eq/2</c> as a callee, indicating that any
+ function that call <c>non_opt_eq/2</c> cannot make delayed sub binary
+ optimization. There is also a suggestion to change argument order.
+ The second warning (that happens to refer to the same line) refers to
+ the construction of the sub binary itself.</p>
- <p>We will soon show another example that should make the distinction between <c>INFO</c>
- and <c>NOT OPTIMIZED</c> warnings somewhat clearer, but first we will heed the suggestion
- to change argument order:</p>
+ <p>Soon another example will show the difference between the
+ <c>INFO</c> and <c>NOT OPTIMIZED</c> warnings somewhat clearer, but
+ let us first follow the suggestion to change argument order:</p>
<code type="erl"><![CDATA[
opt_eq(<<H,T1/binary>>, [H|T2]) ->
@@ -485,15 +506,13 @@ match_body([0|_], <<H,_/binary>>) ->
%% sub binary optimization;
%% SUGGEST changing argument order
done;
-.
-.
-.]]></code>
+...]]></code>
<p>The warning means that <em>if</em> there is a call to <c>match_body/2</c>
(from another clause in <c>match_body/2</c> or another function), the
- delayed sub binary optimization will not be possible. There will be additional
- warnings for any place where a sub binary is matched out at the end of and
- passed as the second argument to <c>match_body/2</c>. For instance:</p>
+ delayed sub binary optimization will not be possible. More warnings will
+ occur for any place where a sub binary is matched out at the end of and
+ passed as the second argument to <c>match_body/2</c>, for example:</p>
<code type="erl"><![CDATA[
match_head(List, <<_:10,Data/binary>>) ->
@@ -504,10 +523,10 @@ match_head(List, <<_:10,Data/binary>>) ->
</section>
<section>
- <title>Unused variables</title>
+ <title>Unused Variables</title>
- <p>The compiler itself figures out if a variable is unused. The same
- code is generated for each of the following functions</p>
+ <p>The compiler figures out if a variable is unused. The same
+ code is generated for each of the following functions:</p>
<code type="erl"><![CDATA[
count1(<<_,T/binary>>, Count) -> count1(T, Count+1);
@@ -519,11 +538,9 @@ count2(<<>>, Count) -> Count.
count3(<<_H,T/binary>>, Count) -> count3(T, Count+1);
count3(<<>>, Count) -> Count.]]></code>
- <p>In each iteration, the first 8 bits in the binary will be skipped, not matched out.</p>
-
+ <p>In each iteration, the first 8 bits in the binary will be skipped,
+ not matched out.</p>
</section>
-
</section>
-
</chapter>