1 files changed, 234 insertions, 0 deletions
diff --git a/system/doc/efficiency_guide/functions.xml b/system/doc/efficiency_guide/functions.xml
new file mode 100644
index 0000000000..d55b60e39c
--- /dev/null
+++ b/system/doc/efficiency_guide/functions.xml
@@ -0,0 +1,234 @@
+<?xml version="1.0" encoding="latin1" ?>
+<!DOCTYPE chapter SYSTEM "chapter.dtd">
+
+<chapter>
+  <header>
+    <copyright>
+      <year>2001</year><year>2009</year>
+      <holder>Ericsson AB. All Rights Reserved.</holder>
+    </copyright>
+    <legalnotice>
+      The contents of this file are subject to the Erlang Public License,
+      Version 1.1, (the "License"); you may not use this file except in
+      compliance with the License. You should have received a copy of the
+      Erlang Public License along with this software. If not, it can be
+      retrieved online at http://www.erlang.org/.
+    
+      Software distributed under the License is distributed on an "AS IS"
+      basis, WITHOUT WARRANTY OF ANY KIND, either express or implied. See
+      the License for the specific language governing rights and limitations
+      under the License.
+    
+    </legalnotice>
+
+    <title>Functions</title>
+    <prepared>Bjorn Gustavsson</prepared>
+    <docno></docno>
+    <date>2007-11-22</date>
+    <rev></rev>
+    <file>functions.xml</file>
+  </header>
+
+  <section>
+    <title>Pattern matching</title>
+    <p>Pattern matching in function head and in <c>case</c> and <c>receive</c>
+     clauses are optimized by the compiler. With a few exceptions, there is nothing
+     to gain by rearranging clauses.</p>
+
+    <p>One exception is pattern matching of binaries. The compiler
+    will not rearrange clauses that match binaries. Placing the
+    clause that matches against the empty binary <em>last</em> will usually
+    be slightly faster than placing it <em>first</em>.</p>
+
+    <p>Here is a rather contrived example to show another exception:</p>
+
+    <p><em>DO NOT</em></p>
+    <code type="erl">
+atom_map1(one) -> 1;
+atom_map1(two) -> 2;
+atom_map1(three) -> 3;
+atom_map1(Int) when is_integer(Int) -> Int;
+atom_map1(four) -> 4;
+atom_map1(five) -> 5;
+atom_map1(six) -> 6.</code>
+
+     <p>The problem is the clause with the variable <c>Int</c>.
+     Since a variable can match anything, including the atoms
+     <c>four</c>, <c>five</c>, and <c>six</c> that the following clauses
+     also will match, the compiler must generate sub-optimal code that will
+     execute as follows:</p>
+
+     <p>First the input value is compared to <c>one</c>, <c>two</c>, and
+     <c>three</c> (using a single instruction that does a binary search;
+     thus, quite efficient even if there are many values) to select which
+     one of the first three clauses to execute (if any).</p>
+
+     <p>If none of the first three clauses matched, the fourth clause
+     will match since a variable always matches. If the guard test
+     <c>is_integer(Int)</c> succeeds, the fourth clause will be
+     executed.</p>
+
+     <p>If the guard test failed, the input value is compared to
+     <c>four</c>, <c>five</c>, and <c>six</c>, and the appropriate clause
+     is selected. (There will be a <c>function_clause</c> exception if
+     none of the values matched.)</p>
+
+     <p>Rewriting to either</p>
+
+     <p><em>DO</em></p>
+     <code type="erl"><![CDATA[
+atom_map2(one) -> 1;
+atom_map2(two) -> 2;
+atom_map2(three) -> 3;
+atom_map2(four) -> 4;
+atom_map2(five) -> 5;
+atom_map2(six) -> 6;
+atom_map2(Int) when is_integer(Int) -> Int.]]></code>
+
+     <p>or</p> 
+
+     <p><em>DO</em></p>
+     <code type="erl"><![CDATA[
+atom_map3(Int) when is_integer(Int) -> Int;
+atom_map3(one) -> 1;
+atom_map3(two) -> 2;
+atom_map3(three) -> 3;
+atom_map3(four) -> 4;
+atom_map3(five) -> 5;
+atom_map3(six) -> 6.]]></code>
+
+     <p>will give slightly more efficient matching code.</p>
+
+     <p>Here is a less contrived example:</p>
+
+     <p><em>DO NOT</em></p>
+     <code type="erl"><![CDATA[
+map_pairs1(_Map, [], Ys) ->
+    Ys;
+map_pairs1(_Map, Xs, [] ) ->
+    Xs;
+map_pairs1(Map, [X|Xs], [Y|Ys]) ->
+    [Map(X, Y)|map_pairs1(Map, Xs, Ys)].]]></code>
+
+     <p>The first argument is <em>not</em> a problem. It is variable, but it
+     is a variable in all clauses. The problem is the variable in the second
+     argument, <c>Xs</c>, in the middle clause. Because the variable can
+     match anything, the compiler is not allowed to rearrange the clauses,
+     but must generate code that matches them in the order written.</p>
+
+     <p>If the function is rewritten like this</p>
+
+     <p><em>DO</em></p>
+     <code type="erl"><![CDATA[
+map_pairs2(_Map, [], Ys) ->
+    Ys;
+map_pairs2(_Map, [_|_]=Xs, [] ) ->
+    Xs;
+map_pairs2(Map, [X|Xs], [Y|Ys]) ->
+    [Map(X, Y)|map_pairs2(Map, Xs, Ys)].]]></code>
+
+    <p>the compiler is free rearrange the clauses. It will generate code
+    similar to this</p>
+
+    <p><em>DO NOT (already done by the compiler)</em></p>
+    <code type="erl"><![CDATA[
+explicit_map_pairs(Map, Xs0, Ys0) ->
+    case Xs0 of
+	[X|Xs] ->
+	    case Ys0 of
+		[Y|Ys] ->
+		    [Map(X, Y)|explicit_map_pairs(Map, Xs, Ys)];
+		[] ->
+		    Xs0
+	    end;
+	[] ->
+	    Ys0
+    end.]]></code>
+      
+    <p>which should be slightly faster for presumably the most common case
+    that the input lists are not empty or very short.
+    (Another advantage is that Dialyzer is able to deduce a better type
+    for the variable <c>Xs</c>.)</p>
+  </section>
+
+  <section>
+    <title>Function Calls </title>
+
+    <p>Here is an intentionally rough guide to the relative costs of
+    different kinds of calls. It is based on benchmark figures run on
+    Solaris/Sparc:</p>
+
+    <list type="bulleted">
+    <item>Calls to local or external functions (<c>foo()</c>, <c>m:foo()</c>)
+    are the fastest kind of calls.</item>
+    <item>Calling or applying a fun (<c>Fun()</c>, <c>apply(Fun, [])</c>)
+    is about <em>three times</em> as expensive as calling a local function.</item>
+    <item>Applying an exported function (<c>Mod:Name()</c>,
+    <c>apply(Mod, Name, [])</c>) is about twice as expensive as calling a fun,
+    or about <em>six times</em> as expensive as calling a local function.</item>
+    </list>
+
+    <section>
+       <title>Notes and implementation details</title>
+
+       <p>Calling and applying a fun does not involve any hash-table lookup.
+       A fun contains an (indirect) pointer to the function that implements
+       the fun.</p>
+
+       <warning><p><em>Tuples are not fun(s)</em>.
+       A "tuple fun", <c>{Module,Function}</c>, is not a fun.
+       The cost for calling a "tuple fun" is similar to that
+       of <c>apply/3</c> or worse. Using "tuple funs" is <em>strongly discouraged</em>,
+       as they may not be supported in a future release.</p></warning>
+
+       <p><c>apply/3</c> must look up the code for the function to execute
+       in a hash table. Therefore, it will always be slower than a
+       direct call or a fun call.</p>
+
+       <p>It no longer matters (from a performance point of view)
+       whether you write</p>
+
+       <code type="erl">
+Module:Function(Arg1, Arg2)</code>
+
+       <p>or</p>
+
+       <code type="erl">
+apply(Module, Function, [Arg1,Arg2])</code>
+
+       <p>(The compiler internally rewrites the latter code into the former.)</p>
+
+       <p>The following code</p>
+
+       <code type="erl">
+apply(Module, Function, Arguments)</code>
+
+       <p>is slightly slower because the shape of the list of arguments
+       is not known at compile time.</p>
+    </section>
+  </section>
+
+  <section>
+    <title>Memory usage in recursion</title>
+    <p>When writing recursive functions it is preferable to make them
+      tail-recursive so that they can execute in constant memory space.</p>
+    <p><em>DO</em></p>
+    <code type="none">
+list_length(List) ->
+    list_length(List, 0).
+
+list_length([], AccLen) -> 
+    AccLen; % Base case
+
+list_length([_|Tail], AccLen) ->
+    list_length(Tail, AccLen + 1). % Tail-recursive</code>
+    <p><em>DO NOT</em></p>
+    <code type="none">
+list_length([]) ->
+    0. % Base case
+list_length([_ | Tail]) ->
+    list_length(Tail) + 1. % Not tail-recursive</code>
+  </section>
+
+</chapter>
+