1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
|
<?xml version="1.0" encoding="latin1" ?>
<!DOCTYPE chapter SYSTEM "chapter.dtd">
<chapter>
<header>
<copyright>
<year>2001</year><year>2010</year>
<holder>Ericsson AB. All Rights Reserved.</holder>
</copyright>
<legalnotice>
The contents of this file are subject to the Erlang Public License,
Version 1.1, (the "License"); you may not use this file except in
compliance with the License. You should have received a copy of the
Erlang Public License along with this software. If not, it can be
retrieved online at http://www.erlang.org/.
Software distributed under the License is distributed on an "AS IS"
basis, WITHOUT WARRANTY OF ANY KIND, either express or implied. See
the License for the specific language governing rights and limitations
under the License.
</legalnotice>
<title>Functions</title>
<prepared>Bjorn Gustavsson</prepared>
<docno></docno>
<date>2007-11-22</date>
<rev></rev>
<file>functions.xml</file>
</header>
<section>
<title>Pattern matching</title>
<p>Pattern matching in function head and in <c>case</c> and <c>receive</c>
clauses are optimized by the compiler. With a few exceptions, there is nothing
to gain by rearranging clauses.</p>
<p>One exception is pattern matching of binaries. The compiler
will not rearrange clauses that match binaries. Placing the
clause that matches against the empty binary <em>last</em> will usually
be slightly faster than placing it <em>first</em>.</p>
<p>Here is a rather contrived example to show another exception:</p>
<p><em>DO NOT</em></p>
<code type="erl">
atom_map1(one) -> 1;
atom_map1(two) -> 2;
atom_map1(three) -> 3;
atom_map1(Int) when is_integer(Int) -> Int;
atom_map1(four) -> 4;
atom_map1(five) -> 5;
atom_map1(six) -> 6.</code>
<p>The problem is the clause with the variable <c>Int</c>.
Since a variable can match anything, including the atoms
<c>four</c>, <c>five</c>, and <c>six</c> that the following clauses
also will match, the compiler must generate sub-optimal code that will
execute as follows:</p>
<p>First the input value is compared to <c>one</c>, <c>two</c>, and
<c>three</c> (using a single instruction that does a binary search;
thus, quite efficient even if there are many values) to select which
one of the first three clauses to execute (if any).</p>
<p>If none of the first three clauses matched, the fourth clause
will match since a variable always matches. If the guard test
<c>is_integer(Int)</c> succeeds, the fourth clause will be
executed.</p>
<p>If the guard test failed, the input value is compared to
<c>four</c>, <c>five</c>, and <c>six</c>, and the appropriate clause
is selected. (There will be a <c>function_clause</c> exception if
none of the values matched.)</p>
<p>Rewriting to either</p>
<p><em>DO</em></p>
<code type="erl"><![CDATA[
atom_map2(one) -> 1;
atom_map2(two) -> 2;
atom_map2(three) -> 3;
atom_map2(four) -> 4;
atom_map2(five) -> 5;
atom_map2(six) -> 6;
atom_map2(Int) when is_integer(Int) -> Int.]]></code>
<p>or</p>
<p><em>DO</em></p>
<code type="erl"><![CDATA[
atom_map3(Int) when is_integer(Int) -> Int;
atom_map3(one) -> 1;
atom_map3(two) -> 2;
atom_map3(three) -> 3;
atom_map3(four) -> 4;
atom_map3(five) -> 5;
atom_map3(six) -> 6.]]></code>
<p>will give slightly more efficient matching code.</p>
<p>Here is a less contrived example:</p>
<p><em>DO NOT</em></p>
<code type="erl"><![CDATA[
map_pairs1(_Map, [], Ys) ->
Ys;
map_pairs1(_Map, Xs, [] ) ->
Xs;
map_pairs1(Map, [X|Xs], [Y|Ys]) ->
[Map(X, Y)|map_pairs1(Map, Xs, Ys)].]]></code>
<p>The first argument is <em>not</em> a problem. It is variable, but it
is a variable in all clauses. The problem is the variable in the second
argument, <c>Xs</c>, in the middle clause. Because the variable can
match anything, the compiler is not allowed to rearrange the clauses,
but must generate code that matches them in the order written.</p>
<p>If the function is rewritten like this</p>
<p><em>DO</em></p>
<code type="erl"><![CDATA[
map_pairs2(_Map, [], Ys) ->
Ys;
map_pairs2(_Map, [_|_]=Xs, [] ) ->
Xs;
map_pairs2(Map, [X|Xs], [Y|Ys]) ->
[Map(X, Y)|map_pairs2(Map, Xs, Ys)].]]></code>
<p>the compiler is free rearrange the clauses. It will generate code
similar to this</p>
<p><em>DO NOT (already done by the compiler)</em></p>
<code type="erl"><![CDATA[
explicit_map_pairs(Map, Xs0, Ys0) ->
case Xs0 of
[X|Xs] ->
case Ys0 of
[Y|Ys] ->
[Map(X, Y)|explicit_map_pairs(Map, Xs, Ys)];
[] ->
Xs0
end;
[] ->
Ys0
end.]]></code>
<p>which should be slightly faster for presumably the most common case
that the input lists are not empty or very short.
(Another advantage is that Dialyzer is able to deduce a better type
for the variable <c>Xs</c>.)</p>
</section>
<section>
<title>Function Calls </title>
<p>Here is an intentionally rough guide to the relative costs of
different kinds of calls. It is based on benchmark figures run on
Solaris/Sparc:</p>
<list type="bulleted">
<item>Calls to local or external functions (<c>foo()</c>, <c>m:foo()</c>)
are the fastest kind of calls.</item>
<item>Calling or applying a fun (<c>Fun()</c>, <c>apply(Fun, [])</c>)
is about <em>three times</em> as expensive as calling a local function.</item>
<item>Applying an exported function (<c>Mod:Name()</c>,
<c>apply(Mod, Name, [])</c>) is about twice as expensive as calling a fun,
or about <em>six times</em> as expensive as calling a local function.</item>
</list>
<section>
<title>Notes and implementation details</title>
<p>Calling and applying a fun does not involve any hash-table lookup.
A fun contains an (indirect) pointer to the function that implements
the fun.</p>
<warning><p><em>Tuples are not fun(s)</em>.
A "tuple fun", <c>{Module,Function}</c>, is not a fun.
The cost for calling a "tuple fun" is similar to that
of <c>apply/3</c> or worse. Using "tuple funs" is <em>strongly discouraged</em>,
as they may not be supported in a future release,
and because there exists a superior alternative since the R10B
release, namely the <c>fun Module:Function/Arity</c> syntax.</p></warning>
<p><c>apply/3</c> must look up the code for the function to execute
in a hash table. Therefore, it will always be slower than a
direct call or a fun call.</p>
<p>It no longer matters (from a performance point of view)
whether you write</p>
<code type="erl">
Module:Function(Arg1, Arg2)</code>
<p>or</p>
<code type="erl">
apply(Module, Function, [Arg1,Arg2])</code>
<p>(The compiler internally rewrites the latter code into the former.)</p>
<p>The following code</p>
<code type="erl">
apply(Module, Function, Arguments)</code>
<p>is slightly slower because the shape of the list of arguments
is not known at compile time.</p>
</section>
</section>
<section>
<title>Memory usage in recursion</title>
<p>When writing recursive functions it is preferable to make them
tail-recursive so that they can execute in constant memory space.</p>
<p><em>DO</em></p>
<code type="none">
list_length(List) ->
list_length(List, 0).
list_length([], AccLen) ->
AccLen; % Base case
list_length([_|Tail], AccLen) ->
list_length(Tail, AccLen + 1). % Tail-recursive</code>
<p><em>DO NOT</em></p>
<code type="none">
list_length([]) ->
0. % Base case
list_length([_ | Tail]) ->
list_length(Tail) + 1. % Not tail-recursive</code>
</section>
</chapter>
|