aboutsummaryrefslogtreecommitdiffstats
path: root/system/doc/extensions/funs.xml
blob: f9c003c8eee3218dc7b99527014c35d591158068 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
<?xml version="1.0" encoding="latin1" ?>
<!DOCTYPE chapter SYSTEM "chapter.dtd">

<chapter>
  <header>
    <copyright>
      <year>1997</year>
      <year>2007</year>
      <holder>Ericsson AB, All Rights Reserved</holder>
    </copyright>
    <legalnotice>
  The contents of this file are subject to the Erlang Public License,
  Version 1.1, (the "License"); you may not use this file except in
  compliance with the License. You should have received a copy of the
  Erlang Public License along with this software. If not, it can be
  retrieved online at http://www.erlang.org/.

  Software distributed under the License is distributed on an "AS IS"
  basis, WITHOUT WARRANTY OF ANY KIND, either express or implied. See
  the License for the specific language governing rights and limitations
  under the License.

  The Initial Developer of the Original Code is Ericsson AB.
    </legalnotice>

    <title>Programming with Funs</title>
    <prepared>Joe Armstrong</prepared>
    <responsible>Bjarne D&auml;cker</responsible>
    <docno>1</docno>
    <approved>Bjarne D&auml;Ker</approved>
    <checked></checked>
    <date>96-01-16</date>
    <rev>PA1</rev>
    <file>funs.sgml</file>
  </header>
  <p>This section introduces functional objects (Funs). That is a new data type introduced in Erlang 4.4. Functions which takes Funs as arguments, or which return Funs are called higher order functions.</p>
  <list type="bulleted">
    <item>Funs can be passed as arguments to other functions, just like lists or tuples</item>
    <item>functions can be written which return Funs, just like any other data object.</item>
  </list>

  <section>
    <title>Higher Order Functions </title>
    <p>Funs encourages us to encapsulate common patterns of design into functional forms called higher order functions. These functions not only shortens programs, but also produce clearer programs because the intended meaning of the program is explicitly rather than implicitly stated.</p>
    <p>The concepts of higher order functions and procedural abstraction are introduced with two brief examples.</p>

    <section>
      <title>Example 1 - map</title>
      <p>If we want to double every element in a list, we could write a function named <c>double</c>:</p>
      <code type="none">

double([H|T]) -> [2*H|double(T)];
double([])    -> []      </code>
      <p>This function obviously doubles the argument entered as input as follows:</p>
      <pre>

> <input>double([1,2,3,4]).</input>
[2,4,6,8]      </pre>
      <p>We now add the function <c>add_one</c>, which adds one to every element in a list:</p>
      <code type="none">

add_one([H|T]) -> [H+1|add_one(T)];
add_one([])    -> [].      </code>
      <p>These functions, <c>double</c>  and <c>add_one</c>, have a very similar structure. We can exploit this fact and write a function <c>map</c> which expresses this similarity:</p>
      <codeinclude file="funs1.erl" tag="%1" type="erl"></codeinclude>
      <p>We can now express the functions <c>double</c> and <c>add_one</c> in terms of <c>map</c> as follows:</p>
      <code type="none">

double(L)  -> map(fun(X) -> 2*X end, L).
add_one(L) -> map(fun(X) -> 1 + X end, L).      </code>
      <p><c>map(F, List)</c> is a function which takes a function <c>F</c> and a list <c>L</c> as arguments and
        returns the new list which is obtained by applying <c>F</c> to each of the elements in <c>L</c>.</p>
      <p>The process of abstracting out the common features of a number of different programs is called procedural abstraction. Procedural abstraction can be used in order to write several different functions which have a similar structure, but differ only in some minor detail. This is done as follows:</p>
      <list type="ordered">
        <item>write one function which represents the common features of these functions</item>
        <item>parameterize the difference in terms of functions which are passed as arguments to the common function.</item>
      </list>
    </section>

    <section>
      <title>Example 2 - foreach</title>
      <p>This example illustrates procedural abstraction. Initially, we show the following two examples written as conventional functions:</p>
      <list type="ordered">
        <item>all elements of a list are printed onto a stream</item>
        <item>a message is broadcast to a list of processes.</item>
      </list>
      <code type="none">

print_list(Stream, [H|T]) ->
    io:format(Stream, "~p~n", [H]),
    print_list(Stream, T);
print_on_list(Stream, []) ->
    true.      </code>
      <code type="none">

broadcast(Msg, [Pid|Pids]) ->
    Pid ! Msg,
    broadcast(Msg, Pids);
broadcast(_, []) ->
    true.      </code>
      <p>Both these functions have a very similar structure. They both iterate over a list doing something to each element in the list. The "something" has to be carried round as an extra argument to the function which does this.</p>
      <p>The function <c>foreach</c> expresses this similarity:</p>
      <codeinclude file="funs1.erl" tag="%2" type="erl"></codeinclude>
      <p>Using <c>foreach</c>, <c>print_on_list</c> becomes:</p>
      <code type="none">

foreach(fun(H) -> io:format(S, "~p~n~,[H]) end, L)      </code>
      <p><c>broadcast</c> becomes:</p>
      <code type="none">

foreach(fun(Pid) -> Pid ! M end, L)      </code>
      <p><c>foreach</c> is evaluated for its side-effect and not its value. <c>foreach(Fun ,L)</c> calls <c>Fun(X)</c> for each element <c>X</c> in <c>L</c> and the processing occurs in the order in which the elements were defined in <c>L</c>. <c>map</c> does not define the order in which its elements are processed.</p>
    </section>
  </section>

  <section>
    <title>Advantages of Higher Order Functions</title>
    <p>Programming with higher order functions, such as <c>map</c> and <c>foreach</c>, has a number of advantages:</p>
    <list type="bulleted">
      <item>It is much easier to understand the program and the intention of the programmer is clearly expressed in the code. The statement <c>foreach(fun(X) -></c> clearly indicates that the intention of this program is to do something to each element in the list <c>L</c>. We also know that the function which is passed as the first argument of <c>foreach</c> takes one argument <c>X</c>, which will be successively bound to each of the elements in <c>L</c>.</item>
      <item>Functions which take Funs as arguments are much easier to re-use than other functions. </item>
    </list>
  </section>

  <section>
    <title>The Syntax of Funs</title>
    <p>Funs are written with the syntax:</p>
    <code type="none">

F = fun (Arg1, Arg2, ... ArgN) ->
        ...
    end    </code>
    <p>This creates an anonymous function of <c>N</c> arguments and binds it to the variable <c>F</c>.</p>
    <p>If we have already written a function in the same module and wish to pass this function as an argument, we can use the following syntax:</p>
    <code type="none">

F = fun FunctionName/Arity    </code>
    <p>With this form of function reference, the function which is referred to does not need to be exported from the module.</p>
    <p>We can also refer to a function defined in a different module with the following syntax:</p>
    <code type="none">

F = {Module, FunctionName}    </code>
    <p>In this case, the function must be exported from the module in question.</p>
    <p>The follow program illustrates the different ways of creating Funs:</p>
    <codeinclude file="fun_test.erl" tag="%1" type="erl"></codeinclude>
    <p>We can evaluate the fun <c>F</c> with the syntax:</p>
    <code type="none">

F(Arg1, Arg2, ..., Argn)    </code>
    <p>To check whether a term is a Fun, use the test <c>function/1</c>
      in a guard. Example:</p>
    <code type="none">
f(F, Args) when function(F) ->
   apply(F, Args);
f(N, _) when integer(N) ->
   N.    </code>
    <p>Funs are a distinct type. The BIFs erlang:fun_info/1,2 can
      be used to retrieve information about a fun, and the BIF
      erlang:fun_to_list/1 returns a textual representation of a fun.
      The check_process_code/2 BIF returns true if the
      process contains funs that depend on the old version of
      a module.</p>
    <note>
      <p>In OTP R5 and earlier releases, funs were represented
        using tuples.</p>
    </note>
  </section>

  <section>
    <title>Variable Bindings within a Fun</title>
    <p>The scope rules for variables which occur in Funs are as follows:</p>
    <list type="bulleted">
      <item>All variables which occur in the head of a Fun are assumed to be "fresh" variables.</item>
      <item>Variables which are defined before the Fun, and which occur in function calls or guard tests within the Fun, have the values they had outside the Fun.</item>
      <item>No variables may be exported from a Fun.</item>
    </list>
    <p>The following examples illustrate these rules:</p>
    <code type="none">

print_list(File, List) ->
    {ok, Stream} = file:open(File, write),
    foreach(fun(X) -> io:format(Stream,"~p~n",[X]) end, List),
    file:close(Stream).    </code>
    <p>In the above example, the variable <c>X</c> which is defined in the head of the Fun is a new variable. The value of the variable <c>Stream</c> which is used within within the Fun gets its value from the
      <c>file:open</c> line.</p>
    <p>Since any variable which occurs in the head of a Fun is considered a new variable it would be equally valid to write:</p>
    <code type="none">

print_list(File, List) ->
    {ok, Stream} = file:open(File, write),
    foreach(fun(File) -> 
                io:format(Stream,"~p~n",[File]) 
            end, List),
    file:close(Stream).    </code>
    <p>In this example, <c>File</c> is used as the new variable instead of <c>X</c>. This is rather silly since code in the body of the Fun cannot refer to the variable <c>File</c> which is defined outside the Fun. Compiling this example will yield the diagnostic:</p>
    <code type="none">

./FileName.erl:Line: Warning: variable 'File' 
      shadowed in 'lambda head'    </code>
    <p>This reminds us that the variable <c>File</c> which is defined inside the Fun collides with the variable <c>File</c> which is defined outside the Fun.</p>
    <p>The rules for importing variables into a Fun has the consequence that certain pattern matching operations have to be moved into guard expressions and cannot be written in the head of the Fun. For example, we might write the following code if we intend the first clause of <c>F</c> to be evaluated when the value of its argument is <c>Y</c>:</p>
    <code type="none">

f(...) ->
    Y = ...
    map(fun(X) when X == Y ->
             ;
           (_) ->
             ...
        end, ...)
    ...    </code>
    <p>instead of</p>
    <code type="none">

f(...) ->
    Y = ...
    map(fun(Y) ->
             ;
           (_) ->
             ...
        end, ...)
    ...    </code>
  </section>

  <section>
    <title>Funs and the Module Lists</title>
    <p>The following examples show a dialogue with the Erlang shell. All the higher order functions discussed are exported from the module <c>lists</c>.</p>

    <section>
      <title>map</title>
      <codeinclude file="funs1.erl" tag="%1" type="erl"></codeinclude>
      <p><c>map</c> takes a function of one argument and a list of terms. It returns the list obtained by applying the function to every argument in the list.</p>
      <pre>

1> <input>Double = fun(X) -> 2 * X end.</input>
#Fun&lt;erl_eval>
2><input>lists:map(Double, [1,2,3,4,5]).</input>
[2,4,6,8,10]      </pre>
      <p>When a new Fun is defined in the shell, the value of the Fun is printed as <c><![CDATA[Fun#<erl_eval>]]></c></p>
    </section>

    <section>
      <title>any</title>
      <codeinclude file="funs1.erl" tag="%4" type="erl"></codeinclude>
      <p><c>any</c> takes a predicate <c>P</c> of one argument and a list of terms. A predicate is a function which returns <c>true</c> or <c>false</c>. <c>any</c> is true if there is a term <c>X</c> in the list such that <c>P(X)</c> is <c>true</c>.</p>
      <p>We define a predicate <c>Big(X)</c> which is <c>true</c> if its argument is greater that 10.</p>
      <pre>

3> <input>Big =  fun(X) -> if X > 10 -> true; true -> false end end.</input>
#Fun&lt;erl_eval>
4><input>lists:any(Big, [1,2,3,4]).</input>
false.
5> <input>lists:any(Big, [1,2,3,12,5]).</input>
true.      </pre>
    </section>

    <section>
      <title>all</title>
      <codeinclude file="funs1.erl" tag="%3" type="erl"></codeinclude>
      <p><c>all</c> has the same arguments as <c>any</c>. It is true if the predicate applied to all elements in the list is true.</p>
      <pre>

6><input>lists:all(Big, [1,2,3,4,12,6]).</input>   
false
7><input>lists:all(Big, [12,13,14,15]).</input>       
true      </pre>
    </section>

    <section>
      <title>foreach</title>
      <codeinclude file="funs1.erl" tag="%2" type="erl"></codeinclude>
      <p><c>foreach</c> takes a function of one argument and a list of terms. The function is applied to each argument in the list. <c>foreach</c> returns <c>ok</c>. It is used for its side-effect only.</p>
      <pre>

8> <input>lists:foreach(fun(X) -> io:format("~w~n",[X]) end, [1,2,3,4]).</input> 
1
2
3
4
true      </pre>
    </section>

    <section>
      <title>foldl</title>
      <codeinclude file="funs1.erl" tag="%8" type="erl"></codeinclude>
      <p><c>foldl</c> takes a function of two arguments, an accumulator and a list. The function is called with two arguments. The first argument is the successive elements in the list, the second argument is the accumulator. The function must return a new accumulator which is used the next time the function is called.</p>
      <p>If we have a list of lists <c>L = ["I","like","Erlang"]</c>, then we can sum the lengths of all the strings in <c>L</c> as follows:</p>
      <pre>

9> <input>L = ["I","like","Erlang"].</input>
["I","like","Erlang"]
10> <input>lists:foldl(fun(X, Sum) -> length(X) + Sum end, 0, L).</input>                    
11      </pre>
      <p><c>foldl</c> works like a <c>while</c> loop in an imperative language:</p>
      <code type="none">

    L =  ["I","like","Erlang"],
    Sum = 0,
    while( L != []){
        Sum += length(head(L)),
        L = tail(L)
    end      </code>
    </section>

    <section>
      <title>mapfoldl</title>
      <codeinclude file="funs1.erl" tag="%10" type="erl"></codeinclude>
      <p><c>mapfoldl</c> simultaneously maps and folds over a list. The following example shows how to change all letters in <c>L</c> to upper case and count them.</p>
      <p>First upcase:</p>
      <pre>

11> <input>Upcase =  fun(X) when $a =&lt; X,  X =&lt; $z -> X + $A - $a;</input>
<input>(X) -> X</input> 
<input>end.</input>
#Fun&lt;erl_eval>
12> <input>Upcase_word =</input> 
<input>fun(X) -></input> 
<input>lists:map(Upcase, X)</input> 
<input>end.</input>
#Fun&lt;erl_eval>
13><input>Upcase_word("Erlang").</input>
"ERLANG"
14><input>lists:map(Upcase_word, L).</input>
["I","LIKE","ERLANG"]      </pre>
      <p>Now we can do the fold and the map at the same time:</p>
      <pre>

14> <input>lists:mapfoldl(fun(Word, Sum) -></input>
14>    <input>{Upcase_word(Word), Sum + length(Word)}</input>
14>              <input>end, 0, L).</input>
{["I","LIKE","ERLANG"],11}      </pre>
    </section>

    <section>
      <title>filter</title>
      <codeinclude file="funs1.erl" tag="%9" type="erl"></codeinclude>
      <p><c>filter</c> takes a predicate of one argument and a list and returns all element in the list which satisfy the predicate.</p>
      <pre>

15><input>lists:filter(Big, [500,12,2,45,6,7]).</input>
[500,12,45]      </pre>
      <p>When we combine maps and filters we can write very succinct and obviously correct code. For example, suppose we want to define a set difference function. We want to define <c>diff(L1, L2)</c> to be the difference between the lists <c>L1</c> and <c>L2</c>.
        This is the list of all elements in L1 which are not contained in L2. This code can be written as follows:</p>
      <code type="none">

diff(L1, L2) -> 
    filter(fun(X) -> not member(X, L2) end, L1).      </code>
      <p>The AND intersection of the list <c>L1</c> and <c>L2</c> is also easily defined:</p>
      <code type="none">

intersection(L1,L2) -> filter(fun(X) -> member(X,L1) end, L2).      </code>
    </section>

    <section>
      <title>takewhile</title>
      <codeinclude file="funs1.erl" tag="%5" type="erl"></codeinclude>
      <p><c>takewhile(P, L)</c> takes elements <c>X</c> from a list <c>L</c> as long as the predicate <c>P(X)</c> is true.</p>
      <pre>

16><input>lists:takewhile(Big, [200,500,45,5,3,45,6]).</input>  
[200,500,45]      </pre>
    </section>

    <section>
      <title>dropwhile</title>
      <codeinclude file="funs1.erl" tag="%6" type="erl"></codeinclude>
      <p><c>dropwhile</c> is the complement of <c>takewhile</c>.</p>
      <pre>

17> <input>lists:dropwhile(Big, [200,500,45,5,3,45,6]).</input>
[5,3,45,6]      </pre>
    </section>

    <section>
      <title>splitlist</title>
      <codeinclude file="funs1.erl" tag="%7" type="erl"></codeinclude>
      <p><c>splitlist(P, L)</c> splits the list <c>L</c> into the two sub-lists <c>{L1, L2}</c>, where <c>L = takewhile(P, L)</c> and <c>L2 = dropwhile(P, L)</c>.</p>
      <pre>

18><input>lists:splitlist(Big, [200,500,45,5,3,45,6]).</input>          
{[200,500,45],[5,3,45,6]}      </pre>
    </section>

    <section>
      <title>first</title>
      <codeinclude file="funs1.erl" tag="%11" type="erl"></codeinclude>
      <p><c>first</c> returns <c>{true, R}</c>, where <c>R</c> is the first element in a list satisfying a predicate or <c>false</c>:</p>
      <pre>

19><input>lists:first(Big, [1,2,45,6,123]).</input>
{true,45}
20><input>lists:first(Big, [1,2,4,5]). </input>       
false      </pre>
    </section>
  </section>

  <section>
    <title>Funs which Return Funs</title>
    <p>So far, this section has only described functions which take Funs as arguments. It is also possible to write more powerful functions which themselves return Funs. The following examples illustrate these type of functions.</p>

    <section>
      <title>Simple Higher Order Functions</title>
      <p><c>Adder(X)</c> is a function which, given <c>X</c>, returns a new function <c>G</c> such that <c>G(K)</c> returns <c>K + X</c>.</p>
      <pre>

21> <input>Adder = fun(X) -> fun(Y) -> X + Y end end.</input>
#Fun&lt;erl_eval>
22> <input>Add6 = Adder(6).</input>
#Fun&lt;erl_eval>
23><input>Add6(10).</input>
16      </pre>
    </section>

    <section>
      <title>Infinite Lists</title>
      <p>The idea is to write something like:</p>
      <code type="none">

-module(lazy).
-export([ints_from/1]).
ints_from(N) ->
    fun() ->
            [N|ints_from(N+1)]
    end.      </code>
      <p>Then we can proceed as follows:</p>
      <code type="none"><![CDATA[

24> XX = lazy:ints_from(1).
#Fun<lazy>
25> XX().
[1|#Fun<lazy>]
26> hd(XX()).
1
27> Y = tl(XX()).
#Fun<lazy>
28> hd(Y()).
2      ]]></code>
      <p>etc. - this is an example of "lazy embedding"</p>
    </section>

    <section>
      <title>Parsing</title>
      <p>The following examples show parsers of the following type:</p>
      <pre>
Parser(Toks) -> {ok, Tree, Toks1} | fail      </pre>
      <p><c>Toks</c> is the list of tokens to be parsed. A successful parse returns <c>{ok, Tree, Toks1}</c>, where <c>Tree</c> is a parse tree and <c>Toks1</c> is a tail of <c>Tree</c> which contains symbols encountered after the structure which was correctly parsed. Otherwise <c>fail</c> is returned.</p>
      <p>The example which follows illustrates a simple, functional parser which parses the grammar:</p>
      <pre>
(a | b) &amp; (c | d)      </pre>
      <p>The following code defines a function <c>pconst(X)</c> in the module <c>funparse</c>, which returns a Fun which parses a list of tokens.</p>
      <codeinclude file="funparse.erl" tag="%14" type="erl"></codeinclude>
      <p>This function can be used as follows:</p>
      <pre>
29><input>P1 = funparse:pconst(a).</input>
#Fun&lt;hof>
30> <input>P1([a,b,c]).</input>
{ok,{const,a},[b,c]}
31> <input>P1([x,y,z]).</input>     
fail      </pre>
      <p>Next, we define the two higher order functions <c>pand</c> and <c>por</c> which combine primitive parsers to produce more complex parsers. Firstly <c>pand</c>:</p>
      <codeinclude file="funparse.erl" tag="%16" type="erl"></codeinclude>
      <p>Given a parser <c>P1</c> for grammar <c>G1</c>, and a parser <c>P2</c> for grammar <c>G2</c>, <c>pand(P1, P2)</c> 
        returns a parser for the grammar which consists of sequences of tokens which satisfy <c>G1</c> followed by sequences of tokens which satisfy <c>G2</c>.</p>
      <p><c>por(P1, P2)</c> returns a parser for the language described by the grammar <c>G1</c> or <c>G2</c>.</p>
      <codeinclude file="funparse.erl" tag="%15" type="erl"></codeinclude>
      <p>The original problem was to parse the grammar <c><![CDATA[(a | b) & (c | d)]]></c>. The following code addresses this problem: </p>
      <codeinclude file="funparse.erl" tag="%13" type="erl"></codeinclude>
      <p>The following code adds a parser interface to the grammar:</p>
      <codeinclude file="funparse.erl" tag="%12" type="erl"></codeinclude>
      <p>We can test this parser as follows:</p>
      <pre>

32> <input>funparse:parse([a,c]).</input>
{ok,{'and',{'or',1,{const,a}},{'or',1,{const,c}}}}
33> <input>funparse:parse([a,d]).</input> 
{ok,{'and',{'or',1,{const,a}},{'or',2,{const,d}}}}
34> <input>funparse:parse([b,c]).</input>   
{ok,{'and',{'or',2,{const,b}},{'or',1,{const,c}}}}
35> <input>funparse:parse([b,d]).</input> 
{ok,{'and',{'or',2,{const,b}},{'or',2,{const,d}}}}
36> <input>funparse:parse([a,b]).</input>   
fail      </pre>
    </section>
  </section>
</chapter>