diff options
Diffstat (limited to 'system/doc/efficiency_guide/listhandling.xml')
-rw-r--r-- | system/doc/efficiency_guide/listhandling.xml | 241 |
1 files changed, 241 insertions, 0 deletions
diff --git a/system/doc/efficiency_guide/listhandling.xml b/system/doc/efficiency_guide/listhandling.xml new file mode 100644 index 0000000000..e9d2dfe556 --- /dev/null +++ b/system/doc/efficiency_guide/listhandling.xml @@ -0,0 +1,241 @@ +<?xml version="1.0" encoding="latin1" ?> +<!DOCTYPE chapter SYSTEM "chapter.dtd"> + +<chapter> + <header> + <copyright> + <year>2001</year><year>2009</year> + <holder>Ericsson AB. All Rights Reserved.</holder> + </copyright> + <legalnotice> + The contents of this file are subject to the Erlang Public License, + Version 1.1, (the "License"); you may not use this file except in + compliance with the License. You should have received a copy of the + Erlang Public License along with this software. If not, it can be + retrieved online at http://www.erlang.org/. + + Software distributed under the License is distributed on an "AS IS" + basis, WITHOUT WARRANTY OF ANY KIND, either express or implied. See + the License for the specific language governing rights and limitations + under the License. + + </legalnotice> + + <title>List handling</title> + <prepared>Bjorn Gustavsson</prepared> + <docno></docno> + <date>2007-11-16</date> + <rev></rev> + <file>listHandling.xml</file> + </header> + + <section> + <title>Creating a list</title> + + <p>Lists can only be built starting from the end and attaching + list elements at the beginning. If you use the <c>++</c> operator + like this</p> + + <code type="erl"> +List1 ++ List2</code> + + <p>you will create a new list which is copy of the elements in <c>List1</c>, + followed by <c>List2</c>. Looking at how <c>lists:append/1</c> or <c>++</c> would be + implemented in plain Erlang, it can be seen clearly that the first list + is copied:</p> + + <code type="erl"> +append([H|T], Tail) -> + [H|append(T, Tail)]; +append([], Tail) -> + Tail.</code> + + <p>So the important thing when recursing and building a list is to + make sure that you attach the new elements to the beginning of the list, + so that you build <em>a</em> list, and not hundreds or thousands of + copies of the growing result list.</p> + + <p>Let us first look at how it should not be done:</p> + + <p><em>DO NOT</em></p> + <code type="erl"><![CDATA[ +bad_fib(N) -> + bad_fib(N, 0, 1, []). + +bad_fib(0, _Current, _Next, Fibs) -> + Fibs; +bad_fib(N, Current, Next, Fibs) -> + bad_fib(N - 1, Next, Current + Next, Fibs ++ [Current]).]]></code> + + <p>Here we are not a building a list; in each iteration step we + create a new list that is one element longer than the new previous list.</p> + + <p>To avoid copying the result in each iteration, we must build the list in + reverse order and reverse the list when we are done:</p> + + <p><em>DO</em></p> + <code type="erl"><![CDATA[ +tail_recursive_fib(N) -> + tail_recursive_fib(N, 0, 1, []). + +tail_recursive_fib(0, _Current, _Next, Fibs) -> + lists:reverse(Fibs); +tail_recursive_fib(N, Current, Next, Fibs) -> + tail_recursive_fib(N - 1, Next, Current + Next, [Current|Fibs]).]]></code> + + </section> + + <section> + <title>List comprehensions</title> + + <p>Lists comprehensions still have a reputation for being slow. + They used to be implemented using funs, which used to be slow.</p> + + <p>In recent Erlang/OTP releases (including R12B), a list comprehension</p> + + <code type="erl"><![CDATA[ +[Expr(E) || E <- List]]]></code> + + <p>is basically translated to a local function</p> + + <code type="erl"> +'lc^0'([E|Tail], Expr) -> + [Expr(E)|'lc^0'(Tail, Expr)]; +'lc^0'([], _Expr) -> [].</code> + + <p>In R12B, if the result of the list comprehension will <em>obviously</em> not be used, + a list will not be constructed. For instance, in this code</p> + + <code type="erl"><![CDATA[ +[io:put_chars(E) || E <- List], +ok.]]></code> + + <p>or in this code</p> + + <code type="erl"><![CDATA[ +. +. +. +case Var of + ... -> + [io:put_chars(E) || E <- List]; + ... -> +end, +some_function(...), +. +. +.]]></code> + + <p>the value is neither assigned to a variable, nor passed to another function, + nor returned, so there is no need to construct a list and the compiler will simplify + the code for the list comprehension to</p> + + <code type="erl"> +'lc^0'([E|Tail], Expr) -> + Expr(E), + 'lc^0'(Tail, Expr); +'lc^0'([], _Expr) -> [].</code> + + </section> + + <section> + <title>Deep and flat lists</title> + + <p><seealso marker="stdlib:lists#flatten/1">lists:flatten/1</seealso> + builds an entirely new list. Therefore, it is expensive, and even + <em>more</em> expensive than the <c>++</c> (which copies its left argument, + but not its right argument).</p> + + <p>In the following situations, you can easily avoid calling <c>lists:flatten/1</c>:</p> + + <list type="bulleted"> + <item>When sending data to a port. Ports understand deep lists + so there is no reason to flatten the list before sending it to + the port.</item> + <item>When calling BIFs that accept deep lists, such as + <seealso marker="erts:erlang#list_to_binary/1">list_to_binary/1</seealso> or + <seealso marker="erts:erlang#iolist_to_binary/1">iolist_to_binary/1</seealso>.</item> + <item>When you know that your list is only one level deep, you can can use + <seealso marker="stdlib:lists#append/1">lists:append/1</seealso>.</item> + </list> + + <p><em>Port example</em></p> + <p><em>DO</em></p> + <pre> + ... + port_command(Port, DeepList) + ...</pre> + <p><em>DO NOT</em></p> + <pre> + ... + port_command(Port, lists:flatten(DeepList)) + ...</pre> + + <p>A common way to send a zero-terminated string to a port is the following:</p> + + <p><em>DO NOT</em></p> + <pre> + ... + TerminatedStr = String ++ [0], % String="foo" => [$f, $o, $o, 0] + port_command(Port, TerminatedStr) + ...</pre> + + <p>Instead do like this:</p> + + <p><em>DO</em></p> + <pre> + ... + TerminatedStr = [String, 0], % String="foo" => [[$f, $o, $o], 0] + port_command(Port, TerminatedStr) + ...</pre> + + <p><em>Append example</em></p> + <p><em>DO</em></p> + <pre> + > lists:append([[1], [2], [3]]). + [1,2,3] + ></pre> + <p><em>DO NOT</em></p> + <pre> + > lists:flatten([[1], [2], [3]]). + [1,2,3] + ></pre> + </section> + + <section> + <title>Why you should not worry about recursive lists functions</title> + + <p>In the performance myth chapter, the following myth was exposed: + <seealso marker="myths#tail_recursive">Tail-recursive functions + are MUCH faster than recursive functions</seealso>.</p> + + <p>To summarize, in R12B there is usually not much difference between + a body-recursive list function and tail-recursive function that reverses + the list at the end. Therefore, concentrate on writing beautiful code + and forget about the performance of your list functions. In the time-critical + parts of your code (and only there), <em>measure</em> before rewriting + your code.</p> + + <p><em>Important note</em>: This section talks about lists functions that + <em>construct</em> lists. A tail-recursive function that does not construct + a list runs in constant space, while the corresponding body-recursive + function uses stack space proportional to the length of the list. + For instance, a function that sums a list of integers, should <em>not</em> be + written like this</p> + + <p><em>DO NOT</em></p> + <code type="erl"> +recursive_sum([H|T]) -> H+recursive_sum(T); +recursive_sum([]) -> 0.</code> + + <p>but like this</p> + + <p><em>DO</em></p> + <code type="erl"> +sum(L) -> sum(L, 0). + +sum([H|T], Sum) -> sum(T, Sum + H); +sum([], Sum) -> Sum.</code> + </section> +</chapter> + |