20012009 Ericsson AB. All Rights Reserved. The contents of this file are subject to the Erlang Public License, Version 1.1, (the "License"); you may not use this file except in compliance with the License. You should have received a copy of the Erlang Public License along with this software. If not, it can be retrieved online at http://www.erlang.org/. Software distributed under the License is distributed on an "AS IS" basis, WITHOUT WARRANTY OF ANY KIND, either express or implied. See the License for the specific language governing rights and limitations under the License. Functions Bjorn Gustavsson 2007-11-22 functions.xml
Pattern matching

Pattern matching in function head and in case and receive clauses are optimized by the compiler. With a few exceptions, there is nothing to gain by rearranging clauses.

One exception is pattern matching of binaries. The compiler will not rearrange clauses that match binaries. Placing the clause that matches against the empty binary last will usually be slightly faster than placing it first.

Here is a rather contrived example to show another exception:

DO NOT

atom_map1(one) -> 1; atom_map1(two) -> 2; atom_map1(three) -> 3; atom_map1(Int) when is_integer(Int) -> Int; atom_map1(four) -> 4; atom_map1(five) -> 5; atom_map1(six) -> 6.

The problem is the clause with the variable Int. Since a variable can match anything, including the atoms four, five, and six that the following clauses also will match, the compiler must generate sub-optimal code that will execute as follows:

First the input value is compared to one, two, and three (using a single instruction that does a binary search; thus, quite efficient even if there are many values) to select which one of the first three clauses to execute (if any).

If none of the first three clauses matched, the fourth clause will match since a variable always matches. If the guard test is_integer(Int) succeeds, the fourth clause will be executed.

If the guard test failed, the input value is compared to four, five, and six, and the appropriate clause is selected. (There will be a function_clause exception if none of the values matched.)

Rewriting to either

DO

1; atom_map2(two) -> 2; atom_map2(three) -> 3; atom_map2(four) -> 4; atom_map2(five) -> 5; atom_map2(six) -> 6; atom_map2(Int) when is_integer(Int) -> Int.]]>

or

DO

Int; atom_map3(one) -> 1; atom_map3(two) -> 2; atom_map3(three) -> 3; atom_map3(four) -> 4; atom_map3(five) -> 5; atom_map3(six) -> 6.]]>

will give slightly more efficient matching code.

Here is a less contrived example:

DO NOT

Ys; map_pairs1(_Map, Xs, [] ) -> Xs; map_pairs1(Map, [X|Xs], [Y|Ys]) -> [Map(X, Y)|map_pairs1(Map, Xs, Ys)].]]>

The first argument is not a problem. It is variable, but it is a variable in all clauses. The problem is the variable in the second argument, Xs, in the middle clause. Because the variable can match anything, the compiler is not allowed to rearrange the clauses, but must generate code that matches them in the order written.

If the function is rewritten like this

DO

Ys; map_pairs2(_Map, [_|_]=Xs, [] ) -> Xs; map_pairs2(Map, [X|Xs], [Y|Ys]) -> [Map(X, Y)|map_pairs2(Map, Xs, Ys)].]]>

the compiler is free rearrange the clauses. It will generate code similar to this

DO NOT (already done by the compiler)

case Xs0 of [X|Xs] -> case Ys0 of [Y|Ys] -> [Map(X, Y)|explicit_map_pairs(Map, Xs, Ys)]; [] -> Xs0 end; [] -> Ys0 end.]]>

which should be slightly faster for presumably the most common case that the input lists are not empty or very short. (Another advantage is that Dialyzer is able to deduce a better type for the variable Xs.)

Function Calls

Here is an intentionally rough guide to the relative costs of different kinds of calls. It is based on benchmark figures run on Solaris/Sparc:

Calls to local or external functions (foo(), m:foo()) are the fastest kind of calls. Calling or applying a fun (Fun(), apply(Fun, [])) is about three times as expensive as calling a local function. Applying an exported function (Mod:Name(), apply(Mod, Name, [])) is about twice as expensive as calling a fun, or about six times as expensive as calling a local function.
Notes and implementation details

Calling and applying a fun does not involve any hash-table lookup. A fun contains an (indirect) pointer to the function that implements the fun.

Tuples are not fun(s). A "tuple fun", {Module,Function}, is not a fun. The cost for calling a "tuple fun" is similar to that of apply/3 or worse. Using "tuple funs" is strongly discouraged, as they may not be supported in a future release, and because there exists a superior alternative since the R10B release, namely the fun Module:Function/Arity syntax.

apply/3 must look up the code for the function to execute in a hash table. Therefore, it will always be slower than a direct call or a fun call.

It no longer matters (from a performance point of view) whether you write

Module:Function(Arg1, Arg2)

or

apply(Module, Function, [Arg1,Arg2])

(The compiler internally rewrites the latter code into the former.)

The following code

apply(Module, Function, Arguments)

is slightly slower because the shape of the list of arguments is not known at compile time.

Memory usage in recursion

When writing recursive functions it is preferable to make them tail-recursive so that they can execute in constant memory space.

DO

list_length(List) -> list_length(List, 0). list_length([], AccLen) -> AccLen; % Base case list_length([_|Tail], AccLen) -> list_length(Tail, AccLen + 1). % Tail-recursive

DO NOT

list_length([]) -> 0. % Base case list_length([_ | Tail]) -> list_length(Tail) + 1. % Not tail-recursive