From 0264d301de02c5dd7b2a9a389294ea4a36127046 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Bj=C3=B6rn=20Gustavsson?= Date: Tue, 10 Jan 2017 13:51:19 +0100 Subject: Remove paragraph mentioning improvements in R12B --- system/doc/efficiency_guide/introduction.xml | 8 -------- 1 file changed, 8 deletions(-) diff --git a/system/doc/efficiency_guide/introduction.xml b/system/doc/efficiency_guide/introduction.xml index ca4a41c798..b650008ae8 100644 --- a/system/doc/efficiency_guide/introduction.xml +++ b/system/doc/efficiency_guide/introduction.xml @@ -46,14 +46,6 @@ to find out where the performance bottlenecks are and optimize only the bottlenecks. Let other code stay as clean as possible.

-

Fortunately, compiler and runtime optimizations introduced in - Erlang/OTP R12B makes it easier to write code that is both clean and - efficient. For example, the ugly workarounds needed in R11B and earlier - releases to get the most speed out of binary pattern matching are - no longer necessary. In fact, the ugly code is slower - than the clean code (because the clean code has become faster, not - because the uglier code has become slower).

-

This Efficiency Guide cannot really teach you how to write efficient code. It can give you a few pointers about what to avoid and what to use, and some understanding of how certain language features are implemented. -- cgit v1.2.3 From 184b2627f8908c8e6af033991ee831c3fb2f9f82 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Bj=C3=B6rn=20Gustavsson?= Date: Tue, 10 Jan 2017 13:54:08 +0100 Subject: Don't call byte_size/1 and tuple_size/1 "new" --- system/doc/efficiency_guide/commoncaveats.xml | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/system/doc/efficiency_guide/commoncaveats.xml b/system/doc/efficiency_guide/commoncaveats.xml index ecfeff0349..94b1c0b222 100644 --- a/system/doc/efficiency_guide/commoncaveats.xml +++ b/system/doc/efficiency_guide/commoncaveats.xml @@ -148,10 +148,10 @@ multiple_setelement(T0) ->

size/1 returns the size for both tuples and binaries.

-

Using the new BIFs tuple_size/1 and byte_size/1, introduced - in R12B, gives the compiler and the runtime system more opportunities for - optimization. Another advantage is that the new BIFs can help Dialyzer to - find more bugs in your program.

+

Using the BIFs tuple_size/1 and byte_size/1 + gives the compiler and the runtime system more opportunities for + optimization. Another advantage is that the BIFs give Dialyzer more + type information.

-- cgit v1.2.3 From 071b8c4470cc9f0d6bee6f00e00ca325531b4a01 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Bj=C3=B6rn=20Gustavsson?= Date: Tue, 10 Jan 2017 13:55:27 +0100 Subject: Don't mention "tuple funs" at all "Tuples funs" was removed a long time ago. There is no need to even mention them. --- system/doc/efficiency_guide/functions.xml | 9 --------- 1 file changed, 9 deletions(-) diff --git a/system/doc/efficiency_guide/functions.xml b/system/doc/efficiency_guide/functions.xml index 4a8248e65c..1c34888bb5 100644 --- a/system/doc/efficiency_guide/functions.xml +++ b/system/doc/efficiency_guide/functions.xml @@ -183,15 +183,6 @@ explicit_map_pairs(Map, Xs0, Ys0) -> A fun contains an (indirect) pointer to the function that implements the fun.

-

Tuples are not fun(s). - A "tuple fun", {Module,Function}, is not a fun. - The cost for calling a "tuple fun" is similar to that - of apply/3 or worse. - Using "tuple funs" is strongly discouraged, - as they might not be supported in a future Erlang/OTP release, - and because there exists a superior alternative from R10B, - namely the fun Module:Function/Arity syntax.

-

apply/3 must look up the code for the function to execute in a hash table. It is therefore always slower than a direct call or a fun call.

-- cgit v1.2.3 From 953f57e3a86f3b714a634177e32f630b56a05240 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Bj=C3=B6rn=20Gustavsson?= Date: Tue, 10 Jan 2017 14:29:08 +0100 Subject: Remove comparisons of binary handling between R11B and R12B Shorten the text by removing superfluous details about how binary handling was different in R11B. --- system/doc/efficiency_guide/binaryhandling.xml | 81 +++++++++----------------- 1 file changed, 26 insertions(+), 55 deletions(-) diff --git a/system/doc/efficiency_guide/binaryhandling.xml b/system/doc/efficiency_guide/binaryhandling.xml index 0295d18644..91fd9a7cd9 100644 --- a/system/doc/efficiency_guide/binaryhandling.xml +++ b/system/doc/efficiency_guide/binaryhandling.xml @@ -32,12 +32,9 @@ binaryhandling.xml -

In R12B, the most natural way to construct and match binaries is - significantly faster than in earlier releases.

+

Binaries can be efficiently built in the following way:

-

To construct a binary, you can simply write as follows:

- -

DO (in R12B) / REALLY DO NOT (in earlier releases)

+

DO

my_list_to_binary(List, <<>>). @@ -47,21 +44,13 @@ my_list_to_binary([H|T], Acc) -> my_list_to_binary([], Acc) -> Acc.]]> -

In releases before R12B, Acc is copied in every iteration. - In R12B, Acc is copied only in the first iteration and extra - space is allocated at the end of the copied binary. In the next iteration, - H is written into the extra space. When the extra space runs out, - the binary is reallocated with more extra space. The extra space allocated - (or reallocated) is twice the size of the - existing binary data, or 256, whichever is larger.

- -

The most natural way to match binaries is now the fastest:

+

Binaries can be efficiently matched like this:

-

DO (in R12B)

+

DO

>) -> [H|my_binary_to_list(T)]; -my_binary_to_list(<<>>) -> [].]]> +my_binary_to_list(<<>>) -> [].]]>
How Binaries are Implemented @@ -138,10 +127,7 @@ my_binary_to_list(<<>>) -> [].]]> pointer to the binary data. For each field that is matched out of a binary, the position in the match context is incremented.

-

In R11B, a match context was only used during a binary matching - operation.

- -

In R12B, the compiler tries to avoid generating code that +

The compiler tries to avoid generating code that creates a sub binary, only to shortly afterwards create a new match context and discard the sub binary. Instead of creating a sub binary, the match context is kept.

@@ -155,7 +141,7 @@ my_binary_to_list(<<>>) -> [].]]>
Constructing Binaries -

In R12B, appending to a binary or bitstring +

Appending to a binary or bitstring is specially optimized by the runtime system:

> %% Bin1 will be COPIED

Let us revisit the example in the beginning of the previous section:

-

DO (in R12B)

+

DO

>) -> [H|my_binary_to_list(T)]; @@ -304,15 +290,14 @@ my_binary_to_list(<<>>) -> [].]]> byte of the binary. 1 byte is matched out and the match context is updated to point to the second byte in the binary.

-

In R11B, at this point a - sub binary - would be created. In R12B, - the compiler sees that there is no point in creating a sub binary, - because there will soon be a call to a function (in this case, +

At this point it would make sense to create a + sub binary, + but in this particular example the compiler sees that + there will soon be a call to a function (in this case, to my_binary_to_list/1 itself) that immediately will create a new match context and discard the sub binary.

-

Therefore, in R12B, my_binary_to_list/1 calls itself +

Therefore my_binary_to_list/1 calls itself with the match context instead of with a sub binary. The instruction that initializes the matching operation basically does nothing when it sees that it was passed a match context instead of a binary.

@@ -321,34 +306,10 @@ my_binary_to_list(<<>>) -> [].]]>
the match context will simply be discarded (removed in the next garbage collection, as there is no longer any reference to it).

-

To summarize, my_binary_to_list/1 in R12B only needs to create - one match context and no sub binaries. In R11B, if the binary - contains N bytes, N+1 match contexts and N - sub binaries are created.

- -

In R11B, the fastest way to match binaries is as follows:

+

To summarize, my_binary_to_list/1 only needs to create + one match context and no sub binaries.

-

DO NOT (in R12B)

- - my_complicated_binary_to_list(Bin, 0). - -my_complicated_binary_to_list(Bin, Skip) -> - case Bin of - <<_:Skip/binary,Byte,_/binary>> -> - [Byte|my_complicated_binary_to_list(Bin, Skip+1)]; - <<_:Skip/binary>> -> - [] - end.]]> - -

This function cleverly avoids building sub binaries, but it cannot - avoid building a match context in each recursion step. - Therefore, in both R11B and R12B, - my_complicated_binary_to_list/1 builds N+1 match - contexts. (In a future Erlang/OTP release, the compiler might be able - to generate code that reuses the match context.)

- -

Returning to my_binary_to_list/1, notice that the match context +

Notice that the match context in my_binary_to_list/1 was discarded when the entire binary had been traversed. What happens if the iteration stops before it has reached the end of the binary? Will the optimization still work?

@@ -544,5 +505,15 @@ count3(<<>>, Count) -> Count.]]> not matched out.

+ +
+ Historical Note + +

Binary handling was significantly improved in R12B. Because + code that was efficient in R11B might not be efficient in R12B, + and vice versa, earlier revisions of this Efficiency Guide contained + some information about binary handling in R11B.

+
+ -- cgit v1.2.3 From fa04f8212d282ea1535c07683660de1a23565b0f Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Bj=C3=B6rn=20Gustavsson?= Date: Tue, 10 Jan 2017 14:44:00 +0100 Subject: Modernize section about list handling and list comprehensions --- system/doc/efficiency_guide/listhandling.xml | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/system/doc/efficiency_guide/listhandling.xml b/system/doc/efficiency_guide/listhandling.xml index 2ebc877820..ec258d7c2a 100644 --- a/system/doc/efficiency_guide/listhandling.xml +++ b/system/doc/efficiency_guide/listhandling.xml @@ -90,7 +90,7 @@ tail_recursive_fib(N, Current, Next, Fibs) ->

Lists comprehensions still have a reputation for being slow. They used to be implemented using funs, which used to be slow.

-

In recent Erlang/OTP releases (including R12B), a list comprehension:

+

A list comprehension:

@@ -102,7 +102,7 @@ tail_recursive_fib(N, Current, Next, Fibs) -> [Expr(E)|'lc^0'(Tail, Expr)]; 'lc^0'([], _Expr) -> []. -

In R12B, if the result of the list comprehension will obviously +

If the result of the list comprehension will obviously not be used, a list will not be constructed. For example, in this code:

[]. +

The compiler also understands that assigning to '_' means that + the value will not used. Therefore, the code in the following example + will also be optimized:

+ + +
@@ -209,11 +217,11 @@ some_function(...),
Recursive List Functions -

In Section 7.2, the following myth was exposed: +

In section about myths, the following myth was exposed: Tail-Recursive Functions are Much Faster Than Recursive Functions.

-

To summarize, in R12B there is usually not much difference between +

There is usually not much difference between a body-recursive list function and tail-recursive function that reverses the list at the end. Therefore, concentrate on writing beautiful code and forget about the performance of your list functions. In the -- cgit v1.2.3 From 9595a90fd301e2049b822c8a4d712b5033a3e9d0 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Bj=C3=B6rn=20Gustavsson?= Date: Tue, 10 Jan 2017 14:48:26 +0100 Subject: Fix a typo in functions.xml --- system/doc/efficiency_guide/functions.xml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/system/doc/efficiency_guide/functions.xml b/system/doc/efficiency_guide/functions.xml index 1c34888bb5..1d0f1f68b7 100644 --- a/system/doc/efficiency_guide/functions.xml +++ b/system/doc/efficiency_guide/functions.xml @@ -65,7 +65,7 @@ atom_map1(six) -> 6. thus, quite efficient even if there are many values) to select which one of the first three clauses to execute (if any). - >If none of the first three clauses match, the fourth clause + If none of the first three clauses match, the fourth clause match as a variable always matches. If the guard test is_integer(Int) succeeds, the fourth -- cgit v1.2.3 From 947169af61bdd67d34fabd47a56be04e8468120d Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Bj=C3=B6rn=20Gustavsson?= Date: Tue, 10 Jan 2017 14:53:03 +0100 Subject: Remove mention of R12B Also don't say that there are no plans to make sharing-preserving copying default; it has been seriously suggested. --- system/doc/efficiency_guide/processes.xml | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/system/doc/efficiency_guide/processes.xml b/system/doc/efficiency_guide/processes.xml index f2d9712f51..bc9daa6666 100644 --- a/system/doc/efficiency_guide/processes.xml +++ b/system/doc/efficiency_guide/processes.xml @@ -146,14 +146,14 @@ loop() ->

Constant Pool -

Constant Erlang terms (also called literals) are now +

Constant Erlang terms (also called literals) are kept in constant pools; each loaded module has its own pool. - The following function does no longer build the tuple every time + The following function does not build the tuple every time it is called (only to have it discarded the next time the garbage collector was run), but the tuple is located in the module's constant pool:

-

DO (in R12B and later)

+

DO

days_in_month(M) -> element(M, {31,28,31,30,31,30,31,31,30,31,30,31}). @@ -235,9 +235,7 @@ true return the same value. Sharing has been lost.

In a future Erlang/OTP release, it might be implemented a - way to (optionally) preserve sharing. There are no plans to make - preserving of sharing the default behaviour, as that would - penalize the vast majority of Erlang applications.

+ way to (optionally) preserve sharing.

-- cgit v1.2.3