From 6513fc5eb55b306e2b1088123498e6c50b9e7273 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Bj=C3=B6rn=20Gustavsson?= Date: Thu, 12 Mar 2015 15:35:13 +0100 Subject: Update Efficiency Guide MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Language cleaned up by the technical writers xsipewe and tmanevik from Combitech. Proofreading and corrections by Björn Gustavsson. --- system/doc/efficiency_guide/commoncaveats.xml | 135 +++++++++++++------------- 1 file changed, 65 insertions(+), 70 deletions(-) (limited to 'system/doc/efficiency_guide/commoncaveats.xml') diff --git a/system/doc/efficiency_guide/commoncaveats.xml b/system/doc/efficiency_guide/commoncaveats.xml index 551b0a03e6..71991d342f 100644 --- a/system/doc/efficiency_guide/commoncaveats.xml +++ b/system/doc/efficiency_guide/commoncaveats.xml @@ -18,7 +18,6 @@ basis, WITHOUT WARRANTY OF ANY KIND, either express or implied. See the License for the specific language governing rights and limitations under the License. - Common Caveats @@ -29,49 +28,50 @@ commoncaveats.xml -

Here we list a few modules and BIFs to watch out for, and not only +

This section lists a few modules and BIFs to watch out for, not only from a performance point of view.

- The timer module + Timer Module

Creating timers using erlang:send_after/3 - and erlang:start_timer/3 + and + erlang:start_timer/3 +, is much more efficient than using the timers provided by the - timer module. The - timer module uses a separate process to manage the timers, - and that process can easily become overloaded if many processes + timer module in STDLIB. + The timer module uses a separate process to manage the timers. + That process can easily become overloaded if many processes create and cancel timers frequently (especially when using the SMP emulator).

-

The functions in the timer module that do not manage timers (such as - timer:tc/3 or timer:sleep/1), do not call the timer-server process - and are therefore harmless.

+

The functions in the timer module that do not manage timers + (such as timer:tc/3 or timer:sleep/1), do not call the + timer-server process and are therefore harmless.

list_to_atom/1 -

Atoms are not garbage-collected. Once an atom is created, it will never - be removed. The emulator will terminate if the limit for the number - of atoms (1048576 by default) is reached.

+

Atoms are not garbage-collected. Once an atom is created, it is never + removed. The emulator terminates if the limit for the number + of atoms (1,048,576 by default) is reached.

-

Therefore, converting arbitrary input strings to atoms could be - dangerous in a system that will run continuously. - If only certain well-defined atoms are allowed as input, you can use +

Therefore, converting arbitrary input strings to atoms can be + dangerous in a system that runs continuously. + If only certain well-defined atoms are allowed as input, list_to_existing_atom/1 + can be used to to guard against a denial-of-service attack. (All atoms that are allowed - must have been created earlier, for instance by simply using all of them + must have been created earlier, for example, by simply using all of them in a module and loading that module.)

Using list_to_atom/1 to construct an atom that is passed to - apply/3 like this

- + apply/3 as follows, is quite expensive and not recommended + in time-critical code:

-apply(list_to_atom("some_prefix"++Var), foo, Args) - -

is quite expensive and is not recommended in time-critical code.

+apply(list_to_atom("some_prefix"++Var), foo, Args)
@@ -81,25 +81,25 @@ apply(list_to_atom("some_prefix"++Var), foo, Args) length of the list, as opposed to tuple_size/1, byte_size/1, and bit_size/1, which all execute in constant time.

-

Normally you don't have to worry about the speed of length/1, - because it is efficiently implemented in C. In time critical-code, though, - you might want to avoid it if the input list could potentially be very long.

+

Normally, there is no need to worry about the speed of length/1, + because it is efficiently implemented in C. In time-critical code, + you might want to avoid it if the input list could potentially be very + long.

Some uses of length/1 can be replaced by matching. - For instance, this code

- + For example, the following code:

foo(L) when length(L) >= 3 -> ... -

can be rewritten to

+

can be rewritten to:

foo([_,_,_|_]=L) -> ... -

(One slight difference is that length(L) will fail if the L - is an improper list, while the pattern in the second code fragment will - accept an improper list.)

+

One slight difference is that length(L) fails if L + is an improper list, while the pattern in the second code fragment + accepts an improper list.

@@ -107,50 +107,49 @@ foo([_,_,_|_]=L) ->

setelement/3 copies the tuple it modifies. Therefore, updating a tuple in a loop - using setelement/3 will create a new copy of the tuple every time.

+ using setelement/3 creates a new copy of the tuple every time.

There is one exception to the rule that the tuple is copied. If the compiler clearly can see that destructively updating the tuple would - give exactly the same result as if the tuple was copied, the call to - setelement/3 will be replaced with a special destructive setelement - instruction. In the following code sequence

- + give the same result as if the tuple was copied, the call to + setelement/3 is replaced with a special destructive setelement + instruction. In the following code sequence, the first setelement/3 + call copies the tuple and modifies the ninth element:

multiple_setelement(T0) -> T1 = setelement(9, T0, bar), T2 = setelement(7, T1, foobar), setelement(5, T2, new_value). -

the first setelement/3 call will copy the tuple and modify the - ninth element. The two following setelement/3 calls will modify +

The two following setelement/3 calls modify the tuple in place.

-

For the optimization to be applied, all of the followings conditions +

For the optimization to be applied, all the followings conditions must be true:

The indices must be integer literals, not variables or expressions. The indices must be given in descending order. - There must be no calls to other function in between the calls to + There must be no calls to another function in between the calls to setelement/3. The tuple returned from one setelement/3 call must only be used in the subsequent call to setelement/3. -

If it is not possible to structure the code as in the multiple_setelement/1 +

If the code cannot be structured as in the multiple_setelement/1 example, the best way to modify multiple elements in a large tuple is to - convert the tuple to a list, modify the list, and convert the list back to + convert the tuple to a list, modify the list, and convert it back to a tuple.

size/1 -

size/1 returns the size for both tuples and binary.

+

size/1 returns the size for both tuples and binaries.

-

Using the new BIFs tuple_size/1 and byte_size/1 introduced - in R12B gives the compiler and run-time system more opportunities for - optimization. A further advantage is that the new BIFs could help Dialyzer +

Using the new BIFs tuple_size/1 and byte_size/1, introduced + in R12B, gives the compiler and the runtime system more opportunities for + optimization. Another advantage is that the new BIFs can help Dialyzer to find more bugs in your program.

@@ -159,22 +158,21 @@ multiple_setelement(T0) ->

It is usually more efficient to split a binary using matching instead of calling the split_binary/2 function. Furthermore, mixing bit syntax matching and split_binary/2 - may prevent some optimizations of bit syntax matching.

+ can prevent some optimizations of bit syntax matching.

DO

> = Bin,]]>

DO NOT

- {Bin1,Bin2} = split_binary(Bin, Num) - + {Bin1,Bin2} = split_binary(Bin, Num)
- The '--' operator -

Note that the '--' operator has a complexity - proportional to the product of the length of its operands, - meaning that it will be very slow if both of its operands + Operator "--" +

The "--" operator has a complexity + proportional to the product of the length of its operands. + This means that the operator is very slow if both of its operands are long lists:

DO NOT

@@ -182,42 +180,39 @@ multiple_setelement(T0) -> HugeList1 -- HugeList2]]>

Instead use the ordsets - module:

+ module in STDLIB:

DO

HugeSet1 = ordsets:from_list(HugeList1), HugeSet2 = ordsets:from_list(HugeList2), - ordsets:subtract(HugeSet1, HugeSet2) - + ordsets:subtract(HugeSet1, HugeSet2) -

Obviously, that code will not work if the original order +

Obviously, that code does not work if the original order of the list is important. If the order of the list must be - preserved, do like this:

+ preserved, do as follows:

DO

-

Subtle note 1: This code behaves differently from '--' - if the lists contain duplicate elements. (One occurrence - of an element in HugeList2 will remove all +

This code behaves differently from "--" + if the lists contain duplicate elements (one occurrence + of an element in HugeList2 removes all occurrences in HugeList1.)

+

Also, this code compares lists elements using the + "==" operator, while "--" uses the "=:=" operator. + If that difference is important, sets can be used instead of + gb_sets, but sets:from_list/1 is much + slower than gb_sets:from_list/1 for long lists.

-

Subtle note 2: This code compares lists elements using the - '==' operator, while '--' uses the '=:='. If - that difference is important, sets can be used instead of - gb_sets, but note that sets:from_list/1 is much - slower than gb_sets:from_list/1 for long lists.

- -

Using the '--' operator to delete an element +

Using the "--" operator to delete an element from a list is not a performance problem:

OK

- HugeList1 -- [Element] - + HugeList1 -- [Element]
-- cgit v1.2.3