aboutsummaryrefslogtreecommitdiffstats
path: root/lib/compiler/test/beam_except_SUITE.erl
AgeCommit message (Collapse)Author
2019-03-09Optimize tail-recursive calls of BIFsBjörn Gustavsson
BEAM currently does not call BIFs at the end of a function in a tail-recursive way. That is, when calling a BIF at the end of a function, the BIF is first called, and then the stack frame is deallocated, and then control is transferred to the caller. If there is no stack frame when a BIF is called in the tail position, the loader will emit a sequence of three instructions: first an instruction that allocates a stack frame and saves the continuation pointer (`allocate`), then an instruction that calls the BIF (`call_bif`), and lastly an instruction that deallocates the stack frame and returns to the caller (`deallocate_return`). The old compiler would essentially allocate a stack frame for each clause in a function, so it would not be that common that a BIF was called in the tail position when there was no stack frame, so the three-instruction sequence was deemed acceptable. The new compiler only allocates stack frames when truly needed, so the three-instruction BIF call sequence has become much more common. This commit introduces a new `call_bif_only` instruction so that only one instruction will be needed when calling a BIF in the tail position when there is no stack frame. This instruction is also used when there is a stack frame to make it possible to deallocate the stack frame **before** calling the BIF, which may make a subsequent garbage collection at the end of the BIF call cheaper (copying less garbage). The one downside of this change is that the function that called the BIF will not be included in the stack backtrace (similar to how a tail-recursive call to an Erlang function will not be included in the backtrace). That was the quick summary of the commit. Here comes a detailed look at how BIF calls are translated by the loader. The first example is a function that calls `setelement/3` in the tail position: update_no_stackframe(X) -> setelement(5, X, new_value). Here is the BEAM code: {function, update_no_stackframe, 1, 12}. {label,11}. {line,[...]}. {func_info,{atom,t},{atom,update_no_stackframe},1}. {label,12}. {move,{x,0},{x,1}}. {move,{atom,new_value},{x,2}}. {move,{integer,5},{x,0}}. {line,[...]}. {call_ext_only,3,{extfunc,erlang,setelement,3}}. Because there is no stack frame, the `call_ext_only` instruction will be used to call `setelement/3`: {call_ext_only,3,{extfunc,erlang,setelement,3}}. The loader will transform this instruction to a three-instruction sequence: 0000000020BD8130: allocate_tt 0 3 0000000020BD8138: call_bif_e erlang:setelement/3 0000000020BD8148: deallocate_return_Q 0 Using the `call_bif_only` instruction introduced in this commit, only one instruction is needed: 000000005DC377F0: call_bif_only_e erlang:setelement/3 `call_bif_only` calls the BIF and returns to the caller. Now let's look at a function that already has a stack frame when `setelement/3` is called: update_with_stackframe(X) -> foobar(X), setelement(5, X, new_value). Here is the BEAM code: {function, update_with_stackframe, 1, 14}. {label,13}. {line,[...]}. {func_info,{atom,t},{atom,update_with_stackframe},1}. {label,14}. {allocate,1,1}. {move,{x,0},{y,0}}. {line,[...]}. {call,1,{f,16}}. {move,{y,0},{x,1}}. {move,{atom,new_value},{x,2}}. {move,{integer,5},{x,0}}. {line,[...]}. {call_ext_last,3,{extfunc,erlang,setelement,3},1}. Since there is a stack frame, the `call_ext_last` instruction will be used to deallocate the stack frame and call the function: {call_ext_last,3,{extfunc,erlang,setelement,3},1}. Before this commit, the loader would translate this instruction to: 0000000020BD81B8: call_bif_e erlang:setelement/3 0000000020BD81C8: deallocate_return_Q 1 That is, the BIF is called before deallocating the stack frame and returning to the calling function. After this commit, the loader will translate the `call_ext_last` like this: 000000005DC37868: deallocate_Q 1 000000005DC37870: call_bif_only_e erlang:setelement/3 There are still two instructions, but now the stack frame will be deallocated before calling the BIF, which could make the potential garbage collection after the BIF call slightly more efficient (copying less garbage). We could have introduced a `call_bif_last` instruction, but the code for calling a BIF is relatively large and there does not seem be a practical way to share the code between `call_bif` and `call_bif_only` (since the difference is at the end, after the BIF call). Therefore, we did not want to clone the BIF calling code yet another time to make a `call_bif_last` instruction.
2019-01-31Fix internal consistency failure caused by beam_exceptBjörn Gustavsson
Fix a bug where the number of live registers in a `bs_get_tail` instruction was too low. Consider this example: -export([bs_get_tail/2]). bs_get_tail(Bin, Config) -> bs_get_tail_1(Bin, 0, 0, Config). bs_get_tail_1(<<_:32, Rest/binary>>, Z1, Z2, F1) -> {Rest,Z1,Z2,F1}. `beam_validator` would emit the following diagnostics: t: function bs_get_tail_1/4+2: Internal consistency check failed - please report this bug. Instruction: {func_info,{atom,t},{atom,bs_get_tail_1},4} Error: {uninitialized_reg,{x,3}}: Here is the part of the code that generates the `function_clause` exception before the optimization: {test_heap,6,4}. {put_list,{x,3},nil,{x,2}}. {put_list,{integer,0},{x,2},{x,2}}. {put_list,{integer,0},{x,2},{x,2}}. {bs_set_position,{x,1},{x,0}}. {bs_get_tail,{x,1},{x,0},3}. %3 live registers. {test_heap,2,3}. {put_list,{x,0},{x,2},{x,1}}. {move,{atom,function_clause},{x,0}}. {line,[{location,"t.erl",8}]}. {call_ext_only,2,{extfunc,erlang,error,2}}. The `bs_get_tail` instruction expects that 3 registers will be live at this point. `beam_except` rewrites the code like this: {bs_set_position,{x,1},{x,0}}. {bs_get_tail,{x,1},{x,0},3}. %Still 3. Too low. {move,{integer,0},{x,1}}. {move,{integer,0},{x,2}}. {jump,{f,3}}. Now the number of live registers in `bs_get_tail` is too low, because the `{x,3}` register will become undefined. This commit adds code to update the number of live registers in the `bs_get_tail` instruction, producing this code: {bs_set_position,{x,1},{x,0}}. {bs_get_tail,{x,1},{x,0},4}. %Adjusted to 4. {move,{integer,0},{x,1}}. {move,{integer,0},{x,2}}. {jump,{f,3}}.
2019-01-29Enhance optimization of function_clause exceptionsBjörn Gustavsson
There is an optimization for reducing the number of instructions needed to generate a `function_clause`. After the latest improvements of the type optimization pass, that optimization is not always applied. Here is an example: -export([foo/3]). foo(X, Y, Z) -> bar(a, X, Y, Z). bar(a, X, Y, Z) when is_tuple(X) -> {X,Y,Z}. Note that the compiler internally adds a clause to each function to generate a `function_clause` exception. Thus: bar(a, X, Y, Z) when is_tuple(X) -> {X,Y,Z}; bar(A1, A2, A3, A4) -> erlang:error(function_clause, [A1,A2,A3,A4]). Optimizations will rewrite the code basically like this: bar(_, X, Y, Z) when is_tuple(X) -> {X,Y,Z}; bar(_, A2, A3, A4) -> erlang:error(function_clause, [a,A2,A3,A4]). Note the `a` as the first element of the list of arguments. It will prevent the optimization of the `function_clause` exception. The BEAM code for `bar/4` looks like this: {function, bar, 4, 4}. {label,3}. {line,[{location,"t.erl",8}]}. {func_info,{atom,t},{atom,bar},4}. {label,4}. {'%',{type_info,{x,0},{atom,a}}}. {test,is_tuple,{f,5},[{x,1}]}. {test_heap,4,4}. {put_tuple2,{x,0},{list,[{x,1},{x,2},{x,3}]}}. return. {label,5}. {test_heap,8,4}. {put_list,{x,3},nil,{x,0}}. {put_list,{x,2},{x,0},{x,0}}. {put_list,{x,1},{x,0},{x,0}}. {put_list,{atom,a},{x,0},{x,1}}. {move,{atom,function_clause},{x,0}}. {line,[{location,"t.erl",8}]}. {call_ext,2,{extfunc,erlang,error,2}}. The code after label 5 is the clause that generates the `function_clause` exception. This commit generalizes the optimization so that it can be applied for this function: {function, bar, 4, 4}. {label,3}. {line,[{location,"t.erl",8}]}. {func_info,{atom,t},{atom,bar},4}. {label,4}. {'%',{type_info,{x,0},{atom,a}}}. {test,is_tuple,{f,5},[{x,1}]}. {test_heap,4,4}. {put_tuple2,{x,0},{list,[{x,1},{x,2},{x,3}]}}. return. {label,5}. {move,{atom,a},{x,0}}. {jump,{f,3}}. For this particular function, it would be safe to omit the `move` instruction before the `{jump,{f,3}}` instruction, but it would not be safe in general to omit `move` instructions.
2018-10-30beam_except: Generalize translation to func_info instructionsBjörn Gustavsson
The `beam_except` pass replaces some calls to `erlang:error/1` or `erlang:error/2` with specialized instructions in order to reduce the size of the BEAM code. In functions that do binary matching, `beam_except` would fail to translate the instructions that generate a `function_clause` exception. Here is an example: bsum(<<H:16,T/binary>>, Acc) -> bsum(T, Acc+H); bsum(<<>>, Acc) -> Acc. The BEAM code that generates the `function_clause` exception looks like this: {label,4}. {test_heap,2,3}. {put_list,{x,1},nil,{x,1}}. %% The following two instructions prevents the translation. {bs_set_position,{x,2},{x,0}}. {bs_get_tail,{x,2},{x,0},3}. {test_heap,2,2}. {put_list,{x,0},{x,1},{x,1}}. {move,{atom,function_clause},{x,0}}. {line,...}. {call_ext,2,{extfunc,erlang,error,2}}. Make the translation of `function_clause` exceptions smarter, so that the following code will be produced: {label,4}. {bs_set_position,{x,2},{x,0}}. {bs_get_tail,{x,2},{x,0},3}. {jump,{f,1}}. %Jump to `func_info` instruction. (This issue was noticed when looking at the code generated by https://github.com/tomas-abrahamsson/gpb.)
2018-09-21Update copyright yearHenrik Nord
2018-07-06Call test_lib:recompile/1 from init_per_suite/1Björn Gustavsson
Call test_lib:recompile/1 from init_per_suite/1 instead of from all/0. That makes it easy to find the log from the compilation in the log file for the init_per_suite/1 test case.
2016-05-25beam_expect: Correctly handle blocks with multiple allocsBjörn Gustavsson
A negative allocation could be calculated if a block had multiple allocations. Make sure to process the block in the right order so that the correct allocation is processed. Also add an assertion. This bug was often not noticed because beam_type usually silently recalculates the allocation amount in test_heap/2 instructions.
2016-03-15update copyright-yearHenrik Nord
2015-06-18Change license text to APLv2Bruce Yinhe
2014-01-24beam_except: Eliminate compiler crashBjörn Gustavsson
Code such as: bar(X) -> case {X+1} of 1 -> ok end. would crash the beam_except pass of the compiler. The reason for the crash is that the '+' operator would add a line/1 instruction that the beam_except pass was not prepared to handle. Reported-by: Erik Søe Sørensen
2013-01-25Update copyright yearsBjörn-Egil Dahlberg
2012-10-23Correct typo in test suite nameBjörn Gustavsson
It should be beam_except_SUITE, since it tests the beam_except module (introduced in 726f6e4c7afe8ce37b30eedbebe583e7b9bfc51b).