otp.git - Mirror of Erlang/OTP repository.

Age	Commit message (Collapse)	Author
2019-04-29	compiler: Propagate match context position on fail path	John Högberg

2019-04-23	beam_validator: Don't infer types for dead values	John Högberg

2019-03-28	Merge branch 'bjorn/compiler/cuddle-with-tests'	Björn Gustavsson
	* bjorn/compiler/cuddle-with-tests: Verify the highest opcode for the r21 test suites Add test_lib:highest_opcode/1 sys_core_fold: Simplify case_expand_var/2 beam_validator: Remove uncovered lines in lists_mod_return_type/3 Cover return type determination of lists functions
2019-03-26	compiler: Fully disable no_return optimizations in try blocks	John Högberg
	Validation could fail when a function that never returned was used in a try block (see attached test case). It's possible to solve this without disabling the optimization as the generated code is sound, but I'm not comfortable making such a large change this close to the OTP 22 release.
2019-03-25	Verify the highest opcode for the r21 test suites	Björn Gustavsson

2019-03-25	Add test_lib:highest_opcode/1	Björn Gustavsson

2019-03-25	Cover return type determination of lists functions	Björn Gustavsson

2019-03-18	beam_validator: Infer types on both sides of '=:='	John Högberg

2019-03-13	Merge pull request #2177 from bjorng/bjorn/erts/tail-recursive-bifs	Björn Gustavsson
	Optimize tail-recursive calls of BIFs OTP-15674
2019-03-09	Optimize tail-recursive calls of BIFs	Björn Gustavsson
	BEAM currently does not call BIFs at the end of a function in a tail-recursive way. That is, when calling a BIF at the end of a function, the BIF is first called, and then the stack frame is deallocated, and then control is transferred to the caller. If there is no stack frame when a BIF is called in the tail position, the loader will emit a sequence of three instructions: first an instruction that allocates a stack frame and saves the continuation pointer (`allocate`), then an instruction that calls the BIF (`call_bif`), and lastly an instruction that deallocates the stack frame and returns to the caller (`deallocate_return`). The old compiler would essentially allocate a stack frame for each clause in a function, so it would not be that common that a BIF was called in the tail position when there was no stack frame, so the three-instruction sequence was deemed acceptable. The new compiler only allocates stack frames when truly needed, so the three-instruction BIF call sequence has become much more common. This commit introduces a new `call_bif_only` instruction so that only one instruction will be needed when calling a BIF in the tail position when there is no stack frame. This instruction is also used when there is a stack frame to make it possible to deallocate the stack frame before calling the BIF, which may make a subsequent garbage collection at the end of the BIF call cheaper (copying less garbage). The one downside of this change is that the function that called the BIF will not be included in the stack backtrace (similar to how a tail-recursive call to an Erlang function will not be included in the backtrace). That was the quick summary of the commit. Here comes a detailed look at how BIF calls are translated by the loader. The first example is a function that calls `setelement/3` in the tail position: update_no_stackframe(X) -> setelement(5, X, new_value). Here is the BEAM code: {function, update_no_stackframe, 1, 12}. {label,11}. {line,[...]}. {func_info,{atom,t},{atom,update_no_stackframe},1}. {label,12}. {move,{x,0},{x,1}}. {move,{atom,new_value},{x,2}}. {move,{integer,5},{x,0}}. {line,[...]}. {call_ext_only,3,{extfunc,erlang,setelement,3}}. Because there is no stack frame, the `call_ext_only` instruction will be used to call `setelement/3`: {call_ext_only,3,{extfunc,erlang,setelement,3}}. The loader will transform this instruction to a three-instruction sequence: 0000000020BD8130: allocate_tt 0 3 0000000020BD8138: call_bif_e erlang:setelement/3 0000000020BD8148: deallocate_return_Q 0 Using the `call_bif_only` instruction introduced in this commit, only one instruction is needed: 000000005DC377F0: call_bif_only_e erlang:setelement/3 `call_bif_only` calls the BIF and returns to the caller. Now let's look at a function that already has a stack frame when `setelement/3` is called: update_with_stackframe(X) -> foobar(X), setelement(5, X, new_value). Here is the BEAM code: {function, update_with_stackframe, 1, 14}. {label,13}. {line,[...]}. {func_info,{atom,t},{atom,update_with_stackframe},1}. {label,14}. {allocate,1,1}. {move,{x,0},{y,0}}. {line,[...]}. {call,1,{f,16}}. {move,{y,0},{x,1}}. {move,{atom,new_value},{x,2}}. {move,{integer,5},{x,0}}. {line,[...]}. {call_ext_last,3,{extfunc,erlang,setelement,3},1}. Since there is a stack frame, the `call_ext_last` instruction will be used to deallocate the stack frame and call the function: {call_ext_last,3,{extfunc,erlang,setelement,3},1}. Before this commit, the loader would translate this instruction to: 0000000020BD81B8: call_bif_e erlang:setelement/3 0000000020BD81C8: deallocate_return_Q 1 That is, the BIF is called before deallocating the stack frame and returning to the calling function. After this commit, the loader will translate the `call_ext_last` like this: 000000005DC37868: deallocate_Q 1 000000005DC37870: call_bif_only_e erlang:setelement/3 There are still two instructions, but now the stack frame will be deallocated before calling the BIF, which could make the potential garbage collection after the BIF call slightly more efficient (copying less garbage). We could have introduced a `call_bif_last` instruction, but the code for calling a BIF is relatively large and there does not seem be a practical way to share the code between `call_bif` and `call_bif_only` (since the difference is at the end, after the BIF call). Therefore, we did not want to clone the BIF calling code yet another time to make a `call_bif_last` instruction.
2019-03-08	beam_ssa_opt: Fix crash in ssa_opt_float	John Högberg
	For reasons better explained in the source code, ssa_opt_float skips optimizing inside guards but it failed to do so consistently; while the pass never processed guard blocks, it was still possible to erroneously defer error checking to a guard block, crashing the compiler once it realized its state was invalid.
2019-03-06	beam_validator: Fix type subtraction on select_* and inequality	John Högberg
	Type subtraction never resulted in the 'none' type, even when it was obvious that it should. Once that was fixed it became apparent that inequality checks also fell into the same subtraction trap that the type pass warned about in a comment. This then led to another funny problem with select_val, consider the following code: {bif,'>=',{f,0},[{x,0},{integer,1}],{x,0}}. {select_val,{x,0},{f,70},{list,[{atom,false},{f,69}, {atom,true},{f,68}]}}. The validator knows that '>=' can only return a boolean, so once it has subtracted 'false' and 'true' it killed the state because all all valid branches had been taken, so validation would crash once it tried to branch off the fail label.
2019-03-05	beam_validator: Refactor type conflict resolution	John Högberg
	The current type conflict resolution works well for the example case in the comment, but doesn't handle branched code properly, consider the following: {label,2}. {test,is_tagged_tuple,{f,ignored},[{x,0},3,{atom,r}]}. {allocate_zero,2,1}. {move,{x,0},{y,0}}. %% {y,0} is known to be {r, _, _} now. {get_tuple_element,{x,0},2,{x,0}}. {'try',{y,1},{f,3}}. %% ... snip ... {jump,{f,5}}. {label,3}. {try_case,{y,1}}. %% {x,0} is the error class (an atom), {x,1} is the error term. {test,is_eq_exact,{f,ignored},[{x,0},{y,0}]}. %% ... since tuples and atoms can't meet, the type of {y,0} is %% now {atom,[]} because the current code assumes the type %% we're updating with. {move,{x,1},{x,0}}. {jump,{f,5}}. {label,5}. %% ... joining tuple (block 2) and atom (block 3) means 'term', %% so the get_tuple_element instruction fails to validate %% despite this being unrechable from block 3. {test_heap,3,1}. {get_tuple_element,{y,0},1,{x,1}}. {put_tuple2,{x,0},{list,[{x,1},{x,0}]}}. {deallocate,2}. return. This commit kills the state on type conflicts, making unreachable instructions truly unreachable.
2019-02-27	beam_validator: Don't explode when building terms in receive	John Högberg
	Building terms with fragile contents is okay because the GC is disabled during loop_rec, and the resulting term won't be reachable from the root set afterwards. ERL-862
2019-02-27	beam_validator: Track types by value rather than by register	John Högberg
	This is a rather subtle but important distinction. While tracking types on a per-register basis is fairly effective, it forces us to track which registers alias each other, and makes it tricky to infer types over large blocks of code as instruction arguments may have been clobbered between definition and inference. Tracking types on a per-value basis makes us immune to these problems.
2019-02-26	beam_jump: Fail label of select_val is unsafe for move elimination	John Högberg
	Consider the following code: bme(Int) -> TagInt = Int band 2#111, Tag = case TagInt of 0 -> a; 1 -> b; 2 -> c; 3 -> d; 4 -> e; 5 -> f; 6 -> g; 7 -> h end, case Tag of g -> expects_g(TagInt, Tag); h -> expects_h(TagInt, Tag); _ -> Tag = id(Tag), ok end. expects_g(6, Atom) -> Atom = id(g), ok. expects_h(7, Atom) -> Atom = id(h), ok. The type optimization pass would recognize that TagInt can only be [0 .. 7], so the first 'case' would select_val over [0 .. 6] and swap out the fail label with the block for 7. A later optimization would merge this block with 'expects_h' in the second case, as the latter is only reachable from the former. ... but this broke down when the move elimination optimization didn't take the fail label of the first select_val into account. This caused it believe that the only way to reach 'expects_h' was through the second case when 'Tag' =:= 'h', which made it remove the move instruction added in the first case, passing garbage to expects_h/2.
2019-02-21	sys_core_fold: Remove an unsafe optimization	Björn Gustavsson
	`sys_core_fold` has an optimization of repeated pattern matching. For example, when a record is matched the first time, the pattern is remembered. When the same record is matched again, the matching does not need to be repeated, but the variables bound in the first matching can be re-used. It turns out that that there is a name capture problem when the old inliner is used. The old inliner is used when explicitly inling certain functions, and by the compiler test suites for testing the compiler. The name capture problem could be eliminated by more aggressive variable renaming when inlining. But, fortunately, given the new SSA passes, this optimization is no longer as essential as it used to be. Removing the optimization turns out to be mostly benefical, leading to a smaller stack frame in many cases. Also remove the optimizations of `element/2`, `is_record/3`, and `setelement/3` from `sys_core_fold`. Because matched patterns are no longer remembered, those optimizations can very rarely be applied any more. (Those same optimizations are already done in `beam_ssa_type`.)
2019-02-21	beam_validator: Refactor liveness/stack initialization checks	John Högberg

2019-02-21	beam_validator: Refactor try/catch handling	John Högberg

2019-02-19	Do the destructive setelement optimization in SSA	Björn Gustavsson
	The expansion of record field updates, when more than one field is updated, but not a majority of the fields, will create a sequence of calls to `erlang:setelement(Index, Value, Tuple)` where Tuple in the first call is the original record tuple, and in the subsequent calls Tuple is the result of the previous call. Furthermore, all Index values are constant positive integers, and the first call to `setelement` will have the greatest index. Thus all the following calls do not actually need to test at run-time whether Tuple has type tuple, nor that the index is within the tuple bounds. Since OTP R7, the `sys_core_dsetel` pass, run as the very last Core Erlang pass, has optimized this sequence of `setelement` calls to use a special destructive version of `setelement` (called `set_tuple_element`) for all but the very first `setelement` in the sequence. It turns out that the presence of the `set_tuple_element` in SSA code is awkward and can prevent or complicate type analysis and aggressive optimizations. Therefore, this commit removes the `sys_core_dsetel` pass and reimplements it for SSA code. The optimization will be done in the `beam_ssa_pre_codegen` pass (that is, just before code generation and after running all other SSA code optimization passes). In most cases, the resulting BEAM code is identical to previous code. For a few modules, the BEAM code is actually slightly better, with smaller stack frames.
2019-02-15	inline_SUITE: Make coverage/1 test cheaper	Björn Gustavsson
	The sole purpose of inline_SUITE:coverage/1 is to ensure that all lines are covered in sys_core_inline. Do that in a cheaper way.
2019-02-15	Remove compile_SUITE:big_file/1	Björn Gustavsson
	This test case does not test anything unique that is not tested by other test cases.
2019-02-15	Add test modules that disable all SSA optimizations	Björn Gustavsson
	This makes sure that the SSA optimizations are not essential and may help to cover more code in beam_ssa_pre_codegen and beam_ssa_codegen.
2019-02-15	Cover erl_bifs.erl	Björn Gustavsson

2019-02-15	Cover exception throwing code in beam_ssa_opt	Björn Gustavsson

2019-02-15	Parallelize test of listing files	Björn Gustavsson

2019-02-15	Don't limit the number of processes when running cover	Björn Gustavsson
	With the recent update to cover to use the `counters` modules, there is no longer any reason to limit the number of parallel processes when running `cover`.
2019-02-11	compiler tests: Restrict cover to run on the local node	Björn Gustavsson
	This change saves about about a minute when running the cover-compiled compiler tests.
2019-02-05	beam_ssa_type: Track the types of tuple elements	John Högberg
	Prior to 294d66a295f6c2101fe3c2da630979ad4e736c08 there wasn't much point to keeping track of tuple element types; they were only known when we had inserted or extracted values from a tuple, and in neither case was it likely that we'd extract the same values again. It makes a lot more sense to do so now that type optimizations are applied across functions; if we return a tuple it's very likely that its elements will be extracted soon after, and knowing their types lets us eliminate more type checks. Co-authored-by: Björn Gustavsson <[email protected]>
2019-02-01	inline_SUITE: Don't start a slave node	Björn Gustavsson
	A long time ago there was a good idea to run compiled code in a slave node. Nowadays, not so much.
2019-02-01	Correct test_lib:is_cloned_mod/1	Björn Gustavsson
	test_lib:is_cloned_mod(inline_SUITE) would return true.
2019-02-01	Parallelize compile_SUITE:bc_options/1	Björn Gustavsson

2019-02-01	Remove the optimized_guard/1 test case	Björn Gustavsson
	With the new SSA code passes, the optimized_guard/1 test case has become really bad at finding unnecessary `and` and `or` instructions.
2019-02-01	Merge pull request #2122 from bjorng/bjorn/compiler/fix-beam_except	Björn Gustavsson
	Fix internal consistency failure caused by beam_except
2019-01-31	Fix internal consistency failure caused by beam_except	Björn Gustavsson
	Fix a bug where the number of live registers in a `bs_get_tail` instruction was too low. Consider this example: -export([bs_get_tail/2]). bs_get_tail(Bin, Config) -> bs_get_tail_1(Bin, 0, 0, Config). bs_get_tail_1(<<_:32, Rest/binary>>, Z1, Z2, F1) -> {Rest,Z1,Z2,F1}. `beam_validator` would emit the following diagnostics: t: function bs_get_tail_1/4+2: Internal consistency check failed - please report this bug. Instruction: {func_info,{atom,t},{atom,bs_get_tail_1},4} Error: {uninitialized_reg,{x,3}}: Here is the part of the code that generates the `function_clause` exception before the optimization: {test_heap,6,4}. {put_list,{x,3},nil,{x,2}}. {put_list,{integer,0},{x,2},{x,2}}. {put_list,{integer,0},{x,2},{x,2}}. {bs_set_position,{x,1},{x,0}}. {bs_get_tail,{x,1},{x,0},3}. %3 live registers. {test_heap,2,3}. {put_list,{x,0},{x,2},{x,1}}. {move,{atom,function_clause},{x,0}}. {line,[{location,"t.erl",8}]}. {call_ext_only,2,{extfunc,erlang,error,2}}. The `bs_get_tail` instruction expects that 3 registers will be live at this point. `beam_except` rewrites the code like this: {bs_set_position,{x,1},{x,0}}. {bs_get_tail,{x,1},{x,0},3}. %Still 3. Too low. {move,{integer,0},{x,1}}. {move,{integer,0},{x,2}}. {jump,{f,3}}. Now the number of live registers in `bs_get_tail` is too low, because the `{x,3}` register will become undefined. This commit adds code to update the number of live registers in the `bs_get_tail` instruction, producing this code: {bs_set_position,{x,1},{x,0}}. {bs_get_tail,{x,1},{x,0},4}. %Adjusted to 4. {move,{integer,0},{x,1}}. {move,{integer,0},{x,2}}. {jump,{f,3}}.
2019-01-31	Merge branch 'maint'	Björn Gustavsson
	* maint: Eliminate bogus warning when using tuple calls
2019-01-30	Eliminate bogus warning when using tuple calls	Björn Gustavsson
	There would be a bogus warning when compiling the following function with the `tuple_calls` option: dispatch(X) -> (list_to_atom("prefix_" ++ atom_to_list(suffix))):doit(X). The warning would look like this: no_file: this expression will fail with a 'badarg' exception https://bugs.erlang.org/browse/ERL-838
2019-01-29	Enhance optimization of function_clause exceptions	Björn Gustavsson
	There is an optimization for reducing the number of instructions needed to generate a `function_clause`. After the latest improvements of the type optimization pass, that optimization is not always applied. Here is an example: -export([foo/3]). foo(X, Y, Z) -> bar(a, X, Y, Z). bar(a, X, Y, Z) when is_tuple(X) -> {X,Y,Z}. Note that the compiler internally adds a clause to each function to generate a `function_clause` exception. Thus: bar(a, X, Y, Z) when is_tuple(X) -> {X,Y,Z}; bar(A1, A2, A3, A4) -> erlang:error(function_clause, [A1,A2,A3,A4]). Optimizations will rewrite the code basically like this: bar(_, X, Y, Z) when is_tuple(X) -> {X,Y,Z}; bar(_, A2, A3, A4) -> erlang:error(function_clause, [a,A2,A3,A4]). Note the `a` as the first element of the list of arguments. It will prevent the optimization of the `function_clause` exception. The BEAM code for `bar/4` looks like this: {function, bar, 4, 4}. {label,3}. {line,[{location,"t.erl",8}]}. {func_info,{atom,t},{atom,bar},4}. {label,4}. {'%',{type_info,{x,0},{atom,a}}}. {test,is_tuple,{f,5},[{x,1}]}. {test_heap,4,4}. {put_tuple2,{x,0},{list,[{x,1},{x,2},{x,3}]}}. return. {label,5}. {test_heap,8,4}. {put_list,{x,3},nil,{x,0}}. {put_list,{x,2},{x,0},{x,0}}. {put_list,{x,1},{x,0},{x,0}}. {put_list,{atom,a},{x,0},{x,1}}. {move,{atom,function_clause},{x,0}}. {line,[{location,"t.erl",8}]}. {call_ext,2,{extfunc,erlang,error,2}}. The code after label 5 is the clause that generates the `function_clause` exception. This commit generalizes the optimization so that it can be applied for this function: {function, bar, 4, 4}. {label,3}. {line,[{location,"t.erl",8}]}. {func_info,{atom,t},{atom,bar},4}. {label,4}. {'%',{type_info,{x,0},{atom,a}}}. {test,is_tuple,{f,5},[{x,1}]}. {test_heap,4,4}. {put_tuple2,{x,0},{list,[{x,1},{x,2},{x,3}]}}. return. {label,5}. {move,{atom,a},{x,0}}. {jump,{f,3}}. For this particular function, it would be safe to omit the `move` instruction before the `{jump,{f,3}}` instruction, but it would not be safe in general to omit `move` instructions.
2019-01-28	Fix crash in beam_ssa_type	Björn Gustavsson
	To improve compilation times, beam_ssa_type keeps track of variables that are only used once and don't keep types for those variables. As currently implemented, it turns to be unsafe. Change it to only keep track of variables that are only used in the terminator of the block they are defined in. https://bugs.erlang.org/browse/ERL-840
2019-01-24	Make the beam_validator smarter again, again	John Högberg
	The fix in f9ea85611faca82c7494449ddb8bcb1ef1d194cb didn't consider that the tested register could be aliased.
2019-01-24	compiler: Introduce module-level type optimization	John Högberg
	This commit lets the type optimization pass work across functions, tracking return and argument types to eliminate redundant tests.
2019-01-24	beam_ssa_opt: Add a scaffold for module-level optimizations	John Högberg
	This serves as a base for the upcoming module-level type optimization, but may come in handy for other passes like beam_ssa_funs and beam_ssa_bsm that have their own ad-hoc implementations.
2019-01-21	beam_ssa_type: Fix type subtraction in #b_switch{}	John Högberg
	A switch is equivalent to a series of '=:=', so we have to subtract each value individually from the type. Subtracting a join risks removing too much type information, and managed to narrow "number" into "float" in the attached test case.
2019-01-21	beam_ssa_type: Remove wait_timeout instructions with a timeout of 0	John Högberg

2019-01-18	Merge branch 'bjorn/compiler/beam_validator/ERL-832'	Björn Gustavsson
	* bjorn/compiler/beam_validator/ERL-832: Make the beam_validator smarter (again)
2019-01-17	Make the beam_validator smarter (again)	Björn Gustavsson
	Needed because of the optimizations in 48f20bd589fa69. https://bugs.erlang.org/browse/ERL-832
2019-01-16	Move optimizations of bs_put* instruction to beam_ssa_opt	Björn Gustavsson
	Do the optimizations of bs_put* instructions in beam_ssa_opt and remove the beam_bs pass. This can lead to a slight improvement of compilation times.
2019-01-16	Merge branch 'maint'	Björn Gustavsson
	* maint: beam_type: Eliminate compiler crash when arithmetic expression fails Conflicts: lib/compiler/src/beam_type.erl
2019-01-14	beam_type: Eliminate compiler crash when arithmetic expression fails	Björn Gustavsson
	The compiler would crash when compiling code such as: (A / B) band 16#ff The type for the expression would be 'none', but beam_type:verified_type/1 did not handle 'none'. https://bugs.erlang.org/browse/ERL-829
2019-01-14	Introduce subtraction of types	Björn Gustavsson
	Introduce subtraction of types to allow some redundant tests to be eliminated. Consider this function: foo(L) when is_list(L) -> case L of [_\|_] -> non_empty; [] -> empty end. After entering the body of the function, L is known to be either a cons cell or nil (otherwise the is_list/1 guard would have failed). If the L is not a cons cell, it must be nil. Therefore, the test for nil in the second clause of the case can be eliminated. Here is the SSA code with some additonal comments for the function before the optimization: function t:foo(_0) { 0: @ssa_bool = bif:is_list _0 br @ssa_bool, label 4, label 3 4: %% _0 is now a list (cons or nil). @ssa_bool:8 = is_nonempty_list _0 br @ssa_bool:8, label 9, label 7 9: ret literal non_empty 7: %% _0 is not a cons (or we wouldn't be here). %% Subtracting cons from the previously known type list %% gives that _0 must be nil. @ssa_bool:10 = bif:'=:=' _0, literal [] br @ssa_bool:10, label 11, label 6 11: ret literal empty 6: _6 = put_tuple literal case_clause, _0 %% t.erl:5 @ssa_ret:12 = call remote (literal erlang):(literal error)/1, _6 ret @ssa_ret:12 3: _9 = put_list _0, literal [] %% t.erl:4 @ssa_ret:13 = call remote (literal erlang):(literal error)/2, literal function_clause, _9 ret @ssa_ret:13 } Type subtraction gives us that _0 must be nil in block 7, allowing us to remove the comparison of _0 with nil. The code for the function can be simplified to: function t:foo(_0) { 0: @ssa_bool = bif:is_list _0 br @ssa_bool, label 4, label 3 4: @ssa_bool:8 = is_nonempty_list _0 br @ssa_bool:8, label 9, label 11 9: ret literal non_empty 11: ret literal empty 3: _9 = put_list _0, literal [] %% t.erl:4 @ssa_ret:13 = call remote (literal erlang):(literal error)/2, literal function_clause, _9 ret @ssa_ret:13 }