Age | Commit message (Collapse) | Author |
|
|
|
|
|
|
|
|
|
|
|
* maint:
beam_utils: Handle bs_start_match2 in anno_defs
|
|
|
|
Co-authored-by: John Högberg <[email protected]>
|
|
* maint:
Fix rare bug in binary matching (again)
Conflicts:
lib/compiler/src/beam_bsm.erl
lib/compiler/src/sys_core_bsm.erl
|
|
This commit improves the bit-syntax match optimization pass,
leveraging the new SSA intermediate format to perform much more
aggressive optimizations. Some highlights:
* Watch contexts can be reused even after being passed to a
function or being used in a try block.
* Sub-binaries are no longer eagerly extracted, making it far
easier to keep "happy paths" free from binary creation.
* Trivial wrapper functions no longer disable context reuse.
|
|
2e40d8d1c51a attempted fix a bug in binary matching, but it only
fixed the bug for the minimized test case.
This commit removes the previous fix and fixes the bug in a more
effective way. See the comments in the new code in `sys_core_bsm`
for an explanation.
This commit restores the optimizations in string.erl and dets_v9.erl
that the previous fix disabled.
I have not found any code where this commit will disable optimizations
when they are actually safe. There are some changes to the code
in ssl_cipher.erl in that some bs_start_match2 instruction did not
reuse the binary register for the match context, but the delayed
sub binary optimizations was never applied to the code in the first
place.
https://bugs.erlang.org/browse/ERL-689
|
|
v3_codegen is replaced by three new passes:
* beam_kernel_to_ssa which translates the Kernel Erlang format
to a new SSA-based intermediate format.
* beam_ssa_pre_codegen which prepares the SSA-based format
for code generation, including register allocation. Registers
are allocated using the linear scan algorithm.
* beam_ssa_codegen which generates BEAM assembly code from the
SSA-based format.
It easier and more effective to optimize the SSA-based format before X
and Y registers have been assigned. The current optimization passes
constantly have to make sure no "holes" in the X register assignments
are created (that is, that no X register becomes undefined that an
allocation instruction depends on).
This commit also introduces the following optimizations:
* Replacing of tuple matching of records with the is_tagged_tuple
instruction. (Replacing beam_record.)
* Sinking of get_tuple_element instructions to just before the first
use of the extracted values. As well as potentially avoiding
extracting tuple elements when they are not actually used on all
executions paths, this optimization could also reduce the number
values that will need to be stored in Y registers. (Similar to
beam_reorder, but more effective.)
* Live optimizations, removing the definition of a variable that is
not subsequently used (provided that the operation has no side
effects), as well strength reduction of binary matching by replacing
the extraction of value from a binary with a skip instruction. (Used
to be done by beam_block, beam_utils, and v3_codegen.)
* Removal of redundant bs_restore2 instructions. (Formerly done
by beam_bs.)
* Type-based optimizations across branches. More effective than
the old beam_type pass that only did type-based optimizations in
basic blocks.
* Optimization of floating point instructions. (Formerly done
by beam_type.)
* Optimization of receive statements to introduce recv_mark and
recv_set instructions. More effective with far fewer restrictions
on what instructions are allowed between creating the reference
and entering the receive statement.
* Common subexpression elimination. (Formerly done by beam_block.)
|
|
The compiler generates incorrect code for the following example:
decode_binary(_, <<Length, Data/binary>>) ->
case {Length, Data} of
{0, _} ->
%% When converting the match context back to a binary,
%% Data will be set to the entire original binary,
%% that is, to <<0>> instead of <<>>.
{{0, 0, 0}, Data};
{4, <<Y:16/little, M, D, Rest/binary>>} ->
{{Y, M, D}, Rest}
end.
The problem is the delayed sub binary creation optimization, which
is not safe to do in this case.
This commit introduces a heuristic that will disable the delayed
sub binary creation optimization for this example. Unfortunately, the
heuristic may turn off the optimization when it would actually be
safe. In the OTP codebase, the optimization is turned off in two
instances, once in string.erl and once in dets_v9.erl.
https://bugs.erlang.org/browse/ERL-689
|
|
Call test_lib:recompile/1 from init_per_suite/1 instead of
from all/0. That makes it easy to find the log from the
compilation in the log file for the init_per_suite/1 test
case.
|
|
|
|
|
|
Extend an existing optimization in beam_dead to avoid
creating a match context when matching an empty binary.
|
|
Argument order can prevent the delayed sub binary creation.
Here is an example directly from the Efficiency Guide:
non_opt_eq([H|T1], <<H,T2/binary>>) ->
non_opt_eq(T1, T2);
non_opt_eq([_|_], <<_,_/binary>>) ->
false;
non_opt_eq([], <<>>) ->
true.
When compiling with the bin_opt_info option, there will be a
suggestion to change the argument order.
It turns out sys_core_bsm can itself change the order, not the
order of the arguments of themselves, but the order in which
the arguments are matched. Here is how it can be rewritten in
pseudo Core Erlang code:
non_opt_eq(Arg1, Arg2) ->
case < Arg2,Arg1 > of
< <<H1,T2/binary>>, [H2|T1] > when H1 =:= H2 ->
non_opt_eq(T1, T2);
< <<_,T2/binary>ffff>, [_|T1] > ->
false;
< <<>>, [] >> ->
true
end.
When rewritten like this, the bs_start_match2 instruction will be
the first instruction in the function and it will be possible to
store the match context in the same register as the binary
({x,1} in this case) and to delay the creation of sub binaries.
The switching of matching order also enables many other simplifications
in sys_core_bsm, since there is no longer any need to pass the position
of the pattern as an argument.
We will update the Efficiency Guide in a separate branch before the
release of OTP 21.
|
|
|
|
Add some tests cases written when attempting some new optimizations
that turned out to be unsafe.
|
|
|
|
|
|
|
|
* maint:
Fix incorrect internal consistency failure for binary matching code
|
|
4c31fd0b9665 made the merging of match contexts stricter;
in fact, a little bit too strict.
Two match contexts with different number of slots would
be downgraded to the 'term' type. The correct way is to
keep the match context but set the number of slots to the
lowest number of slots of the two match contexts.
https://bugs.erlang.org/browse/ERL-490
|
|
The compiler could sometimes emit unnecessary 'move'
instructions in the code for binary matching, for
example for this function:
escape(<<Byte, Rest/bits>>, Pos) when Byte >= 127 ->
escape(Rest, Pos + 1);
escape(<<Byte, Rest/bits>>, Pos) ->
escape(Rest, Pos + Byte);
escape(<<_Rest/bits>>, Pos) ->
Pos.
The generated code would look like this:
{function, escape, 2, 2}.
{label,1}.
{line,[{location,"t.erl",17}]}.
{func_info,{atom,t},{atom,escape},2}.
{label,2}.
{test,bs_start_match2,{f,1},2,[{x,0},0],{x,0}}.
{test,bs_get_integer2,
{f,4},
2,
[{x,0},
{integer,8},
1,
{field_flags,[{anno,[17,{file,"t.erl"}]},unsigned,big]}],
{x,2}}.
{'%',{bin_opt,[17,{file,"t.erl"}]}}.
{move,{x,0},{x,3}}. %% UNECESSARY!
{test,is_ge,{f,3},[{x,2},{integer,127}]}.
{line,[{location,"t.erl",18}]}.
{gc_bif,'+',{f,0},4,[{x,1},{integer,1}],{x,1}}.
{move,{x,3},{x,0}}. %% UNECESSARY!
{call_only,2,{f,2}}.
{label,3}.
{line,[{location,"t.erl",20}]}.
{gc_bif,'+',{f,0},4,[{x,1},{x,2}],{x,1}}.
{move,{x,3},{x,0}}. %% UNECESSARY!
{call_only,2,{f,2}}.
{label,4}.
{move,{x,1},{x,0}}.
return.
The redundant 'move' instructions have been marked.
To avoid the 'move' instructions, we can extend the existing
function is_context_unused/1 in v3_codegen. If v3_codegen can
know that the match context will not be used again, it can reuse
the register for the match context and avoid the extra 'move'
instructions.
https://bugs.erlang.org/browse/ERL-444
|
|
|
|
As part of sys_core_fold, variables involved in bit syntax
matching would be annotated when it would be safe for a later
pass to do the delayed sub-binary creation optimization.
An implicit assumption regarding the annotation was that the
code must not be further optimized. That assumption was broken
in 05130e48555891, which introduced a fixpoint iteration
(applying the optimizations until there were no more changes).
That means that a variable could be annotated as safe for
reusing the match context in one iteration, but a later iteration
could rewrite the code in a way that would make the optimization
unsafe.
One way to fix this would be to clear all reuse_for_context
annotations before each iteration. But that would be wasteful.
Instead I chose to fix the problem by moving out the annotation
code to a separate pass (sys_core_bsm) that is run later after
all major optimizations of Core Erlang has been done.
|
|
The compiler produces poor code for complex guard expressions with andalso/orelse.
Here is an example from the filename module:
-define(IS_DRIVELETTER(Letter),(((Letter >= $A) andalso (Letter =< $Z)) orelse
((Letter >= $a) andalso (Letter =< $z)))).
skip_prefix(Name, false) ->
Name;
skip_prefix([L, DrvSep|Name], DrvSep) when ?IS_DRIVELETTER(L) ->
Name;
skip_prefix(Name, _) ->
Name.
beam_bool fails to simplify the code for the guard, leaving several 'bif'
instructions:
{function, skip_prefix, 2, 49}.
{label,48}.
{line,[{location,"filename.erl",187}]}.
{func_info,{atom,filename},{atom,skip_prefix},2}.
{label,49}.
{test,is_ne_exact,{f,52},[{x,1},{atom,false}]}.
{test,is_nonempty_list,{f,52},[{x,0}]}.
{get_list,{x,0},{x,2},{x,3}}.
{test,is_nonempty_list,{f,52},[{x,3}]}.
{get_list,{x,3},{x,4},{x,5}}.
{bif,'=:=',{f,52},[{x,1},{x,4}],{x,6}}.
{test,is_ge,{f,50},[{x,2},{integer,65}]}.
{bif,'=<',{f,52},[{x,2},{integer,90}],{x,7}}.
{test,is_eq_exact,{f,51},[{x,7},{atom,false}]}.
{test,is_ge,{f,50},[{x,2},{integer,97}]}.
{bif,'=<',{f,52},[{x,2},{integer,122}],{x,7}}.
{jump,{f,51}}.
{label,50}.
{move,{atom,false},{x,7}}.
{label,51}.
{bif,'=:=',{f,52},[{x,7},{atom,true}],{x,7}}.
{test,is_eq_exact,{f,52},[{x,6},{atom,true}]}.
{test,is_eq_exact,{f,52},[{x,7},{atom,true}]}.
{move,{x,5},{x,0}}.
return.
{label,52}.
return.
We can add optimizations of guard tests to v3_kernel to achive a better result:
{function, skip_prefix, 2, 49}.
{label,48}.
{line,[{location,"filename.erl",187}]}.
{func_info,{atom,filename},{atom,skip_prefix},2}.
{label,49}.
{test,is_ne_exact,{f,51},[{x,1},{atom,false}]}.
{test,is_nonempty_list,{f,51},[{x,0}]}.
{get_list,{x,0},{x,2},{x,3}}.
{test,is_nonempty_list,{f,51},[{x,3}]}.
{get_list,{x,3},{x,4},{x,5}}.
{test,is_eq_exact,{f,51},[{x,1},{x,4}]}.
{test,is_ge,{f,51},[{x,2},{integer,65}]}.
{test,is_lt,{f,50},[{integer,90},{x,2}]}.
{test,is_ge,{f,51},[{x,2},{integer,97}]}.
{test,is_ge,{f,51},[{integer,122},{x,2}]}.
{label,50}.
{move,{x,5},{x,0}}.
return.
{label,51}.
return.
Looking at the STDLIB application, there were 112 lines of BIF calls in guards
that beam_bool failed to convert to test instructions. This commit eliminates
all those BIF calls.
Here is how I counted the instructions:
$ PATH=$ERL_TOP/bin:$PATH erlc -I ../include -I ../../kernel/include -S *.erl
$ grep "bif,'[=<>]" *.S | grep -v f,0
dets.S: {bif,'=:=',{f,547},[{x,4},{atom,read_write}],{x,4}}.
dets.S: {bif,'=:=',{f,547},[{x,5},{atom,saved}],{x,5}}.
dets.S: {bif,'=:=',{f,589},[{x,5},{atom,read}],{x,5}}.
.
.
.
$ grep "bif,'[=<>]" *.S | grep -v f,0 | wc
112 224 6765
$
|
|
* maint:
beam_bsm: Eliminate unsafe optimization
|
|
The following code causes a compiler failure:
first_after(Data, Offset) ->
case byte_size(Data) > Offset of
false ->
{First, Rest} = {ok, ok},
ok;
true ->
<<_:Offset/binary, Rest/binary>> = Data,
%% 'Rest' saved in y(0) before the call.
{First, _} = match_first(Data, Rest),
%% When beam_bsm sees the code, the following line
%% which uses y(0) has been optimized away.
{First, Rest} = {First, Rest},
First
end.
match_first(_, <<First:1/binary, Rest/binary>>) ->
{First, Rest}.
Here is the error message from beam_validator:
t: function first_after/2+15:
Internal consistency check failed - please report this bug.
Instruction: {call,2,{f,7}}
Error: {multiple_match_contexts,[{x,1},0]}:
Basically, what happens is that at time of code generation,
the variable 'Rest' is needed after the call to match_first/2
and is therefore saved in y(0). When beam_bsm (a late optimization
pass) sees the code, the use of y(0) following the call
to match_first/2 has been optimized away. beam_bsm therefore
assumes that the delayed sub-binary creation is safe. (Actually,
it is safe, but beam_validator does not realize it.)
The bug was caused by two separate commits:
e199e2471a reduced the number of special cases to handle in BEAM
optimization passed by breaking apart the tail-recursive call
instructions (call_only and call_last) into separate instructions.
Unfortunately, the special handling for tail calls was lost, which
resulted in worse code (i.e. the delayed sub-binary creation
optimization could not be applied).
e1aa422290 tried to compensate, but did so in a way that was not
always safe.
Teaching beam_validator that this kind of code is safe would be
expensive.
Instead, we will undo the damage caused by the two
commits. Re-introduce the special handling of tail-recursive calls in
beam_bsm that was lost in the first commit. (Effectively) revert the
change in the second commit.
ERL-268
|
|
During development, a bug in beam_utils caused a compiler failure
in xmerl. If the bug reappears, make sure that we catch it when
compiling the compiler test suite.
|
|
We used to put code that would crash the compiler into
compilation_SUITE_data. That way we would have a failing test case to
remind us to fix a bug.
Nowadays, we generally fix the bug and write the test case at the same
time. Therefore it makes more sense to put the test code directly into
a test suite.
Move out bin_syntax_1 through bin_syntax_5 test cases. Scrap
bin_syntax_6 because it does not longer seems to be relevant.
While we are it, rename the fun_shadow/1 test to size_shadow/1. Also
make sure that the code produces the correct result.
|
|
|
|
* bjorn/compiler/modernize-tests:
Remove ?line macros
Replace use of lists:keysearch/3 with lists:keyfind/3
Eliminate use of doc and suite clauses
Replace ?t with test_server
Replace use of test_server:format/2 with io:format/2
Eliminate use of test_server:fail/0,1
Eliminate use of ?config() macro
Modernize use of timetraps
Eliminate useless helper functions
|
|
|
|
|
|
|
|
Either rely on the default 30 minutes timetrap, or set the timeout
using the supported methods in common_test.
|
|
Binary matching can be confusing. For example:
1> <<-1>> = <<-1>>.
** exception error: no match of right hand side value <<"ÿ">>
2>
When constructing binaries, the value will be masked to fit in
the binary segment. But no such masking happens when matching
binaries.
One solution that we considered was to do the same masking when
matching. We have rejected that solution for several reasons:
* Masking in construction is highly controversial and by some
people considered a bad design decision.
* While masking of unsigned numbers can be understood, masking of
signed numbers it not easy to understand.
* Then there is the question of backward compatibility. Adding
masking to matching would mean that clauses that did not match
earlier would start to match. That means that code that has
never been tested will be executed. Code that has not been
tested will usually not work.
Therefore, we have decided to warn for binary patterns that cannot
possibly match.
While we are it, we will also warn for the following example where
size for a binary segment is invalid:
bad_size(Bin) ->
BadSize = bad_size,
<<42:BadSize>> = Bin.
That example would crash the HiPE compiler because the BEAM compiler
would generate a bs_get_integer2 instruction with an invalid size
field. We can avoid that crash if sys_core_fold not only warns for bad
binary pattern, but also removes the clauses that will not match.
Reported-by: http://bugs.erlang.org/browse/ERL-44
Reported-by: Kostis Sagonas
|
|
As a first step to removing the test_server application as
as its own separate application, change the inclusion of
test_server.hrl to an inclusion of ct.hrl and remove the
inclusion of test_server_line.hrl.
|
|
* maint:
Eliminate crash because of unsafe delaying of sub-binary creation
|
|
The following code would fail to compile:
decode(<<Code/integer, Bin/binary>>) ->
<<C1/integer, B1/binary>> = Bin,
case C1 of
X when X =:= 1 orelse X =:= 2 ->
Bin2 = <<>>;
_ ->
Bin2 = B1
end,
case Code of
1 -> decode(Bin2);
_ -> Bin2
end.
The error message would be:
t: function decode/1+28:
Internal consistency check failed - please report this bug.
Instruction: return
Error: {match_context,{x,0}}:
The beam_bsm pass would delay the creation of a sub-binary when it was
unsafe to do so. The culprit was the btb_follow_branch/3 function that
for performance reasons cached labels that had already been checked.
The problem was the safety of a label also depends on the contents
of the registers. Therefore, the key for caching needs to be both
the label and the register contents.
Reported-by: José Valim
|
|
* maint:
Fix missing filename and line number in warning
Conflicts:
lib/compiler/test/bs_match_SUITE.erl
|
|
When the 'bin_opt_info' is given, warnings without filenames
and line numbers could sometimes be produced:
no_file: Warning: INFO: matching non-variables after
a previous clause matching a variable will prevent delayed
sub binary optimization
The reason for the missing information is that #c_alias{} records lack
location information. There are several ways to fix the problem. The
easiest seems to be to get the location information from the
code).
Noticed-by: José Valim
|
|
|
|
The ASN.1 compiler often generates code similar to:
f(<<0:1,...>>) -> ...;
f(<<1:1,...>>) -> ....
Internally that will be rewritten to (conceptually):
f(<<B:1,Tail/binary>>) ->
case B of
0 ->
case Tail of ... end;
1 ->
case Tail of ... end;
_ ->
error(case_clause)
end.
Since B comes from a bit field of one bit, we know that the only
possible values are 0 and 1. Therefore the error clause can be
eliminated like this:
f(<<B:1,Tail/binary>>) ->
case B of
0 ->
case Tail of ... end;
_ ->
case Tail of ... end
end.
Similarly, we can also a deduce the range for an integer from
a 'band' operation with a literal integer.
While we are at it, also add a test case to improve the coverage.
|
|
|
|
When matching a binary literal as in:
<<"abc">> = Bin
the compiler will produce a sequence of three instructions
(some details in the instructions removed for simplicity):
bs_start_match2 Fail BinReg CtxtReg
bs_match_string Fail CtxtReg "abc"
bs_test_tail2 Fail CtxtReg 0
The sequence can be replaced with:
is_eq_exact Fail BinReg "abc"
|
|
For tidiness, always place .core files in data directories.
|