Age | Commit message (Collapse) | Author |
|
2e40d8d1c51a attempted fix a bug in binary matching, but it only
fixed the bug for the minimized test case.
This commit removes the previous fix and fixes the bug in a more
effective way. See the comments in the new code in `sys_core_bsm`
for an explanation.
This commit restores the optimizations in string.erl and dets_v9.erl
that the previous fix disabled.
I have not found any code where this commit will disable optimizations
when they are actually safe. There are some changes to the code
in ssl_cipher.erl in that some bs_start_match2 instruction did not
reuse the binary register for the match context, but the delayed
sub binary optimizations was never applied to the code in the first
place.
https://bugs.erlang.org/browse/ERL-689
|
|
The compiler generates incorrect code for the following example:
decode_binary(_, <<Length, Data/binary>>) ->
case {Length, Data} of
{0, _} ->
%% When converting the match context back to a binary,
%% Data will be set to the entire original binary,
%% that is, to <<0>> instead of <<>>.
{{0, 0, 0}, Data};
{4, <<Y:16/little, M, D, Rest/binary>>} ->
{{Y, M, D}, Rest}
end.
The problem is the delayed sub binary creation optimization, which
is not safe to do in this case.
This commit introduces a heuristic that will disable the delayed
sub binary creation optimization for this example. Unfortunately, the
heuristic may turn off the optimization when it would actually be
safe. In the OTP codebase, the optimization is turned off in two
instances, once in string.erl and once in dets_v9.erl.
https://bugs.erlang.org/browse/ERL-689
|
|
|
|
If possible, when adding move/2 instructions, try to insert
them into a block. That could potentially allow them to
be optimized.
|
|
|
|
Also slightly refactor the code to simplify the types.
|
|
The following code causes a compiler failure:
first_after(Data, Offset) ->
case byte_size(Data) > Offset of
false ->
{First, Rest} = {ok, ok},
ok;
true ->
<<_:Offset/binary, Rest/binary>> = Data,
%% 'Rest' saved in y(0) before the call.
{First, _} = match_first(Data, Rest),
%% When beam_bsm sees the code, the following line
%% which uses y(0) has been optimized away.
{First, Rest} = {First, Rest},
First
end.
match_first(_, <<First:1/binary, Rest/binary>>) ->
{First, Rest}.
Here is the error message from beam_validator:
t: function first_after/2+15:
Internal consistency check failed - please report this bug.
Instruction: {call,2,{f,7}}
Error: {multiple_match_contexts,[{x,1},0]}:
Basically, what happens is that at time of code generation,
the variable 'Rest' is needed after the call to match_first/2
and is therefore saved in y(0). When beam_bsm (a late optimization
pass) sees the code, the use of y(0) following the call
to match_first/2 has been optimized away. beam_bsm therefore
assumes that the delayed sub-binary creation is safe. (Actually,
it is safe, but beam_validator does not realize it.)
The bug was caused by two separate commits:
e199e2471a reduced the number of special cases to handle in BEAM
optimization passed by breaking apart the tail-recursive call
instructions (call_only and call_last) into separate instructions.
Unfortunately, the special handling for tail calls was lost, which
resulted in worse code (i.e. the delayed sub-binary creation
optimization could not be applied).
e1aa422290 tried to compensate, but did so in a way that was not
always safe.
Teaching beam_validator that this kind of code is safe would be
expensive.
Instead, we will undo the damage caused by the two
commits. Re-introduce the special handling of tail-recursive calls in
beam_bsm that was lost in the first commit. (Effectively) revert the
change in the second commit.
ERL-268
|
|
|
|
The following code would fail to compile:
decode(<<Code/integer, Bin/binary>>) ->
<<C1/integer, B1/binary>> = Bin,
case C1 of
X when X =:= 1 orelse X =:= 2 ->
Bin2 = <<>>;
_ ->
Bin2 = B1
end,
case Code of
1 -> decode(Bin2);
_ -> Bin2
end.
The error message would be:
t: function decode/1+28:
Internal consistency check failed - please report this bug.
Instruction: return
Error: {match_context,{x,0}}:
The beam_bsm pass would delay the creation of a sub-binary when it was
unsafe to do so. The culprit was the btb_follow_branch/3 function that
for performance reasons cached labels that had already been checked.
The problem was the safety of a label also depends on the contents
of the registers. Therefore, the key for caching needs to be both
the label and the register contents.
Reported-by: José Valim
|
|
|
|
Allows for 'creation of sub binary delayed' optimization if
map instructions are in a clause.
Reported-by: José Valim
|
|
|
|
lists:dropwhile/2 and the fun in btb_index_1/2 shows up in the
top 10 list of eprof. Replace dropwhile with a special-purpose
function for a tiny increase in speed.
|
|
We must not do the delayed binary creation optimization if the
code attempts to call the matched out binary. Calling a matchstate
will crash the run-time system.
Reported-by: Loïc Hoguin
|
|
|
|
The handling of bs_start_match2 was both too conservative and too
careless. It was too conservative in that would not do the
optimization if the were copies of the match state in other
registers. It was careless in that it did not consider the
failure branch.
Reorganize the code and fix both these issues. Add a test case
to test that the failure branch is considered.
|
|
When determining whether the delayed creation of sub-binaries
optimizations is applicable, this module some tests whether the
register containg the match state is killed. That is actually a
stronger condition than necessary; since the register is initialized,
it suffices to test whether the register is unused.
|
|
We were too conservative when handling a call when there were copies of
the match context in both x and y registers. Don't give up if there
is are copies of the match context in y registers, as long as those
copies are killed by the code that follows the call.
|
|
Somewhat reduce the code bloat by eliminating special cases.
|
|
Somewhat reduce code bloat.
|
|
Eliminate some code bloat.
|
|
Rewrite the five binary creation instructions to a bs_init
instruction, in order to somewhat reduce code bloat.
|
|
We can remove some code bloat by handling the special instructions
as BIF instructions in the optimization passes. Also note that
bs_utf*_size was not handled by beam_utils:check_liveness/3
(meaning the conservative answer instead of the correct answer
would be returned).
|
|
Seven bs_put_* instructions can be combined into one generic bs_put
instruction to avoid some code bloat. That will also improve some
optimizations (such as beam_trim) that did not handle all bs_put*
variants.
|
|
|
|
|
|
In warning_translate_label/2, gb_trees:lookup/2 is called
to translate from the entry label for a function to its name.
Since the gb_tree has an entry for all functions in the module,
there is no way that the lookup can fail unless there is a
serious bug.
Therefore, use gb_trees:get/2 so that an exception and an
internal compiler error will be generated if the lookup would
ever fail.
|
|
|