Age | Commit message (Collapse) | Author |
|
* bjorn/compiler/misc:
Move select_val optimization from beam_clean to beam_peep
beam_type: Improve optimizations by keeping track of booleans
beam_type: Improve optimization by keeping track of integers
beam_type: Remove unused clause
beam_type: Fix forgotten change of internal representation
beam_dead: Improve optimization of literal binary matching
beam_dead: Optimize select_val instructions
Move out bit syntax optimizations from beam_block
sys_core_fold: Extend the list of BIFs that return integers
v3_codegen: Optimize matching of the final size-less binary segment
Regain full coverage of beam_block
|
|
There is an optimization in beam_clean that will remove values
having the same label as the failure label in a select_val
instruction. Conceptually, this optimization is in the wrong
module since ideally beam_clean is a mandatory pass that should
not do optimizations. Furthermore, this part of beam_clean is
called three times (from beam_dead, beam_peep, and as a compiler
pass from the 'compile' module), but it only does useful one of
the times it is called.
Therefore, move this optimization to the beam_peep pass.
The same optimization is done in beam_dead, but unfortunately it
misses some opportunities for optimization because the code sharing
optimization in beam_jump (share/1) runs after beam_dead. It would
be more satisfactory to have this optimization only in beam_dead,
but it turned out not to be trivial. If we try to run
beam_jump:share/1 before beam_dead, some optimizations will no
longer work in beam_dead because fallthroughs have been eliminated.
For the moment, the possible solutions to this problem seems to
involve more work and more complicated code than the gain from
eliminating the duplicated optimization would gain.
|
|
There is an optimization in beam_block to simplify a select_val
on a known boolean value. We can implement this optimization
in a cleaner way in beam_type and it will also be applicable
in more situations. (When I added the optimization to beam_type
without removing the optimization from beam_block, the optimization
was applied 66 times.)
|
|
The ASN.1 compiler often generates code similar to:
f(<<0:1,...>>) -> ...;
f(<<1:1,...>>) -> ....
Internally that will be rewritten to (conceptually):
f(<<B:1,Tail/binary>>) ->
case B of
0 ->
case Tail of ... end;
1 ->
case Tail of ... end;
_ ->
error(case_clause)
end.
Since B comes from a bit field of one bit, we know that the only
possible values are 0 and 1. Therefore the error clause can be
eliminated like this:
f(<<B:1,Tail/binary>>) ->
case B of
0 ->
case Tail of ... end;
_ ->
case Tail of ... end
end.
Similarly, we can also a deduce the range for an integer from
a 'band' operation with a literal integer.
While we are at it, also add a test case to improve the coverage.
|
|
The clause cannot possibly match, because there will always be
a {bif,...} clause that will match before reaching the fclearerror
instruction.
|
|
30cc5c90 changed the internal representation of catch and
try...catch, but beam_type was not updated in one place.
|
|
When the bit syntax is used to match a single binary literal, the bit
syntax instructions will be replaced with a comparison to a binary
literal. The only problem is that the bs_context_to_binary instruction
will not be eliminated.
Example:
f(<<"string">>) ->
ok.
This function would be translated to:
{function, f, 1, 2}.
{label,1}.
{line,...}.
{func_info,...}.
{label,2}.
{test,is_eq_exact,{f,3},[{x,0},{literal,<<"string">>}]}.
{move,{atom,ok},{x,0}}.
return.
{label,3}.
{bs_context_to_binary,{x,0}}.
{jump,{f,1}}.
The bs_context_to_binary instruction serves no useful purpose,
since {x,0} can never be a match context. Eliminating the
instruction, the resulting code will be:
{function, f, 1, 2}.
{label,1}.
{line,...}.
{func_info,...}.
{label,2}.
{test,is_eq_exact,{f,1},[{x,0},{literal,<<"string">>}]}.
{move,{atom,ok},{x,0}}.
return.
|
|
In a select_val instruction, values associated with a label
which is the same as the failure label can be removed. We
already do this optimization in beam_clean, but it is better
do this sort of optimization before the beam_jump pass.
Also rewrite a select_val instruction with a single value
to is_eq_exact instruction followed by a jump instruction.
|
|
In the future we might want to add more bit syntax optimizations,
but beam_block is already sufficiently complicated. Therefore, move
the bit syntax optimizations out of beam_block into a separate
compiler pass called beam_bs.
|
|
Knowing that a BIF returns an integer makes it possible to
replace '==' with the cheaper '=:=' test.
|
|
Consider the following function:
f(Bin, Bool) ->
case Bin of
<<Val:16/binary,_/binary>> when Bool ->
Val
end.
Simplified, the generated code looks like:
bs_start_match2 Fail Live Bin => Bin
bs_get_integer2 Fail Live Bin size=Sz unit=1 => Val
bs_skip_bits2 Fail Bin size=all unit=8
is_eq_exact Fail Bool true
The code generator will replace the bs_skip_bits2 instruction with
a bs_test_unit instruction if it can be clearly seen that the
context register will not be used again. In this case, it is not
obvious without looking at the code at the Fail label.
However, it turns out that bs_test_unit instruction is always
safe beacuse of the way v3_kernel compiles pattern matching.
It doesn't matter whether the match context will be used again.
If it will be used again, the position in it will *not* be used.
Instead, a bs_restore2 instruction will restore one of the saved
instructions.
|
|
=== OTP-18.1 ===
Changed Applications:
- compiler-6.0.1
- crypto-3.6.1
- debugger-4.1.1
- dialyzer-2.8.1
- diameter-1.11
- erts-7.1
- eunit-2.2.11
- hipe-3.13
- inets-6.0.1
- kernel-4.1
- mnesia-4.13.1
- odbc-2.11.1
- public_key-1.0.1
- sasl-2.6
- ssh-4.1
- ssl-7.1
- stdlib-2.6
- tools-2.8.1
- wx-1.5
Unchanged Applications:
- asn1-4.0
- common_test-1.11
- cosEvent-2.2
- cosEventDomain-1.2
- cosFileTransfer-1.2
- cosNotification-1.2
- cosProperty-1.2
- cosTime-1.2
- cosTransactions-1.3
- edoc-0.7.17
- eldap-1.2
- erl_docgen-0.4
- erl_interface-3.8
- et-1.5.1
- gs-1.6
- ic-4.4
- jinterface-1.6
- megaco-3.18
- observer-2.1
- orber-3.8
- os_mon-2.4
- ose-1.1
- otp_mibs-1.1
- parsetools-2.1
- percept-0.8.11
- reltool-0.7
- runtime_tools-1.9.1
- snmp-5.2
- syntax_tools-1.7
- test_server-3.9
- typer-0.9.9
- webtool-0.9
- xmerl-1.3.8
Conflicts:
OTP_VERSION
erts/vsn.mk
|
|
|
|
d0784035ab fixed a problem with register corruption. Because of
that, opt_moves/2 will never be asked to optimize instructions with
more than two destination registers. Therefore, to regain full
coverage of beam_block, remove the final clause in opt_moves/2.
|
|
* bjorn/compiler/remove-deprecated/OTP-12979:
core_lib: Remove previously deprecated functions
|
|
|
|
* bjorn/cuddle-with-tests: (23 commits)
rand_SUITE: Speed up basic_stats/1
base64_SUITE: Speed up roundtrip/1
lists_SUITE: Test lists:concat/2
lists_SUITE: Test lists:split/2
lists_SUITE: Add a test case for lists:prefix/2
lists_SUITE: Add hof/1 to test all high-order functions
lists_SUITE: Add test for lists:takewhile/1
lists_SUITE: Run test cases in each group in parallel
lists_SUITE: Test lists:keyreplace/4
lists_SUITE: Extend flatten/1 test to also test flatlength/1
lists_SUITE: Correct test of lists:flatten/2
id_transform_SUITE: Modernize test suite
io_proto_SUITE: Speed up determination of default shell
io_proto_SUITE: Refactor up rtnode() and friends
gen_event_SUITE: Remove unnecessary sleep calls
proc_lib: Improve coverage for crash/1
proc_lib_SUITE: Eliminate compiler warnings
io_SUITE: Add coverage/1 to completely cover io_lib_pretty
io_SUITE: Extend coverage of code for testing printable chars
io_SUITE: Speed up test for bad +pc option
...
|
|
|
|
* c-rack/fix-typo3:
Fix typo in call_last/3 spec
Fix typo
Fix typo: message to send is in x(1) not x(0)
Fix another small typo
Fix typo
|
|
|
|
Make sure that all warnings produced when compiling the
test suite contains filenames and line numbers.
|
|
|
|
Instruction get_map_elements might destroy target registers when the fail-label is taken.
Only seen for patterns with two, and only two, target registers.
Specifically: we copy one register, and then jump.
foo(A,#{a := V1, b := V2}) -> ...
foo(A,#{b := V}) -> ...
call foo(value, #{a=>whops, c=>42}).
corresponding assembler:
{test,is_map,{f,5},[{x,1}]}.
{get_map_elements,{f,7},{x,1},{list,[{atom,a},{x,1},{atom,b},{x,2}]}}.
%% if 'a' exists but not 'b' {x,1} is overwritten, jump {f,7}
{move,{integer,1},{x,0}}.
{call_only,3,{f,10}}.
{label,7}.
{get_map_elements,{f,8},{x,1},{list,[{atom,b},{x,2}]}}.
%% {x,1} (src) is read with a corrupt value
{move,{x,0},{x,1}}.
{move,{integer,2},{x,0}}.
{call_only,3,{f,10}}.
The fix is to remove 'opt_moves' pass for get_map_elements instruction
in the case of two or more destinations.
Reported-by: Valery Tikhonov
|
|
|
|
|
|
|
|
|
|
|
|
In 45f469ca0890, the BEAM loader started to use x(1023) as scratch
register for some instructions. Therefore we should not allow
x(1023) to be used in code emitted by the compiler.
|
|
When translating guards to Core Erlang, it is sometimes necessary
to add an is_boolean/1 guard test. Here is an example when it is
necessary:
o(A, B) when A or B ->
ok.
That would be translated to something like:
o(A, B) when ((A =:= true) or (B =:= true)) and
is_boolean(A) and is_boolean(B) ->
ok.
The is_boolean/1 tests are necessary to ensure that the guard
fails for calls such as:
o(true, not_boolean)
However, because of a bug in v3_core, is_boolean/1 tests were
added when they were not necessary. Here is an example:
f(B) when not B -> ok.
That would be translated to:
f(B) when (B =:= false) and is_boolean(B) -> ok.
The following translation will work just as well.
f(B) when B =:= false -> ok.
Correct the bug to suppress those unnecessary is_boolean/1 tests.
|
|
We can rewrite more instances of select_val to is_boolean because
it is not necessary that a particular label follows the select_val.
|
|
Put 'try' instructions inside block to improve the optimization
of allocation instructions. Currently, the compiler only looks
at initialization of y registers inside blocks when determining
which y registers that will be "naturally" initialized.
|
|
Simplify further optimizations by moving safe instructions to
before the 'try' or 'catch' instruction.
|
|
When matching tuples, the pattern matching compiler would generate
code that would fetch all elements of the tuple that will ultimately
be used, *before* testing that (for example) the first element is the
correct record tag. For example:
is_tuple Fail {x,0}
test_arity Fail {x,0} 3
get_tuple_element {x,0} 0 {x,1}
get_tuple_element {x,0} 1 {x,2}
get_tuple_element {x,0} 2 {x,3}
is_eq_exact Fail {x,1} some_tag
If {x,2} and {x,3} are not used at label Fail, we can re-arrange the
code like this:
is_tuple Fail {x,0}
test_arity Fail {x,0} 3
get_tuple_element {x,0} 0 {x,1}
is_eq_exact Fail {x,1} some_tag
get_tuple_element {x,0} 1 {x,2}
get_tuple_element {x,0} 2 {x,3}
Doing that may be beneficial in two ways.
If the branch is taken, we have eliminated the execution of two
unnecessary instructions.
Even if the branch is never or rarely taken, there is the possibility
for more optimizations following the is_eq_exact instructions.
For example, imagine that the code looks like this:
get_tuple_element {x,0} 1 {x,2}
get_tuple_element {x,0} 2 {x,3}
move {x,2} {y,0}
move {x,3} {y,1}
Assuming that {x,2} and {x,3} have no further uses in the code
that follows, that can be rewritten to:
get_tuple_element {x,0} 1 {y,0}
get_tuple_element {x,0} 2 {y,1}
When should we perform this optimization?
At the very latest, it must be done before opt_blocks/1 in
beam_block which does the elimination of unnecessary moves.
Actually, we want do the optimization before the blocks have
been established, since moving instructions out of one block
into another is cumbersome.
Therefore, we will do the optimization in a new pass that is
run before beam_block. A new pass will make debugging easier,
and beam_block already has a fair number of sub passes.
|
|
|
|
Here is an example of a move instruction that could not be optimized
away because the {x,2} register was not killed:
get_tuple_element Reg Pos {x,2}
.
.
.
move {x,2} {y,0}
put_list {x,2} nil Any
We can do the optimization if we replace all occurrences of the {x,2}
register as a source with {y,0}:
get_tuple_element Reg Pos {y,0}
.
.
.
put_list {y,0} nil Dst
|
|
The 'move' optimization was relatively clean until GC BIFs
were introduced. Instead of re-thinking the implementation,
the existing code was fixed and patched.
The current code unsuccessfully attempts to eliminate 'move'
instructions across GC BIF and allocation instructions. We can
simplify the code if we give up as soon as we encounter any
instruction that allocates.
|
|
opt_alloc/1 makes a redundant call to opt/1. It is redundant because
the opt/1 function has already been applied to the instruction
sequence prior to calling opt_alloc/1.
|
|
|
|
|
|
Add the 'da' option to create a list after the beam_a pass. Seeing
how the code looks after beam_a, but before the blocks have been
established, is sometimes useful.
For symmetry, add the 'dz' option, even though it is just a synonym
for 'S'.
|
|
|
|
|
|
|
|
regressions_SUITE will have code snippets which previously crashed the compiler.
This commits includes a test for Maps crash in beam_bool.
|
|
Before beam_split the get_map_elements instruction is still in
blocks and the helper function in beam_jump did not reflect this.
Reported-by: Quviq twitter account
|
|
* bjorn/compiler/spurious-warning:
sys_core_fold: Eliminate warnings for unused terms in effect context
sys_core_fold: Eliminate warnings for unused terms
|
|
* bjorn/compiler/doc:
Update compiler documentation
(Sneaking in OTP-12769 here which is a release note for syntax_tools.
Sorry about that.)
|
|
Language cleaned up by the technical writer tmanevik from
Combitech. Proofreading and corrections by Björn Gustavsson.
|
|
|