Age | Commit message (Collapse) | Author |
|
Commit b76588fb5a introduced an optimization of the compile time of
huge functions with many bs_match_string instructions. The
optimization is done in two passes. The first pass coalesces adjacent
bs_match_string instructions. To avoid copying bitstrings multiple
times, the bitstrings in the instructions are combined in to a (deep)
list. The second pass goes through all instructions in the function
and combines the list of bitstrings to a single bitstring in all
bs_match_string instructions.
The second pass (fix_bs_match_string) is run on all instructions in
each function, even if there are no bs_match_instructions in the
function. While fix_bs_match_string is not a bottleneck (it is a
linear pass), its execution time is noticeable when profiling some
modules.
Move the execution of the second pass to the select_binary()
function so that it will only be executed for instructions that
do binary matching. Also take the opportunity to optimize away
uses of bs_restore2 that occour directly after a bs_save2. That
optimimization is currently done in beam_block, but it can be
done essentially for free in the same pass that fixes up
bs_match_string instructions.
|
|
Profiling shows that the execution time for "turning" y registers
is noticeable for some modules (e.g. S1AP-PDU-Contents from the
asn1 test suite). We can reduce the impact on running time by
special-casing important instructions. In particular, there is
no need to look for y registers in the list argument for a
select_val instruction.
|
|
It is no longer necessary to sort the keys, since the loader
does the sorting.
|
|
For unclear reasons, there are two functions in v3_life that are
almost identical: literal/2 and literal2/2. literal/2 is used
for expressions and literal2/2 for patterns.
It turns out that literal2/2 can do everything that literal/2 can
do, except that it transforms maps differently.
If we adjust v3_codegen to accept the same format of maps in
expressions and patterns, we can rename literal2/2 to literal/2
and use it for expressions and patterns.
|
|
v3_codegen puts the compilation in the process dictionary, but
never uses them.
|
|
The put_map_assoc and put_map_exact instructions in the run-time
system will support that the target register is the same as one of
the source registers. Teach the code generator to take advantage
of that.
The disadvantages of not reusing register when possible is that the
garbage collector may retain dead terms longer than necessary.
|
|
Need heap for maps is zero and fall through is also zero.
|
|
|
|
Removed dead code.
|
|
|
|
Multiple 'nil' cannot happen on instruction level.
Map keys are always unique with literals.
|
|
Literal nil values aren't tagged tuple but the bare atom nil.
The function lists:sort/2 expects the passed function to return true if the first
element is less than or equal to the second, not strictly less than. The original
base clause is changed accordingly.
Reported-by: Ulf Norell
|
|
* Combine multiple get values with one instruction
* Combine multiple check keys with one instruction
|
|
This fixes an error on multiple updates optimization for map pairs.
The error was introduced with moving to term order in Maps.
This also fixes an error where register life time was lost for values
and could result in erroneuos values being emitted in for map pairs.
Simplified v3_codegen by moving multiple update optimizations to v3_kernel.
|
|
|
|
|
|
|
|
All pairs in a Map needs to be in strict ascending key order.
|
|
|
|
Can now handle {list [reg()]} elements in instructions.
|
|
To make it possible to build the entire OTP system, also define
dummys for the instructions in ops.tab.
|
|
Reported-by: Stanislav Seletskiy
|
|
* maint:
Fix compiler crash for binary matching and a complicated guard
|
|
The compiler would crash when attempting to compile a function
head that did binary matching and had a complex expression using
'andalso' and 'not'.
Noticed-by: José Valim
|
|
In modules with huge functions with many bs_match_string
instructions, we can speed up the compilation by combining
adjacent bs_match_strings instruction in v3_codegen (as opposed
to in beam_block where we used to do it).
For instance, on my computer the v3_codegen became more than
twice as fast when compiling the re_testoutput1_split_test module
in the STDLIB test suites.
|
|
Add some comments to make the code generation for binary
construction somewhat clearer. Also get rid of the cg_bo_newregs/2
function that serves no useful purpose. (It was probably
intended to undo the effect of cg_live/2, but note that the
value passed to cg_bo_newregs/2 has not passed through cg_live/2.)
|
|
|
|
|
|
|
|
In Core Erlang and later passes, compiler-generated code can be
indicated in two different ways: by negative line numbers and by
a 'compiler_generated' annotation.
Simplify the code and improve coverage by turning negative line
numbers positive and adding a 'compiler_generated' annotation in
the v3_core pass. That means that Core Erlang and latter passes
do not have deal with negative line numbers.
|
|
If we let v3_kernel make sure that a 'put' operation always has a
destination register, the special case in v3_codegen is not needed.
|
|
In the v3_life pass, it is assumed that a 'match_fail' primop
only occur at the top-level and at the end of a function.
But this code:
do_split_cases(A) ->
case A of
x ->
Z = dummy1;
_ ->
Z = dummy2,
a=b
end,
Z.
will be optimized by sys_core_fold to the following code:
'split_cases'/1 =
fun (_cor0) ->
let <_cor7,Z> =
case _cor0 of
<'x'> when 'true' ->
< 'dummy1','dummy1' >
<_cor6> when 'true' ->
%% Here follows a 'match_fail' primop inside
%% multiple return values:
< primop 'match_fail'({'badmatch','b'}),'dummy2' >
end
in
Z
moving the 'match_fail' primop into a "values" construction.
In the future, we would like to get rid of the v3_life pass (it is
there for historical reasons), so in the mean-time we prefer to not
add more code to it by generalizing the handling of 'match_fail'.
Since the 'match_fail' primop can be simulated by erlang:error/{1,2},
the simplest solution is to translate 'match_fail' to a call to
erlang:error/{1,2} in v3_kernel and remove the handling of
'match_fail' in v3_life and v3_codegen.
It is tempting to get rid of 'match_fail' also in the Core Erlang
format, but there are two issues:
- Removing the support for 'match_fail' completely may break tools
that generate Core Erlang code. We should not do that in a minor
release.
- There is no easy way to generate a 'function_clause' exception
that will remain correct if it will be inlined into another
function. (Calling "erlang:error(function_clause, Args)" is
fine only if it is not inlined into another function.) A good
solution probably involves introducing new instructions, which
is better done in a major release.
Noticed-by: Håkan Matsson
Minimized-test-case-by: Erik Søe Sørensen
|
|
|
|
|
|
By accident a previous instance of St is used, which is
harmless in this case, but leads to worse quality of
the generated code.
|
|
Code such as
foo(A) -> <<A:0>>.
would cause a compiler crash.
|
|
The inliner can cause illegal uses of the bs_context_to_binary/1
instruction, such as:
bs_context_to_binary {literal,"string"}
or
bs_context_to_binary {integer,1}
Remove the bs_context_to_binary/1 instruction if the operand
is not a register (it is clearly not needed).
|
|
* bg/compiler-remove-r11-support:
compiler: Don't support the no_binaries option
erts: Don't support the put_string/3 instruction
compiler: Don't support the no_constant_pool option
compiler: Don't support the r11 option
test_server: Don't support communication with R11 nodes
binary_SUITE: Don't test bit-level binary roundtrips with R11 nodes
erts: Test compatibility of funs with R12 instead of R11
OTP-8531 bg/compiler-remove-r11-support
|
|
The no_constant_pool option was implied by the r11 option. It turns
off the usage of the constant (literal) pool, so that BEAM
instructions that use constants can be loaded in an R11 system.
Since the r11 option has been removed, there is no need to
retain the no_constant_pool option.
|
|
|