Age | Commit message (Collapse) | Author |
|
The expansion of record field updates, when more than one field is
updated, but not a majority of the fields, will create a sequence of
calls to `erlang:setelement(Index, Value, Tuple)` where Tuple in the
first call is the original record tuple, and in the subsequent calls
Tuple is the result of the previous call. Furthermore, all Index
values are constant positive integers, and the first call to
`setelement` will have the greatest index. Thus all the following
calls do not actually need to test at run-time whether Tuple has type
tuple, nor that the index is within the tuple bounds.
Since OTP R7, the `sys_core_dsetel` pass, run as the very last Core
Erlang pass, has optimized this sequence of `setelement` calls to use
a special destructive version of `setelement` (called
`set_tuple_element`) for all but the very first `setelement` in the
sequence.
It turns out that the presence of the `set_tuple_element` in SSA code
is awkward and can prevent or complicate type analysis and aggressive
optimizations.
Therefore, this commit removes the `sys_core_dsetel` pass and
reimplements it for SSA code. The optimization will be done in the
`beam_ssa_pre_codegen` pass (that is, just before code generation and
after running all other SSA code optimization passes).
In most cases, the resulting BEAM code is identical to previous
code. For a few modules, the BEAM code is actually slightly better,
with smaller stack frames.
|
|
This test case does not test anything unique that is not
tested by other test cases.
|
|
|
|
|
|
With the new SSA code passes, the optimized_guard/1 test case has
become really bad at finding unnecessary `and` and `or` instructions.
|
|
This commit lets the type optimization pass work across functions,
tracking return and argument types to eliminate redundant tests.
|
|
Do the optimizations of bs_put* instructions in beam_ssa_opt
and remove the beam_bs pass. This can lead to a slight improvement
of compilation times.
|
|
Share code for semantically equivalent blocks referred to to by `br`
and `switch` instructions.
A similar optimization is done in `beam_jump`, but doing it here as
well is beneficial as it may enable other optimizations. Also, if
there are many semantically equivalent clauses, this optimization can
substanstially decrease compilation times.
|
|
Add the `diffable` option to produce a more diff-friendly output in
a .S file. In a diffable .S file, label numbers starts from 1 in each
function and local function calls use function names instead of labels.
scripts/diffable produces diff-friendly files but only for a hard-coded
sub set of files in OTP. Having the option directly in the compiler
makes is easier to diff the BEAM code for arbitrary source modules.
|
|
* maint:
compiler: Forward +source flag to epp and fix bug in +deterministic
epp: Allow user to set source name independently of input file name
|
|
The source file path as given to `erlc` was included in an implicit
file attribute inserted by epp, even when the +source flag was
set to something else which was a bit surprising. It was also
included when +deterministic was specified, breaking the flag's
promise.
This commit forwards the +source flag to epp so it inserts the
right information, and if +deterministic is given it will be shaved
to just the base name of the file, guaranteeing the same result
regardless of how the input is reached.
|
|
This commit improves the bit-syntax match optimization pass,
leveraging the new SSA intermediate format to perform much more
aggressive optimizations. Some highlights:
* Watch contexts can be reused even after being passed to a
function or being used in a try block.
* Sub-binaries are no longer eagerly extracted, making it far
easier to keep "happy paths" free from binary creation.
* Trivial wrapper functions no longer disable context reuse.
|
|
Most of the optimizations in beam_dead have been superseded
by the optimizations in beam_ssa_dead.
The forward/1 pass of beam_dead has been moved to beam_jump.
The beam_split pass splits blocks that contain instructions with
non-zero labels. Because there are no optimizations left that optimize
instructions within blocks, beam_block never needs to put such
instructions into blocks in the first place. beam_split also moved
'move' instructions out block to help beam_dead. That is no longer
necessary since beam_dead no longer exists.
|
|
|
|
Sometimes when building a tuple, there is no way to avoid an
extra `move` instruction. Consider this code:
make_tuple(A) -> {ok,A}.
The corresponding BEAM code looks like this:
{test_heap,3,1}.
{put_tuple,2,{x,1}}.
{put,{atom,ok}}.
{put,{x,0}}.
{move,{x,1},{x,0}}.
return.
To avoid overwriting the source register `{x,0}`, a `move`
instruction is necessary.
The problem doesn't exist when building a list:
%% build_list(A) -> [A].
{test_heap,2,1}.
{put_list,{x,0},nil,{x,0}}.
return.
Introduce a new `put_tuple2` instruction that builds a tuple in a
single instruction, so that the `move` instruction can be eliminated:
%% make_tuple(A) -> {ok,A}.
{test_heap,3,1}.
{put_tuple2,{x,0},{list,[{atom,ok},{x,0}]}}.
return.
Note that the BEAM loader already combines `put_tuple` and `put`
instructions into an internal instruction similar to `put_tuple2`.
Therefore the introduction of the new instruction will not speed up
execution of tuple building itself, but it will be less work for
the loader to load the new instruction.
|
|
v3_codegen is replaced by three new passes:
* beam_kernel_to_ssa which translates the Kernel Erlang format
to a new SSA-based intermediate format.
* beam_ssa_pre_codegen which prepares the SSA-based format
for code generation, including register allocation. Registers
are allocated using the linear scan algorithm.
* beam_ssa_codegen which generates BEAM assembly code from the
SSA-based format.
It easier and more effective to optimize the SSA-based format before X
and Y registers have been assigned. The current optimization passes
constantly have to make sure no "holes" in the X register assignments
are created (that is, that no X register becomes undefined that an
allocation instruction depends on).
This commit also introduces the following optimizations:
* Replacing of tuple matching of records with the is_tagged_tuple
instruction. (Replacing beam_record.)
* Sinking of get_tuple_element instructions to just before the first
use of the extracted values. As well as potentially avoiding
extracting tuple elements when they are not actually used on all
executions paths, this optimization could also reduce the number
values that will need to be stored in Y registers. (Similar to
beam_reorder, but more effective.)
* Live optimizations, removing the definition of a variable that is
not subsequently used (provided that the operation has no side
effects), as well strength reduction of binary matching by replacing
the extraction of value from a binary with a skip instruction. (Used
to be done by beam_block, beam_utils, and v3_codegen.)
* Removal of redundant bs_restore2 instructions. (Formerly done
by beam_bs.)
* Type-based optimizations across branches. More effective than
the old beam_type pass that only did type-based optimizations in
basic blocks.
* Optimization of floating point instructions. (Formerly done
by beam_type.)
* Optimization of receive statements to introduce recv_mark and
recv_set instructions. More effective with far fewer restrictions
on what instructions are allowed between creating the reference
and entering the receive statement.
* Common subexpression elimination. (Formerly done by beam_block.)
|
|
As a preparation for replacing v3_codegen with a new code generator,
remove unsafe optimization passes. Especially the older compiler
passes have implicit assumptions about how the code is generated.
Remove the optimizations in beam_block (keep the code that creates
blocks) because they are unsafe. beam_block also calls
beam_utils:live_opt/1, which is unsafe.
Remove beam_type because it calls beam_utils:live_opt/1, and also
because it recalculates the number of heaps words and number of live
registers in allocation instructions, thus potentially hiding bugs in
other passes.
Remove beam_receive because it is unsafe.
Remove beam_record because it is the only remaining user
of beam_utils:anno_defs/1.
Remove beam_reorder because it makes much more sense to run it
as an early SSA-based optimization pass.
Remove the now unused functions in beam_utils:
anno_def/1
delete_annos/1
is_killed_block/2
live_opt/1
usage/3
Note that the following test cases will fail because of the
removed optimizations:
compile_SUITE:optimized_guards/1
compile_SUITE:bc_options/1
receive_SUITE:ref_opt/1
|
|
'john/compiler/fix-deterministic-include-paths/OTP-15204/ERL-679' into maint
* john/compiler/fix-deterministic-include-paths/OTP-15204/ERL-679:
Omit include path debug info for +deterministic builds
|
|
Compiling the same file with different include paths resulted in
different files with the `+deterministic` flag even if everything
but the paths were identical. This was caused by the absolute path
of each include directory being unconditionally included in a
debug information chunk.
This commit fixes this by only including this information in
non-deterministic builds.
|
|
Call test_lib:recompile/1 from init_per_suite/1 instead of
from all/0. That makes it easy to find the log from the
compilation in the log file for the init_per_suite/1 test
case.
|
|
|
|
Keys in map patterns are input variables, not pattern variables.
|
|
Instructions that produce more than one result complicate
optimizations. get_list/3 is one of two instructions that
produce multiple results (get_map_elements/3 is the other).
Introduce the get_hd/2 and get_tl/2 instructions
that return the head and tail of a cons cell, respectively,
and use it internally in all optimization passes.
For efficiency, we still want to use get_list/3 if both
head and tail are used, so we will translate matching pairs
of get_hd and get_tl back to get_list instructions.
|
|
|
|
Conflicts:
lib/compiler/src/beam_listing.erl
|
|
* lukas/compiler/add_to_dis/OTP-14784:
compiler: Add +to_dis option that dumps loaded asm
|
|
|
|
* maint:
Recognize 'deterministic' when given in a -compile() attribute
Conflicts:
lib/compiler/src/beam_asm.erl
|
|
* bjorn/cuddle-with-tests:
Skip compile_SUITE:pre_load_check/1 when code is native-compiled
|
|
The compiler option 'deterministic' was only recognized when given
as an option to the compiler, not when it was specified in a
-compile() attribute in the source file.
https://bugs.erlang.org/browse/ERL-498
|
|
The v3_life pass does not do enough to be worth being its own
pass. Essentially it does two things:
* Calculates life-time information starting from the annotations
that v3_kernel provides. That part can be moved into v3_codegen.
* Rewrites the Kernel Erlang records to similar plain tuples
(for example, #k_cons{hd=Hd,tl=Tl} is rewritten to {cons,Hd,Tl}).
That rewriting is not needed and can be eliminated.
|
|
Tracing won't work on modules that are native-compiled.
|
|
This allows compilers built on top of the compile module
to attach external compilation metadata to the compile_info
chunk.
For example, Erlang uses this chunk to store the compiler
version. Elixir and LFE may augment this by also adding
their own compiler versions, which can be useful when
debugging.
The deterministic option does not affect the user supplied
compile_info. It is therefore the responsibility of external
compilers to guarantee any added information does not violate
the determinsitic option, if such option is supported.
Finally, this code moves the building of the compile_info
options to the compile module instead of beam_asm, moving
all of the option mangling code to a single place.
|
|
Tuple calls is the ability to invoke a function on a tuple
as first argument:
1> Var = dict:new().
{dict,0,16,16,8,80,48,
{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},
{{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]}}}
2> Var:size().
0
This behaviour is considered by most to be undesired and confusing,
especially when it comes to errors. For example, imagine you invoke
"Mod:new()" where a Mod is an atom and you accidentally pass {ok, dict}.
It raises:
{undef,[{ok,new,[{ok,dict}],[]},...]}
As it attempts to invoke ok:new/1, which is really hard to debug
as there is no call to new/1 on the source code.
Furthemore, this behaviour is implemented at the VM level, which
imposes such semantics on all languages running on BEAM.
Since we cannot remove the behaviour above, this proposal makes the
behaviour opt-in with a compiler flag:
-compile(tuple_calls).
This means that, if a codebase relies on this functionality, they
can keep compatibility by adding configuring their build tool to
always use the 'tuple_calls' flag or explicitly on each module.
As long as the compile attribute above is listed, the codebase will
work on old and new Erlang versions alike. The only downside of the
current implementation is that modules compiled on OTP 20 that rely
on 'tuple_calls' will have to be recompiled to run with 'tuple_calls'
on OTP 21+.
|
|
As part of sys_core_fold, variables involved in bit syntax
matching would be annotated when it would be safe for a later
pass to do the delayed sub-binary creation optimization.
An implicit assumption regarding the annotation was that the
code must not be further optimized. That assumption was broken
in 05130e48555891, which introduced a fixpoint iteration
(applying the optimizations until there were no more changes).
That means that a variable could be annotated as safe for
reusing the match context in one iteration, but a later iteration
could rewrite the code in a way that would make the optimization
unsafe.
One way to fix this would be to clear all reuse_for_context
annotations before each iteration. But that would be wasteful.
Instead I chose to fix the problem by moving out the annotation
code to a separate pass (sys_core_bsm) that is run later after
all major optimizations of Core Erlang has been done.
|
|
compile:forms/1,2 is documented to return:
{ok,ModuleName,BinaryOrCode}
However, if one of the options 'from_core', 'from_asm', or
'from_beam' is given, ModuleName will be returned as [].
A worse problem is that is that if one those options are
combined with the 'native' option, compilation will crash.
Correct compile:forms/1,2 to pick up the module name from
the forms provided (either Core Erlang, Beam assembly code,
or a Beam file).
Reported here: https://bugs.erlang.org/browse/ERL-417
|
|
A core dump can't be created with the name 'core' if there is
a directory named 'core'.
Rename the test case from core/1 to core_pp/1 so that the directory
name can be the same as the test case name. It also makes sense
to use a less generic name for the test case.
|
|
Also test other options that turns off certain optimizations or
instruction sets.
|
|
Add a test for utf8 function names
|
|
The test found a bug in v3_kernel_pp which was not
taking into account utf8 atoms. The bug has also
been fixed.
|
|
Remove unused variable warning in compile_SUITE
|
|
|
|
|
|
The new Dbgi chunk returns data in the following format:
{debug_info_v1, Backend, Data}
This allows compilers to store the debug info in different
formats. In order to retrieve a particular format, for
instance, Erlang Abstract Format, one may invoke:
Backend:debug_info(erlang_v1, Module, Data, Opts)
Besides introducing the chunk above, this commit also:
* Changes beam_lib:chunk(Beam, [:abstract_code]) to
read from the new Dbgi chunk while keeping backwards
compatibility with old .beams
* Adds the {debug_info, {Backend, Data}} option to
compile:file/2 and friends that are stored in the
Dbgi chunk. This allows the debug info encryption
mechanism to work across compilers
* Improves dialyzer to work directly on Core Erlang,
allowing languages that do not have the Erlang
Abstract Format to be dialyzer as long as they emit
the new chunk and their backend implementation is
available
Backwards compatibility is kept across the board except
for those calling beam_lib:chunk(Beam, ["Abst"]), as the
old chunk is no longer available. Note however the "Abst"
chunk has always been optional.
Future OTP versions may remove parsing the "Abst" chunk
altogether from beam_lib once Erlang 19 and earlier is no
longer supported.
The current Dialyzer implementation still supports earlier
.beam files and such may also be removed in future versions.
|
|
The test is supposed to compare the Core Erlang code that has been
printed and parsed back. It did compare and print any differences,
but it did not fail when there were differences.
Also fix problems with variable names and maps not comparing
equal when the inliner has been used.
|
|
This allow languages such as Elixir and LFE to attach
extra chunks to the .beam file without having to parse
the beam file after compilation.
This commit also cleans up the interface to beam_asm,
allowing chunks to be passed from the compiler without
a need to change beam_asm API on every new chunk.
|
|
The new chunk stores atoms encoded in UTF-8.
beam_lib has also been modified to handle the new
'utf8_atoms' attribute while the 'atoms' attribute
may be a missing chunk from now on.
The binary_to_atom/2 BIF can now encode any utf8
binary with up to 255 characters.
The list_to_atom/1 BIF can now accept codepoints
higher than 255 with up to 255 characters (thanks
to Björn Gustavsson).
|
|
Add the option 'deterministic' to make it easier to
achieve reproducible builds.
This option omits the {options,...} and {source,...} tuples in
M:module_info(compile), because those options may contain absolute
paths.
The author of ERL-310 suggested that only compiler options that
may contain absolute paths (such as {i,...}) should be excluded. But I
find it confusing to keep only some options.
Alternatives considered: Always omitting this information. Since this
information has been available for a long time, that would probably
break some workflows. As an example that some people care about
{source,...}, 2d785c07fbf9 made it possible to give a compiler option
to set {source,...}.
ERL-310
|
|
Guards should use the more efficient 'test' instructions, not 'bif'
instructions. Add a test to make sure that the optimizations don't
degrade.
We do have to keep an exception list for functions where we can't
replace all 'bif' calls with 'test' instructions. We try to keep
that list a short as practically possible.
|
|
Ensure that correct (not necessarily optimal) code is generated for
Core Erlang code not originating from v3_core.
|