Age | Commit message (Collapse) | Author |
|
Most of the optimizations in beam_dead have been superseded
by the optimizations in beam_ssa_dead.
The forward/1 pass of beam_dead has been moved to beam_jump.
The beam_split pass splits blocks that contain instructions with
non-zero labels. Because there are no optimizations left that optimize
instructions within blocks, beam_block never needs to put such
instructions into blocks in the first place. beam_split also moved
'move' instructions out block to help beam_dead. That is no longer
necessary since beam_dead no longer exists.
|
|
v3_codegen is replaced by three new passes:
* beam_kernel_to_ssa which translates the Kernel Erlang format
to a new SSA-based intermediate format.
* beam_ssa_pre_codegen which prepares the SSA-based format
for code generation, including register allocation. Registers
are allocated using the linear scan algorithm.
* beam_ssa_codegen which generates BEAM assembly code from the
SSA-based format.
It easier and more effective to optimize the SSA-based format before X
and Y registers have been assigned. The current optimization passes
constantly have to make sure no "holes" in the X register assignments
are created (that is, that no X register becomes undefined that an
allocation instruction depends on).
This commit also introduces the following optimizations:
* Replacing of tuple matching of records with the is_tagged_tuple
instruction. (Replacing beam_record.)
* Sinking of get_tuple_element instructions to just before the first
use of the extracted values. As well as potentially avoiding
extracting tuple elements when they are not actually used on all
executions paths, this optimization could also reduce the number
values that will need to be stored in Y registers. (Similar to
beam_reorder, but more effective.)
* Live optimizations, removing the definition of a variable that is
not subsequently used (provided that the operation has no side
effects), as well strength reduction of binary matching by replacing
the extraction of value from a binary with a skip instruction. (Used
to be done by beam_block, beam_utils, and v3_codegen.)
* Removal of redundant bs_restore2 instructions. (Formerly done
by beam_bs.)
* Type-based optimizations across branches. More effective than
the old beam_type pass that only did type-based optimizations in
basic blocks.
* Optimization of floating point instructions. (Formerly done
by beam_type.)
* Optimization of receive statements to introduce recv_mark and
recv_set instructions. More effective with far fewer restrictions
on what instructions are allowed between creating the reference
and entering the receive statement.
* Common subexpression elimination. (Formerly done by beam_block.)
|
|
|
|
1a029efd1ad47f started to run the beam_block pass a second time,
but it did not attempt to combine adjacent blocks.
Combining adjacent blocks leads to many more opportunities for
optimizations.
After doing some diffing in generated code, it turns out that
there is no benefit for beam_split to split out line instructions
from blocks. It seems that the only reason it was done was to
slightly simplify the implementation of the no_line_info option
in beam_clean.
|
|
|
|
c2035ebb8b restricted the get_map_elements instruction so that it
could only occur at the beginning of a block. It turns out that
including it anywhere in a block is unsafe.
Therefore, never put get_map_elements instruction in blocks.
(Also remove the beam_utils:join_even/2 function since it is no
longer used.)
ERL-266
|
|
* henrik/update-copyrightyear:
update copyright-year
|
|
Remove the unreachable instructions after a 'raise' instruction
(e.g. a 'jump' or 'deallocate', 'return') to decrease code size.
|
|
|
|
Put 'try' instructions inside block to improve the optimization
of allocation instructions. Currently, the compiler only looks
at initialization of y registers inside blocks when determining
which y registers that will be "naturally" initialized.
|
|
|
|
* beam_utils:joineven/1 -> beam_utils:join_even/1
* beam_utils:split_even/1 -> beam_utils:split_even/1
|
|
No need to check for fail label zero for get_map_elements in beam_split.
get_map_elements is always used in pattern matching and never in a body.
|
|
|
|
* Combine multiple get values with one instruction
* Combine multiple check keys with one instruction
|
|
|
|
To make it possible to build the entire OTP system, also define
dummys for the instructions in ops.tab.
|
|
To facilitate debugging of compiler bugs, teach the compiler the
'no_dead' option. Since the beam_dead pass used to do the necessary
splitting of basic blocks to expose all labels, we must move that
splitting into a separate pass that is always run.
|