Age | Commit message (Collapse) | Author |
|
https://bugs.erlang.org/browse/ERL-1013
|
|
The compiler would treat the "Unit" of bs_init instructions as
the unit of the result instead of the required unit of the input,
causing is_binary checks to be wrongly optimized away.
|
|
* bjorn/compiler/fix-unsafe-type-inference/OTP-15838:
Fix unsafe negative type inference
# Conflicts:
# lib/compiler/src/beam_ssa_type.erl
|
|
* john/compiler/list_append_type/OTP-15841:
compiler: Fix broken type for erlang:'++'/2
|
|
|
|
The type optimizer pass (`beam_ssa_type`) could make unsafe
negative inferences. That is, incorrectly infer that a variable
could *not* have a particular type.
This bug was found when adding another optimization. It is not
clear how write a failing test case without that added optimization.
|
|
The compiler would not terminate while compiling the following code:
foo(<<N:32>>, Tuple, NewValue) ->
_ = element(N, Tuple),
setelement(N, Tuple, NewValue).
The type analysis pass would attempt to construct a huge list when
attempting analyse the type of `Tuple` after the call to
`setelement/3`.
https://bugs.erlang.org/browse/ERL-948
|
|
Validation could fail when a function that never returned was used
in a try block (see attached test case). It's possible to solve
this without disabling the optimization as the generated code is
sound, but I'm not comfortable making such a large change this
close to the OTP 22 release.
|
|
|
|
* bjorn/compiler/use-lists-types:
beam_ssa_type: Use types from some 'lists' functions
|
|
This commit lets the compiler know about the return
type of some of the functions in the `lists` module.
Knowing about the return will allow the compiler to emit
fewer type test instructions, and also fewer instructions
for throwing `case_clause` or `badmatch` exceptions, thus
producing slightly faster and more compact code.
This change makes the `lists` module a part of the language, but it
could be argued that it already is because several functions
(e.g. `member/2` and `keymember/3`) are implemented in as BIFs in the
runtime system. Therefore, a user cannot simply change the
`lists` module and expect everything to continue working as before.
The compiler will now know the return types for the following
functions:
all/2
any/2
keymember/3
member/2
prefix/2
suffix/2
dropwhile/2
duplicate/2
filter/2
flatten/1
map/2
mapfoldl/3
mapfoldr/3
partition/2
reverse/1
sort/1
splitwith/1
takewhile/1
unzip/1
usort/1
zip/2
zipwith/3
|
|
|
|
|
|
Part of the switch optimization done by `ssa_opt_sw` can be better
done in `beam_ssa_type`.
|
|
|
|
Refactor optimization of `Bool =:= true` to make the code
somewhat simpler, as well as potentially apply the optimization
in more cases.
|
|
The expansion of record field updates, when more than one field is
updated, but not a majority of the fields, will create a sequence of
calls to `erlang:setelement(Index, Value, Tuple)` where Tuple in the
first call is the original record tuple, and in the subsequent calls
Tuple is the result of the previous call. Furthermore, all Index
values are constant positive integers, and the first call to
`setelement` will have the greatest index. Thus all the following
calls do not actually need to test at run-time whether Tuple has type
tuple, nor that the index is within the tuple bounds.
Since OTP R7, the `sys_core_dsetel` pass, run as the very last Core
Erlang pass, has optimized this sequence of `setelement` calls to use
a special destructive version of `setelement` (called
`set_tuple_element`) for all but the very first `setelement` in the
sequence.
It turns out that the presence of the `set_tuple_element` in SSA code
is awkward and can prevent or complicate type analysis and aggressive
optimizations.
Therefore, this commit removes the `sys_core_dsetel` pass and
reimplements it for SSA code. The optimization will be done in the
`beam_ssa_pre_codegen` pass (that is, just before code generation and
after running all other SSA code optimization passes).
In most cases, the resulting BEAM code is identical to previous
code. For a few modules, the BEAM code is actually slightly better,
with smaller stack frames.
|
|
Consider this pseudo code:
f(...) ->
Val = case Expr of
... ->
... ;
... ->
... ;
... ->
my_abort(something_went_wrong)
end,
%% Here follows code that uses Val.
.
.
.
my_abort(Reason) ->
throw({error,Reason}).
The first two clauses in the case will probably provide some
information about the type of the variable `Var`, information
that would be useful for optimizing the code that follows the
case.
However, the third clause would ruin everything. The call
to `my_abort/1` could return anything, and thus `Val` could
also have any type.
294d66a295f6 introduced module-level type analysis, which will in
general keep track of the return type of a local function
call. However, it does not improve the optimization for this specific
function. When a function never returns, that is, when its type is
`none`, it does not propagate the `none` type, but instead pretends
that the return type is `any`.
This commit extends the handling of functions that don't return to
properly handle the `none` type. Any instructions that directly
follows the function that does not return will be discarded, and the
call will be rewritten to a tail-recursive call.
For this specific example, it means that the type for `Val` deduced
from the first two clauses will be retained and can be used for
optimizing the code after the case.
|
|
Prior to 294d66a295f6c2101fe3c2da630979ad4e736c08 there wasn't much
point to keeping track of tuple element types; they were only known
when we had inserted or extracted values from a tuple, and in
neither case was it likely that we'd extract the same values again.
It makes a lot more sense to do so now that type optimizations are
applied across functions; if we return a tuple it's very likely
that its elements will be extracted soon after, and knowing their
types lets us eliminate more type checks.
Co-authored-by: Björn Gustavsson <[email protected]>
|
|
|
|
|
|
To improve compilation times, beam_ssa_type keeps track of variables
that are only used once and don't keep types for those variables. As
currently implemented, it turns to be unsafe. Change it to only keep
track of variables that are only used in the terminator of the block
they are defined in.
https://bugs.erlang.org/browse/ERL-840
|
|
This commit lets the type optimization pass work across functions,
tracking return and argument types to eliminate redundant tests.
|
|
A switch is equivalent to a series of '=:=', so we have to subtract
each value individually from the type. Subtracting a join risks
removing too much type information, and managed to narrow "number"
into "float" in the attached test case.
|
|
|
|
|
|
The beam_ssa_type pass would leave redundant 'succeeded' instructions,
and depend on the live optimization pass to remove them.
Update beam_ssa_type to remove redundant 'succeeded' instructions.
This will not improve the generated code, but will improve compilation
times since it eliminates instructions and variables.
|
|
Improve type optimizations
|
|
Introduce subtraction of types to allow some redundant tests to be
eliminated.
Consider this function:
foo(L) when is_list(L) ->
case L of
[_|_] -> non_empty;
[] -> empty
end.
After entering the body of the function, L is known to be either
a cons cell or nil (otherwise the is_list/1 guard would have failed).
If the L is not a cons cell, it must be nil. Therefore, the test
for nil in the second clause of the case can be eliminated.
Here is the SSA code with some additonal comments for the function
before the optimization:
function t:foo(_0) {
0:
@ssa_bool = bif:is_list _0
br @ssa_bool, label 4, label 3
4:
%% _0 is now a list (cons or nil).
@ssa_bool:8 = is_nonempty_list _0
br @ssa_bool:8, label 9, label 7
9:
ret literal non_empty
7:
%% _0 is not a cons (or we wouldn't be here).
%% Subtracting cons from the previously known type list
%% gives that _0 must be nil.
@ssa_bool:10 = bif:'=:=' _0, literal []
br @ssa_bool:10, label 11, label 6
11:
ret literal empty
6:
_6 = put_tuple literal case_clause, _0
%% t.erl:5
@ssa_ret:12 = call remote (literal erlang):(literal error)/1, _6
ret @ssa_ret:12
3:
_9 = put_list _0, literal []
%% t.erl:4
@ssa_ret:13 = call remote (literal erlang):(literal error)/2, literal function_clause, _9
ret @ssa_ret:13
}
Type subtraction gives us that _0 must be nil in block 7, allowing us to
remove the comparison of _0 with nil. The code for the function can be simplified
to:
function t:foo(_0) {
0:
@ssa_bool = bif:is_list _0
br @ssa_bool, label 4, label 3
4:
@ssa_bool:8 = is_nonempty_list _0
br @ssa_bool:8, label 9, label 11
9:
ret literal non_empty
11:
ret literal empty
3:
_9 = put_list _0, literal []
%% t.erl:4
@ssa_ret:13 = call remote (literal erlang):(literal error)/2, literal function_clause, _9
ret @ssa_ret:13
}
|
|
|
|
When an instruction has been eliminated, the 'succeeded' instruction must
be eliminated.
|
|
The type analysis pass (`beam_ssa_type`) keeps the type information
for all variables that are in scope. For huge functions, the
`join_types/2` function could get really slow when joining two
maps with thousands of variables in each.
Use a conservative approach to discard type information for
variables only used once by a `br` or `switch` in the same
block as the variable is defined in.
This approach reduces the runtime for the `beam_ssa_type` pass
from a few minutes down to few seconds for this module:
https://github.com/aggelgian/cuter/blob/master/src/cuter_binlib.erl
Rejected approach: Calculate liveness information for all variables
and discard type information for any variable that would not be
used again. That turned out to not work because some optimizations
would invalidate the liveness (in particular, substitutions could
lengthen the lifetime for a variable). Trying to update the
liveness information when doing the optimizations would be tricky.
Noticed-by: Kostis Sagonas
|
|
This commit improves the bit-syntax match optimization pass,
leveraging the new SSA intermediate format to perform much more
aggressive optimizations. Some highlights:
* Watch contexts can be reused even after being passed to a
function or being used in a try block.
* Sub-binaries are no longer eagerly extracted, making it far
easier to keep "happy paths" free from binary creation.
* Trivial wrapper functions no longer disable context reuse.
|
|
We chose to refer to variables through their var_name() because we
anticipated the need to annotate them, but it turned out we didn't
really need that, and many things become a lot cleaner if the
entire #b_var{} is used to represent variables.
|
|
In beam_ssa_type, do substitutions similar to what ssa_opt_misc does
to get rid of variables that evaluate to constant values. That
somewhat simplifies the code of beam_ssa_type, and could improve
performance of the compiler since instructions and variables are
eliminated, reducing the amount of work for later passes.
|
|
|
|
Remove the following clause from the `fun` clauses in arith_op_types/2
because it cannot possibly match:
(_, any) -> number;
Here is why it cannot match:
The second argument is the accumulator for lists:foldl/3.
Its initial value is `unknown`.
None of the clauses will update the accumulator to `any`, including
the clause that matches `any` in the first argument -- it will set the
accumulator to `number`.
Thus, the accumulator (second argument) can never be `any` and
the clause can never match. QED.
|
|
|
|
|
|
|
|
v3_codegen is replaced by three new passes:
* beam_kernel_to_ssa which translates the Kernel Erlang format
to a new SSA-based intermediate format.
* beam_ssa_pre_codegen which prepares the SSA-based format
for code generation, including register allocation. Registers
are allocated using the linear scan algorithm.
* beam_ssa_codegen which generates BEAM assembly code from the
SSA-based format.
It easier and more effective to optimize the SSA-based format before X
and Y registers have been assigned. The current optimization passes
constantly have to make sure no "holes" in the X register assignments
are created (that is, that no X register becomes undefined that an
allocation instruction depends on).
This commit also introduces the following optimizations:
* Replacing of tuple matching of records with the is_tagged_tuple
instruction. (Replacing beam_record.)
* Sinking of get_tuple_element instructions to just before the first
use of the extracted values. As well as potentially avoiding
extracting tuple elements when they are not actually used on all
executions paths, this optimization could also reduce the number
values that will need to be stored in Y registers. (Similar to
beam_reorder, but more effective.)
* Live optimizations, removing the definition of a variable that is
not subsequently used (provided that the operation has no side
effects), as well strength reduction of binary matching by replacing
the extraction of value from a binary with a skip instruction. (Used
to be done by beam_block, beam_utils, and v3_codegen.)
* Removal of redundant bs_restore2 instructions. (Formerly done
by beam_bs.)
* Type-based optimizations across branches. More effective than
the old beam_type pass that only did type-based optimizations in
basic blocks.
* Optimization of floating point instructions. (Formerly done
by beam_type.)
* Optimization of receive statements to introduce recv_mark and
recv_set instructions. More effective with far fewer restrictions
on what instructions are allowed between creating the reference
and entering the receive statement.
* Common subexpression elimination. (Formerly done by beam_block.)
|