Age | Commit message (Collapse) | Author |
|
|
|
Part of the switch optimization done by `ssa_opt_sw` can be better
done in `beam_ssa_type`.
|
|
|
|
Refactor optimization of `Bool =:= true` to make the code
somewhat simpler, as well as potentially apply the optimization
in more cases.
|
|
Always rewriting left-associative andalso/orelse to right-associative
will not change the code (except in very rare cases), but it will
make sure that the transformation is tested.
|
|
The expansion of record field updates, when more than one field is
updated, but not a majority of the fields, will create a sequence of
calls to `erlang:setelement(Index, Value, Tuple)` where Tuple in the
first call is the original record tuple, and in the subsequent calls
Tuple is the result of the previous call. Furthermore, all Index
values are constant positive integers, and the first call to
`setelement` will have the greatest index. Thus all the following
calls do not actually need to test at run-time whether Tuple has type
tuple, nor that the index is within the tuple bounds.
Since OTP R7, the `sys_core_dsetel` pass, run as the very last Core
Erlang pass, has optimized this sequence of `setelement` calls to use
a special destructive version of `setelement` (called
`set_tuple_element`) for all but the very first `setelement` in the
sequence.
It turns out that the presence of the `set_tuple_element` in SSA code
is awkward and can prevent or complicate type analysis and aggressive
optimizations.
Therefore, this commit removes the `sys_core_dsetel` pass and
reimplements it for SSA code. The optimization will be done in the
`beam_ssa_pre_codegen` pass (that is, just before code generation and
after running all other SSA code optimization passes).
In most cases, the resulting BEAM code is identical to previous
code. For a few modules, the BEAM code is actually slightly better,
with smaller stack frames.
|
|
* john/compiler/cuddle-validator:
compiler: Allow disabling SSA passes in -compile() directives
beam_validator: Infer types from result of all type test BIFs
beam_validator: Infer BIF argument types
beam_validator: type_test in BIFs that only fail on invalid types
beam_validator: Simplify complex branches
beam_validator: Explain why verify_get_map wipes dst registers
beam_validator: fconv means we have a number
beam_validator: Simplify update_ne/eq_types
beam_validator: Remove pointless fragility propagation
beam_validator: Misc cosmetic refactoring
|
|
This fixes compiling against older OTP versions with the +r?VSN
options, which often expand to disabling certain SSA passes to
avoid new instructions.
|
|
The compiler will usually optimize these into test instructions,
but they'll nevertheless pop up in some cases.
|
|
If we know that a BIF will badarg unless its arguments have certain
types, we can infer that we have at least those types on success.
Note that we can't do this in the general case as the BIF could
fail for reasons other than bad arguments.
|
|
The sole purpose of inline_SUITE:coverage/1 is to ensure
that all lines are covered in sys_core_inline. Do that
in a cheaper way.
|
|
This test case does not test anything unique that is not
tested by other test cases.
|
|
This makes sure that the SSA optimizations are not essential and
may help to cover more code in beam_ssa_pre_codegen and
beam_ssa_codegen.
|
|
|
|
eb0b8da6e816 started to use a binary instead of a string in
bs_match_string instructions.
Remove a clause that attempts to handle the old form of
bs_match_string from old .S files. This is pointless, because an old
.S file is likely to contain a bs_context_to_binary instruction as
well. The bs_context_to_binary is no longer recognized by
`beam_validator`, so those old .S files will not work anyway.
|
|
|
|
|
|
With the recent update to cover to use the `counters` modules,
there is no longer any reason to limit the number of parallel
processes when running `cover`.
|
|
* bjorn/compiler/beam_ssa_opt:
Make sure that beam_ssa_opt optimizes all functions
|
|
* maint:
Add persistent_term:get(Key, DefaultValue)
Make dialyzer faster for left-associative andalso/orelse expressions
|
|
The function `get_call_order_po/2` did not always include all
functions in a module. The method it used was to first find
all leave functions (functions not calling any other function),
and then find all others functions that called any of the leave
functions either directly or indirectly.
Functions that did not call a leave function (directly or indirectly)
would not be included in the list of functions to optimize.
Reimplement `get_call_order_po/2` to use the standard algorithm
for constructing a list of nodes in reverse postorder and then
reverse that list.
|
|
* bjorn/compiler/improve-move-elimination:
beam_jump: Improve elimination of redundant moves
|
|
https://bugs.erlang.org/browse/ERL-851
|
|
|
|
This change saves about about a minute when running the
cover-compiled compiler tests.
|
|
beam_ssa_type: Propagate the 'none' type from calls
|
|
Consider this pseudo code:
f(...) ->
Val = case Expr of
... ->
... ;
... ->
... ;
... ->
my_abort(something_went_wrong)
end,
%% Here follows code that uses Val.
.
.
.
my_abort(Reason) ->
throw({error,Reason}).
The first two clauses in the case will probably provide some
information about the type of the variable `Var`, information
that would be useful for optimizing the code that follows the
case.
However, the third clause would ruin everything. The call
to `my_abort/1` could return anything, and thus `Val` could
also have any type.
294d66a295f6 introduced module-level type analysis, which will in
general keep track of the return type of a local function
call. However, it does not improve the optimization for this specific
function. When a function never returns, that is, when its type is
`none`, it does not propagate the `none` type, but instead pretends
that the return type is `any`.
This commit extends the handling of functions that don't return to
properly handle the `none` type. Any instructions that directly
follows the function that does not return will be discarded, and the
call will be rewritten to a tail-recursive call.
For this specific example, it means that the type for `Val` deduced
from the first two clauses will be retained and can be used for
optimizing the code after the case.
|
|
|
|
Branches where the state is altered on both success and failure
are hard to follow and require juggling the current state, so this
commit adds a helper function to make it easier.
|
|
|
|
|
|
|
|
set_aliased_type already preserves fragility when updating the type
of a fragile register.
|
|
|
|
Optimize ssa_opt_sink for huge functions
|
|
The ssa_opt_sink optimization of beam_ssa_opt could get very slow
for certain huge functions. 9a190cae9bd7 partly addressed this issue
by terminating the optimization early if there happened to be no
get_tuple_element instructions at all in the function.
This commit addresses the issue more directly by making the dominator
calculation in beam_ssa:dominators/1 more efficient. The same
algorithm as before is used, but it is implemented in a more efficient
way based on the ideas in "A Simple, Fast Dominance Algorithm"
(http://www.hipersoft.rice.edu/grads/publications/dom14.pdf).
As well as being more efficient, the new implementation also gives
an explicit representation of the dominator tree, which makes it
possible to simplify and optimize the ssa_opt_sink optimization.
|
|
Prior to 294d66a295f6c2101fe3c2da630979ad4e736c08 there wasn't much
point to keeping track of tuple element types; they were only known
when we had inserted or extracted values from a tuple, and in
neither case was it likely that we'd extract the same values again.
It makes a lot more sense to do so now that type optimizations are
applied across functions; if we return a tuple it's very likely
that its elements will be extracted soon after, and knowing their
types lets us eliminate more type checks.
Co-authored-by: Björn Gustavsson <[email protected]>
|
|
|
|
|
|
|
|
Reduce compilation times
|
|
A long time ago there was a good idea to run compiled code in a
slave node. Nowadays, not so much.
|
|
test_lib:is_cloned_mod(inline_SUITE) would return true.
|
|
There could be a warning with a `no_file` atom instead of filename
and line number.
|
|
|
|
With the new SSA code passes, the optimized_guard/1 test case has
become really bad at finding unnecessary `and` and `or` instructions.
|
|
Two helper functions in beam_ssa_opt and beam_ssa_dead
are body-recursive for no good reason.
While at it, add some clarifying comments to the functions.
|
|
beam_ssa:def_used/2 is used by beam_ssa_pre_codegen when reserving
Y registers.
Do the following optimizations:
* Use an ordset instead of a gb_set. When the only operation performed
on a set is union/2, an ordset will usually be faster, especially when
the result is an ordset.
* Use a cerl_set instead of a gb_set for the set of all possible
predecessors. cerl_sets is usually faster than gb_sets.
|
|
Avoiding calls usually reduces the size of the stack frame and reduces
register shuffling.
|
|
|