Age | Commit message (Collapse) | Author |
|
Move size=all binary clause pruning to v3_kernel
|
|
Tune BEAM instructions for the new compiler (part 1)
|
|
Optimize the beam_ssa_dead sub pass
|
|
The advantage of moving it up is that it reduces the
size of the code emitted by v3_kernel, speeding
v3_kernel itself and beam_kernel_to_ssa pass.
|
|
Prior to this patch, v3_kernel would do multiple
passes on the clauses to group them. This commit
unrolls those passes, making v3_kernel up to 10%
faster in those cases.
|
|
|
|
This is cleaner and slightly faster.
|
|
The general complexity of the shortcut sub pass of `beam_ssa_dead` is
quadratic, but those optimizations will reduce the constant factor
somewhat.
|
|
Refactor the code to avoid putting any variable from a skippable block
into the set of unset variables. Keeping the set of unset variables as
small as possible will make beam_ssa_dead almost twice as fast when
compiling lib/unicode/tokenizer.ex in elixir.
|
|
Consider this code:
foo(X) ->
case X of
{ok,A} -> A;
error -> X
end.
The `is_tagged_tuple` instruction would not be used
because not all instructions in the tuple matching
sequence had the same failure label:
function t:foo(_0) {
0:
@ssa_bool:7 = bif:is_tuple _0
br @ssa_bool:7, label 8, label 4
8:
@ssa_arity = bif:tuple_size _0
@ssa_bool:9 = bif:'=:=' @ssa_arity, literal 2
br @ssa_bool:9, label 6, label 3
6:
_4 = get_tuple_element _0, literal 0
@ssa_bool = bif:'=:=' _4, literal ok
br @ssa_bool, label 5, label 3
5:
_3 = get_tuple_element _0, literal 1
ret _3
4:
@ssa_bool:11 = bif:'=:=' _0, literal error
br @ssa_bool:11, label 10, label 3
10:
ret _0
3:
_2 = put_tuple literal case_clause, _0
%% t.erl:5
@ssa_ret:12 = call remote (literal erlang):(literal error)/1, _2
ret @ssa_ret:12
}
Enhance the ssa_opt_record optimization to use
`is_tagged_tuple` even if all failure labels are not the
same:
function t:foo(_0) {
0:
@ssa_bool:7 = bif:is_tuple _0
br @ssa_bool:7, label 8, label 4
8:
@ssa_bool:9 = is_tagged_tuple _0, literal 2, literal ok
br @ssa_bool:9, label 6, label 3
6:
_3 = get_tuple_element _0, literal 1
ret _3
4:
@ssa_bool:11 = bif:'=:=' _0, literal error
br @ssa_bool:11, label 10, label 3
10:
ret _0
3:
_2 = put_tuple literal case_clause, _0
%% t.erl:5
@ssa_ret:12 = call remote (literal erlang):(literal error)/1, _2
ret @ssa_ret:12
}
The tuple test will be repeated, but since four instructions
are replaced by two instructions, the code will still be faster
and smaller.
|
|
|
|
|
|
We used to cheat by checking if it were possible to meet the Given
and Required types, which caught the most common problems but
potentially let tuple element conflicts pass through.
This was a compromise to let the thing "work" while we were
refactoring the validator, but we can be a lot stricter now that
its type tracking capabilities approach those of the type
optimization pass.
|
|
Building terms with fragile contents is okay because the GC is
disabled during loop_rec, and the resulting term won't be reachable
from the root set afterwards.
ERL-862
|
|
|
|
This is possible now that we track types on a per-value basis, and
no longer need to care whether the source tuple's register has been
clobbered by the time we infer the type.
|
|
|
|
This is a rather subtle but important distinction. While tracking
types on a per-register basis is fairly effective, it forces us to
track which registers alias each other, and makes it tricky to infer
types over large blocks of code as instruction arguments may have
been clobbered between definition and inference.
Tracking types on a per-value basis makes us immune to these
problems.
|
|
|
|
I have no idea how this escaped us for so long...
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Consider the following code:
bme(Int) ->
TagInt = Int band 2#111,
Tag = case TagInt of
0 -> a; 1 -> b; 2 -> c; 3 -> d;
4 -> e; 5 -> f; 6 -> g; 7 -> h
end,
case Tag of
g -> expects_g(TagInt, Tag);
h -> expects_h(TagInt, Tag);
_ -> Tag = id(Tag), ok
end.
expects_g(6, Atom) -> Atom = id(g), ok.
expects_h(7, Atom) -> Atom = id(h), ok.
The type optimization pass would recognize that TagInt can only be
[0 .. 7], so the first 'case' would select_val over [0 .. 6] and swap
out the fail label with the block for 7.
A later optimization would merge this block with 'expects_h' in the
second case, as the latter is only reachable from the former.
... but this broke down when the move elimination optimization didn't
take the fail label of the first select_val into account. This caused it
believe that the only way to reach 'expects_h' was through the second
case when 'Tag' =:= 'h', which made it remove the move instruction
added in the first case, passing garbage to expects_h/2.
|
|
This should be slightly more efficient than converting to/from
lists for large sets.
|
|
* bjorn/compiler/use-lists-types:
beam_ssa_type: Use types from some 'lists' functions
|
|
This commit lets the compiler know about the return
type of some of the functions in the `lists` module.
Knowing about the return will allow the compiler to emit
fewer type test instructions, and also fewer instructions
for throwing `case_clause` or `badmatch` exceptions, thus
producing slightly faster and more compact code.
This change makes the `lists` module a part of the language, but it
could be argued that it already is because several functions
(e.g. `member/2` and `keymember/3`) are implemented in as BIFs in the
runtime system. Therefore, a user cannot simply change the
`lists` module and expect everything to continue working as before.
The compiler will now know the return types for the following
functions:
all/2
any/2
keymember/3
member/2
prefix/2
suffix/2
dropwhile/2
duplicate/2
filter/2
flatten/1
map/2
mapfoldl/3
mapfoldr/3
partition/2
reverse/1
sort/1
splitwith/1
takewhile/1
unzip/1
usort/1
zip/2
zipwith/3
|
|
`sys_core_fold` has an optimization of repeated pattern matching.
For example, when a record is matched the first time, the pattern
is remembered. When the same record is matched again, the matching
does not need to be repeated, but the variables bound in the first
matching can be re-used.
It turns out that that there is a name capture problem when the old
inliner is used. The old inliner is used when explicitly inling
certain functions, and by the compiler test suites for testing the
compiler.
The name capture problem could be eliminated by more aggressive
variable renaming when inlining.
But, fortunately, given the new SSA passes, this optimization is no
longer as essential as it used to be. Removing the optimization
turns out to be mostly benefical, leading to a smaller stack
frame in many cases.
Also remove the optimizations of `element/2`, `is_record/3`, and
`setelement/3` from `sys_core_fold`. Because matched patterns are no
longer remembered, those optimizations can very rarely be applied any
more. (Those same optimizations are already done in `beam_ssa_type`.)
|
|
|
|
For some reason, a `get_tuple_element` instruction was not deemed
suitble for local common sub expression elimination.
It turns out that enabling CSE for `get_tuple_element` is benefical.
It will also be even more benefical in a future commit where some
of the optimizations in `sys_core_fold` are removed.
|
|
* john/compiler/more-validator-cuddling:
beam_validator: Refactor call argument validation
beam_validator: Refactor liveness/stack initialization checks
beam_validator: Refactor try/catch handling
beam_validator: Remember definitions on assignment
beam_validator: Refactor stack trimming
beam_validator: Track definitions of all terms
beam_validator: Remove special handling of map_get/is_map_key
beam_validator: Refactor select_tuple_arity
beam_validator: Treat select_val as a series of '=:='
beam_validator: Treat all bs_get instructions as extractions
beam_validator: Separate BIF/call types more clearly
beam_validator: Assert that no tuple elements are out of bounds
beam_validator: Get rid of the last uses of set_aliased_type
beam_validator: Minor cosmetic refactoring
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Neither can be used for type subtraction, so the default BIF
handler suits them just fine.
|
|
|
|
|
|
While this is strictly only relevant for bs_get_binary2, we should
never build anything while matching a message, so it ought to be
safe to remove this last raw use of propagate_fragility.
|
|
|
|
|
|
Granted, it's replaced with a thin wrapper, but it'll simplify
migration to the new type format.
|