Age | Commit message (Collapse) | Author |
|
If a type only has one clause and if the pattern is literal,
the matching can be done more efficiently by directly comparing
with the literal.
Example:
find(String, "") -> String;
find(String, <<>>) -> String;
find(String, SearchPattern) ->
.
.
.
Without this optimization, the relevant part of the code would look
this:
{test,bs_start_match2,{f,3},2,[{x,1},0],{x,2}}.
{test,bs_test_tail2,{f,4},[{x,2},0]}.
return.
{label,3}.
{test,is_nil,{f,4},[{x,1}]}.
return.
{label,4}.
.
.
.
That is, if {x,1} is a binary, a match context will be built to
test whether {x,1} is an empty binary.
With the optimization, the code will look this:
{test,is_eq_exact,{f,3},[{x,1},{literal,<<>>}]}.
return.
{label,3}.
{test,is_nil,{f,4},[{x,1}]}.
return.
{label,4}.
.
.
.
|
|
Rewrite a catch expression like this:
catch side_effect(),
...
to:
try
side_effect()
catch
_:_ ->
ok
end,
...
A try/catch is more efficient since no stack trace will be built
when an exception occurs.
|
|
Improve the receive optimization to be able to handle code
such as this:
Ref = make_ref(),
try
side_effect()
catch _:_ ->
ok
end,
receive
%% Clauses that all match Ref.
.
.
.
end
|
|
If a try/catch is used to ignore the potential exception caused by some
expression with a side effect such as:
try
side_effect()
catch _Class:_Reason ->
ok
end,
.
.
.
the generated code will be wasteful:
try YReg TryLabel
%% call side_effect() here
try_end Yreg
jump GoodLabel
TryLabel:
try_case YReg
%% try_case has set up registers as follows:
%% x(0) -> error | exit | throw
%% x(1) -> reason
%% x(2) -> raw stack trace data
GoodLabel:
%% This code does not use any x registers.
There will be two code paths that both end up at GoodLabel, and
the try_case instruction will set up registers that are never used.
We can do better by replacing try_case with try_end to obtain this
code:
try YReg TryLabel
%% call side_effect() here
try_end Yreg
jump GoodLabel
TryLabel:
try_end YReg
GoodLabel:
The jump optimizer (beam_jump) can further optimize this code
like this:
try YReg TryLabel
%% call side_effect() here
TryLabel:
try_end YReg
|
|
Optimise size calculation for binary construction
OTP-14654
|
|
* maint:
Fix incorrect internal consistency failure for binary matching code
|
|
4c31fd0b9665 made the merging of match contexts stricter;
in fact, a little bit too strict.
Two match contexts with different number of slots would
be downgraded to the 'term' type. The correct way is to
keep the match context but set the number of slots to the
lowest number of slots of the two match contexts.
https://bugs.erlang.org/browse/ERL-490
|
|
* siri/string-new-api: (28 commits)
hipe (test): Do not use deprecated functions in string(3)
dialyzer (test): Do not use deprecated functions in string(3)
eunit (test): Do not use deprecated functions in string(3)
system (test): Do not use deprecated functions in string(3)
system (test): Do not use deprecated functions in string(3)
mnesia (test): Do not use deprecated functions in string(3)
Deprecate old string functions
observer: Do not use deprecated functions in string(3)
common_test: Do not use deprecated functions in string(3)
eldap: Do not use deprecated functions in string(3)
et: Do not use deprecated functions in string(3)
os_mon: Do not use deprecated functions in string(3)
debugger: Do not use deprecated functions in string(3)
runtime_tools: Do not use deprecated functions in string(3)
asn1: Do not use deprecated functions in string(3)
compiler: Do not use deprecated functions in string(3)
sasl: Do not use deprecated functions in string(3)
reltool: Do not use deprecated functions in string(3)
kernel: Do not use deprecated functions in string(3)
hipe: Do not use deprecated functions in string(3)
...
Conflicts:
lib/eunit/src/eunit_lib.erl
lib/observer/src/crashdump_viewer.erl
lib/reltool/src/reltool_target.erl
|
|
|
|
Add compile_info option to compile
OTP-14615
|
|
This allows compilers built on top of the compile module
to attach external compilation metadata to the compile_info
chunk.
For example, Erlang uses this chunk to store the compiler
version. Elixir and LFE may augment this by also adding
their own compiler versions, which can be useful when
debugging.
The deterministic option does not affect the user supplied
compile_info. It is therefore the responsibility of external
compilers to guarantee any added information does not violate
the determinsitic option, if such option is supported.
Finally, this code moves the building of the compile_info
options to the compile module instead of beam_asm, moving
all of the option mangling code to a single place.
|
|
Optimise equality comparisons
|
|
It turns out it was extremely common to have a following sequence:
move 0 D1
byte_size _ D2
bs_add D1 D2 D
Which is equivalent to just:
byte_size _ D
Similarly another sequence:
move S D1
byte_size _ D2
bs_add D1 D2 D
Can be optimised into:
byte_size _ D2
bs_add S D2 D
Both of those optimisations work with byte_size and bit_size instructions.
|
|
* In both loader and compiler, make sure constants are always the second
operand - many passes of the compiler assume that's always the case.
* In loader rewrite is_eq_exact with same arguments to skip the instruction
and with different constants move one to an x register to maintain
the properly outlined above.
* The same (but in reverse) is done with the is_ne_exact, where we rewrite
to an unconditional jump or add a move to an x register.
* All of the above allow to replace is_eq_exact_fss with is_eq_exact_fyy and
is_ne_exact_fss with is_ne_exact_fSS as those are the only possibilities left.
|
|
The compiler could sometimes emit unnecessary 'move'
instructions in the code for binary matching, for
example for this function:
escape(<<Byte, Rest/bits>>, Pos) when Byte >= 127 ->
escape(Rest, Pos + 1);
escape(<<Byte, Rest/bits>>, Pos) ->
escape(Rest, Pos + Byte);
escape(<<_Rest/bits>>, Pos) ->
Pos.
The generated code would look like this:
{function, escape, 2, 2}.
{label,1}.
{line,[{location,"t.erl",17}]}.
{func_info,{atom,t},{atom,escape},2}.
{label,2}.
{test,bs_start_match2,{f,1},2,[{x,0},0],{x,0}}.
{test,bs_get_integer2,
{f,4},
2,
[{x,0},
{integer,8},
1,
{field_flags,[{anno,[17,{file,"t.erl"}]},unsigned,big]}],
{x,2}}.
{'%',{bin_opt,[17,{file,"t.erl"}]}}.
{move,{x,0},{x,3}}. %% UNECESSARY!
{test,is_ge,{f,3},[{x,2},{integer,127}]}.
{line,[{location,"t.erl",18}]}.
{gc_bif,'+',{f,0},4,[{x,1},{integer,1}],{x,1}}.
{move,{x,3},{x,0}}. %% UNECESSARY!
{call_only,2,{f,2}}.
{label,3}.
{line,[{location,"t.erl",20}]}.
{gc_bif,'+',{f,0},4,[{x,1},{x,2}],{x,1}}.
{move,{x,3},{x,0}}. %% UNECESSARY!
{call_only,2,{f,2}}.
{label,4}.
{move,{x,1},{x,0}}.
return.
The redundant 'move' instructions have been marked.
To avoid the 'move' instructions, we can extend the existing
function is_context_unused/1 in v3_codegen. If v3_codegen can
know that the match context will not be used again, it can reuse
the register for the match context and avoid the extra 'move'
instructions.
https://bugs.erlang.org/browse/ERL-444
|
|
* maint:
Make handling of match contexts stricter
|
|
Enhance optimisations in beam_peep
|
|
beam_validator could fail issue a diagnostic when a register
that was supposed to be a match context was not guaranteed to
be a match context.
The bug was in merging of types. Merging of a match context with
another term would result in a match context. That is wrong. Merging
should produce a more general type, not a narrower type. Also, the
valid slots in two match contexts should be combined with 'band', not
'bor'.
|
|
When cleaning selects, it might happen we're left with only one pair. In such
case convert to a regular test + jump.
|
|
|
|
'john/compiler/fail-labels-in-blocks-otp-18/ERIERL-48/OTP-14522' into maint
* john/compiler/fail-labels-in-blocks-otp-18/ERIERL-48/OTP-14522:
compiler: Fix live regs update on allocate in validator
Take fail labels into account when determining liveness in block ops
Conflicts:
lib/compiler/src/beam_utils.erl
|
|
The state without pruned registers was passed on to test_heap
causing the validator to belive registers that aren't live
actually are live.
|
|
bjorng/bjorn/compiler/improve-case-opt/ERL-452/OTP-14525
Generalize optimization of "one-armed" cases
|
|
Even though, it's not possible to have fall-throughs when entering the otp
pass, it can produce them itself and we're running the pass until fixpoint.
|
|
This makes other optimisations more efficient since we have less labels overall.
|
|
It can happen we have the following situation:
{test,is_tuple,Fail,[R1]}
{test,test_arity,Fail,[R1,N1]}
{get_tuple_element,R1,N2,R2}
{test,is_eq_exaqct,Fail,[R2,Atom]}
{jump,Fail}
Previously, the optimisation would eliminate the last is_eq_exact test, but
we can do more. If the register R2 is not used in Fail, we can eliminate the
get_tuple_element instruction as well as all the preceding tests. Ultimately,
the whole sequence can be replaced by:
{jump,Fail}
|
|
This is especially useful after inlining a function with a case.
Today the compiler would most probably be able to unify all the leafs of the
case during the sharing optimisation, but it would fail to unify the pattern
matching itself.
Naively running the optimisation multiple times wouldn't be able to find the
common code either, because it would differ in jump/fail targets of various
instructions.
To remedy this, after doing each sharing pass we traverse the code backwards
when reversing and update all the jump targets with the new targets that were
discovered during the unification pass. This allows running the optimisation
until fixpoint and makes sure all sharing opportunities will be discovered.
This optimisation also helps with the Elixir's `with/else` construct.
|
|
|
|
* maint:
sys_core_fold: Fix unsafe optimization of non-variable apply
Correct type specification in ssl:prf/5
|
|
A 'case' expression will force a stack frame (essentially in the same
way as a function call), unless it is at the end of a function.
In sys_core_fold there is an optimization that can optimize one-armed
cases such as:
case Expr of
Pat1 ->
DoSomething;
Pat2 ->
erlang:error(bad)
end,
MoreCode.
Because only one arm of the 'case' can succeed, the code after the
case can be move into the successful arm:
case Expr of
Pat1 ->
DoSomething,
MoreCode;
Pat2 ->
erlang:error(bad)
end.
Thus, the 'case' is at the end of the function and it will no longer
need a stack frame.
However, the optimization in sys_core_fold would not be applied if
there were more than one failing clause such as in this code:
case Expr of
Pat1 ->
DoSomething,
MoreCode;
Pat2 ->
erlang:error(bad);
_ ->
erlang:error(case_clause)
end.
Generalize the optimization to handle any number of failing
clauses at the end of the case.
Reported-by: bugs.erlang.org/browse/ERL-452
|
|
|
|
The sys_core_fold pass would do an unsafe "optimization" when an
apply operation did not have a variable in the function position
as in the following example:
> cat test1.core
module 'test1' ['test1'/2]
attributes []
'i'/1 =
fun (_f) -> _f
'test1'/2 =
fun (_f, _x) ->
apply apply 'i'/1 (_f) (_x)
end
> erlc test1.core
no_file: Warning: invalid function call
Reported-by: Mikael Pettersson
|
|
Introduce a new core pass called sys_core_alias
OTP-14505
|
|
The goal of this pass is to find values that are built from
patterns and generate aliases for those values to remove
pressure from the GC. For example, this code:
example({ok, Val}) ->
{ok, Val}.
shall become:
example({ok, Val} = Tuple) ->
Tuple.
Currently this pass aliases tuple and cons nodes made of literals,
variables and other cons. The tuple/cons may appear anywhere in the
pattern and it will be aliased if used later on.
Notice a tuple/cons made only of literals is not aliased as it may
be part of the literal pool.
|
|
Tuple calls is the ability to invoke a function on a tuple
as first argument:
1> Var = dict:new().
{dict,0,16,16,8,80,48,
{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},
{{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]}}}
2> Var:size().
0
This behaviour is considered by most to be undesired and confusing,
especially when it comes to errors. For example, imagine you invoke
"Mod:new()" where a Mod is an atom and you accidentally pass {ok, dict}.
It raises:
{undef,[{ok,new,[{ok,dict}],[]},...]}
As it attempts to invoke ok:new/1, which is really hard to debug
as there is no call to new/1 on the source code.
Furthemore, this behaviour is implemented at the VM level, which
imposes such semantics on all languages running on BEAM.
Since we cannot remove the behaviour above, this proposal makes the
behaviour opt-in with a compiler flag:
-compile(tuple_calls).
This means that, if a codebase relies on this functionality, they
can keep compatibility by adding configuring their build tool to
always use the 'tuple_calls' flag or explicitly on each module.
As long as the compile attribute above is listed, the codebase will
work on old and new Erlang versions alike. The only downside of the
current implementation is that modules compiled on OTP 20 that rely
on 'tuple_calls' will have to be recompiled to run with 'tuple_calls'
on OTP 21+.
|
|
|
|
|
|
All keys in an orddict must be unique. sys_core_fold:sub_sub_scope/1
broke that rule. It was probably harmless, but it is better to
avoid such rule violations.
|
|
* hasse/unicode_atoms/OTP-14285:
compiler: Handle (bad) Unicode parse transform module names
kernel: Improve handling of Unicode filenames
stdlib: Handle Unicode atoms in ms_transform
stdlib: Improve Unicode handling of the Erlang parser
stdlib: Handle unknown compiler options with Unicode
stdlib: Handle Unicode macro names
stdlib: Correct Unicode handling in escript
dialyzer: Improve handling of Unicode
parsetools: Improve handling of Unicode atoms
stdlib: Handle Unicode atoms when formatting stacktraces
stdlib: Add more checks of module names to the linter
stdlib: Handle Unicode atoms better in io_lib_format
stdlib: Handle Unicode atoms in c.erl
|
|
|
|
As part of sys_core_fold, variables involved in bit syntax
matching would be annotated when it would be safe for a later
pass to do the delayed sub-binary creation optimization.
An implicit assumption regarding the annotation was that the
code must not be further optimized. That assumption was broken
in 05130e48555891, which introduced a fixpoint iteration
(applying the optimizations until there were no more changes).
That means that a variable could be annotated as safe for
reusing the match context in one iteration, but a later iteration
could rewrite the code in a way that would make the optimization
unsafe.
One way to fix this would be to clear all reuse_for_context
annotations before each iteration. But that would be wasteful.
Instead I chose to fix the problem by moving out the annotation
code to a separate pass (sys_core_bsm) that is run later after
all major optimizations of Core Erlang has been done.
|
|
compile:forms/1,2 is documented to return:
{ok,ModuleName,BinaryOrCode}
However, if one of the options 'from_core', 'from_asm', or
'from_beam' is given, ModuleName will be returned as [].
A worse problem is that is that if one those options are
combined with the 'native' option, compilation will crash.
Correct compile:forms/1,2 to pick up the module name from
the forms provided (either Core Erlang, Beam assembly code,
or a Beam file).
Reported here: https://bugs.erlang.org/browse/ERL-417
|
|
Make it clear that is_tagged_tuple/4 was added in OTP 20 (not R17).
|
|
Functions that can are known be pure can be evaluated at
compile-time if the arguments are literals and if the result is
expressible as a literal.
list_to_ref/1 and list_to_port/1 returns terms that cannot be
expressed as literals, so the optimization is not possible.
The argument for port_to_list/1 is never a literal, so there is
no way to evaluate it at compile-time. Therefore, marking those
functions as pure serves no useful purpose.
Note: list_to_pid/1 *is* marked as pure, but only so that we can test
the code in sys_core_fold that rejects pure functions that evaluate to
at term that is not possible to express as a literal. It is sufficient
to have one pure function of that kind.
|
|
erlang:hash/2 was removed in c5d9b970fb5b3a71.
|
|
Add a test for utf8 function names
|
|
The test found a bug in v3_kernel_pp which was not
taking into account utf8 atoms. The bug has also
been fixed.
|
|
The undocumented compiler option 'slim' is used when compiling
the primary bootstrap. The purpose is to make the bootstrap smaller
and to avoid unnecessary churn in the git repository. That is,
the BEAM file should be different only if the actual code in the
file is different, and not if it has merely been re-compiled on
a different computer.
Two commits have fattened the 'slim' option. In 36f7087ae0f,
extra chunks are included even in slim BEAM files. In dfb899c0229f7,
the "Dbgi" were added as an extra chunk, causing it to be included
in slim files.
Make 'slim' slim again by only including the essential chunks and
the attribute chunk (as was the case before the {extra,...} option
was added).
|
|
|
|
Introduce new "Dbgi" chunk
OTP-14369
|