Age | Commit message (Collapse) | Author |
|
|
|
Add beam_ssa_dead to perform the main optimizations done
by beam_dead:
* Shortcut branches that jump to another block with a branch.
If it can be seen that the second branch will always branch
to a specific block, replace the target of the first branch.
* Combined nested sequences of '=:=' tests and switch
instructions operating on the same variable to a single switch.
Diffing the compiler output, it seems that beam_ssa_dead finds
many more opportunities for optimizations than beam_dead,
although it does not find all opportunities that beam_dead does.
In total, beam_ssa_dead is such improvement over beam_dead that
there is no reason to keep beam_dead as well as beam_ssa_dead.
Note that beam_ssa_dead does not attempt to optimize away
redundant bs_context_binary instructions, because that instruction
will be superseded by new instructions in the near future.
|
|
v3_codegen is replaced by three new passes:
* beam_kernel_to_ssa which translates the Kernel Erlang format
to a new SSA-based intermediate format.
* beam_ssa_pre_codegen which prepares the SSA-based format
for code generation, including register allocation. Registers
are allocated using the linear scan algorithm.
* beam_ssa_codegen which generates BEAM assembly code from the
SSA-based format.
It easier and more effective to optimize the SSA-based format before X
and Y registers have been assigned. The current optimization passes
constantly have to make sure no "holes" in the X register assignments
are created (that is, that no X register becomes undefined that an
allocation instruction depends on).
This commit also introduces the following optimizations:
* Replacing of tuple matching of records with the is_tagged_tuple
instruction. (Replacing beam_record.)
* Sinking of get_tuple_element instructions to just before the first
use of the extracted values. As well as potentially avoiding
extracting tuple elements when they are not actually used on all
executions paths, this optimization could also reduce the number
values that will need to be stored in Y registers. (Similar to
beam_reorder, but more effective.)
* Live optimizations, removing the definition of a variable that is
not subsequently used (provided that the operation has no side
effects), as well strength reduction of binary matching by replacing
the extraction of value from a binary with a skip instruction. (Used
to be done by beam_block, beam_utils, and v3_codegen.)
* Removal of redundant bs_restore2 instructions. (Formerly done
by beam_bs.)
* Type-based optimizations across branches. More effective than
the old beam_type pass that only did type-based optimizations in
basic blocks.
* Optimization of floating point instructions. (Formerly done
by beam_type.)
* Optimization of receive statements to introduce recv_mark and
recv_set instructions. More effective with far fewer restrictions
on what instructions are allowed between creating the reference
and entering the receive statement.
* Common subexpression elimination. (Formerly done by beam_block.)
|
|
* maint:
Fix side-effect optimization when compiling from Core Erlang
Conflicts:
lib/compiler/src/sys_core_fold.erl
|
|
When an expression is only used for its side effects, we try to
remove everything that doesn't tie into a side-effect, but we
went a bit too far when we applied the optimization to funs
defined in such a context. Consider the following:
do letrec 'f'/0 = fun () -> ... whatever ...
in call 'side':'effect'(apply 'f'/0())
'ok'
When f/0 is optimized under the assumption that its return value
is unused, side:effect/1 will be fed the result of the last
side-effecting expression in f/0 instead of its actual result.
https://bugs.erlang.org/browse/ERL-658
Co-authored-by: Björn Gustavsson <[email protected]>
|
|
* maint:
Call test_lib:recompile/1 from init_per_suite/1
beam_debug: Fix printing of floating point registers
|
|
Call test_lib:recompile/1 from init_per_suite/1 instead of
from all/0. That makes it easy to find the log from the
compilation in the log file for the init_per_suite/1 test
case.
|
|
Fold is_function/1,2 during compilation
|
|
|
|
This can often appear in code after inlining some higher-order
functions.
* mark is_function/2 as pure
* track function types in sys_core_fold
* use those types to eval is_function/1,2 at compile-time when possible
|
|
sys_core_fold would crash when attempting to optimize this code:
t() when (#{})#{}->
c.
|
|
Consider a 'case' that exports variables and whose return
value is ignored:
foo(N) ->
case N of
1 ->
Res = one;
2 ->
Res = two
end,
{ok,Res}.
That code will be translated to the following Core Erlang code:
'foo'/1 =
fun (_@c0) ->
let <_@c5,Res> =
case _@c0 of
<1> when 'true' ->
<'one','one'>
<2> when 'true' ->
<'two','two'>
<_@c3> when 'true' ->
primop 'match_fail'({'case_clause',_@c3})
end
in
{'ok',Res}
The exported variables has been rewritten to explicit return
values. Note that the original return value from the 'case' is bound to
the variable _@c5, which is unused.
The corresponding BEAM assembly code looks like this:
{function, foo, 1, 2}.
{label,1}.
{line,[...]}.
{func_info,{atom,t},{atom,foo},1}.
{label,2}.
{test,is_integer,{f,6},[{x,0}]}.
{select_val,{x,0},{f,6},{list,[{integer,2},{f,3},{integer,1},{f,4}]}}.
{label,3}.
{move,{atom,two},{x,1}}.
{move,{atom,two},{x,0}}.
{jump,{f,5}}.
{label,4}.
{move,{atom,one},{x,1}}.
{move,{atom,one},{x,0}}.
{label,5}.
{test_heap,3,2}.
{put_tuple,2,{x,0}}.
{put,{atom,ok}}.
{put,{x,1}}.
return.
{label,6}.
{line,[...]}.
{case_end,{x,0}}.
Because of the test_heap instruction following label 5, the assignment
to {x,0} cannot be optimized away by the passes that optimize BEAM assembly
code.
Refactor the optimizations of 'let' in sys_core_fold to eliminate the
unused variable. Thus:
'foo'/1 =
fun (_@c0) ->
let <Res> =
case _@c0 of
<1> when 'true' ->
'one'
<2> when 'true' ->
'two'
<_@c3> when 'true' ->
primop 'match_fail'({'case_clause',_@c3})
end
in
{'ok',Res}
The resulting BEAM code will look like:
{function, foo, 1, 2}.
{label,1}.
{line,[...]}.
{func_info,{atom,t},{atom,foo},1}.
{label,2}.
{test,is_integer,{f,6},[{x,0}]}.
{select_val,{x,0},{f,6},{list,[{integer,2},{f,3},{integer,1},{f,4}]}}.
{label,3}.
{move,{atom,two},{x,0}}.
{jump,{f,5}}.
{label,4}.
{move,{atom,one},{x,0}}.
{label,5}.
{test_heap,3,1}.
{put_tuple,2,{x,1}}.
{put,{atom,ok}}.
{put,{x,0}}.
{move,{x,1},{x,0}}.
return.
{label,6}.
{line,[...]}.
{case_end,{x,0}}.
|
|
A 'case' expression will force a stack frame (essentially in the same
way as a function call), unless it is at the end of a function.
In sys_core_fold there is an optimization that can optimize one-armed
cases such as:
case Expr of
Pat1 ->
DoSomething;
Pat2 ->
erlang:error(bad)
end,
MoreCode.
Because only one arm of the 'case' can succeed, the code after the
case can be move into the successful arm:
case Expr of
Pat1 ->
DoSomething,
MoreCode;
Pat2 ->
erlang:error(bad)
end.
Thus, the 'case' is at the end of the function and it will no longer
need a stack frame.
However, the optimization in sys_core_fold would not be applied if
there were more than one failing clause such as in this code:
case Expr of
Pat1 ->
DoSomething,
MoreCode;
Pat2 ->
erlang:error(bad);
_ ->
erlang:error(case_clause)
end.
Generalize the optimization to handle any number of failing
clauses at the end of the case.
Reported-by: bugs.erlang.org/browse/ERL-452
|
|
|
|
Moving a fun into a guard may cause code that is not accepted
by beam_validator.
|
|
Add more filename/line number annotations while translating to
Core Erlang in v3_core, and ensure that sys_core_fold retains
existing annotations. The goal is to avoid that sys_core_fold
generate warnings with "no_file" instead of a filename.
|
|
|
|
|
|
?config is ugly and not recommended. Use proplists:get_value/2
instead.
|
|
As a first step to removing the test_server application as
as its own separate application, change the inclusion of
test_server.hrl to an inclusion of ct.hrl and remove the
inclusion of test_server_line.hrl.
|
|
|
|
For tidiness, always place .core files in data directories.
|
|
The .core or .S files that are compiled in the test cases
may lack module_info/0,1 functions, which will cause problems if
we (for example) try to run eprof later. To avoid that problem,
unload each module directly after testing it.
|
|
The 'try' ... 'catch' is problematic. Firstly, if no optimization
is possible, an exception will always be thrown. Secondly, bugs
in the code will go unnoticed.
|
|
|
|
|
|
Introduce access functions to hide the low-level details of how
type information is implemented.
|
|
* bjorn/compiler/dup-bug-fix/OTP-12453:
Teach case_opt/3 to avoid unnecessary building
sys_core_fold: Optimize let statements more aggressively
Suppress warnings for expressions that are assigned to '_'
trace_bif_SUITE: Ensure that a call to time/0 is not removed
|
|
* maint:
Update primary bootstrap
Correct unsafe optimization of '==' and '/='
Conflicts:
bootstrap/lib/compiler/ebin/sys_core_fold.beam
|
|
Since '=:=' is cheaper than '==', the compiler tries to replace
'==' with '=:=' if the result of comparison will be the same.
As an example:
V == {a,b}
can be rewritten to:
V =:= {a,b}
since the literal on the right side contains no numeric values
that '==' would compare differently to '=:='.
With the introduction of maps, we will need to take them into
account. Since the comparison of maps is planned to change in 18.0,
we will be very conservative and only do the optimization if
both keys and values are non-numeric.
|
|
Given this code:
f(S) ->
F0 = F1 = {S,S},
[F0,F1].
case_opt/3 would "optimize" it like this:
f(S) ->
F1 = {S,S},
F0 = {S,S},
[F0,F1].
Similarly, this code:
g({a,_,_}=T) ->
{b,
[_,_] = [T,none],
x}.
would be rewritten to:
g({a,Tmp1,Tmp2}=T) ->
Tmp3 = {a,Tmp1,Tmp2},
{b,
[Tmp3,none],
x}.
where the tuple is rebuilt instead of using the T variable.
Rewrite case_opt/3 to be more careful while optimizing.
|
|
I originally decided that in 'value' context, rewriting a let statement
where the variables were not in the body to a sequence was not worth
it, because the variables would be unused in only one let in a
thousand lets (roughly).
I have reconsidered.
The main reason is that if we do the rewrite, core_lib:is_var_used/2
will be used much more frequently, which will help us to find bugs
in it sooner.
Another reason is that the way letify/2 is currently implemented
with its own calls to core_lib:is_var_used/2 is only safe as long
as all the bindings are independent of each other. We could make
letify/2 smarter, but if we introduce this new optimization there
is no need.
Measuring compilation speed, I have not seen any significant slowdown.
It seems that although core_lib:is_var_used/2 is called much more
frequently, most calls will be fast because is_var_used/2 will quickly
find a use of the variable.
Also add a test case to cover a line opt_guard_try/1 that was
no longer covered.
|
|
* maint:
Update primary bootstrap
Be more careful about map patterns when evalutating element/2
Do not convert map patterns to map expressions
Conflicts:
bootstrap/lib/compiler/ebin/sys_core_fold.beam
lib/compiler/test/match_SUITE.erl
|
|
We must not convert map patterns to map expressions.
|
|
I have spent too much time lately waiting for 'cover' to finish,
so now its time to optimize the running time of the tests suite
in coverage mode.
Basically, when 'cover' is running, the test suites would not
run any tests in parallel. The reason is that using too many
parallel processes when running 'cover' would be slower than
running them sequentially. But those measurements were made
several years ago, and many improvements have been made to
improve the parallelism of the run-time system.
Experimenting with the test_lib:p_run/2 function, I found that
increasing the number of parallel processes would speed up the
self_compile tests cases in compilation_SUITE. The difference
between using 3 processes or 4 processes was slight, though,
so it seems that we should not use more than 4 processes when
running 'cover'.
We don't want to change test_lib:parallel/0, because there is
no way to limit the number of test cases that will be run in
parallel by common_test. However, there as test suites (such as
andor_SUITE) that don't invoke the compiler at run-time. We can
run the cases in such test suites in parallel even if 'cover'
is running.
|
|
The pass sys_core_fold did not correctly handle non-matching patterns in code
such as:
0 = case <<>> of
<<>> -> 0;
a -> 1
end.
Function case_opt_lit/3 is rewritten in two passes to first remove any
non-matching clause and only then potentially remove the related patterns
in each clause.
Reported-by: Ulf Norell
|
|
Boolean case expressions with redundant clauses could make the compiler
crash:
case X == 0 of
false -> no;
false -> no;
true -> yes
end.
Reported-by: Ulf Norell
|
|
In e12b7d5331c58b41db06cadfa4af75b78b62a2b1, a bug was introduced
that would cause case expressions to be evaluated more than once
if there were aliases in the pattern. Example:
X = Y = io:put_chars("some chars"),
{X,Y}
That would be rewritten to code similar to (but in Core Erlang):
X = io:put_chars("some chars"),
X = io:put_chars("some chars"),
{X,Y}
Make sure that we only evalute the expression once by doing a
transformation similar to (but in Core Erlang):
NewVar = io:put_chars("some chars"),
X = NewVar,
Y = NewVar,
{X,Y}
Reported-by: José Valim
Reported-by: Anthony Ramine
|
|
|
|
Case expressions such as:
case {Expr1,Expr} of
{V1,V2} -> ...
end
are already optimized to not actually build the tuple. Generalize
the optimization to avoid building any kind of composite term,
such as:
case {ok,[A,B]} of
{ok,[X,Y]} -> ...
end
We don't expect programmers to write such code directly, but
inlining can produce such code.
We need to be careful about the warnings we produce. If the case
expression is a literal, it is expected that no warnings should be
produced for clauses that don't match. We must make sure that we
continue to suppress those warnings.
|
|
ErrorInfo is documented to be:
{ErrorLine,Module,ErrorDescriptor}
but for some errors with line numbers it would look like:
{Module,ErrorDescriptor}
Ensure that all ErrorInfo tuples have three elements. Use 'none'
instead of a line number:
{none,Module,ErrorDescriptor}
There already are errors that return 'none' when no line number is
available, but that convention was not documented. Mention it in the
documentation.
Also make sure that the compiler will not print 'none' as a line
number in error messages (if the 'report_errors' option is given) as
that looks stupid. That is, when attempting to compile a non-existing
module, the error message should be:
non-existing.erl: no such file or directory
and not:
non-existing.erl:none: no such file or directory
|
|
* nox/fix-seq-opt/OTP-10818:
Add two tests for unused multiple values in effect context
Forbid multiple values in Core Erlang sequence arguments
|
|
|
|
|
|
|
|
|
|
Run testcases in parallel will make the test suite run slightly
faster. Another reason for this change is that we want more testing
of parallel testcase support in common_test.
|
|
opt_guard_try/1 assumed that it was only operating on guards, and
implicitly assumed that any function call was to BIFs without
any side effects.
|
|
|
|
In 3d0f4a3085f11389e5b22d10f96f0cbf08c9337f (an update to conform
with common_test), in all test_lib:recompile(?MODULE) calls, ?MODULE
was changed to the actual name of the module. That would cause
test_lib:recompile/1 to compile the module with the incorrect
compiler options in cloned modules such as record_no_opt_SUITE,
causing worse coverage.
|