Age | Commit message (Collapse) | Author |
|
5239eb0c62a9 stopped storing patterns that a variable has matched.
The only tuples that are stored in the type database are tuples
that have been previously constructed.
Therefore, the code in sys_core_fold:case_expand_var/2 and friends
that converts a tuple pattern to a tuple construction can be simplified
to used the stored tuple directly.
|
|
`sys_core_fold` has an optimization of repeated pattern matching.
For example, when a record is matched the first time, the pattern
is remembered. When the same record is matched again, the matching
does not need to be repeated, but the variables bound in the first
matching can be re-used.
It turns out that that there is a name capture problem when the old
inliner is used. The old inliner is used when explicitly inling
certain functions, and by the compiler test suites for testing the
compiler.
The name capture problem could be eliminated by more aggressive
variable renaming when inlining.
But, fortunately, given the new SSA passes, this optimization is no
longer as essential as it used to be. Removing the optimization
turns out to be mostly benefical, leading to a smaller stack
frame in many cases.
Also remove the optimizations of `element/2`, `is_record/3`, and
`setelement/3` from `sys_core_fold`. Because matched patterns are no
longer remembered, those optimizations can very rarely be applied any
more. (Those same optimizations are already done in `beam_ssa_type`.)
|
|
* maint:
Fix unsafe optimization of stack trace building
|
|
The `sys_core_fold` pass of the compiler would optimize
away the building of the stacktrace in code such as:
try
...
catch
C:R:Stk ->
erlang:raise(C, {R,Stk}, Stk)
end
That optimization is unsafe and would cause a crash in a later compiler
pass.
|
|
* maint:
Fix side-effect optimization when compiling from Core Erlang
Conflicts:
lib/compiler/src/sys_core_fold.erl
|
|
bjorng/bjorn/compiler/letrec-side-effect-fix/ERL-658/OTP-15188
Fix side-effect optimization when compiling from Core Erlang
|
|
Rewrite erlang:get_stacktrace calls to primop when safe
|
|
* maint:
Eliminate double computation of next var
beam_validator: Fix false diagnostic for a receive nested in a try
|
|
When an expression is only used for its side effects, we try to
remove everything that doesn't tie into a side-effect, but we
went a bit too far when we applied the optimization to funs
defined in such a context. Consider the following:
do letrec 'f'/0 = fun () -> ... whatever ...
in call 'side':'effect'(apply 'f'/0())
'ok'
When f/0 is optimized under the assumption that its return value
is unused, side:effect/1 will be fed the result of the last
side-effecting expression in f/0 instead of its actual result.
https://bugs.erlang.org/browse/ERL-658
Co-authored-by: Björn Gustavsson <[email protected]>
|
|
My compiler benchmarks on modules with huge functions, show the
next_free_variable_name call to be expensive. It turns out one
of the 3 calls to the function was completely ignored.
|
|
This allows taking advantage of further optimisations, like the
raw_raise instruction for code that can't upgrade (yet) to the
new stacktrace syntax for compatibility reasons.
The rewrite is only done when it is safe - when the get_stacktrace
call is the very first thing the handler does.
|
|
Fold is_function/1,2 during compilation
|
|
|
|
This can often appear in code after inlining some higher-order
functions.
* mark is_function/2 as pure
* track function types in sys_core_fold
* use those types to eval is_function/1,2 at compile-time when possible
|
|
sys_core_fold could do unsafe transformations on the
code from the old inliner (invoked using the compiler
option `{inline,[{F/A}]}` to request inlining of specific
functions).
To explain the bug, let's first look at an example that
sys_core_fold handles correctly. Consider this code:
'foo'/2 =
fun (Arg1,Arg2) ->
let <B> = Arg2
in let <A,B> = <B,Arg1>
in {A,B}
In this example, the lets can be completely eliminated,
since the arguments for the lets are variables (as opposed
to expressions). Since the variable B is rebound in the
inner let, `sys_core_fold` must take special care when
doing the substitutions.
Here is the correct result:
'foo'/2 =
fun (Arg1, Arg2) ->
{Arg2,Arg1}
Consider a slight modifictation of the example:
'bar'/2 =
fun (Arg1,Arg2) ->
let <B> = [Arg2]
in let <A,B> = <B,[Arg1]>
in {A,B}
Here some of the arguments for the lets are expressions, so
the lets must be kept. sys_core_fold does not handle this
example correctly:
'bar'/2 =
fun (Arg1,Arg2) ->
let <B> = [Arg2]
in let <B> = [Arg1]
in {B,B}
In the inner let, the variable A has been eliminated and
replaced with the variable B in the body (the first B in
the tuple). Since the B in the outer let is never used,
the outer let will be eliminated, giving:
'bar'/2 =
fun (Arg1,Arg2) ->
let <B> = [Arg1]
in {B,B}
To handle this example correctly, sys_core_fold must
rename the variable B in the inner let like this to
avoid capturing B:
'bar'/2 =
fun (Arg1,Arg2) ->
let <B> = [Arg2]
in let <NewName> = [Arg1]
in {B,NewName}
(Note: The `v3_kernel` pass alreday handles those examples correctly
in case `sys_core_fold` has been disabled.)
|
|
Rewrite a call of a literal external fun to a direct call
OTP-15044
|
|
Rewrite calls such as:
(fun erlang:abs/1)(-42)
to:
erlang:abs(-42)
While we are at it, also add rewriting of apply/2 with a fixed
number of arguments to a direct call of the fun. For example:
apply(F, [A,B])
would be rewritten to:
F(A, B)
https://bugs.erlang.org/browse/ERL-614
|
|
sys_core_fold would crash when attempting to optimize this code:
t() when (#{})#{}->
c.
|
|
Compile external fun expressions to literals
OTP-15003
|
|
The expressions fun M:F/A, when all elements are literals are also
treated as a literal. Since they have consistent representation and
don't depend on the code currently loaded in the VM, this is safe.
This can provide significant performance improvements in code using such
functions extensively - a full function call to erlang:make_fun/3 is
replaced by a single move instruction and no register shuffling or
saving registers to stack is necessary. Additionally, compound data
types that contain such external functions as elements can be treated as
literals too.
The commit also changes the representation of external funs to be a
valid Erlang syntax and adds support for literal external funs to core
Erlang.
|
|
Use integer variable names instead of atoms in v3_core, sys_core_fold,
and v3_kernel to avoid overflowing the atom table.
It is a deliberate design decision to calculate the first free integer
variable name (in sys_core_fold and v3_kernel) instead of somehow
passing it from one pass to another. I don't want that kind of
dependency between compiler passes. Also note that the next free
variable name is not easily available after running the inliner.
|
|
|
|
Consider the following function:
function({function,Name,Arity,CLabel,Is0}, Lc0) ->
try
%% Optimize the code for the function.
catch
Class:Error:Stack ->
io:format("Function: ~w/~w\n", [Name,Arity]),
erlang:raise(Class, Error, Stack)
end.
The stacktrace is retrieved, but it is only used in the call
to erlang:raise/3. There is no need to build a stacktrace
in this function. We can avoid the building if we introduce
an instruction called raw_raise/3 that works exactly like
the erlang:raise/3 BIF except that its third argument must
be a raw stacktrace.
|
|
Consider a 'case' that exports variables and whose return
value is ignored:
foo(N) ->
case N of
1 ->
Res = one;
2 ->
Res = two
end,
{ok,Res}.
That code will be translated to the following Core Erlang code:
'foo'/1 =
fun (_@c0) ->
let <_@c5,Res> =
case _@c0 of
<1> when 'true' ->
<'one','one'>
<2> when 'true' ->
<'two','two'>
<_@c3> when 'true' ->
primop 'match_fail'({'case_clause',_@c3})
end
in
{'ok',Res}
The exported variables has been rewritten to explicit return
values. Note that the original return value from the 'case' is bound to
the variable _@c5, which is unused.
The corresponding BEAM assembly code looks like this:
{function, foo, 1, 2}.
{label,1}.
{line,[...]}.
{func_info,{atom,t},{atom,foo},1}.
{label,2}.
{test,is_integer,{f,6},[{x,0}]}.
{select_val,{x,0},{f,6},{list,[{integer,2},{f,3},{integer,1},{f,4}]}}.
{label,3}.
{move,{atom,two},{x,1}}.
{move,{atom,two},{x,0}}.
{jump,{f,5}}.
{label,4}.
{move,{atom,one},{x,1}}.
{move,{atom,one},{x,0}}.
{label,5}.
{test_heap,3,2}.
{put_tuple,2,{x,0}}.
{put,{atom,ok}}.
{put,{x,1}}.
return.
{label,6}.
{line,[...]}.
{case_end,{x,0}}.
Because of the test_heap instruction following label 5, the assignment
to {x,0} cannot be optimized away by the passes that optimize BEAM assembly
code.
Refactor the optimizations of 'let' in sys_core_fold to eliminate the
unused variable. Thus:
'foo'/1 =
fun (_@c0) ->
let <Res> =
case _@c0 of
<1> when 'true' ->
'one'
<2> when 'true' ->
'two'
<_@c3> when 'true' ->
primop 'match_fail'({'case_clause',_@c3})
end
in
{'ok',Res}
The resulting BEAM code will look like:
{function, foo, 1, 2}.
{label,1}.
{line,[...]}.
{func_info,{atom,t},{atom,foo},1}.
{label,2}.
{test,is_integer,{f,6},[{x,0}]}.
{select_val,{x,0},{f,6},{list,[{integer,2},{f,3},{integer,1},{f,4}]}}.
{label,3}.
{move,{atom,two},{x,0}}.
{jump,{f,5}}.
{label,4}.
{move,{atom,one},{x,0}}.
{label,5}.
{test_heap,3,1}.
{put_tuple,2,{x,1}}.
{put,{atom,ok}}.
{put,{x,0}}.
{move,{x,1},{x,0}}.
return.
{label,6}.
{line,[...]}.
{case_end,{x,0}}.
|
|
Improve handling of #c_seq{}, making sure to simplify a #c_seq{}
as much as possible. With that improvement, we can remove some
special-case code from opt_simple_let_2/6.
|
|
|
|
|
|
|
|
Add syntax in try/catch to retrieve the stacktrace directly
|
|
We used to not care about the number of values returned from the
'after infinity' clause in a receive (because it could never be
executed). It is time to start caring because this will cause problem
when we will soon start to do some more aggressive optimizizations.
|
|
This commit adds a new syntax for retrieving the stacktrace
without calling erlang:get_stacktrace/0. That allow us to
deprecate erlang:get_stacktrace/0 and ultimately remove it.
The problem with erlang:get_stacktrace/0 is that it can keep huge
terms in a process for an indefinite time after an exception. The
stacktrace can be huge after a 'function_clause' exception or a failed
call to a BIF or operator, because the arguments for the call will be
included in the stacktrace. For example:
1> catch abs(lists:seq(1, 1000)).
{'EXIT',{badarg,[{erlang,abs,
[[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20|...]],
[]},
{erl_eval,do_apply,6,[{file,"erl_eval.erl"},{line,674}]},
{erl_eval,expr,5,[{file,"erl_eval.erl"},{line,431}]},
{shell,exprs,7,[{file,"shell.erl"},{line,687}]},
{shell,eval_exprs,7,[{file,"shell.erl"},{line,642}]},
{shell,eval_loop,3,[{file,"shell.erl"},{line,627}]}]}}
2> erlang:get_stacktrace().
[{erlang,abs,
[[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,
23,24|...]],
[]},
{erl_eval,do_apply,6,[{file,"erl_eval.erl"},{line,674}]},
{erl_eval,expr,5,[{file,"erl_eval.erl"},{line,431}]},
{shell,exprs,7,[{file,"shell.erl"},{line,687}]},
{shell,eval_exprs,7,[{file,"shell.erl"},{line,642}]},
{shell,eval_loop,3,[{file,"shell.erl"},{line,627}]}]
3>
We can extend the syntax for clauses in try/catch to optionally bind
the stacktrace to a variable.
Here is an example using the current syntax:
try
Expr
catch C:E ->
Stk = erlang:get_stacktrace(),
.
.
.
In the new syntax, it would look like:
try
Expr
catch
C:E:Stk ->
.
.
.
Only a variable (not a pattern) is allowed in the stacktrace position,
to discourage matching of the stacktrace. (Matching would also be
expensive, because the raw format of the stacktrace would have to be
converted to the cooked form before matching.)
Note that:
try
Expr
catch E ->
.
.
.
is a shorthand for:
try
Expr
catch throw:E ->
.
.
.
If the stacktrace is to be retrieved for a throw, the 'throw:'
prefix must be explicitly included:
try
Expr
catch throw:E:Stk ->
.
.
.
|
|
Rewrite a catch expression like this:
catch side_effect(),
...
to:
try
side_effect()
catch
_:_ ->
ok
end,
...
A try/catch is more efficient since no stack trace will be built
when an exception occurs.
|
|
bjorng/bjorn/compiler/improve-case-opt/ERL-452/OTP-14525
Generalize optimization of "one-armed" cases
|
|
A 'case' expression will force a stack frame (essentially in the same
way as a function call), unless it is at the end of a function.
In sys_core_fold there is an optimization that can optimize one-armed
cases such as:
case Expr of
Pat1 ->
DoSomething;
Pat2 ->
erlang:error(bad)
end,
MoreCode.
Because only one arm of the 'case' can succeed, the code after the
case can be move into the successful arm:
case Expr of
Pat1 ->
DoSomething,
MoreCode;
Pat2 ->
erlang:error(bad)
end.
Thus, the 'case' is at the end of the function and it will no longer
need a stack frame.
However, the optimization in sys_core_fold would not be applied if
there were more than one failing clause such as in this code:
case Expr of
Pat1 ->
DoSomething,
MoreCode;
Pat2 ->
erlang:error(bad);
_ ->
erlang:error(case_clause)
end.
Generalize the optimization to handle any number of failing
clauses at the end of the case.
Reported-by: bugs.erlang.org/browse/ERL-452
|
|
The sys_core_fold pass would do an unsafe "optimization" when an
apply operation did not have a variable in the function position
as in the following example:
> cat test1.core
module 'test1' ['test1'/2]
attributes []
'i'/1 =
fun (_f) -> _f
'test1'/2 =
fun (_f, _x) ->
apply apply 'i'/1 (_f) (_x)
end
> erlc test1.core
no_file: Warning: invalid function call
Reported-by: Mikael Pettersson
|
|
|
|
All keys in an orddict must be unique. sys_core_fold:sub_sub_scope/1
broke that rule. It was probably harmless, but it is better to
avoid such rule violations.
|
|
As part of sys_core_fold, variables involved in bit syntax
matching would be annotated when it would be safe for a later
pass to do the delayed sub-binary creation optimization.
An implicit assumption regarding the annotation was that the
code must not be further optimized. That assumption was broken
in 05130e48555891, which introduced a fixpoint iteration
(applying the optimizations until there were no more changes).
That means that a variable could be annotated as safe for
reusing the match context in one iteration, but a later iteration
could rewrite the code in a way that would make the optimization
unsafe.
One way to fix this would be to clear all reuse_for_context
annotations before each iteration. But that would be wasteful.
Instead I chose to fix the problem by moving out the annotation
code to a separate pass (sys_core_bsm) that is run later after
all major optimizations of Core Erlang has been done.
|
|
Code with huge literal case expressions such as the following would
compile very slowly:
case "Very long literal string (thousands of characters)..." of
.
.
.
end.
The reason is that in the case optimization each character in the
string would be handled individually. Fix this bug by handling
literals all at once.
|
|
The compiler produces poor code for complex guard expressions with andalso/orelse.
Here is an example from the filename module:
-define(IS_DRIVELETTER(Letter),(((Letter >= $A) andalso (Letter =< $Z)) orelse
((Letter >= $a) andalso (Letter =< $z)))).
skip_prefix(Name, false) ->
Name;
skip_prefix([L, DrvSep|Name], DrvSep) when ?IS_DRIVELETTER(L) ->
Name;
skip_prefix(Name, _) ->
Name.
beam_bool fails to simplify the code for the guard, leaving several 'bif'
instructions:
{function, skip_prefix, 2, 49}.
{label,48}.
{line,[{location,"filename.erl",187}]}.
{func_info,{atom,filename},{atom,skip_prefix},2}.
{label,49}.
{test,is_ne_exact,{f,52},[{x,1},{atom,false}]}.
{test,is_nonempty_list,{f,52},[{x,0}]}.
{get_list,{x,0},{x,2},{x,3}}.
{test,is_nonempty_list,{f,52},[{x,3}]}.
{get_list,{x,3},{x,4},{x,5}}.
{bif,'=:=',{f,52},[{x,1},{x,4}],{x,6}}.
{test,is_ge,{f,50},[{x,2},{integer,65}]}.
{bif,'=<',{f,52},[{x,2},{integer,90}],{x,7}}.
{test,is_eq_exact,{f,51},[{x,7},{atom,false}]}.
{test,is_ge,{f,50},[{x,2},{integer,97}]}.
{bif,'=<',{f,52},[{x,2},{integer,122}],{x,7}}.
{jump,{f,51}}.
{label,50}.
{move,{atom,false},{x,7}}.
{label,51}.
{bif,'=:=',{f,52},[{x,7},{atom,true}],{x,7}}.
{test,is_eq_exact,{f,52},[{x,6},{atom,true}]}.
{test,is_eq_exact,{f,52},[{x,7},{atom,true}]}.
{move,{x,5},{x,0}}.
return.
{label,52}.
return.
We can add optimizations of guard tests to v3_kernel to achive a better result:
{function, skip_prefix, 2, 49}.
{label,48}.
{line,[{location,"filename.erl",187}]}.
{func_info,{atom,filename},{atom,skip_prefix},2}.
{label,49}.
{test,is_ne_exact,{f,51},[{x,1},{atom,false}]}.
{test,is_nonempty_list,{f,51},[{x,0}]}.
{get_list,{x,0},{x,2},{x,3}}.
{test,is_nonempty_list,{f,51},[{x,3}]}.
{get_list,{x,3},{x,4},{x,5}}.
{test,is_eq_exact,{f,51},[{x,1},{x,4}]}.
{test,is_ge,{f,51},[{x,2},{integer,65}]}.
{test,is_lt,{f,50},[{integer,90},{x,2}]}.
{test,is_ge,{f,51},[{x,2},{integer,97}]}.
{test,is_ge,{f,51},[{integer,122},{x,2}]}.
{label,50}.
{move,{x,5},{x,0}}.
return.
{label,51}.
return.
Looking at the STDLIB application, there were 112 lines of BIF calls in guards
that beam_bool failed to convert to test instructions. This commit eliminates
all those BIF calls.
Here is how I counted the instructions:
$ PATH=$ERL_TOP/bin:$PATH erlc -I ../include -I ../../kernel/include -S *.erl
$ grep "bif,'[=<>]" *.S | grep -v f,0
dets.S: {bif,'=:=',{f,547},[{x,4},{atom,read_write}],{x,4}}.
dets.S: {bif,'=:=',{f,547},[{x,5},{atom,saved}],{x,5}}.
dets.S: {bif,'=:=',{f,589},[{x,5},{atom,read}],{x,5}}.
.
.
.
$ grep "bif,'[=<>]" *.S | grep -v f,0 | wc
112 224 6765
$
|
|
The fixpoint iteration added in 05130e48 makes those calls
superfluous.
|
|
There are two calls opt_not_in_let(). Since 05130e4855
introduced iteration to a fixpoint, only the first call
is needed. Removing the redundant call will slightly speed
up compilation.
|
|
|
|
* maint:
Don't copy funs into guards
|
|
Funs must not be created in guards. The instruction for creating
a fun clobbers all X registers, which is a bad thing to do in
a guard.
|
|
* maint:
Don't let inline_list_funcs degrade optimizations
|
|
83199af0263 refactored sys_core_fold to break out the code for the
inline_lists_funcs option to its own module. Unfortunately, it also
accidentally turned off compile-time evaluation of calls to BIFs with
wholly or partial constant arguments.
For example, the code for the following funtion gets much worse
when inline_list_funcs is used:
b() ->
R0 = #r{},
R1 = setelement(1+2, R0, "deux"),
R2 = setelement(1+3, R1, "trois"),
R3 = setelement(1+5, R2, "cinq"),
R4 = setelement(1+2, R3, "DEUX"),
R4.
ERL-285
|
|
* josevalim/compiler/at-var/PR-1081/OTP-13924:
Use @ in variable names generated by core and kernel
|
|
Run the optimizations until a fixpoint is reached, or until
the maximum iteration count is reached.
The hope is that in the future we can many small optimizations
instead of optimizations that try to do everything in one pass.
This change allows us to remove the ad-hoc calls to expr/2
to run more optimizations on a piece of code.
|
|
The optimization that avoids building a tuple in a case
expression would not work if any clause matched a tuple
as in the following example:
f(A, B) ->
case {A,B} of
{<<X>>,Y} ->
{X,Y}
end.
The generated Core Erlang code would look like this (note
the tuples in the case expression and the pattern):
'f'/2 =
fun (_cor1,_cor0) ->
case {_cor1,_cor0} of
<{#{#<X>(8,1,'integer',['unsigned'|['big']])}#,Y}>
when 'true' ->
{X,Y}
.
.
.
end
It is expected that the code should look like this (note
that tuples have been replaced with "values"):
'f'/2 =
fun (_cor1,_cor0) ->
%% Line 5
case <_cor1,_cor0> of
<#{#<X>(8,1,'integer',['unsigned'|['big']])}#,Y> ->
{X,Y}
.
.
.
end
While at it, also fix bugs in the handling of pattern with
aliases. The bindings were produced in the wrong order (creating
'let's with referring to free variables), but in most cases
the incorrect bindings were discarded later without causing any
harm.
|