Age | Commit message (Collapse) | Author |
|
Argument order can prevent the delayed sub binary creation.
Here is an example directly from the Efficiency Guide:
non_opt_eq([H|T1], <<H,T2/binary>>) ->
non_opt_eq(T1, T2);
non_opt_eq([_|_], <<_,_/binary>>) ->
false;
non_opt_eq([], <<>>) ->
true.
When compiling with the bin_opt_info option, there will be a
suggestion to change the argument order.
It turns out sys_core_bsm can itself change the order, not the
order of the arguments of themselves, but the order in which
the arguments are matched. Here is how it can be rewritten in
pseudo Core Erlang code:
non_opt_eq(Arg1, Arg2) ->
case < Arg2,Arg1 > of
< <<H1,T2/binary>>, [H2|T1] > when H1 =:= H2 ->
non_opt_eq(T1, T2);
< <<_,T2/binary>ffff>, [_|T1] > ->
false;
< <<>>, [] >> ->
true
end.
When rewritten like this, the bs_start_match2 instruction will be
the first instruction in the function and it will be possible to
store the match context in the same register as the binary
({x,1} in this case) and to delay the creation of sub binaries.
The switching of matching order also enables many other simplifications
in sys_core_bsm, since there is no longer any need to pass the position
of the pattern as an argument.
We will update the Efficiency Guide in a separate branch before the
release of OTP 21.
|
|
|
|
Add some tests cases written when attempting some new optimizations
that turned out to be unsafe.
|
|
Consider a 'case' that exports variables and whose return
value is ignored:
foo(N) ->
case N of
1 ->
Res = one;
2 ->
Res = two
end,
{ok,Res}.
That code will be translated to the following Core Erlang code:
'foo'/1 =
fun (_@c0) ->
let <_@c5,Res> =
case _@c0 of
<1> when 'true' ->
<'one','one'>
<2> when 'true' ->
<'two','two'>
<_@c3> when 'true' ->
primop 'match_fail'({'case_clause',_@c3})
end
in
{'ok',Res}
The exported variables has been rewritten to explicit return
values. Note that the original return value from the 'case' is bound to
the variable _@c5, which is unused.
The corresponding BEAM assembly code looks like this:
{function, foo, 1, 2}.
{label,1}.
{line,[...]}.
{func_info,{atom,t},{atom,foo},1}.
{label,2}.
{test,is_integer,{f,6},[{x,0}]}.
{select_val,{x,0},{f,6},{list,[{integer,2},{f,3},{integer,1},{f,4}]}}.
{label,3}.
{move,{atom,two},{x,1}}.
{move,{atom,two},{x,0}}.
{jump,{f,5}}.
{label,4}.
{move,{atom,one},{x,1}}.
{move,{atom,one},{x,0}}.
{label,5}.
{test_heap,3,2}.
{put_tuple,2,{x,0}}.
{put,{atom,ok}}.
{put,{x,1}}.
return.
{label,6}.
{line,[...]}.
{case_end,{x,0}}.
Because of the test_heap instruction following label 5, the assignment
to {x,0} cannot be optimized away by the passes that optimize BEAM assembly
code.
Refactor the optimizations of 'let' in sys_core_fold to eliminate the
unused variable. Thus:
'foo'/1 =
fun (_@c0) ->
let <Res> =
case _@c0 of
<1> when 'true' ->
'one'
<2> when 'true' ->
'two'
<_@c3> when 'true' ->
primop 'match_fail'({'case_clause',_@c3})
end
in
{'ok',Res}
The resulting BEAM code will look like:
{function, foo, 1, 2}.
{label,1}.
{line,[...]}.
{func_info,{atom,t},{atom,foo},1}.
{label,2}.
{test,is_integer,{f,6},[{x,0}]}.
{select_val,{x,0},{f,6},{list,[{integer,2},{f,3},{integer,1},{f,4}]}}.
{label,3}.
{move,{atom,two},{x,0}}.
{jump,{f,5}}.
{label,4}.
{move,{atom,one},{x,0}}.
{label,5}.
{test_heap,3,1}.
{put_tuple,2,{x,1}}.
{put,{atom,ok}}.
{put,{x,0}}.
{move,{x,1},{x,0}}.
return.
{label,6}.
{line,[...]}.
{case_end,{x,0}}.
|
|
|
|
The type optimizations for is_record and test_arity checked whether
the arity was equal to the size stored in the type information,
which is incorrect since said size is the *minimum* size of the
tuple (as determined by previous instructions) and not its exact
size.
A future patch to the 'master' branch will restore these
optimizations in a safe manner.
|
|
Delay creation of stack frames
|
|
|
|
|
|
01835845579e9 fixed some problems, but introduced a bug where
is_not_used/3 would report that a register was not used when it
in fact was.
|
|
|
|
Add syntax in try/catch to retrieve the stacktrace directly
|
|
|
|
|
|
We used to not care about the number of values returned from the
'after infinity' clause in a receive (because it could never be
executed). It is time to start caring because this will cause problem
when we will soon start to do some more aggressive optimizizations.
|
|
|
|
|
|
This commit adds a new syntax for retrieving the stacktrace
without calling erlang:get_stacktrace/0. That allow us to
deprecate erlang:get_stacktrace/0 and ultimately remove it.
The problem with erlang:get_stacktrace/0 is that it can keep huge
terms in a process for an indefinite time after an exception. The
stacktrace can be huge after a 'function_clause' exception or a failed
call to a BIF or operator, because the arguments for the call will be
included in the stacktrace. For example:
1> catch abs(lists:seq(1, 1000)).
{'EXIT',{badarg,[{erlang,abs,
[[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20|...]],
[]},
{erl_eval,do_apply,6,[{file,"erl_eval.erl"},{line,674}]},
{erl_eval,expr,5,[{file,"erl_eval.erl"},{line,431}]},
{shell,exprs,7,[{file,"shell.erl"},{line,687}]},
{shell,eval_exprs,7,[{file,"shell.erl"},{line,642}]},
{shell,eval_loop,3,[{file,"shell.erl"},{line,627}]}]}}
2> erlang:get_stacktrace().
[{erlang,abs,
[[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,
23,24|...]],
[]},
{erl_eval,do_apply,6,[{file,"erl_eval.erl"},{line,674}]},
{erl_eval,expr,5,[{file,"erl_eval.erl"},{line,431}]},
{shell,exprs,7,[{file,"shell.erl"},{line,687}]},
{shell,eval_exprs,7,[{file,"shell.erl"},{line,642}]},
{shell,eval_loop,3,[{file,"shell.erl"},{line,627}]}]
3>
We can extend the syntax for clauses in try/catch to optionally bind
the stacktrace to a variable.
Here is an example using the current syntax:
try
Expr
catch C:E ->
Stk = erlang:get_stacktrace(),
.
.
.
In the new syntax, it would look like:
try
Expr
catch
C:E:Stk ->
.
.
.
Only a variable (not a pattern) is allowed in the stacktrace position,
to discourage matching of the stacktrace. (Matching would also be
expensive, because the raw format of the stacktrace would have to be
converted to the cooked form before matching.)
Note that:
try
Expr
catch E ->
.
.
.
is a shorthand for:
try
Expr
catch throw:E ->
.
.
.
If the stacktrace is to be retrieved for a throw, the 'throw:'
prefix must be explicitly included:
try
Expr
catch throw:E:Stk ->
.
.
.
|
|
|
|
f9a323d10a9f5d added consistent operand order for equality
comparisons. As a result, beam_dead:turn_op/1 is no longer covered.
We must keep the uncovered lines in beam_dead to ensure that
beam_dead can handle BEAM assembly code from another source than
v3_codegen that might not follow the operand order convention.
The only way to cover the lines is to use BEAM assembly in
the test case.
|
|
|
|
The uncovered line was added in 6753bbcc3fdb0.
|
|
Conflicts:
lib/compiler/src/beam_listing.erl
|
|
* lukas/compiler/add_to_dis/OTP-14784:
compiler: Add +to_dis option that dumps loaded asm
|
|
|
|
* maint:
Recognize 'deterministic' when given in a -compile() attribute
Conflicts:
lib/compiler/src/beam_asm.erl
|
|
* bjorn/cuddle-with-tests:
Skip compile_SUITE:pre_load_check/1 when code is native-compiled
|
|
The compiler option 'deterministic' was only recognized when given
as an option to the compiler, not when it was specified in a
-compile() attribute in the source file.
https://bugs.erlang.org/browse/ERL-498
|
|
The v3_life pass does not do enough to be worth being its own
pass. Essentially it does two things:
* Calculates life-time information starting from the annotations
that v3_kernel provides. That part can be moved into v3_codegen.
* Rewrites the Kernel Erlang records to similar plain tuples
(for example, #k_cons{hd=Hd,tl=Tl} is rewritten to {cons,Hd,Tl}).
That rewriting is not needed and can be eliminated.
|
|
Tracing won't work on modules that are native-compiled.
|
|
* maint:
Fix incorrect internal consistency failure for binary matching code
|
|
4c31fd0b9665 made the merging of match contexts stricter;
in fact, a little bit too strict.
Two match contexts with different number of slots would
be downgraded to the 'term' type. The correct way is to
keep the match context but set the number of slots to the
lowest number of slots of the two match contexts.
https://bugs.erlang.org/browse/ERL-490
|
|
Add compile_info option to compile
OTP-14615
|
|
This allows compilers built on top of the compile module
to attach external compilation metadata to the compile_info
chunk.
For example, Erlang uses this chunk to store the compiler
version. Elixir and LFE may augment this by also adding
their own compiler versions, which can be useful when
debugging.
The deterministic option does not affect the user supplied
compile_info. It is therefore the responsibility of external
compilers to guarantee any added information does not violate
the determinsitic option, if such option is supported.
Finally, this code moves the building of the compile_info
options to the compile module instead of beam_asm, moving
all of the option mangling code to a single place.
|
|
Optimise equality comparisons
|
|
* In both loader and compiler, make sure constants are always the second
operand - many passes of the compiler assume that's always the case.
* In loader rewrite is_eq_exact with same arguments to skip the instruction
and with different constants move one to an x register to maintain
the properly outlined above.
* The same (but in reverse) is done with the is_ne_exact, where we rewrite
to an unconditional jump or add a move to an x register.
* All of the above allow to replace is_eq_exact_fss with is_eq_exact_fyy and
is_ne_exact_fss with is_ne_exact_fSS as those are the only possibilities left.
|
|
The compiler could sometimes emit unnecessary 'move'
instructions in the code for binary matching, for
example for this function:
escape(<<Byte, Rest/bits>>, Pos) when Byte >= 127 ->
escape(Rest, Pos + 1);
escape(<<Byte, Rest/bits>>, Pos) ->
escape(Rest, Pos + Byte);
escape(<<_Rest/bits>>, Pos) ->
Pos.
The generated code would look like this:
{function, escape, 2, 2}.
{label,1}.
{line,[{location,"t.erl",17}]}.
{func_info,{atom,t},{atom,escape},2}.
{label,2}.
{test,bs_start_match2,{f,1},2,[{x,0},0],{x,0}}.
{test,bs_get_integer2,
{f,4},
2,
[{x,0},
{integer,8},
1,
{field_flags,[{anno,[17,{file,"t.erl"}]},unsigned,big]}],
{x,2}}.
{'%',{bin_opt,[17,{file,"t.erl"}]}}.
{move,{x,0},{x,3}}. %% UNECESSARY!
{test,is_ge,{f,3},[{x,2},{integer,127}]}.
{line,[{location,"t.erl",18}]}.
{gc_bif,'+',{f,0},4,[{x,1},{integer,1}],{x,1}}.
{move,{x,3},{x,0}}. %% UNECESSARY!
{call_only,2,{f,2}}.
{label,3}.
{line,[{location,"t.erl",20}]}.
{gc_bif,'+',{f,0},4,[{x,1},{x,2}],{x,1}}.
{move,{x,3},{x,0}}. %% UNECESSARY!
{call_only,2,{f,2}}.
{label,4}.
{move,{x,1},{x,0}}.
return.
The redundant 'move' instructions have been marked.
To avoid the 'move' instructions, we can extend the existing
function is_context_unused/1 in v3_codegen. If v3_codegen can
know that the match context will not be used again, it can reuse
the register for the match context and avoid the extra 'move'
instructions.
https://bugs.erlang.org/browse/ERL-444
|
|
Enhance optimisations in beam_peep
|
|
Remove query keyword residues
|
|
When cleaning selects, it might happen we're left with only one pair. In such
case convert to a regular test + jump.
|
|
query was no longer a keyword since the commit 0dc3a29744bed0b7f483ad72e19773dc0982ea50 2012-11-19.
So the quotes around query, when it used as atom, no needed.
|
|
Conflicts:
OTP_VERSION
|
|
* maint-20:
Updated OTP version
Prepare release
Accept non-binary options as socket-options
Bump version
Fix broken handling of default values in extensions for PER
compiler: Fix live regs update on allocate in validator
Take fail labels into account when determining liveness in block ops
Check for overflow when appending binaries, and error out with system_limit
|
|
'john/compiler/fail-labels-in-blocks-otp-19/ERIERL-48/OTP-14522' into maint-20
* john/compiler/fail-labels-in-blocks-otp-19/ERIERL-48/OTP-14522:
compiler: Fix live regs update on allocate in validator
Take fail labels into account when determining liveness in block ops
|
|
|
|
bjorng/bjorn/compiler/improve-case-opt/ERL-452/OTP-14525
Generalize optimization of "one-armed" cases
|
|
* maint:
sys_core_fold: Fix unsafe optimization of non-variable apply
Correct type specification in ssl:prf/5
|
|
A 'case' expression will force a stack frame (essentially in the same
way as a function call), unless it is at the end of a function.
In sys_core_fold there is an optimization that can optimize one-armed
cases such as:
case Expr of
Pat1 ->
DoSomething;
Pat2 ->
erlang:error(bad)
end,
MoreCode.
Because only one arm of the 'case' can succeed, the code after the
case can be move into the successful arm:
case Expr of
Pat1 ->
DoSomething,
MoreCode;
Pat2 ->
erlang:error(bad)
end.
Thus, the 'case' is at the end of the function and it will no longer
need a stack frame.
However, the optimization in sys_core_fold would not be applied if
there were more than one failing clause such as in this code:
case Expr of
Pat1 ->
DoSomething,
MoreCode;
Pat2 ->
erlang:error(bad);
_ ->
erlang:error(case_clause)
end.
Generalize the optimization to handle any number of failing
clauses at the end of the case.
Reported-by: bugs.erlang.org/browse/ERL-452
|
|
The sys_core_fold pass would do an unsafe "optimization" when an
apply operation did not have a variable in the function position
as in the following example:
> cat test1.core
module 'test1' ['test1'/2]
attributes []
'i'/1 =
fun (_f) -> _f
'test1'/2 =
fun (_f, _x) ->
apply apply 'i'/1 (_f) (_x)
end
> erlc test1.core
no_file: Warning: invalid function call
Reported-by: Mikael Pettersson
|
|
Add stacktrace entries to BIF calls from emulator
|