Age | Commit message (Collapse) | Author |
|
The compiler could sometimes emit unnecessary 'move'
instructions in the code for binary matching, for
example for this function:
escape(<<Byte, Rest/bits>>, Pos) when Byte >= 127 ->
escape(Rest, Pos + 1);
escape(<<Byte, Rest/bits>>, Pos) ->
escape(Rest, Pos + Byte);
escape(<<_Rest/bits>>, Pos) ->
Pos.
The generated code would look like this:
{function, escape, 2, 2}.
{label,1}.
{line,[{location,"t.erl",17}]}.
{func_info,{atom,t},{atom,escape},2}.
{label,2}.
{test,bs_start_match2,{f,1},2,[{x,0},0],{x,0}}.
{test,bs_get_integer2,
{f,4},
2,
[{x,0},
{integer,8},
1,
{field_flags,[{anno,[17,{file,"t.erl"}]},unsigned,big]}],
{x,2}}.
{'%',{bin_opt,[17,{file,"t.erl"}]}}.
{move,{x,0},{x,3}}. %% UNECESSARY!
{test,is_ge,{f,3},[{x,2},{integer,127}]}.
{line,[{location,"t.erl",18}]}.
{gc_bif,'+',{f,0},4,[{x,1},{integer,1}],{x,1}}.
{move,{x,3},{x,0}}. %% UNECESSARY!
{call_only,2,{f,2}}.
{label,3}.
{line,[{location,"t.erl",20}]}.
{gc_bif,'+',{f,0},4,[{x,1},{x,2}],{x,1}}.
{move,{x,3},{x,0}}. %% UNECESSARY!
{call_only,2,{f,2}}.
{label,4}.
{move,{x,1},{x,0}}.
return.
The redundant 'move' instructions have been marked.
To avoid the 'move' instructions, we can extend the existing
function is_context_unused/1 in v3_codegen. If v3_codegen can
know that the match context will not be used again, it can reuse
the register for the match context and avoid the extra 'move'
instructions.
https://bugs.erlang.org/browse/ERL-444
|
|
* maint:
Make handling of match contexts stricter
|
|
Enhance optimisations in beam_peep
|
|
beam_validator could fail issue a diagnostic when a register
that was supposed to be a match context was not guaranteed to
be a match context.
The bug was in merging of types. Merging of a match context with
another term would result in a match context. That is wrong. Merging
should produce a more general type, not a narrower type. Also, the
valid slots in two match contexts should be combined with 'band', not
'bor'.
|
|
When cleaning selects, it might happen we're left with only one pair. In such
case convert to a regular test + jump.
|
|
|
|
'john/compiler/fail-labels-in-blocks-otp-18/ERIERL-48/OTP-14522' into maint
* john/compiler/fail-labels-in-blocks-otp-18/ERIERL-48/OTP-14522:
compiler: Fix live regs update on allocate in validator
Take fail labels into account when determining liveness in block ops
Conflicts:
lib/compiler/src/beam_utils.erl
|
|
The state without pruned registers was passed on to test_heap
causing the validator to belive registers that aren't live
actually are live.
|
|
bjorng/bjorn/compiler/improve-case-opt/ERL-452/OTP-14525
Generalize optimization of "one-armed" cases
|
|
Even though, it's not possible to have fall-throughs when entering the otp
pass, it can produce them itself and we're running the pass until fixpoint.
|
|
This makes other optimisations more efficient since we have less labels overall.
|
|
It can happen we have the following situation:
{test,is_tuple,Fail,[R1]}
{test,test_arity,Fail,[R1,N1]}
{get_tuple_element,R1,N2,R2}
{test,is_eq_exaqct,Fail,[R2,Atom]}
{jump,Fail}
Previously, the optimisation would eliminate the last is_eq_exact test, but
we can do more. If the register R2 is not used in Fail, we can eliminate the
get_tuple_element instruction as well as all the preceding tests. Ultimately,
the whole sequence can be replaced by:
{jump,Fail}
|
|
This is especially useful after inlining a function with a case.
Today the compiler would most probably be able to unify all the leafs of the
case during the sharing optimisation, but it would fail to unify the pattern
matching itself.
Naively running the optimisation multiple times wouldn't be able to find the
common code either, because it would differ in jump/fail targets of various
instructions.
To remedy this, after doing each sharing pass we traverse the code backwards
when reversing and update all the jump targets with the new targets that were
discovered during the unification pass. This allows running the optimisation
until fixpoint and makes sure all sharing opportunities will be discovered.
This optimisation also helps with the Elixir's `with/else` construct.
|
|
|
|
* maint:
sys_core_fold: Fix unsafe optimization of non-variable apply
Correct type specification in ssl:prf/5
|
|
A 'case' expression will force a stack frame (essentially in the same
way as a function call), unless it is at the end of a function.
In sys_core_fold there is an optimization that can optimize one-armed
cases such as:
case Expr of
Pat1 ->
DoSomething;
Pat2 ->
erlang:error(bad)
end,
MoreCode.
Because only one arm of the 'case' can succeed, the code after the
case can be move into the successful arm:
case Expr of
Pat1 ->
DoSomething,
MoreCode;
Pat2 ->
erlang:error(bad)
end.
Thus, the 'case' is at the end of the function and it will no longer
need a stack frame.
However, the optimization in sys_core_fold would not be applied if
there were more than one failing clause such as in this code:
case Expr of
Pat1 ->
DoSomething,
MoreCode;
Pat2 ->
erlang:error(bad);
_ ->
erlang:error(case_clause)
end.
Generalize the optimization to handle any number of failing
clauses at the end of the case.
Reported-by: bugs.erlang.org/browse/ERL-452
|
|
|
|
The sys_core_fold pass would do an unsafe "optimization" when an
apply operation did not have a variable in the function position
as in the following example:
> cat test1.core
module 'test1' ['test1'/2]
attributes []
'i'/1 =
fun (_f) -> _f
'test1'/2 =
fun (_f, _x) ->
apply apply 'i'/1 (_f) (_x)
end
> erlc test1.core
no_file: Warning: invalid function call
Reported-by: Mikael Pettersson
|
|
Introduce a new core pass called sys_core_alias
OTP-14505
|
|
The goal of this pass is to find values that are built from
patterns and generate aliases for those values to remove
pressure from the GC. For example, this code:
example({ok, Val}) ->
{ok, Val}.
shall become:
example({ok, Val} = Tuple) ->
Tuple.
Currently this pass aliases tuple and cons nodes made of literals,
variables and other cons. The tuple/cons may appear anywhere in the
pattern and it will be aliased if used later on.
Notice a tuple/cons made only of literals is not aliased as it may
be part of the literal pool.
|
|
Tuple calls is the ability to invoke a function on a tuple
as first argument:
1> Var = dict:new().
{dict,0,16,16,8,80,48,
{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},
{{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]}}}
2> Var:size().
0
This behaviour is considered by most to be undesired and confusing,
especially when it comes to errors. For example, imagine you invoke
"Mod:new()" where a Mod is an atom and you accidentally pass {ok, dict}.
It raises:
{undef,[{ok,new,[{ok,dict}],[]},...]}
As it attempts to invoke ok:new/1, which is really hard to debug
as there is no call to new/1 on the source code.
Furthemore, this behaviour is implemented at the VM level, which
imposes such semantics on all languages running on BEAM.
Since we cannot remove the behaviour above, this proposal makes the
behaviour opt-in with a compiler flag:
-compile(tuple_calls).
This means that, if a codebase relies on this functionality, they
can keep compatibility by adding configuring their build tool to
always use the 'tuple_calls' flag or explicitly on each module.
As long as the compile attribute above is listed, the codebase will
work on old and new Erlang versions alike. The only downside of the
current implementation is that modules compiled on OTP 20 that rely
on 'tuple_calls' will have to be recompiled to run with 'tuple_calls'
on OTP 21+.
|
|
|
|
|
|
All keys in an orddict must be unique. sys_core_fold:sub_sub_scope/1
broke that rule. It was probably harmless, but it is better to
avoid such rule violations.
|
|
* hasse/unicode_atoms/OTP-14285:
compiler: Handle (bad) Unicode parse transform module names
kernel: Improve handling of Unicode filenames
stdlib: Handle Unicode atoms in ms_transform
stdlib: Improve Unicode handling of the Erlang parser
stdlib: Handle unknown compiler options with Unicode
stdlib: Handle Unicode macro names
stdlib: Correct Unicode handling in escript
dialyzer: Improve handling of Unicode
parsetools: Improve handling of Unicode atoms
stdlib: Handle Unicode atoms when formatting stacktraces
stdlib: Add more checks of module names to the linter
stdlib: Handle Unicode atoms better in io_lib_format
stdlib: Handle Unicode atoms in c.erl
|
|
|
|
As part of sys_core_fold, variables involved in bit syntax
matching would be annotated when it would be safe for a later
pass to do the delayed sub-binary creation optimization.
An implicit assumption regarding the annotation was that the
code must not be further optimized. That assumption was broken
in 05130e48555891, which introduced a fixpoint iteration
(applying the optimizations until there were no more changes).
That means that a variable could be annotated as safe for
reusing the match context in one iteration, but a later iteration
could rewrite the code in a way that would make the optimization
unsafe.
One way to fix this would be to clear all reuse_for_context
annotations before each iteration. But that would be wasteful.
Instead I chose to fix the problem by moving out the annotation
code to a separate pass (sys_core_bsm) that is run later after
all major optimizations of Core Erlang has been done.
|
|
compile:forms/1,2 is documented to return:
{ok,ModuleName,BinaryOrCode}
However, if one of the options 'from_core', 'from_asm', or
'from_beam' is given, ModuleName will be returned as [].
A worse problem is that is that if one those options are
combined with the 'native' option, compilation will crash.
Correct compile:forms/1,2 to pick up the module name from
the forms provided (either Core Erlang, Beam assembly code,
or a Beam file).
Reported here: https://bugs.erlang.org/browse/ERL-417
|
|
Make it clear that is_tagged_tuple/4 was added in OTP 20 (not R17).
|
|
Functions that can are known be pure can be evaluated at
compile-time if the arguments are literals and if the result is
expressible as a literal.
list_to_ref/1 and list_to_port/1 returns terms that cannot be
expressed as literals, so the optimization is not possible.
The argument for port_to_list/1 is never a literal, so there is
no way to evaluate it at compile-time. Therefore, marking those
functions as pure serves no useful purpose.
Note: list_to_pid/1 *is* marked as pure, but only so that we can test
the code in sys_core_fold that rejects pure functions that evaluate to
at term that is not possible to express as a literal. It is sufficient
to have one pure function of that kind.
|
|
erlang:hash/2 was removed in c5d9b970fb5b3a71.
|
|
Add a test for utf8 function names
|
|
The test found a bug in v3_kernel_pp which was not
taking into account utf8 atoms. The bug has also
been fixed.
|
|
The undocumented compiler option 'slim' is used when compiling
the primary bootstrap. The purpose is to make the bootstrap smaller
and to avoid unnecessary churn in the git repository. That is,
the BEAM file should be different only if the actual code in the
file is different, and not if it has merely been re-compiled on
a different computer.
Two commits have fattened the 'slim' option. In 36f7087ae0f,
extra chunks are included even in slim BEAM files. In dfb899c0229f7,
the "Dbgi" were added as an extra chunk, causing it to be included
in slim files.
Make 'slim' slim again by only including the essential chunks and
the attribute chunk (as was the case before the {extra,...} option
was added).
|
|
|
|
Introduce new "Dbgi" chunk
OTP-14369
|
|
* lukas/erts/list_to_port/OTP-14348:
erts: Add erlang:list_to_port/1 debug bif
erts: Auto-import port_to_list for consistency
erts: Polish off erlang:list_to_ref/1
|
|
|
|
Follow the same pattern as pid_to_list
|
|
By moving to effects_code_generation/1, there is no need
to explicitly remove those options when storing compile
information in the DebugInfo chunk.
|
|
The new Dbgi chunk returns data in the following format:
{debug_info_v1, Backend, Data}
This allows compilers to store the debug info in different
formats. In order to retrieve a particular format, for
instance, Erlang Abstract Format, one may invoke:
Backend:debug_info(erlang_v1, Module, Data, Opts)
Besides introducing the chunk above, this commit also:
* Changes beam_lib:chunk(Beam, [:abstract_code]) to
read from the new Dbgi chunk while keeping backwards
compatibility with old .beams
* Adds the {debug_info, {Backend, Data}} option to
compile:file/2 and friends that are stored in the
Dbgi chunk. This allows the debug info encryption
mechanism to work across compilers
* Improves dialyzer to work directly on Core Erlang,
allowing languages that do not have the Erlang
Abstract Format to be dialyzer as long as they emit
the new chunk and their backend implementation is
available
Backwards compatibility is kept across the board except
for those calling beam_lib:chunk(Beam, ["Abst"]), as the
old chunk is no longer available. Note however the "Abst"
chunk has always been optional.
Future OTP versions may remove parsing the "Abst" chunk
altogether from beam_lib once Erlang 19 and earlier is no
longer supported.
The current Dialyzer implementation still supports earlier
.beam files and such may also be removed in future versions.
|
|
Fixes https://bugs.erlang.org/browse/ERL-406 - a bug introduced in
0377592dc2238f561291be854d2ce859dd9a5fb1
|
|
|
|
The main purpose of these options is compatibility with
old Erlang systems. Since it is no longer possible to
communicate with R15B or earlier, we no longer need the
r12 through r15 options.
|
|
* kill type information only for affected registers in get_map_elements
* bs_get_utf* will produce integers of unicode range
This optimises code created by Elixir compiler, where:
<<x::utf8,_::binary>> when x in 1..10
will compile the guard to
is_integer(X) andalso X >= 1 andalso X =< 10
This allows us to eliminate the is_integer check.
* bs_get_float will produce a float
* allow to carry type information over other bs instructions killing
only the affected registers
* kill only x registers after call_fun and apply instructions
|
|
core_scan will now support and require atoms encoded in UTF-8.
|
|
Rewrite the instruction stream on tagged tuple tests.
Tagged tuples means a tuple of any arity with an atom as its first element.
Typically records, ok-tuples and error-tuples.
from:
...
{test,is_tuple,Fail,[Src]}.
{test,test_arity,Fail,[Src,Sz]}.
...
{get_tuple_element,Src,0,Dst}.
...
{test,is_eq_exact,Fail,[Dst,Atom]}.
...
to:
...
{test,is_tagged_tuple,Fail,[Src,Sz,Atom]}.
...
|
|
Code such as the following:
-record(x, {a}).
f(R, N0) ->
N = N0 / 100,
if element(1, R#x.a) =:= 0 ->
N
end.
would fail to compile with the following message:
m: function f/2+19:
Internal consistency check failed - please report this bug.
Instruction: {fmove,{fr,0},{x,1}}
Error: {uninitialized_reg,{fr,0}}:
This bug was introduced in 348b5e6bee2f.
Basically, the beam_type pass placed the fmove instruction in the
wrong place. Instructions that store to floating point registers and
instructions that read from floating point registers are supposed to
be in the same basic block.
Fix the problem by flushing all floating points instruction
before a call the pseudo-BIF is_record/3, thus making sure that
the fmove instruction is placed in the correct block.
Here is an annotated listing of the relevant part of the .S
file (before the fix):
{test_heap,{alloc,[{words,0},{floats,1}]},2}.
{fconv,{x,1},{fr,0}}.
{fmove,{float,100.0},{fr,1}}.
fclearerror.
{bif,fdiv,{f,0},[{fr,0},{fr,1}],{fr,0}}.
{fcheckerror,{f,0}}.
%% The instruction {fmove,{fr,0},{x,1}} should have
%% been here.
%% Block of instructions expanded from a call to
%% the pseudo-BIF is_record/3. (Expanded in a later
%% compiler pass.)
{test,is_tuple,{f,3},[{x,0}]}.
{test,test_arity,{f,3},[{x,0},2]}.
{get_tuple_element,{x,0},0,{x,2}}.
{test,is_eq_exact,{f,3},[{x,2},{atom,x}]}.
{move,{atom,true},{x,2}}.
{jump,{f,4}}.
{label,3}.
{move,{atom,false},{x,2}}.
{label,4}.
%% End of expansion.
%% The fmove instruction that beam_validator complains
%% about.
{fmove,{fr,0},{x,1}}.
Reported-by: Richard Carlsson
|
|
Binary construction that mixes long literal strings with variables
will make Dialyzer slow. Example:
<<"long string (thousand of characters)",T/binary>>
The string literals in binary construction is translated to one binary
segment per character; all those segments will slow down Dialyzer.
We can speed up Dialyzer if we combine several characters (up to 256)
to a signle segment in the binary. It will also slightly speed up the
compiler.
This optimization will make core listings file with binary strings
harder to read, but they were not that easy to read before this
change.
ERL-308
|
|
This allow languages such as Elixir and LFE to attach
extra chunks to the .beam file without having to parse
the beam file after compilation.
This commit also cleans up the interface to beam_asm,
allowing chunks to be passed from the compiler without
a need to change beam_asm API on every new chunk.
|