Age | Commit message (Collapse) | Author |
|
|
|
* maint:
Eliminate crash because of unsafe delaying of sub-binary creation
|
|
The following code would fail to compile:
decode(<<Code/integer, Bin/binary>>) ->
<<C1/integer, B1/binary>> = Bin,
case C1 of
X when X =:= 1 orelse X =:= 2 ->
Bin2 = <<>>;
_ ->
Bin2 = B1
end,
case Code of
1 -> decode(Bin2);
_ -> Bin2
end.
The error message would be:
t: function decode/1+28:
Internal consistency check failed - please report this bug.
Instruction: return
Error: {match_context,{x,0}}:
The beam_bsm pass would delay the creation of a sub-binary when it was
unsafe to do so. The culprit was the btb_follow_branch/3 function that
for performance reasons cached labels that had already been checked.
The problem was the safety of a label also depends on the contents
of the registers. Therefore, the key for caching needs to be both
the label and the register contents.
Reported-by: José Valim
|
|
Internally in the v3_core pass, an #imatch{} record represents
a match expression:
Pattern = Expression
If Pattern is a single, unbound variable, #imatch{} will be
rewritten to #iset{}; otherwise it will be rewritten to #icase{}.
To determine how #imatch{} should be translated, the pattern is
processed using upattern/3. The return value from upattern/3 is thrown
away (after having been used for determing how the #imatch{} record
should be translated).
That means that every pattern in an #imatch{} is processed twice,
which is wasteful.
We can easily avoid the double processing of patterns by
introducing a new helper function that determines whether the
pattern is a new variable.
|
|
When manipulating Core Erlang trees it may be useful to perform some
operation when a node is visited, before inspecting children nodes. The
definition of cerl_tree:mapfold/3 does not allow that, as it applies the
given function only after all the recursive calls on the children nodes
have been completed.
This patch adds a new argument to mapfold: a function that is applied when
a node is first entered.
As an example of its use, consider the case where one wants to move a
'call' node earlier, by adding 'let' node and replacing the 'call' node
with the defined variable. The name of that variable must be specified
before one traverses the inner tree (especially if such replacements can be
nested).
|
|
|
|
|
|
Literal maps could cause dialyzer to crash when pretty printing the results.
Reported-by: Chris McGrath <[email protected]>
|
|
* maint:
Fix crash when attempting to update a fun as if it were a map
|
|
The following example would cause an internal consistency
failure in the compiler:
f() -> ok.
update() -> (fun f/0)#{u => 42}.
The reason is that internally, v3_core will (incorrectly)
rewrite update/0 to code similar to this:
update() ->
if
is_map(fun f/0) ->
maps:update(u, 42, fun f/0)
end.
Since funs are not allowed to be created in guards, incorrect and
unsafe code would be generated.
It is easy to fix the bug. There already is a is_valid_map_src/1
function in v3_core that tests whether the argument for the map update
operation can possibly be a valid map. A fun is represented as a
variable with a special name in Core Erlang, so it would not be
recognized as unsafe. All we'll need to do to fix the bug is to look
closer at variables to ensure they don't represent funs. That will
ensure that the code is rewritten in the correct way:
update() ->
error({badmap,fun f/0})
end.
Reported-by: Thomas Arts
|
|
* maint:
Eliminate crash in v3_codegen
|
|
The following code would crash v3_codegen:
order(From) ->
catch
if
From#{[] => sufficient} ->
saint
end.
Before explaining the crash, first some background on the stack
frame and the Y registers.
Certain instructions, most notably the 'call' instructions, clobber
all X registers. Before any such instruction, all X registers that
have values that will be used after the call must be saved to Y
registers (i.e. to the stack frame). adjust_stack/4 will be called
when X registers must be saved.
There is also another situation when X registers must be saved, namely
within a 'catch' if we are about to execute any instruction that may
cause an exception. Examples of such instructions are some guard BIFs
(such as length/1) and construction of binaries or maps. Within a
'catch', X registers must be be saved because if an exception is
thrown and catched all X registers will be destroyed. The same
adjust_stack/4 function will be called for those instructions, but
only if they occur within a 'catch'.
There is actually one more complication. If there is code in
a guard within a catch, the X registers should not be saved, because
the code in a guard never clobbers any X registers that were alive
before the guard code was entered. v3_codegen is written with the
implicit assumption that code in guards never cause anything
to be saved to Y registers.
The code for building maps and binaries would incorrectly save X
registers within a guard inside a 'catch'.
For construction of binaries, that would mean that a useless but
harmelss 'move' instruction was generated.
But for construction of maps, the saving of the Y register would not
be harmless. There would be a crash when attempting to merge #sr{}
records. #sr{} records keeps track of the contents of X and Y
registers. When two separate code paths are joined (e.g. at the end of
'case' statement), the register descriptors must be reconciled.
Basically, the register descriptors for both paths must be identical.
The #sr{} record for one path must not claim that {y,0} contains
a certain value, while another path claims that {y,0} is dead.
Thus, the crash occurs in sr_merge/2 when failing to reconcile the
Y registers.
To fix this bug this bug we will introduce a new function called
maybe_adjust_stack/5. It will save X registers on the stack only
if the code is inside a catch but not inside a guard. We will
change all existing code to use this new function when appropriate.
Reported-by: Thomas Arts
|
|
* maint:
beam_bool: Fix unsafe optimization
|
|
beam_bool would make the following code unsafe (which would be
reported by beam_validator):
scotland(Echo) ->
found(case Echo of
Echo when true; Echo, Echo, Echo ->
Echo;
echo ->
[]
end,
Echo = placed).
found(_, _) -> million.
Basically, beam_bool would see that the 'case' would always return
the value of Echo. Thus:
scotland(Echo) ->
found(Echo, Echo = placed).
The only problem is that beam_bool would also remove a 'move'
instruction that would save Echo to the stack. Here is the
assembly code for part of the function:
{allocate_zero,1,1}.
{move,{x,0},{y,0}}. %% Save Echo on stack.
{bif,'=:=',{f,7},[{x,0},{atom,true}],{x,1}}.
{bif,'=:=',{f,7},[{x,0},{atom,true}],{x,2}}.
{bif,'=:=',{f,7},[{x,0},{atom,true}],{x,3}}.
{bif,'and',{f,7},[{x,2},{x,3}],{x,2}}.
{bif,'and',{f,7},[{x,1},{x,2}],{x,1}}.
{jump,{f,8}}.
{label,7}.
{move,{atom,false},{x,1}}.
{label,8}.
{bif,'or',{f,6},[{atom,true},{x,1}],{x,1}}.
{test,is_eq_exact,{f,6},[{x,1},{atom,true}]}. %% Jump never taken.
{jump,{f,5}}.
{label,6}.
{test,is_eq_exact,{f,9},[{x,0},{atom,echo}]}.
{move,nil,{x,0}}.
{jump,{f,5}}.
{label,9}.
{test_heap,3,0}.
{put_tuple,2,{x,0}}.
{put,{atom,case_clause}}.
{put,{y,0}}.
{line,[{location,"t.erl",5}]}.
{call_ext,1,{extfunc,erlang,error,1}}.
{jump,{f,5}}.
{label,5}.
{test,is_eq_exact,{f,12},[{atom,placed},{y,0}]}.
beam_bool would see that the is_eq_exact test at label 8 would
always succeed. It could therefore remove most of the code before
the jump to label 5. Unfortunately it also removed the essential
move of Echo to the stack:
{allocate_zero,1,1}.
%% Instruction incorrectly removed: {move,{x,0},{y,0}}.
{jump,{f,5}}.
{label,5}.
{test,is_eq_exact,{f,12},[{atom,placed},{y,0}]}.
The root cause of the problem is that the 'move' instruction is
included in the block of 'bif' instructions before label 8.
Normally the 'move' instruction would not have been discarded,
but because the left operand to the 'or' BIF is 'true', the
entire block with 'bif' instructions are dropped.
As far as I can see, there is no gain by including 'move'
instructions in the first place. There is no way that better
code will be produced. In fact, the entire optimization can
be given up if 'move' instructions are found in the block.
Thus we can fix this bug by never including any 'move' instructions
in the block of 'bif' instructions. We can also remove all the
code that deals with 'move' instructions within blocks.
Reported-by: Thomas Arts
|
|
In most cases, we don't have to seed the random number generator,
as the rand:uniform/1 takes care about that itself.
|
|
The 'random' module is used to pad the end of a block with random
bytes. The appropriate function to use in this case
crypto:rand_bytes/1.
|
|
* maint:
Fix missing filename and line number in warning
Conflicts:
lib/compiler/test/bs_match_SUITE.erl
|
|
When the 'bin_opt_info' is given, warnings without filenames
and line numbers could sometimes be produced:
no_file: Warning: INFO: matching non-variables after
a previous clause matching a variable will prevent delayed
sub binary optimization
The reason for the missing information is that #c_alias{} records lack
location information. There are several ways to fix the problem. The
easiest seems to be to get the location information from the
code).
Noticed-by: José Valim
|
|
* bjorn/cleanup:
beam_validator: Don't allow an 'undefined' entry label in a function
beam_validator: Remove obsolete DEBUG support
v3_kernel: Speed up compilation of modules with many funs
beam_dict: Speed up storage of funs
beam_asm: Speed up assembly for modules with many exports
sys_core_dsetel: Use a map instead of a dict
sys_pre_expand: Cover coerce_to_float/2
Cover code for callbacks in sys_pre_expand
Cover sys_pre_expand:pattern/2
sys_pre_expand: Remove uncovered clause in pat_bit_size/2
sys_pre_expand: Clean up data structures
sys_pre_expand: Remove vestiges of variable usage tracking
sys_pre_expand: Remove imports of ordsets functions
sys_pre_expand: Remove unnecessary inclusion of erl_bits.hrl
io: Make a fast code path for i/o requests
|
|
Before 912fea0b beam_validator could validate disassembled files.
That's probably why the entry label was allowed to be 'undefined'.
|
|
No one has used the debug support in many years. Also, the debug
support is not free. There are calls to lists:foreach/2 that will be
executed even when debug support is turned off.
|
|
Using a map to store the number of free variables for funs instead of
an orddict will speed up the v3_kernel pass for modules with a huge
number of funs (such as NBAP-PDU-Contents in the asn1 test suite).
|
|
For huge modules with many funs (such as NBAP-PDU-Contents in the asn1
test suite), the call to length/1 in beam_dict:lambda/3 will dominate
the running time of the beam_asm pass.
|
|
Eliminate searching in the list of exported functions in favor of
using a map. For modules with a huge number of exported functions
(such as NBAP-PDU-Contents in the asn1 test suite), that will mean a
significant speed-up.
|
|
For large modules, a map is significantly faster than a dict.
|
|
|
|
The atom 'all' can never occur in a size field before sys_pre_expand
has been run.
|
|
The handling of non-remote calls is messy, with several lookups
to determine whether the call is local or to some imported module.
We can simplify the code if we keep a map that immediately gives
us the answer. Here is an example of what the map entries look
like:
{f,1} => local
{foldl,3} => {imported,lists}
That is, there should be a local call to f/1 and a remote call
to lists:foldl/3.
Note that there is no longer any need to keep the set of all defined
functions in the state record.
|
|
Before the Core Erlang passes were introduced a long time ago,
sys_pre_expand used to track used and new variables in order to
do lambda lifting (i.e. transform funs into ordinary Erlang
functions). Lambda lifting is now done in v3_kernel.
Remove the few remaining vestiges of variable tracking in the
comments and the code.
|
|
Importing from_list/1 and union/2 from the 'ordsets', while at
the same time making calls explicit calls to the functions with
same name in the 'gb_sets' module is confusing. Make all calls
to 'ordsets' explicit.
|
|
|
|
|
|
|
|
A record field type has been modified due to commit 8ce35b2:
"Take out automatic insertion of 'undefined' from typed record fields".
|
|
c288ab87 added beam_reorder to move get_tuple_element instructions.
Compiling code such as the following would crash the compiler:
alloc(_U1, _U2, R) ->
V = R#alloc.version,
Res = id(V),
_ = id(0),
Res.
The crash would occur because the following two instructions:
{get_tuple_element,{x,2},1,{x,1}}.
{allocate_zero,1,2}.
were swapped and rewritten to:
{allocate_zero,1,1}.
{get_tuple_element,{x,2},1,{x,1}}.
That transformation is not safe because the allocate_zero instruction
would kill {x,2}, which is the register that is holding the reference
to the tuple. Only do the transformation when the tuple reference is
in an x register with a lower number than the destination register.
|
|
There is an optimization in beam_clean that will remove values
having the same label as the failure label in a select_val
instruction. Conceptually, this optimization is in the wrong
module since ideally beam_clean is a mandatory pass that should
not do optimizations. Furthermore, this part of beam_clean is
called three times (from beam_dead, beam_peep, and as a compiler
pass from the 'compile' module), but it only does useful one of
the times it is called.
Therefore, move this optimization to the beam_peep pass.
The same optimization is done in beam_dead, but unfortunately it
misses some opportunities for optimization because the code sharing
optimization in beam_jump (share/1) runs after beam_dead. It would
be more satisfactory to have this optimization only in beam_dead,
but it turned out not to be trivial. If we try to run
beam_jump:share/1 before beam_dead, some optimizations will no
longer work in beam_dead because fallthroughs have been eliminated.
For the moment, the possible solutions to this problem seems to
involve more work and more complicated code than the gain from
eliminating the duplicated optimization would gain.
|
|
There is an optimization in beam_block to simplify a select_val
on a known boolean value. We can implement this optimization
in a cleaner way in beam_type and it will also be applicable
in more situations. (When I added the optimization to beam_type
without removing the optimization from beam_block, the optimization
was applied 66 times.)
|
|
The ASN.1 compiler often generates code similar to:
f(<<0:1,...>>) -> ...;
f(<<1:1,...>>) -> ....
Internally that will be rewritten to (conceptually):
f(<<B:1,Tail/binary>>) ->
case B of
0 ->
case Tail of ... end;
1 ->
case Tail of ... end;
_ ->
error(case_clause)
end.
Since B comes from a bit field of one bit, we know that the only
possible values are 0 and 1. Therefore the error clause can be
eliminated like this:
f(<<B:1,Tail/binary>>) ->
case B of
0 ->
case Tail of ... end;
_ ->
case Tail of ... end
end.
Similarly, we can also a deduce the range for an integer from
a 'band' operation with a literal integer.
While we are at it, also add a test case to improve the coverage.
|
|
The clause cannot possibly match, because there will always be
a {bif,...} clause that will match before reaching the fclearerror
instruction.
|
|
30cc5c90 changed the internal representation of catch and
try...catch, but beam_type was not updated in one place.
|
|
When the bit syntax is used to match a single binary literal, the bit
syntax instructions will be replaced with a comparison to a binary
literal. The only problem is that the bs_context_to_binary instruction
will not be eliminated.
Example:
f(<<"string">>) ->
ok.
This function would be translated to:
{function, f, 1, 2}.
{label,1}.
{line,...}.
{func_info,...}.
{label,2}.
{test,is_eq_exact,{f,3},[{x,0},{literal,<<"string">>}]}.
{move,{atom,ok},{x,0}}.
return.
{label,3}.
{bs_context_to_binary,{x,0}}.
{jump,{f,1}}.
The bs_context_to_binary instruction serves no useful purpose,
since {x,0} can never be a match context. Eliminating the
instruction, the resulting code will be:
{function, f, 1, 2}.
{label,1}.
{line,...}.
{func_info,...}.
{label,2}.
{test,is_eq_exact,{f,1},[{x,0},{literal,<<"string">>}]}.
{move,{atom,ok},{x,0}}.
return.
|
|
In a select_val instruction, values associated with a label
which is the same as the failure label can be removed. We
already do this optimization in beam_clean, but it is better
do this sort of optimization before the beam_jump pass.
Also rewrite a select_val instruction with a single value
to is_eq_exact instruction followed by a jump instruction.
|
|
In the future we might want to add more bit syntax optimizations,
but beam_block is already sufficiently complicated. Therefore, move
the bit syntax optimizations out of beam_block into a separate
compiler pass called beam_bs.
|
|
Knowing that a BIF returns an integer makes it possible to
replace '==' with the cheaper '=:=' test.
|
|
Consider the following function:
f(Bin, Bool) ->
case Bin of
<<Val:16/binary,_/binary>> when Bool ->
Val
end.
Simplified, the generated code looks like:
bs_start_match2 Fail Live Bin => Bin
bs_get_integer2 Fail Live Bin size=Sz unit=1 => Val
bs_skip_bits2 Fail Bin size=all unit=8
is_eq_exact Fail Bool true
The code generator will replace the bs_skip_bits2 instruction with
a bs_test_unit instruction if it can be clearly seen that the
context register will not be used again. In this case, it is not
obvious without looking at the code at the Fail label.
However, it turns out that bs_test_unit instruction is always
safe beacuse of the way v3_kernel compiles pattern matching.
It doesn't matter whether the match context will be used again.
If it will be used again, the position in it will *not* be used.
Instead, a bs_restore2 instruction will restore one of the saved
instructions.
|
|
d0784035ab fixed a problem with register corruption. Because of
that, opt_moves/2 will never be asked to optimize instructions with
more than two destination registers. Therefore, to regain full
coverage of beam_block, remove the final clause in opt_moves/2.
|
|
* bjorn/compiler/remove-deprecated/OTP-12979:
core_lib: Remove previously deprecated functions
|
|
|
|
* c-rack/fix-typo3:
Fix typo in call_last/3 spec
Fix typo
Fix typo: message to send is in x(1) not x(0)
Fix another small typo
Fix typo
|
|
|