Age | Commit message (Collapse) | Author |
|
Strengthen beam_validator to check that the stack is initialized
when an instruction with an {f,0} operand is executed.
For example, the following code sequence:
{allocate,0,1}.
{bif,element,{f,0},[{integer,1},{x,0}],{x,0}}.
should not be accepted because the stack may be scanned if
element/2 fails. That could cause a crash or other undefined
behavior if garbage on the stack looks like a catch tag.
|
|
|
|
beam_validator: Strengthen validation of GC instructions
OTP-14863
|
|
beam_validator did not verify that the Y registers were initialized
before executing the following instructions that could cause a GC:
bs_append/8
bs_init2/6
bs_init_bits/6
gc_bif1/5
gc_bif2/6
gc_bif3/7
test_heap/2
That means that, for example, an incorrect optimization that replaced
an 'allocate_zero' instruction with an 'allocate' instruction when it
was not safe, would not be rejected by beam_validtor, but would
instead cause a crash or other undefined behavior at runtime.
Also fix a minor bug in beam_type exposed by the stronger checking.
When compiling from .S files, beam_type did not handle the
init/1 instruction and could produce unsafe code.
|
|
The type optimizations for is_record and test_arity checked whether
the arity was equal to the size stored in the type information,
which is incorrect since said size is the *minimum* size of the
tuple (as determined by previous instructions) and not its exact
size.
A future patch to the 'master' branch will restore these
optimizations in a safe manner.
|
|
* lukas/compiler/add_to_dis/OTP-14784:
compiler: Add +to_dis option that dumps loaded asm
|
|
|
|
The compiler option 'deterministic' was only recognized when given
as an option to the compiler, not when it was specified in a
-compile() attribute in the source file.
https://bugs.erlang.org/browse/ERL-498
|
|
4c31fd0b9665 made the merging of match contexts stricter;
in fact, a little bit too strict.
Two match contexts with different number of slots would
be downgraded to the 'term' type. The correct way is to
keep the match context but set the number of slots to the
lowest number of slots of the two match contexts.
https://bugs.erlang.org/browse/ERL-490
|
|
beam_validator could fail issue a diagnostic when a register
that was supposed to be a match context was not guaranteed to
be a match context.
The bug was in merging of types. Merging of a match context with
another term would result in a match context. That is wrong. Merging
should produce a more general type, not a narrower type. Also, the
valid slots in two match contexts should be combined with 'band', not
'bor'.
|
|
'john/compiler/fail-labels-in-blocks-otp-18/ERIERL-48/OTP-14522' into maint
* john/compiler/fail-labels-in-blocks-otp-18/ERIERL-48/OTP-14522:
compiler: Fix live regs update on allocate in validator
Take fail labels into account when determining liveness in block ops
Conflicts:
lib/compiler/src/beam_utils.erl
|
|
The state without pruned registers was passed on to test_heap
causing the validator to belive registers that aren't live
actually are live.
|
|
|
|
The sys_core_fold pass would do an unsafe "optimization" when an
apply operation did not have a variable in the function position
as in the following example:
> cat test1.core
module 'test1' ['test1'/2]
attributes []
'i'/1 =
fun (_f) -> _f
'test1'/2 =
fun (_f, _x) ->
apply apply 'i'/1 (_f) (_x)
end
> erlc test1.core
no_file: Warning: invalid function call
Reported-by: Mikael Pettersson
|
|
|
|
|
|
All keys in an orddict must be unique. sys_core_fold:sub_sub_scope/1
broke that rule. It was probably harmless, but it is better to
avoid such rule violations.
|
|
* hasse/unicode_atoms/OTP-14285:
compiler: Handle (bad) Unicode parse transform module names
kernel: Improve handling of Unicode filenames
stdlib: Handle Unicode atoms in ms_transform
stdlib: Improve Unicode handling of the Erlang parser
stdlib: Handle unknown compiler options with Unicode
stdlib: Handle Unicode macro names
stdlib: Correct Unicode handling in escript
dialyzer: Improve handling of Unicode
parsetools: Improve handling of Unicode atoms
stdlib: Handle Unicode atoms when formatting stacktraces
stdlib: Add more checks of module names to the linter
stdlib: Handle Unicode atoms better in io_lib_format
stdlib: Handle Unicode atoms in c.erl
|
|
|
|
As part of sys_core_fold, variables involved in bit syntax
matching would be annotated when it would be safe for a later
pass to do the delayed sub-binary creation optimization.
An implicit assumption regarding the annotation was that the
code must not be further optimized. That assumption was broken
in 05130e48555891, which introduced a fixpoint iteration
(applying the optimizations until there were no more changes).
That means that a variable could be annotated as safe for
reusing the match context in one iteration, but a later iteration
could rewrite the code in a way that would make the optimization
unsafe.
One way to fix this would be to clear all reuse_for_context
annotations before each iteration. But that would be wasteful.
Instead I chose to fix the problem by moving out the annotation
code to a separate pass (sys_core_bsm) that is run later after
all major optimizations of Core Erlang has been done.
|
|
compile:forms/1,2 is documented to return:
{ok,ModuleName,BinaryOrCode}
However, if one of the options 'from_core', 'from_asm', or
'from_beam' is given, ModuleName will be returned as [].
A worse problem is that is that if one those options are
combined with the 'native' option, compilation will crash.
Correct compile:forms/1,2 to pick up the module name from
the forms provided (either Core Erlang, Beam assembly code,
or a Beam file).
Reported here: https://bugs.erlang.org/browse/ERL-417
|
|
Make it clear that is_tagged_tuple/4 was added in OTP 20 (not R17).
|
|
Functions that can are known be pure can be evaluated at
compile-time if the arguments are literals and if the result is
expressible as a literal.
list_to_ref/1 and list_to_port/1 returns terms that cannot be
expressed as literals, so the optimization is not possible.
The argument for port_to_list/1 is never a literal, so there is
no way to evaluate it at compile-time. Therefore, marking those
functions as pure serves no useful purpose.
Note: list_to_pid/1 *is* marked as pure, but only so that we can test
the code in sys_core_fold that rejects pure functions that evaluate to
at term that is not possible to express as a literal. It is sufficient
to have one pure function of that kind.
|
|
erlang:hash/2 was removed in c5d9b970fb5b3a71.
|
|
Add a test for utf8 function names
|
|
The test found a bug in v3_kernel_pp which was not
taking into account utf8 atoms. The bug has also
been fixed.
|
|
The undocumented compiler option 'slim' is used when compiling
the primary bootstrap. The purpose is to make the bootstrap smaller
and to avoid unnecessary churn in the git repository. That is,
the BEAM file should be different only if the actual code in the
file is different, and not if it has merely been re-compiled on
a different computer.
Two commits have fattened the 'slim' option. In 36f7087ae0f,
extra chunks are included even in slim BEAM files. In dfb899c0229f7,
the "Dbgi" were added as an extra chunk, causing it to be included
in slim files.
Make 'slim' slim again by only including the essential chunks and
the attribute chunk (as was the case before the {extra,...} option
was added).
|
|
|
|
Introduce new "Dbgi" chunk
OTP-14369
|
|
* lukas/erts/list_to_port/OTP-14348:
erts: Add erlang:list_to_port/1 debug bif
erts: Auto-import port_to_list for consistency
erts: Polish off erlang:list_to_ref/1
|
|
|
|
Follow the same pattern as pid_to_list
|
|
By moving to effects_code_generation/1, there is no need
to explicitly remove those options when storing compile
information in the DebugInfo chunk.
|
|
The new Dbgi chunk returns data in the following format:
{debug_info_v1, Backend, Data}
This allows compilers to store the debug info in different
formats. In order to retrieve a particular format, for
instance, Erlang Abstract Format, one may invoke:
Backend:debug_info(erlang_v1, Module, Data, Opts)
Besides introducing the chunk above, this commit also:
* Changes beam_lib:chunk(Beam, [:abstract_code]) to
read from the new Dbgi chunk while keeping backwards
compatibility with old .beams
* Adds the {debug_info, {Backend, Data}} option to
compile:file/2 and friends that are stored in the
Dbgi chunk. This allows the debug info encryption
mechanism to work across compilers
* Improves dialyzer to work directly on Core Erlang,
allowing languages that do not have the Erlang
Abstract Format to be dialyzer as long as they emit
the new chunk and their backend implementation is
available
Backwards compatibility is kept across the board except
for those calling beam_lib:chunk(Beam, ["Abst"]), as the
old chunk is no longer available. Note however the "Abst"
chunk has always been optional.
Future OTP versions may remove parsing the "Abst" chunk
altogether from beam_lib once Erlang 19 and earlier is no
longer supported.
The current Dialyzer implementation still supports earlier
.beam files and such may also be removed in future versions.
|
|
Fixes https://bugs.erlang.org/browse/ERL-406 - a bug introduced in
0377592dc2238f561291be854d2ce859dd9a5fb1
|
|
|
|
The main purpose of these options is compatibility with
old Erlang systems. Since it is no longer possible to
communicate with R15B or earlier, we no longer need the
r12 through r15 options.
|
|
* kill type information only for affected registers in get_map_elements
* bs_get_utf* will produce integers of unicode range
This optimises code created by Elixir compiler, where:
<<x::utf8,_::binary>> when x in 1..10
will compile the guard to
is_integer(X) andalso X >= 1 andalso X =< 10
This allows us to eliminate the is_integer check.
* bs_get_float will produce a float
* allow to carry type information over other bs instructions killing
only the affected registers
* kill only x registers after call_fun and apply instructions
|
|
core_scan will now support and require atoms encoded in UTF-8.
|
|
Rewrite the instruction stream on tagged tuple tests.
Tagged tuples means a tuple of any arity with an atom as its first element.
Typically records, ok-tuples and error-tuples.
from:
...
{test,is_tuple,Fail,[Src]}.
{test,test_arity,Fail,[Src,Sz]}.
...
{get_tuple_element,Src,0,Dst}.
...
{test,is_eq_exact,Fail,[Dst,Atom]}.
...
to:
...
{test,is_tagged_tuple,Fail,[Src,Sz,Atom]}.
...
|
|
Code such as the following:
-record(x, {a}).
f(R, N0) ->
N = N0 / 100,
if element(1, R#x.a) =:= 0 ->
N
end.
would fail to compile with the following message:
m: function f/2+19:
Internal consistency check failed - please report this bug.
Instruction: {fmove,{fr,0},{x,1}}
Error: {uninitialized_reg,{fr,0}}:
This bug was introduced in 348b5e6bee2f.
Basically, the beam_type pass placed the fmove instruction in the
wrong place. Instructions that store to floating point registers and
instructions that read from floating point registers are supposed to
be in the same basic block.
Fix the problem by flushing all floating points instruction
before a call the pseudo-BIF is_record/3, thus making sure that
the fmove instruction is placed in the correct block.
Here is an annotated listing of the relevant part of the .S
file (before the fix):
{test_heap,{alloc,[{words,0},{floats,1}]},2}.
{fconv,{x,1},{fr,0}}.
{fmove,{float,100.0},{fr,1}}.
fclearerror.
{bif,fdiv,{f,0},[{fr,0},{fr,1}],{fr,0}}.
{fcheckerror,{f,0}}.
%% The instruction {fmove,{fr,0},{x,1}} should have
%% been here.
%% Block of instructions expanded from a call to
%% the pseudo-BIF is_record/3. (Expanded in a later
%% compiler pass.)
{test,is_tuple,{f,3},[{x,0}]}.
{test,test_arity,{f,3},[{x,0},2]}.
{get_tuple_element,{x,0},0,{x,2}}.
{test,is_eq_exact,{f,3},[{x,2},{atom,x}]}.
{move,{atom,true},{x,2}}.
{jump,{f,4}}.
{label,3}.
{move,{atom,false},{x,2}}.
{label,4}.
%% End of expansion.
%% The fmove instruction that beam_validator complains
%% about.
{fmove,{fr,0},{x,1}}.
Reported-by: Richard Carlsson
|
|
Binary construction that mixes long literal strings with variables
will make Dialyzer slow. Example:
<<"long string (thousand of characters)",T/binary>>
The string literals in binary construction is translated to one binary
segment per character; all those segments will slow down Dialyzer.
We can speed up Dialyzer if we combine several characters (up to 256)
to a signle segment in the binary. It will also slightly speed up the
compiler.
This optimization will make core listings file with binary strings
harder to read, but they were not that easy to read before this
change.
ERL-308
|
|
This allow languages such as Elixir and LFE to attach
extra chunks to the .beam file without having to parse
the beam file after compilation.
This commit also cleans up the interface to beam_asm,
allowing chunks to be passed from the compiler without
a need to change beam_asm API on every new chunk.
|
|
The new chunk stores atoms encoded in UTF-8.
beam_lib has also been modified to handle the new
'utf8_atoms' attribute while the 'atoms' attribute
may be a missing chunk from now on.
The binary_to_atom/2 BIF can now encode any utf8
binary with up to 255 characters.
The list_to_atom/1 BIF can now accept codepoints
higher than 255 with up to 255 characters (thanks
to Björn Gustavsson).
|
|
|
|
|
|
|
|
|
|
|
|
|