Age | Commit message (Collapse) | Author |
|
Some change in the BEAM compiler resulted in the creation of basic
blocks that differed from those previously created by the compiler.
As a result, the lazy code motion pass of RTL crashed when compiling
some of the new code.
Crashes were privately reported by @richcarl.
|
|
Make hipe compile option verify_gcsafe the default
|
|
Consider the following function:
function({function,Name,Arity,CLabel,Is0}, Lc0) ->
try
%% Optimize the code for the function.
catch
Class:Error:Stack ->
io:format("Function: ~w/~w\n", [Name,Arity]),
erlang:raise(Class, Error, Stack)
end.
The stacktrace is retrieved, but it is only used in the call
to erlang:raise/3. There is no need to build a stacktrace
in this function. We can avoid the building if we introduce
an instruction called raw_raise/3 that works exactly like
the erlang:raise/3 BIF except that its third argument must
be a raw stacktrace.
|
|
which is has been since 3d21f793538927ae88f78504a11dd898e8ca1a7a
|
|
|
|
by preventing it from doing GC, which generated code relies on.
|
|
This commit adds a new syntax for retrieving the stacktrace
without calling erlang:get_stacktrace/0. That allow us to
deprecate erlang:get_stacktrace/0 and ultimately remove it.
The problem with erlang:get_stacktrace/0 is that it can keep huge
terms in a process for an indefinite time after an exception. The
stacktrace can be huge after a 'function_clause' exception or a failed
call to a BIF or operator, because the arguments for the call will be
included in the stacktrace. For example:
1> catch abs(lists:seq(1, 1000)).
{'EXIT',{badarg,[{erlang,abs,
[[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20|...]],
[]},
{erl_eval,do_apply,6,[{file,"erl_eval.erl"},{line,674}]},
{erl_eval,expr,5,[{file,"erl_eval.erl"},{line,431}]},
{shell,exprs,7,[{file,"shell.erl"},{line,687}]},
{shell,eval_exprs,7,[{file,"shell.erl"},{line,642}]},
{shell,eval_loop,3,[{file,"shell.erl"},{line,627}]}]}}
2> erlang:get_stacktrace().
[{erlang,abs,
[[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,
23,24|...]],
[]},
{erl_eval,do_apply,6,[{file,"erl_eval.erl"},{line,674}]},
{erl_eval,expr,5,[{file,"erl_eval.erl"},{line,431}]},
{shell,exprs,7,[{file,"shell.erl"},{line,687}]},
{shell,eval_exprs,7,[{file,"shell.erl"},{line,642}]},
{shell,eval_loop,3,[{file,"shell.erl"},{line,627}]}]
3>
We can extend the syntax for clauses in try/catch to optionally bind
the stacktrace to a variable.
Here is an example using the current syntax:
try
Expr
catch C:E ->
Stk = erlang:get_stacktrace(),
.
.
.
In the new syntax, it would look like:
try
Expr
catch
C:E:Stk ->
.
.
.
Only a variable (not a pattern) is allowed in the stacktrace position,
to discourage matching of the stacktrace. (Matching would also be
expensive, because the raw format of the stacktrace would have to be
converted to the cooked form before matching.)
Note that:
try
Expr
catch E ->
.
.
.
is a shorthand for:
try
Expr
catch throw:E ->
.
.
.
If the stacktrace is to be retrieved for a throw, the 'throw:'
prefix must be explicitly included:
try
Expr
catch throw:E:Stk ->
.
.
.
|
|
|
|
Literal tags are used by the VM as an alternative to reserving a large
virtual memory space in order to be able to quickly identify which terms
are literals. The use of literal tags harms performance, but is useful
to support systems where allocating a large amount of virtual memory is
not an option.
|
|
HiPE has had metadata for gc safety on it's temporaries for a while, but
it has never been enforced or even checked, so naturally several
gc-safety violations has slipped through.
A new pass, hipe_rtl_verify_gcsafe verifies gcsafety on optimised RTL
and is used when running the testsuite, and can be manually enabled with
+{hipe,[verify_gcsafe]}.
|
|
Since gcunsafe values are live over is_divisible calls (although only
the happy path, which never GCd), it should be a primop so there cannot
be any GCs.
|
|
During deletion, the killing of expressions was not considered.
|
|
The lazy code motion optimisation pass could violate its guarantees
eliminating partial redundancy by moving an expression to before a call
instruction.
|
|
by introducing new primop 'is_unicode'
with no exception (ab)use and no GC.
Replaces bs_validate_unicode which is kept for backward compat for now.
|
|
|
|
|
|
|
|
* Omit bounds check in more cases.
A test case that needs this change to omit bounds check is added.
* Improve code generation by reformulating bounds check to decrease
register pressure.
|
|
|
|
With this change, both the matches and does not match cases have
fastpaths that does not need to call primops.
|
|
|
|
|
|
|
|
test_two_fixnums would previously only check its right-hand argument for
immediates, but not it's left. Now, test_two_fixnums reduces to
test_fixnum if either argument is an immediate.
|
|
By changing mask_and_compare from and,sub to sub,and, x86 can use a
3-address LEA immediate add, saving a mov. The RISC backends should see
no change in sequence length.
We make test_(heap_|sub)binary use mask_and_compare so they will benefit
too.
|
|
The addsub sequence was suboptimal when one of the arguments was
immediate, because it became an immediate alu followed by an immediate
alub, and the optimisers would not combine them due to the risk of
altering the branch. However, in this case we know that such a rewrite
is safe, and do it directly in hipe_tagscheme:fixnum_addsub/5 instead.
|
|
This makes the fast case a fallthrough and the slow case a branch,
hopefully improving cache locality.
|
|
With the introduction of immediate adds encoded as 'LEA' on x86, it is
now possible to do a fixnum add in two instructions and one branch by
commuting the addition and reusing the result register as a temporary,
which makes the 'alub' a 2-address add, saving a move instruction.
|
|
|
|
branch and alub overlap in their use cases, but the backends rely on
knowing that the result is unused in their lowering of branch. By
extending alub so that the destination is optional, it can fully replace
branch.
This simplifies rtl by reducing code duplication and the number of
instructions.
Also, in the x86 and arm backends, we can now use 'test' and
{'tst','mvn','teq'} to lower some alubs without destinations. This is
particularly good for x86, as sequences such as 'is_boxed' type tests
now get shorter (both from not needing a mov to copy the variable, but
also from the fact that 'testb' encodes shorter than 'andq').
|
|
|
|
This fixes a HiPE bug reported on erlang-questions on 2/11/2016.
The BEAM to ICode tranaslation of the bs_match_string instruction,
written long ago for binaries (i.e., with byte-sized strings), tried
to do a `clever' translation of even bit-sized strings using a HiPE
primop that took a `Size' argument expressed in *bytes*.
ICode is not really the place to do such a thing, and moreover there
is really no reason for the HiPE primop not to take a Size argument
expressed in *bits* instead. This commit changes the `Size' argument
to be in bits, postpones the translation of the bs_match_string primop
Fixed in a pair-programming/debugging session with @margnus1.
until RTL and does a proper translation using bit-sized quantities there.
|
|
Did not work with purge and made worse by new purge strategy.
Did yield terrible performance when fun thing is created *before*
fun code is loaded. Like when receiving not yet loaded fun
from other node. The cached 'native_address' in ErlFunThing
will not be updated leading to mode switch and error_handler
being called for every call to the fun from native code.
|
|
|
|
Immediate arguments to get_word_integer/4 would lead to bad but
unreachable RTL being generated. We omit its generation by testing for
immediates and performing the logic at compile time.
|
|
|
|
* Rewrite matching statements in ?when_option macro to form that silences
dialyzer's unmatched_return warnings
* Treat compiler warnings as errors when compiling files in main
|
|
|
|
|
|
|
|
* margnus1/bs_unit_fix:
hipe: Fix signed compares of unsigned sizes
beam: Fix overflow bug in i_bs_add_jId
hipe: Add tests for bad bit syntax float sizes
Add a case testing the handling of guards involving binaries
Add some more binary syntax construction tests
hipe: Guard against enormous numbers in ranges
hipe: Fix constructing huge binaries
hipe: Fix binary constructions failing with badarith
Add missing corner-case to bs_construct_SUITE
hipe: Allow unsigned args in hipe_rtl_arith
hipe: test unit size match in bs_put_binary_all
hipe: test unit size match in bs_append
Fix hipe_rtl_binary_construct:floorlog2/1
OTP-13272
|
|
The code generated by the HiPE compiler for pattern matching with
UTF binaries was such that sometimes THE_NON_VALUE was stored in
the roots followed by the garbage collector. This was not an issue
for the vanilla native code compiler, but was problematic for the
ErLLVM back-end.
Fix the issue by not storing THE_NON_VALUE in the live roots. An
alternative fix would be to change the code of the garbage collector.
With this fix, there are no more (known) failing test cases for the
ErLLVM back-end (at least on x86_64 with LLVM 3.5, which is the
configuation regularly tested). Thanks to @margnus1 for the fix.
|
|
Also, some of the branches were testing sizes in bits against a constant
?MAX_BINSIZE, which was in bytes. The signed comparisons masked this
mistake. These branches have been removed since all sizes in bits that
fit in a machine word are valid binary sizes.
Finally, a test that reproduces the issue was added to bs_construct,
along with a test for one of the cases (bs_init<0>(...)) when the test
against ?MAX_BINSIZE must be changed to unsigned rather than removed.
|
|
Bugs were fixed in
hipe_rtl_binary_match:{first_part/3,make_size/3,set_high/1} in commit
5aea81c49, but it turns out these had been copy-pasted verbatim into
hipe_rtl_binary_construct, where they were causing further bugs. They
have now moved to hipe_rtl_binary, from where they are included by the
other two modules.
Furthermore, first_part/3 (reamed get_word_integer/3, since it loads
integers that fits into an unsigned word), and make_size/3 now accepts a
fourth argument to distinguish too large arguments (which should cause a
system_limit exception) from negative or non-integral arguments.
The use of first_part/3 (get_word_integer/3) from 5aea81c49 in
hipe_rtl_binary_construct now allows several binary construction
instructions to accept bignum sizes, as they were supposed to.
Additionally, calls to
hipe_rtl_binary_construct:check_and_untag_fixnum/3 were replaced with
get_word_integer/4 since all of them were also supposed to accept
sufficiently small bignums, but didn't, and check_and_untag_fixnum/3 was
essentially identical to first_part/3 anyway.
HiPE is now capable of passing bs_construct_SUITE completely unmodified.
|
|
hipe_rtl_arith is only used by hipe_rtl_ssa_const_prop, which applies it
to any RTL, including RTL where the intent is to do unsigned math. Since
signed and unsigned operations produce the same 2's complement result,
this change is harmless.
On 32-bit architectures it caused HiPE crashes when compiling code like
<<0:((1 bsl 32)-1)>>, because the size of the field is too large to fit
in a signed integer.
|
|
The unit size field was previously completely discarded when lowering
this instruction from BEAM to Icode.
This feature was previously missing and expressions such as <<0,
<<1:1>>/binary>> would succeed construction when compiled with HiPE.
|
|
This feature was previously missing and expressions such as
<<<<1:1>>/binary>> would succeed construction when compiled with HiPE.
A primop is_divisible is introduced to handle the case when the unit
size is not a power of two.
|
|
Relying on double-precision floating-point arithmetic to compute the
log2 of an integer up to 64 bits long leads to rounding errors.
|
|
* kostis/hipe-bs-match-huge-bin:
Fix matching with huge binaries
Compile without errors for exported variables
OTP-13092
|
|
copy_offset_int_big was assuming (Offset + Size - 1) (Tmp9 in the first
BB) would not underflow. It was also unconditionally reading and writing
the binary even when Size was zero, unlike copy_int_little, which is the
only other case of bs_put_integer that does not have a short-circuit on
Size = 0.
This was causing segfaults when constructing binaries starting with a
zero-length integer field, because a logical right shift was used to
compute an offset in bytes (which became 0x1fffffffffffffff) to read in
the binary.
Tests, taken from the emulator bs_construct_SUITE, were also added.
The complete credit for the report and the fix goes to Magnus Lång.
|