Age | Commit message (Collapse) | Author |
|
|
|
|
|
* Omit bounds check in more cases.
A test case that needs this change to omit bounds check is added.
* Improve code generation by reformulating bounds check to decrease
register pressure.
|
|
|
|
With this change, both the matches and does not match cases have
fastpaths that does not need to call primops.
|
|
|
|
|
|
|
|
test_two_fixnums would previously only check its right-hand argument for
immediates, but not it's left. Now, test_two_fixnums reduces to
test_fixnum if either argument is an immediate.
|
|
By changing mask_and_compare from and,sub to sub,and, x86 can use a
3-address LEA immediate add, saving a mov. The RISC backends should see
no change in sequence length.
We make test_(heap_|sub)binary use mask_and_compare so they will benefit
too.
|
|
The addsub sequence was suboptimal when one of the arguments was
immediate, because it became an immediate alu followed by an immediate
alub, and the optimisers would not combine them due to the risk of
altering the branch. However, in this case we know that such a rewrite
is safe, and do it directly in hipe_tagscheme:fixnum_addsub/5 instead.
|
|
This makes the fast case a fallthrough and the slow case a branch,
hopefully improving cache locality.
|
|
With the introduction of immediate adds encoded as 'LEA' on x86, it is
now possible to do a fixnum add in two instructions and one branch by
commuting the addition and reusing the result register as a temporary,
which makes the 'alub' a 2-address add, saving a move instruction.
|
|
|
|
branch and alub overlap in their use cases, but the backends rely on
knowing that the result is unused in their lowering of branch. By
extending alub so that the destination is optional, it can fully replace
branch.
This simplifies rtl by reducing code duplication and the number of
instructions.
Also, in the x86 and arm backends, we can now use 'test' and
{'tst','mvn','teq'} to lower some alubs without destinations. This is
particularly good for x86, as sequences such as 'is_boxed' type tests
now get shorter (both from not needing a mov to copy the variable, but
also from the fact that 'testb' encodes shorter than 'andq').
|
|
|
|
This fixes a HiPE bug reported on erlang-questions on 2/11/2016.
The BEAM to ICode tranaslation of the bs_match_string instruction,
written long ago for binaries (i.e., with byte-sized strings), tried
to do a `clever' translation of even bit-sized strings using a HiPE
primop that took a `Size' argument expressed in *bytes*.
ICode is not really the place to do such a thing, and moreover there
is really no reason for the HiPE primop not to take a Size argument
expressed in *bits* instead. This commit changes the `Size' argument
to be in bits, postpones the translation of the bs_match_string primop
Fixed in a pair-programming/debugging session with @margnus1.
until RTL and does a proper translation using bit-sized quantities there.
|
|
Did not work with purge and made worse by new purge strategy.
Did yield terrible performance when fun thing is created *before*
fun code is loaded. Like when receiving not yet loaded fun
from other node. The cached 'native_address' in ErlFunThing
will not be updated leading to mode switch and error_handler
being called for every call to the fun from native code.
|
|
|
|
Immediate arguments to get_word_integer/4 would lead to bad but
unreachable RTL being generated. We omit its generation by testing for
immediates and performing the logic at compile time.
|
|
|
|
* Rewrite matching statements in ?when_option macro to form that silences
dialyzer's unmatched_return warnings
* Treat compiler warnings as errors when compiling files in main
|
|
|
|
|
|
|
|
* margnus1/bs_unit_fix:
hipe: Fix signed compares of unsigned sizes
beam: Fix overflow bug in i_bs_add_jId
hipe: Add tests for bad bit syntax float sizes
Add a case testing the handling of guards involving binaries
Add some more binary syntax construction tests
hipe: Guard against enormous numbers in ranges
hipe: Fix constructing huge binaries
hipe: Fix binary constructions failing with badarith
Add missing corner-case to bs_construct_SUITE
hipe: Allow unsigned args in hipe_rtl_arith
hipe: test unit size match in bs_put_binary_all
hipe: test unit size match in bs_append
Fix hipe_rtl_binary_construct:floorlog2/1
OTP-13272
|
|
The code generated by the HiPE compiler for pattern matching with
UTF binaries was such that sometimes THE_NON_VALUE was stored in
the roots followed by the garbage collector. This was not an issue
for the vanilla native code compiler, but was problematic for the
ErLLVM back-end.
Fix the issue by not storing THE_NON_VALUE in the live roots. An
alternative fix would be to change the code of the garbage collector.
With this fix, there are no more (known) failing test cases for the
ErLLVM back-end (at least on x86_64 with LLVM 3.5, which is the
configuation regularly tested). Thanks to @margnus1 for the fix.
|
|
Also, some of the branches were testing sizes in bits against a constant
?MAX_BINSIZE, which was in bytes. The signed comparisons masked this
mistake. These branches have been removed since all sizes in bits that
fit in a machine word are valid binary sizes.
Finally, a test that reproduces the issue was added to bs_construct,
along with a test for one of the cases (bs_init<0>(...)) when the test
against ?MAX_BINSIZE must be changed to unsigned rather than removed.
|
|
Bugs were fixed in
hipe_rtl_binary_match:{first_part/3,make_size/3,set_high/1} in commit
5aea81c49, but it turns out these had been copy-pasted verbatim into
hipe_rtl_binary_construct, where they were causing further bugs. They
have now moved to hipe_rtl_binary, from where they are included by the
other two modules.
Furthermore, first_part/3 (reamed get_word_integer/3, since it loads
integers that fits into an unsigned word), and make_size/3 now accepts a
fourth argument to distinguish too large arguments (which should cause a
system_limit exception) from negative or non-integral arguments.
The use of first_part/3 (get_word_integer/3) from 5aea81c49 in
hipe_rtl_binary_construct now allows several binary construction
instructions to accept bignum sizes, as they were supposed to.
Additionally, calls to
hipe_rtl_binary_construct:check_and_untag_fixnum/3 were replaced with
get_word_integer/4 since all of them were also supposed to accept
sufficiently small bignums, but didn't, and check_and_untag_fixnum/3 was
essentially identical to first_part/3 anyway.
HiPE is now capable of passing bs_construct_SUITE completely unmodified.
|
|
hipe_rtl_arith is only used by hipe_rtl_ssa_const_prop, which applies it
to any RTL, including RTL where the intent is to do unsigned math. Since
signed and unsigned operations produce the same 2's complement result,
this change is harmless.
On 32-bit architectures it caused HiPE crashes when compiling code like
<<0:((1 bsl 32)-1)>>, because the size of the field is too large to fit
in a signed integer.
|
|
The unit size field was previously completely discarded when lowering
this instruction from BEAM to Icode.
This feature was previously missing and expressions such as <<0,
<<1:1>>/binary>> would succeed construction when compiled with HiPE.
|
|
This feature was previously missing and expressions such as
<<<<1:1>>/binary>> would succeed construction when compiled with HiPE.
A primop is_divisible is introduced to handle the case when the unit
size is not a power of two.
|
|
Relying on double-precision floating-point arithmetic to compute the
log2 of an integer up to 64 bits long leads to rounding errors.
|
|
* kostis/hipe-bs-match-huge-bin:
Fix matching with huge binaries
Compile without errors for exported variables
OTP-13092
|
|
copy_offset_int_big was assuming (Offset + Size - 1) (Tmp9 in the first
BB) would not underflow. It was also unconditionally reading and writing
the binary even when Size was zero, unlike copy_int_little, which is the
only other case of bs_put_integer that does not have a short-circuit on
Size = 0.
This was causing segfaults when constructing binaries starting with a
zero-length integer field, because a logical right shift was used to
compute an offset in bytes (which became 0x1fffffffffffffff) to read in
the binary.
Tests, taken from the emulator bs_construct_SUITE, were also added.
The complete credit for the report and the fix goes to Magnus Lång.
|
|
In certain cases of matching with very big binaries, the HiPE compiler
generated code that would fail the match, even in cases that the matching
was successful. The problem was more quite noticeable on 32-bit platforms
where certain integer quantities would be represented as bignums.
Brief summary of changes:
* gen_rtl({bs_skip_bits, ...}, ...) could not handle too large constants.
Previously the constants were truncated to word size.
* hipe_rtl_binary_match:make_size/3 erroneously assumed that the output
of first_part/3 would not overflow when multiplied by 8, which is no
longer true. To maintain full performance, the overflow test is only
performed when BitsVar was a bignum. Thus, the fast path is identical
to before.
* hipe_rtl_binary_match:set_high/2 was assuming that only bits below
bit 27 were ever set in arguments to bs_skip_bits, which is not only
false when the arguments are bignums, but also on 64-bit platforms.
The commit includes a test taken from the bs_match_bin_SUITE.
Most of the credit for finding these HiPE compiler errors and for
creating appropriate fixes for them should go to Magnus Lång.
|
|
|
|
OTP-12845
* bruce/change-license:
fix errors caused by changed line numbers
Change license text to APLv2
|
|
|
|
For quite some time now, this module generated a (quite harmless)
dialyzer warning. Comment out a clause which was actually unreachable.
While at it, do some small code refactorings here and there.
|
|
Only call emasculate_binary if ProcBin.flags is set,
which means it's a writable binary.
|
|
Seen symptom: Hipe compiled code with <<C/utf8, ...>> = Bin does sometimes
not match even though Bin contains a valid utf8 character. There might be
other possible binary matching symptoms, as the problem is not utf8
specific.
Problem: A writable binary was not "emasculated" when the matching started
(as it should) by the hipe compiled code.
Fix: Add a new primop emasculate_binary(Bin) that is called when
a matchstate is created.
ToDo: There are probably room for optimization. For example only call
emasculate_binary if ProcBin.flags is set.
|
|
hipe_rtl:phi_remove_pred/2 can produce a #move{} instruction with
floating-point temporaries as operands, even though such moves MUST
be #fmove{} instructions.
Added type checks to the #move{} and #fmove{} constructor and setter
functions to ensure that similar mishaps cannot happen again.
|
|
|
|
Namely, extend the HiPE tagging scheme info, properly handle the translation
of the (is_)map type test to Icode and RTL and support handling of the map()
type in the type system.
While at it, also performed some clean up of things that needed small fixes.
|
|
* ks/hipe-rtl-remove-constant/OTP-11822:
Remove RTL code that handled the (is_)constant guard
|
|
The (is_)constant/1 guard is removed from Erlang long ago and thus
there is no need to handle it in RTL.
While editing these files, also performed some minor cleanup.
|
|
Extend the 'rtl_var' definition to host liveness information needed for
a simple liveness analysis performed by the LLVM backend.
Also, uncomment some function definitions (mostly simple accessors) that
are already there and are useful for the translation of RTL code to LLVM
assembly.
Finally, extend the RTL 'call' instruction with the 'normalcontinuation'
field which is required for translating calls that are in the scope of
an exception handler. This extra field is required in order to point to
a new basic block which will hold the 'unwind label' of LLVM's invoke
instruction. While the 'unwind label' is semantically equivalent with
the 'failcontinuation', in LLVM this block must have a 'landingpad'
instruction. The problem arises by the fact that an RTL 'continuation'
block can also be accessible by other paths in the RTL CFG, and so
cannot be marked as a landing pad. To overcome this issue we create a
new block (the 'normalcontinuation') which is used as the 'unwind label'
of LLVM's invoke instruction and which will eventually transfer control
to the 'continuation' block.
|
|
|
|
|