Age | Commit message (Collapse) | Author |
|
Adds a new register allocator callback
Target:branch_preds(Instr, Context) which, for a control flow
instruction Instr, returns a list of tuples {Target, Probability} for
each label name Target that Instr may branch to. Probability is a float
between 0.0 and 1.0 and corresponds to the predicted probability that
control flow branches to the corresponding target. The probabilities may
sum to at most 1.0 (rounding errors aside). Note that a sum less than
1.0 is valid.
|
|
In addition to the temporary name rewriting that hipe_regalloc_prepass
does, range splitters also need to be able to insert move instructions,
as well as inserting new basic blocks in the control flow graph. The
following four callbacks are added for that purpose:
* Target:mk_move(Src, Dst, Context)
Returns a move instruction from the temporary (not just register
number) Src to Dst.
* Target:mk_goto(Label, Context)
Returns a unconditional control flow instruction that branches to the
label with name Label.
* Target:redirect_jmp(Instr, ToOld, ToNew, Context)
Modifies the control flow instruction Instr so that any control flow
that would go to a label with name ToOld instead goes to the label
with name ToNew.
* Target:new_label(Context)
Returns a fresh label name that does not belong to any existing block
in the current function, and is to be used to create a new basic
block in the control flow graph by calling Target:update_bb/4 with
this new name.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Also, use byte form for immediates up to 255, since there's no sign
extension in byte form.
HiPE seems to never generate negative test immediates currently, but we
should at least not output incorrect encodings.
|
|
|
|
It seems that most 3-address adds of temps can be move coalesced.
Therefore, we limit the behaviour added by 1567585dda8 to only affect
immediate adds.
Also, add conversion of immediate mov+sub to lea.
|
|
Although LEA is useful for three-address form adds, sometimes it is used
where a normal add would have sufficed (due to the addition being the
last use of one of the operands; but RTL lowering does not know that as
it does not have liveness information). As a workaround, we convert LEA
back to ADD when the destination is the same as one of the operands.
|
|
branch and alub overlap in their use cases, but the backends rely on
knowing that the result is unused in their lowering of branch. By
extending alub so that the destination is optional, it can fully replace
branch.
This simplifies rtl by reducing code duplication and the number of
instructions.
Also, in the x86 and arm backends, we can now use 'test' and
{'tst','mvn','teq'} to lower some alubs without destinations. This is
particularly good for x86, as sequences such as 'is_boxed' type tests
now get shorter (both from not needing a mov to copy the variable, but
also from the fact that 'testb' encodes shorter than 'andq').
|
|
|
|
|
|
This allows us to pass around the context data that
hipe_regalloc_prepass needs cleanly, without using process dictionary or
parameterised modules (like it was previous to this change).
|
|
This is sound because the liveness data structure only stores liveness
info at basic block boundaries, and the rewrites that happen in
TargetSpecific:check_and_rewrite/2 preserves all existing definitions
and uses, and all new liveness intervals, belonging to newly introduced
temporaries, are always local to a basic block, and thus do not show up
in the liveout or livein sets for the basic block.
|
|
These will not only be useful for hipe_regalloc_prepass, but also, after
the introduction of a mk_move/2 (or similar) callback, for the purpose
of range splitting.
Since the substitution needed to case over all the instructions, a new
module, hipe_x86_subst, was introduced to the x86 backend.
Due to differences in the 'jtab' field of a #jmp_switch{} between x86
and amd64, it regrettably needed to be duplicated to hipe_amd64_subst.
|
|
|
|
This is due to the improvements in hipe_temp_map, removing the need for
duplicated logic in the backends.
|
|
|
|
|
|
hipe_regalloc_prepass speeds up register allocation by spilling any temp
that is live over a call (which clobbers all register).
In order to detect these, a new function was added to the target
interface; defines_all_alloc/1, that takes an instruction and returns a
boolean.
|
|
This is primarily useful for heap allocations, as a two-address 'add'
can't be used to both copy the heap pointer to another register, and add
the tag.
|
|
For x86, additionally reuse liveness from float LSRA for the GP LSRA.
|
|
Most x86 passes were either linearise(pass(to_cfg(Code))) or trivially
rewritable to process a CFG. This saves a great deal of time and memory
churn when compiling large programs.
Now, there will only ever be a single Linear->CFG conversion, just after
lowering from RTL, and only ever a single CFG->Linear conversion, just
before the finalise pass. Both of these now happen in hipe_x86_main.
|
|
The x86 backend crashes if certain RTL optimisations were omitted,
preventing it from being usable at lower optimisation levels.
|
|
There is little point offering LSRA for x86 if we're still going to call
hipe_graph_coloring_regalloc for the floats. In particular, all
allocators except LSRA allocates an N^2 interference matrix, making them
unusable for really large functions.
|
|
|
|
|
|
|
|
and correct the name of another, erroneously spelt, option in the process.
|
|
|
|
Main problem:
A faulty HIPE_LITERAL_CRC was not detected by the loader.
Strangeness #1:
Dialyzer should ask the hipe compiler about the target checksum,
not an internal bif.
Strangeness #2:
The HIPE_SYSTEM_CRC checksum was based on the HIPE_LITERALS_CRC
checksum.
Solution:
New HIPE_ERTS_CHECKSUM which is an bxor of the two (now independent)
HIPE_LITERALS_CRC and HIPE_SYSTEM_CRC.
HIPE_LITERALS_CRC represents values that are assumed to stay constant
for different VM configurations of the same arch, and are therefor
hard coded into the hipe compiler.
HIPE_SYSTEM_CRC represents values that may differ between VM variants.
By default the hipe compiler asks the running VM for this checksum,
in order to create beam files for the same running VM.
The hipe compiler can be configured (with "make XCOMP=yes ...") to
create beam files for another VM variant, in which case HIPE_SYSTEM_CRC
is also hard coded.
ToDo:
Treat all erts properties the same. Either ask the running VM or hard
coded into hipe (if XCOMP=yes). This will simplify and reduce the risk
of dangerous mismatches. One concern might be the added overhead
from more frequent calls to hipe_bifs:get_rts_param.
|
|
|
|
RTL can produce an #fconv{} instruction with an immediate operand, but
the backends unconditionally access the operand as a temporary. This
results in broken representation in the backends and eventually they
crash.
|
|
All backends (e.g. arm, ppc, sparc, x86) share the same code for the following
functions:
* find_const/2,
* mk_data_relocs/2, and
* slim_sorted_exportmap/3
This commit moves those definitions (along with some helper functions) in
misc/hipe_pack_constants.erl and adds the appropriate specs. This is a
structural change; no change in semantics intented.
|
|
|
|
|
|
OTP-10106
OTP-10107
|
|
|
|
|
|
|
|
environment after a number of bugs are fixed and some features
are added in the documentation build process.
- The arity calculation is updated.
- The module prefix used in the function names for bif's are
removed in the generated links so the links will look like
http://www.erlang.org/doc/man/erlang.html#append_element-2
instead of
http://www.erlang.org/doc/man/erlang.html#erlang:append_element-2
- Enhanced the menu positioning in the html documentation when a
new page is loaded.
- A number of corrections in the generation of man pages (thanks
to Sergei Golovan)
- Moved some man pages to more apropriate sections, pages in
section 4 moved to 5 and pages in 6 moved to 7.
- The legal notice is taken from the xml book file so OTP's
build process can be used for non OTP applications.
|
|
|