Age | Commit message (Collapse) | Author |
|
to not be transformed to local calls.
code_SUITE:uprade still fails if run with compile option
{hipe,to_llvm}
|
|
Did not work with purge and made worse by new purge strategy.
Did yield terrible performance when fun thing is created *before*
fun code is loaded. Like when receiving not yet loaded fun
from other node. The cached 'native_address' in ErlFunThing
will not be updated leading to mode switch and error_handler
being called for every call to the fun from native code.
|
|
by introducing hipe_bifs:commit_patch_load/1
that creates the HipeModule.
|
|
|
|
Just like the BEAM loader state (as returned by
erlang:prepare_loading/2), the HiPE loader state is contained in a magic
binary.
Eventually, we will separate HiPE loading into a prepare and a finalise
phase, like the BEAM loader, where the prepare phase will be implemented
by hipe_unified_loader and the finalise phase be implemented in C by
hipe_load.c and beam_load.c, making prepare side-effect free and
finalise atomic. The finalise phase will be exposed through the
erlang:finish_loading/1 API, just like the BEAM loader, as this will
allow HiPE and BEAM modules to be mixed in the same atomic "commit".
The usage of a loader state makes it easier to keep track of all
resources allocated during loading, and will not only make it easy to
prevent leaks when hipe_unified_loader crashes, but also paves the way
for proper, leak-free, unloading of HiPE modules.
|
|
and hipe_bifs:update_code_size
|
|
A step toward better integration of hipe load and purge
Highlights:
* code_server no longer needs to call hipe_unified_loader:post_beam_load/1
Instead new internal function hipe_redirect_to_module()
is called by loading BIFs to patch native call sites if needed.
* hipe_purge_module() is called by erts_internal:purge_module/2
to purge any native code.
* struct hipe_mfa_info redesigned and only used for exported
functions that are called from or implemented by native code.
A list of native call sites (struct hipe_ref) are kept for each hipe_mfa_info.
* struct hipe_sdesc used by hipe_find_mfa_from_ra()
to build native stack traces.
|
|
* rickard/time-unit/OTP-13831:
Replace usage of deprecated time units
|
|
=== OTP-19.1 ===
Changed Applications:
- asn1-4.0.4
- common_test-1.12.3
- compiler-7.0.2
- crypto-3.7.1
- debugger-4.2.1
- dialyzer-3.0.2
- diameter-1.12.1
- edoc-0.8
- erl_docgen-0.6
- erl_interface-3.9.1
- erts-8.1
- eunit-2.3.1
- gs-1.6.2
- hipe-3.15.2
- ic-4.4.2
- inets-6.3.3
- jinterface-1.7.1
- kernel-5.1
- mnesia-4.14.1
- observer-2.2.2
- odbc-2.11.3
- parsetools-2.1.3
- reltool-0.7.2
- runtime_tools-1.10.1
- sasl-3.0.1
- snmp-5.2.4
- ssh-4.3.2
- ssl-8.0.2
- stdlib-3.1
- syntax_tools-2.1
- tools-2.8.6
- wx-1.7.1
- xmerl-1.3.12
Unchanged Applications:
- cosEvent-2.2.1
- cosEventDomain-1.2.1
- cosFileTransfer-1.2.1
- cosNotification-1.2.2
- cosProperty-1.2.1
- cosTime-1.2.2
- cosTransactions-1.3.2
- eldap-1.2.2
- et-1.6
- megaco-3.18.1
- orber-3.8.2
- os_mon-2.4.1
- otp_mibs-1.1.1
- percept-0.9
- public_key-1.2
- typer-0.9.11
Conflicts:
OTP_VERSION
lib/gs/doc/src/notes.xml
lib/gs/vsn.mk
|
|
|
|
* maint:
erl_bif_types: Properly unopaque maps:merge/2 args
|
|
into maint
* margnus1/dialyzer/fix_maps_opaque/ERL-249/PR-1161/OTP-13878:
erl_bif_types: Properly unopaque maps:merge/2 args
|
|
* sverker/hipe-speedy-reg-alloc/PR-1159:
hipe: Refactor ra callbacks to accept context arg
hipe: Reuse liveness between regalloc iterations
hipe: Add ra_partitioned to o1 and up
hipe_regalloc_prepass: Change splitting heuristic
hipe: Make sure prepass temps are below SpillLimit
hipe_regalloc_prepass: Rename coloring collisions
hipe_ppc: Add code rewrite RA callbacks
hipe_sparc: Add code rewrite RA callbacks
hipe_arm: Add code rewrite RA callbacks
hipe_x86: Add code rewrite RA callbacks
hipe: Remove defun_to_cfg/1 RA callback
Add new sanity assertion to hipe_regalloc_prepass
Simplify hipe_x86_ra_finalise:conv_ra_maplet/3
hipe_x86: Simplify ra_postconditions is_mem_opnd
hipe_x86: Fix pseudo_tailcall prettyprinting
hipe_x86: Extra sanity assertions
hipe: clean up unnecessary catches
hipe: Remove temp reuse from call_fun
hipe: Add IG partitioning to hipe_regalloc_prepass
hipe: Add hipe_regalloc_prepass
|
|
erl_bif_types:type/5 was calling erl_types:map_pairwise_merge/3 directly
with its (potentially opaque) arguments, causing Dialyzer crashes.
Bug (ERL-249) reported and minimised test case provided by Felipe
Ripoll.
|
|
|
|
This allows us to pass around the context data that
hipe_regalloc_prepass needs cleanly, without using process dictionary or
parameterised modules (like it was previous to this change).
|
|
This is sound because the liveness data structure only stores liveness
info at basic block boundaries, and the rewrites that happen in
TargetSpecific:check_and_rewrite/2 preserves all existing definitions
and uses, and all new liveness intervals, belonging to newly introduced
temporaries, are always local to a basic block, and thus do not show up
in the liveout or livein sets for the basic block.
|
|
|
|
ra_partitioned significantly speeds up register allocation of larger
functions without affecting allocation quality negatively. This is the
final change needed to make o1 suitable for compiling really large
functions without choking.
|
|
The division into an initial pass that may introduce temps, and
following passes that must not forces us to make the same heuristic
decision during each of these passes. Thus, the splitting heuristic
can't be based on the number of temporaries -- at least without
excluding temporaries above SpillLimit.
|
|
Implement as ceil/1 and floor/1 as new guard BIFs (essentially part of
Erlang language). They are guard BIFs because trunc/1 is a guard
BIF. It would be strange to have trunc/1 as a part of the language, but
not ceil/1 and floor/1.
|
|
If temps introduced by hipe_regalloc_prepass end up above SpillLimit,
the register allocators will not spill them. This constraint is
unnecessarily limiting the allocators and might theoretically lead to
unallocatable programs (more temps above SpillLimit alive at a time than
there are physical registers).
|
|
|
|
These will not only be useful for hipe_regalloc_prepass, but also, after
the introduction of a mk_move/2 (or similar) callback, for the purpose
of range splitting.
Since the substitution needed to case over all the instructions, a new
module, hipe_ppc_subst, was introduced to the ppc backend.
|
|
These will not only be useful for hipe_regalloc_prepass, but also, after
the introduction of a mk_move/2 (or similar) callback, for the purpose
of range splitting.
Since the substitution needed to case over all the instructions, a new
module, hipe_sparc_subst, was introduced to the sparc backend.
|
|
These will not only be useful for hipe_regalloc_prepass, but also, after
the introduction of a mk_move/2 (or similar) callback, for the purpose
of range splitting.
Since the substitution needed to case over all the instructions, a new
module, hipe_arm_subst, was introduced to the arm backend.
|
|
These will not only be useful for hipe_regalloc_prepass, but also, after
the introduction of a mk_move/2 (or similar) callback, for the purpose
of range splitting.
Since the substitution needed to case over all the instructions, a new
module, hipe_x86_subst, was introduced to the x86 backend.
Due to differences in the 'jtab' field of a #jmp_switch{} between x86
and amd64, it regrettably needed to be duplicated to hipe_amd64_subst.
|
|
Now that all backends do register allocation on a CFG directly and
define the defun_to_cfg/1 callback as the identity function, it can be
removed.
|
|
As the just_as_good_as assertion was loosened with the `NowRegs >=
CheckRegs` check, it no longer verified that hipe_regalloc_prepass had
not incorrectly labeled a temp as unallocatable. We add that behaviour
back.
|
|
|
|
This is due to the improvements in hipe_temp_map, removing the need for
duplicated logic in the backends.
|
|
|
|
|
|
|
|
|
|
|
|
hipe_regalloc_prepass speeds up register allocation by spilling any temp
that is live over a call (which clobbers all register).
In order to detect these, a new function was added to the target
interface; defines_all_alloc/1, that takes an instruction and returns a
boolean.
|
|
* sverker/hipe-performance-o1/PR-1154:
hipe_sparc: Minimise CFG<->linear conversions
hipe_ppc: Minimise CFG<->linear conversions
hipe_arm: Minimise CFG<->linear conversions
hipe_x86: Use lea instead of move+add
hipe_arm: Improve peephole optimiser
hipe_arm: Be resilient to crappy RTL
hipe_ppc: Be resilient to crappy RTL
hipe_sparc: Be resilient to crappy RTL
hipe: Reuse liveness info for spillmin
hipe_x86: Minimise CFG<->linear conversions
hipe: Fix o0 and o1
hipe: Add o0 and o1 to tests
hipe_rtl_binary:get_word_integer/4: Handle imms
hipe_x86: Be resilient to crappy RTL
hipe_x86: LSRA for SSE2
|
|
|
|
* sverker/hipe-sparc-19/PR-1148:
Eliminate catch-all clause from two functions
Increase the time limit used by the test suite
|
|
* maint:
dialyzer: Increase time limit of suites
dialyzer: Remove a check that always fails
dialyzer: Optimize an opaque type case
|
|
Fix a mistake in commit 85f6fe3b.
Instead of using the declared opaque type, the form's type is used in
a case where the opaque type is turned into a non-opaque type. The
result is more general types (smaller Erlang terms) and faster
analyses.
|
|
Now, there will only ever be a single Linear->CFG conversion, just after
lowering from RTL, and only ever a single CFG->Linear conversion, just
before the finalise pass. Both of these now happen in hipe_sparc_main.
|
|
Now, there will only ever be a single Linear->CFG conversion, just after
lowering from RTL, and only ever a single CFG->Linear conversion, just
before the finalise pass. Both of these now happen in hipe_ppc_main.
|
|
Now, there will only ever be a single Linear->CFG conversion, just after
lowering from RTL, and only ever a single CFG->Linear conversion, just
before the finalise pass. Both of these now happen in hipe_arm_main.
|
|
This is primarily useful for heap allocations, as a two-address 'add'
can't be used to both copy the heap pointer to another register, and add
the tag.
|
|
|
|
The ARM backend crashes if certain RTL optimisations were omitted,
preventing it from being usable at lower optimisation levels.
One of the problems were caused by shift-by-immediate-zero, which wraps
to immediate-32 with some shiftops. TODO: Someplace should be modified
to crash when these are generated so debuging further instances of this
gets easier in the future.
|
|
The PowerPC backend crashes if certain RTL optimisations were omitted,
preventing it from being usable at lower optimisation levels.
|
|
The SPARC backend crashes if certain RTL optimisations were omitted,
preventing it from being usable at lower optimisation levels.
|