aboutsummaryrefslogtreecommitdiffstats
path: root/lib/hipe/main/hipe.erl
diff options
context:
space:
mode:
authorMagnus Lång <[email protected]>2017-03-16 16:39:26 +0100
committerMagnus Lång <[email protected]>2017-03-16 20:49:42 +0100
commitd1d26f4bf9da3cc5eab4e918df771d67fe9e6bb5 (patch)
treef1d261bb46fff0908a071de80646859e19cb5809 /lib/hipe/main/hipe.erl
parentcf047293ecf6ea108a1e5a412743bfb5fe66e26f (diff)
downloadotp-d1d26f4bf9da3cc5eab4e918df771d67fe9e6bb5.tar.gz
otp-d1d26f4bf9da3cc5eab4e918df771d67fe9e6bb5.tar.bz2
otp-d1d26f4bf9da3cc5eab4e918df771d67fe9e6bb5.zip
hipe: Add range splitter range_split
hipe_range_split is a complex live range splitter, more sophisticated thatn hipe_restore_reuse, but still targeted specifically at temporaries forced onto stack by being live over call instructions. hipe_range_split partitions the control flow graph at call instructions, like hipe_regalloc_prepass. Splitting decisions are made on a per partition and per temporary basis. There are three different ways in which hipe_range_split may choose to split a temporary in a program partition: * Mode1: Spill the temp before calls, and restore it after them * Mode2: Spill the temp after definitions, restore it after calls * Mode3: Spill the temp after definitions, restore it before uses To pick which of these should be used for each temp×partiton pair, hipe_range_split uses a cost function. The cost is simply the sum of the cost of all expected stack accesses, and the cost for an individual stack access is based on the probability weight of the basic block that it resides in. This biases the range splitter so that it attempts moving stack accesses from a functions hot path to the cold path. hipe_bb_weights is used to compute the probability weights. mode3 is effectively the same as what hipe_restore_reuse does. Because of this, hipe_restore_reuse reuses the analysis pass of hipe_restore_reuse in order to compute the minimal needed set of spills and restores. The reason mode3 was introduced to hipe_range_split rather than simply composing it with hipe_restore_reuse (by running both) is that such a composition resulted in poor register allocation results due to insufficiently strong move coalescing in the register allocator. The cost function heuristic has a couple of tuning knobs: * {range_split_min_gain, Gain} (default: 1.1, range: [0.0, inf)) The minimum proportional improvement that the cost of all stack accesses to a temp must display in order for that temp to be split. * {range_split_mode1_fudge, Factor} (default: 1.1, range: [0.0, inf)) Costs for mode1 are multiplied by this factor in order to discourage it when it provides marginal benefits. The justification is that mode1 causes temps to be live for longest, thus leading to higher register pressure. * {range_split_weight_power, Factor} (default: 2, range: (0.0, inf)) Adjusts how much effect the basic block weights have on the cost of a stack access. A stack access in a block with weight 1.0 has cost 1.0, a stack access in a block with weight 0.01 has cost 1/Factor. Additionally, the option range_split_weights chooses whether the basic block weights are used at all. In the case that the input is very big, hipe_range_split automatically falls back to hipe_restore_reuse only in order to keep compile times under control. Note that this is not only because of hipe_range_split being slow, but also due to the resulting program being slow to register allocate, and is not as partitionable by hipe_regalloc_prepass. hipe_restore_reuse, on the other hand, does not affect the programs partitionability. The hipe_range_split pass is controlled by a new option ra_range_split. ra_range_split is added to o2, and ra_restore_reuse is disabled in o2.
Diffstat (limited to 'lib/hipe/main/hipe.erl')
-rw-r--r--lib/hipe/main/hipe.erl17
1 files changed, 16 insertions, 1 deletions
diff --git a/lib/hipe/main/hipe.erl b/lib/hipe/main/hipe.erl
index f3e7c0879e..19b4e8bfe2 100644
--- a/lib/hipe/main/hipe.erl
+++ b/lib/hipe/main/hipe.erl
@@ -1230,6 +1230,13 @@ option_text(regalloc) ->
" optimistic - another variant of a coalescing allocator";
option_text(remove_comments) ->
"Strip comments from intermediate code";
+option_text(ra_range_split) ->
+ "Split live ranges of temporaries live over call instructions\n"
+ "before performing register allocation.\n"
+ "Heuristically tries to move stack accesses to the cold path of function.\n"
+ "This range splitter is more sophisticated than 'ra_restore_reuse', but has\n"
+ "a significantly larger impact on compile time.\n"
+ "Should only be used with move coalescing register allocators.";
option_text(ra_restore_reuse) ->
"Split live ranges of temporaries such that straight-line\n"
"code will not need to contain multiple restores from the same stack\n"
@@ -1376,7 +1383,12 @@ opt_keys() ->
pp_rtl_linear,
ra_partitioned,
ra_prespill,
+ ra_range_split,
ra_restore_reuse,
+ range_split_min_gain,
+ range_split_mode1_fudge,
+ range_split_weight_power,
+ range_split_weights,
regalloc,
remove_comments,
rtl_ssa,
@@ -1436,7 +1448,8 @@ o1_opts(TargetArch) ->
o2_opts(TargetArch) ->
Common = [icode_type, icode_call_elim, % icode_ssa_struct_reuse,
- rtl_lcm | (o1_opts(TargetArch) -- [rtl_ssapre])],
+ ra_range_split, range_split_weights, % XXX: Having defaults here is ugly
+ rtl_lcm | (o1_opts(TargetArch) -- [rtl_ssapre, ra_restore_reuse])],
case TargetArch of
T when T =:= amd64 orelse T =:= ppc64 -> % 64-bit targets
[icode_range | Common];
@@ -1484,7 +1497,9 @@ opt_negations() ->
{no_pp_rtl_ssapre, pp_rtl_ssapre},
{no_ra_partitioned, ra_partitioned},
{no_ra_prespill, ra_prespill},
+ {no_ra_range_split, ra_range_split},
{no_ra_restore_reuse, ra_restore_reuse},
+ {no_range_split_weights, range_split_weights},
{no_remove_comments, remove_comments},
{no_rtl_ssa, rtl_ssa},
{no_rtl_ssa_const_prop, rtl_ssa_const_prop},