diff options
author | Björn Gustavsson <[email protected]> | 2015-07-07 10:45:38 +0200 |
---|---|---|
committer | Björn Gustavsson <[email protected]> | 2015-08-21 15:55:35 +0200 |
commit | c288ab87fd6cafe22ce46be551baa2e815b495b0 (patch) | |
tree | bda0b5f6646ae4b00ffca4df5ba9dc4a8e97f641 /HOWTO | |
parent | 5f431276f1044c673c2e434e003e2f1ffddab341 (diff) | |
download | otp-c288ab87fd6cafe22ce46be551baa2e815b495b0.tar.gz otp-c288ab87fd6cafe22ce46be551baa2e815b495b0.tar.bz2 otp-c288ab87fd6cafe22ce46be551baa2e815b495b0.zip |
Delay get_tuple_element instructions until they are needed
When matching tuples, the pattern matching compiler would generate
code that would fetch all elements of the tuple that will ultimately
be used, *before* testing that (for example) the first element is the
correct record tag. For example:
is_tuple Fail {x,0}
test_arity Fail {x,0} 3
get_tuple_element {x,0} 0 {x,1}
get_tuple_element {x,0} 1 {x,2}
get_tuple_element {x,0} 2 {x,3}
is_eq_exact Fail {x,1} some_tag
If {x,2} and {x,3} are not used at label Fail, we can re-arrange the
code like this:
is_tuple Fail {x,0}
test_arity Fail {x,0} 3
get_tuple_element {x,0} 0 {x,1}
is_eq_exact Fail {x,1} some_tag
get_tuple_element {x,0} 1 {x,2}
get_tuple_element {x,0} 2 {x,3}
Doing that may be beneficial in two ways.
If the branch is taken, we have eliminated the execution of two
unnecessary instructions.
Even if the branch is never or rarely taken, there is the possibility
for more optimizations following the is_eq_exact instructions.
For example, imagine that the code looks like this:
get_tuple_element {x,0} 1 {x,2}
get_tuple_element {x,0} 2 {x,3}
move {x,2} {y,0}
move {x,3} {y,1}
Assuming that {x,2} and {x,3} have no further uses in the code
that follows, that can be rewritten to:
get_tuple_element {x,0} 1 {y,0}
get_tuple_element {x,0} 2 {y,1}
When should we perform this optimization?
At the very latest, it must be done before opt_blocks/1 in
beam_block which does the elimination of unnecessary moves.
Actually, we want do the optimization before the blocks have
been established, since moving instructions out of one block
into another is cumbersome.
Therefore, we will do the optimization in a new pass that is
run before beam_block. A new pass will make debugging easier,
and beam_block already has a fair number of sub passes.
Diffstat (limited to 'HOWTO')
0 files changed, 0 insertions, 0 deletions