aboutsummaryrefslogtreecommitdiffstats
path: root/erts/emulator/Makefile.in
diff options
context:
space:
mode:
authorBjörn Gustavsson <[email protected]>2019-03-07 10:00:39 +0100
committerBjörn Gustavsson <[email protected]>2019-03-09 09:31:08 +0100
commit2d2e78ad6e667655560a67e848153dbb218914f7 (patch)
treef14140ca46e4eeaacf2895d59a12772f6f1dac33 /erts/emulator/Makefile.in
parent3066a5bf51467d1f8f0b05cf7b7bab0ef6a17578 (diff)
downloadotp-2d2e78ad6e667655560a67e848153dbb218914f7.tar.gz
otp-2d2e78ad6e667655560a67e848153dbb218914f7.tar.bz2
otp-2d2e78ad6e667655560a67e848153dbb218914f7.zip
Optimize tail-recursive calls of BIFs
BEAM currently does not call BIFs at the end of a function in a tail-recursive way. That is, when calling a BIF at the end of a function, the BIF is first called, and then the stack frame is deallocated, and then control is transferred to the caller. If there is no stack frame when a BIF is called in the tail position, the loader will emit a sequence of three instructions: first an instruction that allocates a stack frame and saves the continuation pointer (`allocate`), then an instruction that calls the BIF (`call_bif`), and lastly an instruction that deallocates the stack frame and returns to the caller (`deallocate_return`). The old compiler would essentially allocate a stack frame for each clause in a function, so it would not be that common that a BIF was called in the tail position when there was no stack frame, so the three-instruction sequence was deemed acceptable. The new compiler only allocates stack frames when truly needed, so the three-instruction BIF call sequence has become much more common. This commit introduces a new `call_bif_only` instruction so that only one instruction will be needed when calling a BIF in the tail position when there is no stack frame. This instruction is also used when there is a stack frame to make it possible to deallocate the stack frame **before** calling the BIF, which may make a subsequent garbage collection at the end of the BIF call cheaper (copying less garbage). The one downside of this change is that the function that called the BIF will not be included in the stack backtrace (similar to how a tail-recursive call to an Erlang function will not be included in the backtrace). That was the quick summary of the commit. Here comes a detailed look at how BIF calls are translated by the loader. The first example is a function that calls `setelement/3` in the tail position: update_no_stackframe(X) -> setelement(5, X, new_value). Here is the BEAM code: {function, update_no_stackframe, 1, 12}. {label,11}. {line,[...]}. {func_info,{atom,t},{atom,update_no_stackframe},1}. {label,12}. {move,{x,0},{x,1}}. {move,{atom,new_value},{x,2}}. {move,{integer,5},{x,0}}. {line,[...]}. {call_ext_only,3,{extfunc,erlang,setelement,3}}. Because there is no stack frame, the `call_ext_only` instruction will be used to call `setelement/3`: {call_ext_only,3,{extfunc,erlang,setelement,3}}. The loader will transform this instruction to a three-instruction sequence: 0000000020BD8130: allocate_tt 0 3 0000000020BD8138: call_bif_e erlang:setelement/3 0000000020BD8148: deallocate_return_Q 0 Using the `call_bif_only` instruction introduced in this commit, only one instruction is needed: 000000005DC377F0: call_bif_only_e erlang:setelement/3 `call_bif_only` calls the BIF and returns to the caller. Now let's look at a function that already has a stack frame when `setelement/3` is called: update_with_stackframe(X) -> foobar(X), setelement(5, X, new_value). Here is the BEAM code: {function, update_with_stackframe, 1, 14}. {label,13}. {line,[...]}. {func_info,{atom,t},{atom,update_with_stackframe},1}. {label,14}. {allocate,1,1}. {move,{x,0},{y,0}}. {line,[...]}. {call,1,{f,16}}. {move,{y,0},{x,1}}. {move,{atom,new_value},{x,2}}. {move,{integer,5},{x,0}}. {line,[...]}. {call_ext_last,3,{extfunc,erlang,setelement,3},1}. Since there is a stack frame, the `call_ext_last` instruction will be used to deallocate the stack frame and call the function: {call_ext_last,3,{extfunc,erlang,setelement,3},1}. Before this commit, the loader would translate this instruction to: 0000000020BD81B8: call_bif_e erlang:setelement/3 0000000020BD81C8: deallocate_return_Q 1 That is, the BIF is called before deallocating the stack frame and returning to the calling function. After this commit, the loader will translate the `call_ext_last` like this: 000000005DC37868: deallocate_Q 1 000000005DC37870: call_bif_only_e erlang:setelement/3 There are still two instructions, but now the stack frame will be deallocated before calling the BIF, which could make the potential garbage collection after the BIF call slightly more efficient (copying less garbage). We could have introduced a `call_bif_last` instruction, but the code for calling a BIF is relatively large and there does not seem be a practical way to share the code between `call_bif` and `call_bif_only` (since the difference is at the end, after the BIF call). Therefore, we did not want to clone the BIF calling code yet another time to make a `call_bif_last` instruction.
Diffstat (limited to 'erts/emulator/Makefile.in')
0 files changed, 0 insertions, 0 deletions