aboutsummaryrefslogtreecommitdiffstats
path: root/erts/emulator
AgeCommit message (Collapse)Author
2019-03-11erts: Refactor common things into traverse_context_tSverker Eriksson
and rename it from match_callbacks_t.
2019-03-09Optimize tail-recursive calls of BIFsBjörn Gustavsson
BEAM currently does not call BIFs at the end of a function in a tail-recursive way. That is, when calling a BIF at the end of a function, the BIF is first called, and then the stack frame is deallocated, and then control is transferred to the caller. If there is no stack frame when a BIF is called in the tail position, the loader will emit a sequence of three instructions: first an instruction that allocates a stack frame and saves the continuation pointer (`allocate`), then an instruction that calls the BIF (`call_bif`), and lastly an instruction that deallocates the stack frame and returns to the caller (`deallocate_return`). The old compiler would essentially allocate a stack frame for each clause in a function, so it would not be that common that a BIF was called in the tail position when there was no stack frame, so the three-instruction sequence was deemed acceptable. The new compiler only allocates stack frames when truly needed, so the three-instruction BIF call sequence has become much more common. This commit introduces a new `call_bif_only` instruction so that only one instruction will be needed when calling a BIF in the tail position when there is no stack frame. This instruction is also used when there is a stack frame to make it possible to deallocate the stack frame **before** calling the BIF, which may make a subsequent garbage collection at the end of the BIF call cheaper (copying less garbage). The one downside of this change is that the function that called the BIF will not be included in the stack backtrace (similar to how a tail-recursive call to an Erlang function will not be included in the backtrace). That was the quick summary of the commit. Here comes a detailed look at how BIF calls are translated by the loader. The first example is a function that calls `setelement/3` in the tail position: update_no_stackframe(X) -> setelement(5, X, new_value). Here is the BEAM code: {function, update_no_stackframe, 1, 12}. {label,11}. {line,[...]}. {func_info,{atom,t},{atom,update_no_stackframe},1}. {label,12}. {move,{x,0},{x,1}}. {move,{atom,new_value},{x,2}}. {move,{integer,5},{x,0}}. {line,[...]}. {call_ext_only,3,{extfunc,erlang,setelement,3}}. Because there is no stack frame, the `call_ext_only` instruction will be used to call `setelement/3`: {call_ext_only,3,{extfunc,erlang,setelement,3}}. The loader will transform this instruction to a three-instruction sequence: 0000000020BD8130: allocate_tt 0 3 0000000020BD8138: call_bif_e erlang:setelement/3 0000000020BD8148: deallocate_return_Q 0 Using the `call_bif_only` instruction introduced in this commit, only one instruction is needed: 000000005DC377F0: call_bif_only_e erlang:setelement/3 `call_bif_only` calls the BIF and returns to the caller. Now let's look at a function that already has a stack frame when `setelement/3` is called: update_with_stackframe(X) -> foobar(X), setelement(5, X, new_value). Here is the BEAM code: {function, update_with_stackframe, 1, 14}. {label,13}. {line,[...]}. {func_info,{atom,t},{atom,update_with_stackframe},1}. {label,14}. {allocate,1,1}. {move,{x,0},{y,0}}. {line,[...]}. {call,1,{f,16}}. {move,{y,0},{x,1}}. {move,{atom,new_value},{x,2}}. {move,{integer,5},{x,0}}. {line,[...]}. {call_ext_last,3,{extfunc,erlang,setelement,3},1}. Since there is a stack frame, the `call_ext_last` instruction will be used to deallocate the stack frame and call the function: {call_ext_last,3,{extfunc,erlang,setelement,3},1}. Before this commit, the loader would translate this instruction to: 0000000020BD81B8: call_bif_e erlang:setelement/3 0000000020BD81C8: deallocate_return_Q 1 That is, the BIF is called before deallocating the stack frame and returning to the calling function. After this commit, the loader will translate the `call_ext_last` like this: 000000005DC37868: deallocate_Q 1 000000005DC37870: call_bif_only_e erlang:setelement/3 There are still two instructions, but now the stack frame will be deallocated before calling the BIF, which could make the potential garbage collection after the BIF call slightly more efficient (copying less garbage). We could have introduced a `call_bif_last` instruction, but the code for calling a BIF is relatively large and there does not seem be a practical way to share the code between `call_bif` and `call_bif_only` (since the difference is at the end, after the BIF call). Therefore, we did not want to clone the BIF calling code yet another time to make a `call_bif_last` instruction.
2019-03-09Merge pull request #2176 from josevalim/jv-beam-load-messageBjörn Gustavsson
Clarify beam_load error message on file/module mismatch
2019-03-08Merge pull request #2175 from jhogberg/john/erts/enif_term_type/OTP-15640John Högberg
erts: Add enif_term_type
2019-03-08Clarify beam_load error message on file/module mismatchJosé Valim
This is particularly important in case insensitive filesystems, where attempting to invoke a module with the wrong case leads to confusing error messages: 1> erlpress_core:foo(). beam/beam_load.c(1428): Error loading module 'erlpress_core': module name in object code is erlPress_core Loading of erlPress_core.beam failed: :badfile This commit replaces object code by BEAM file and improves the readability of the message.
2019-03-08Merge branch 'sverker/master/ets-no-mbuf-trapping/OTP-15660'Sverker Eriksson
* sverker/master/ets-no-mbuf-trapping/OTP-15660: erts: Remove ets traversal yielding if heap fragment
2019-03-08Merge pull request #2174 from bjorng/bjorn/tune-beam-2Björn Gustavsson
Tune BEAM instructions for the new compiler (part 2)
2019-03-07Merge branch 'sverker/maint/ets-no-mbuf-trapping/OTP-15660'Sverker Eriksson
into sverker/master/ets-no-mbuf-trapping/OTP-15660
2019-03-07Merge branch 'sverker/ets-no-mbuf-trapping/OTP-15660'Sverker Eriksson
into sverker/maint/ets-no-mbuf-trapping/OTP-15660
2019-03-07erts: Remove ets traversal yielding if heap fragmentSverker Eriksson
Many heap fragments do no longer make the GC slow. Even worse, we are not guaranteed that a yield will provoke a GC removing the fragments, which might lead to a one-yield-per-bucket scenario if the heap fragment(s) still remains after each yield.
2019-03-07Merge branch 'rickard/make-fixes-21/OTP-15657' into maintRickard Green
* rickard/make-fixes-21/OTP-15657: Remove own configured RM make variable
2019-03-07Merge branch 'rickard/make-fixes-22/OTP-15657'Rickard Green
* rickard/make-fixes-22/OTP-15657: Remove own configured RM make variable
2019-03-07Slightly optimize binary constructionBjörn Gustavsson
Use S operands instead of s operands for a slight speed increase and reduction in code size of process_main(). Use micro instructions for frequently executed instructions. While at it, use safe multiplication in gen_get_integer() in beam_load.c.
2019-03-07erts: Add enif_term_typeJohn Högberg
This helps avoid long sequences of enif_is_xxx in code that serializes terms (such as JSON encoders) by letting the user switch on the type.
2019-03-06Merge branch 'rickard/send-bump-reds/ERL-773/OTP-15513'Rickard Green
* rickard/send-bump-reds/ERL-773/OTP-15513: Fix faulty assertion Bump reductions on send based on message size
2019-03-06Merge 'rickard/make-fixes-21/OTP-15657' into 'rickard/make-fixes-22/OTP-15657'Rickard Green
* rickard/make-fixes-21/OTP-15657: Remove own configured RM make variable
2019-03-06Merge 'rickard/make-fixes-20/OTP-15657' into 'rickard/make-fixes-21/OTP-15657'Rickard Green
* rickard/make-fixes-20/OTP-15657: Remove own configured RM make variable
2019-03-06Merge 'rickard/make-fixes-19/OTP-15657' into 'rickard/make-fixes-20/OTP-15657'Rickard Green
* rickard/make-fixes-19/OTP-15657: Remove own configured RM make variable
2019-03-06Merge 'rickard/make-fixes-18/OTP-15657' into 'rickard/make-fixes-19/OTP-15657'Rickard Green
* rickard/make-fixes-18/OTP-15657: Remove own configured RM make variable
2019-03-06Merge 'rickard/make-fixes-17/OTP-15657' into 'rickard/make-fixes-18/OTP-15657'Rickard Green
* rickard/make-fixes-17/OTP-15657: Remove own configured RM make variable
2019-03-06Remove own configured RM make variableRickard Green
Instead rely on gnu make's pre-defined RM variable which should equal 'rm -f'
2019-03-06[socket] More if-def to make it "work" on windowsMicael Karlberg
2019-03-06Merge branch 'maint'Rickard Green
* maint: kernel runtime dependency to erts erts: Add yield via timeout to inet read_packet erts: Don't increase buffer when sctp sndbuf is set erts: Only change inet buffer if not set
2019-03-06Merge branch 'lukas/erts/fix_inet_buffer_auto_adjust/OTP-15651/OTP-15652' ↵Rickard Green
into maint * lukas/erts/fix_inet_buffer_auto_adjust/OTP-15651/OTP-15652: kernel runtime dependency to erts erts: Add yield via timeout to inet read_packet erts: Don't increase buffer when sctp sndbuf is set erts: Only change inet buffer if not set
2019-03-06Slightly optimize is_eq and is_neBjörn Gustavsson
2019-03-06Optimize the '*' operator when multiplying two small integersBjörn Gustavsson
2019-03-06Optimize multiplication in binary matching instructionsBjörn Gustavsson
2019-03-06sys.h: Check for overflow checking aritmethic builtinsBjörn Gustavsson
Let sys.h define HAVE_OVERFLOW_CHECK_BUILTINS if the compiler supports __builtin_mul_overflow() and the other overflow checking builtins. The test is intentionally made in a sys.h and not as a configure test. On Windows, beam_emu.c is always compiled using gcc, but the other files are usually compiled with Microsoft's C compiler. With the test in the header file, HAVE_OVERFLOW_CHECK_BUILTINS will be defined when compiling beam_emu.c.
2019-03-06Eliminate unused i_bs_skip_bits_all2 instructionBjörn Gustavsson
Starting in OTP 19 (in commit 9504c0dd71d0), the compiler emits a test_unit instruction instead of a skip instruction at the end of binary. We can do the same replacement in the loader to get rid of the i_bs_skip_bits_all2 instruction.
2019-03-06Optimize field size calculation on a 64-bit architectureBjörn Gustavsson
On a 64-bit architecture, the size of any binary that would fit in the memory must fit in a small, so we can fail immediately if the size term is not a small.
2019-03-06Reduce code size for binary matching instructionsBjörn Gustavsson
The new compiler required adding support for Y register for all binary matching instructions. That was (intentionally) done in a naive way that simplicated duplicated the entire body of each instruction. Now it's time to be less naive. Rewrite the binary matching instructions using micro instructions. Because some of the binary instructions are huge, that will significantly decrease the size of process_main(). When compiling with clang, a huge process_main() would mess up profile-guide optimization resulting in a significant performance degradation. On my Mac, profile-guide optimzation would decrease the estone benchmark by 100K estones (about 20 percent). This commit gives me back the lost estones.
2019-03-06Deoptimize obsoleted binary matching instructionsBjörn Gustavsson
Mark the obsoleted instructions bs_start_match2, bs_save2, bs_restore2, and bs_context_to_binary as cold. Remove support of a Y operand for bs_save2 and bs_restore2.
2019-03-06Reclassify get_tuple_element with a Y destination as hotBjörn Gustavsson
get_tuple_element with an Y register has become more frequent with the new compiler.
2019-03-06Remove optimization that has become a pessimizationBjörn Gustavsson
The compiler used to generate "move Literal y(Y)" instructions very rarely. Therefore, there was a transformation to avoid having a "move c y" instruction. With the new compiler, "move Literal y(Y)" instructions are relatively frequent, so we will need a "move c y" instruction.
2019-03-06Introduce move_window2 and remove move2_par_xyxyBjörn Gustavsson
2019-03-06Optimize hd/1 and tl/1 in guardsBjörn Gustavsson
2019-03-06bif_instrs.tab: Don't hardcode length of instructionsBjörn Gustavsson
2019-03-06[socket] Use (new) enif_compare_pids functionMicael Karlberg
Replace the won function for pid compare with the (new) enif_compare_pids function. OTP-15565
2019-03-06[socket] CommentaryMicael Karlberg
Add some more comments in order to increase readability. OTP-15565
2019-03-06[socket] Macro abuse of activate-nextMicael Karlberg
Implemented the activate_next function and added its "users" acceptor, writer and reader (macro abuse). After a request (accept, write or send) has been either successfully completed or failed, another request should be activated. Previously only one attempt was made, which might leave the other (waiting) requestors hanging. Now, instead we use a 'activate-next' function that pop's the request (accept, wrote or read) queue until success or its empty, thereby making sure that no waiting processes is left hanging. OTP-15565
2019-03-06beam_emu.c: Rename the confusing macro GetR() to GetSource()Björn Gustavsson
2019-03-05[socket] Make use of official monitor to term functionMicael Karlberg
Remove own function to make monitor printable (was a hack) and make use of the new enif_make_monitor_term instead.
2019-03-04Merge branch 'bmk/20190304/openindiana_types' into ↵Micael Karlberg
bmk/20190301/cleanup_through_macro_abuse/OTP-15565
2019-03-04[socket] Fixed various type size and unused function warningsMicael Karlberg
Fixed some type size warnings (SCTP related). E.g: On Solaris 11 (OpenIndiana Hipster) long and int is size 4, but the way Sint32 def works it first "tests" for long size and if that is correct, that is chosen (which it is on Solaris 11). On linux long is size 8, so Sint32 will be defined as int. ... On Solaris 11 the flags TCP_CONGESTION and SO_BINDTODEVICE does not exist, so the function(s) n[set|get]opt_str_opt is never used. So, in order to keep the compiler quiet, we add some if-def to exclude these functions in this case.
2019-03-04[socket] Some minor cleanup and commentsMicael Karlberg
2019-03-04[socket] Macro abuse for requestor queue func defMicael Karlberg
The requestor (acceptor, writer and reader) functions are virtually identical, so to ensure quality and not having to write the exact same functions three times, we make use of some macro magic for their declaration. OTP-15565
2019-03-04[net] Macro abuse for func defMicael Karlberg
Some more macro abuse for nif API callback functions. OTP-15565
2019-03-04[socket] Macro abuse for more func defMicael Karlberg
Some more macro abuse for nif API callback functions and the operator (acceptor, writer and reader) queue wrapper functions (search4pid, push, pop and unqueue). OTP-15565
2019-03-04[socket] Macro abuse for setopt(otp) and getopt(otp)Micael Karlberg
2019-03-04[socket|net] Macro abuseMicael Karlberg
Make use of macro concat magic to simplify declarations. OTP-15565