Age | Commit message (Collapse) | Author |
|
* rickard/dist_ctrl_get_data/OTP-15617:
Testing of the example gen_tcp_dist module
Add possibility to also get size of data from erlang:dist_ctrl_get_data()
|
|
|
|
Tune BEAM instructions for the new compiler (part 3)
|
|
* max-au/erts/dirty_scheduler_shutdown/PR-2172/OTP-15690:
erts: release dirty runqueue lock before entering endless loop when BEAM is shutting down
|
|
shutting down
This patch fixes a problem happening when BEAM is shutting down. It is possible for a dirty scheduler to take the lock, and keep it, when the system is shutting down. It may also happen that a normal scheduler decides to schedule some dirty job (example is major garbage collection that results in migrating the process into dirty CPU queue), and hangs trying to take the lock that will never be released.
To fix the problem, either release the lock before entering endless wait loop, or reverse the order in which schedulers are stopped. Either fix works, and, of course, it works even better to apply both.
|
|
If a suspend/resume signal pair was sent to a process while it
was executing dirty the resume counter on the process got into
an inconsistent state. This in turn could cause the process
to enter a suspended state indefinitely.
|
|
Introduce move_src_window[234] instructions for moving several
consecutively numbered Y registers to discontiguously numbered X
registers. This optimization is effective because the compiler has
sorted the `move` instructions in Y register order.
|
|
into sverker/master/enif_whereis_pid-dirty-dtor
|
|
|
|
to run user NIF code in a more known execution context.
Fixes problems like user calling enif_whereis_pid() in destructor
which may need to release process main lock in order to lock reg_tab.
|
|
|
|
|
|
|
|
move_dup is used very infrequently.
|
|
|
|
|
|
With the new compiler, it has become less common with a
move to x(0) before a jump. Change the move_jump instruction
to take a destination as well as a source.
|
|
|
|
Also support swap of Y registers.
|
|
It turns out that sequences such as the following are common:
move x0 Y1
move Y2 x0
|
|
It is relatively common to move something from a Y register to
an X register before trimming.
|
|
Apart from the refactoring, the instruction "put_list x c y" is replaced
with "put_list x n y".
|
|
|
|
Introduce the GENOP_NAME_ARITY() macro to avoid setting the arity
wrong for for an instruction.
|
|
That will avoid showing garbage instructions that will never be
executed.
|
|
|
|
* sverker/ets-select-fixation-owner-change-bug/OTP-15672:
erts: Fix bug for yielding ets:replace
|
|
Found by valgrind:
Conditional jump or move depends on uninitialised value(s)
Suspected: ets_select_replace_1:3034 [erl_db.c]
Bug introduced by already merged parent commit
0d550c80d4f19cc432e7de056169695d436c02a0.
|
|
* sverker/ets-select-fixation-owner-change-bug/OTP-15672:
erts: Fix ets:select table fixation leak at owner change
erts: Refactor common things into traverse_context_t
stdlib: Clarify docs for ets:info(_, safe_fixed)
|
|
Optimize tail-recursive calls of BIFs
OTP-15674
|
|
Symtom:
ETS table remains fixed after finished ets:select* call.
Problem:
The decision to unfix table after a yielding ets:select*
is based on table ownership, but ownership might have changed
while ets:select* was yielding.
Solution:
Remember and pass along whether table was fixed
when the traversal started.
|
|
and rename it from match_callbacks_t.
|
|
BEAM currently does not call BIFs at the end of a function in a
tail-recursive way. That is, when calling a BIF at the end of a
function, the BIF is first called, and then the stack frame
is deallocated, and then control is transferred to the caller.
If there is no stack frame when a BIF is called in the tail position,
the loader will emit a sequence of three instructions: first an
instruction that allocates a stack frame and saves the continuation
pointer (`allocate`), then an instruction that calls the BIF
(`call_bif`), and lastly an instruction that deallocates the stack
frame and returns to the caller (`deallocate_return`).
The old compiler would essentially allocate a stack frame for each
clause in a function, so it would not be that common that a BIF was
called in the tail position when there was no stack frame, so the
three-instruction sequence was deemed acceptable.
The new compiler only allocates stack frames when truly needed, so
the three-instruction BIF call sequence has become much more common.
This commit introduces a new `call_bif_only` instruction so that only
one instruction will be needed when calling a BIF in the tail position
when there is no stack frame. This instruction is also used when there
is a stack frame to make it possible to deallocate the stack frame
**before** calling the BIF, which may make a subsequent garbage
collection at the end of the BIF call cheaper (copying less garbage).
The one downside of this change is that the function that called the
BIF will not be included in the stack backtrace (similar to how a
tail-recursive call to an Erlang function will not be included in the
backtrace).
That was the quick summary of the commit. Here comes a detailed look
at how BIF calls are translated by the loader. The first example is a
function that calls `setelement/3` in the tail position:
update_no_stackframe(X) ->
setelement(5, X, new_value).
Here is the BEAM code:
{function, update_no_stackframe, 1, 12}.
{label,11}.
{line,[...]}.
{func_info,{atom,t},{atom,update_no_stackframe},1}.
{label,12}.
{move,{x,0},{x,1}}.
{move,{atom,new_value},{x,2}}.
{move,{integer,5},{x,0}}.
{line,[...]}.
{call_ext_only,3,{extfunc,erlang,setelement,3}}.
Because there is no stack frame, the `call_ext_only` instruction will
be used to call `setelement/3`:
{call_ext_only,3,{extfunc,erlang,setelement,3}}.
The loader will transform this instruction to a three-instruction
sequence:
0000000020BD8130: allocate_tt 0 3
0000000020BD8138: call_bif_e erlang:setelement/3
0000000020BD8148: deallocate_return_Q 0
Using the `call_bif_only` instruction introduced in this commit,
only one instruction is needed:
000000005DC377F0: call_bif_only_e erlang:setelement/3
`call_bif_only` calls the BIF and returns to the caller.
Now let's look at a function that already has a stack frame when
`setelement/3` is called:
update_with_stackframe(X) ->
foobar(X),
setelement(5, X, new_value).
Here is the BEAM code:
{function, update_with_stackframe, 1, 14}.
{label,13}.
{line,[...]}.
{func_info,{atom,t},{atom,update_with_stackframe},1}.
{label,14}.
{allocate,1,1}.
{move,{x,0},{y,0}}.
{line,[...]}.
{call,1,{f,16}}.
{move,{y,0},{x,1}}.
{move,{atom,new_value},{x,2}}.
{move,{integer,5},{x,0}}.
{line,[...]}.
{call_ext_last,3,{extfunc,erlang,setelement,3},1}.
Since there is a stack frame, the `call_ext_last` instruction will be used
to deallocate the stack frame and call the function:
{call_ext_last,3,{extfunc,erlang,setelement,3},1}.
Before this commit, the loader would translate this instruction to:
0000000020BD81B8: call_bif_e erlang:setelement/3
0000000020BD81C8: deallocate_return_Q 1
That is, the BIF is called before deallocating the stack frame and returning
to the calling function.
After this commit, the loader will translate the `call_ext_last` like this:
000000005DC37868: deallocate_Q 1
000000005DC37870: call_bif_only_e erlang:setelement/3
There are still two instructions, but now the stack frame will be
deallocated before calling the BIF, which could make the potential
garbage collection after the BIF call slightly more efficient (copying
less garbage).
We could have introduced a `call_bif_last` instruction, but the code
for calling a BIF is relatively large and there does not seem be a
practical way to share the code between `call_bif` and `call_bif_only`
(since the difference is at the end, after the BIF call). Therefore,
we did not want to clone the BIF calling code yet another time to
make a `call_bif_last` instruction.
|
|
Clarify beam_load error message on file/module mismatch
|
|
erts: Add enif_term_type
|
|
This is particularly important in case insensitive filesystems,
where attempting to invoke a module with the wrong case leads
to confusing error messages:
1> erlpress_core:foo().
beam/beam_load.c(1428): Error loading module 'erlpress_core':
module name in object code is erlPress_core
Loading of erlPress_core.beam failed: :badfile
This commit replaces object code by BEAM file and improves
the readability of the message.
|
|
* sverker/master/ets-no-mbuf-trapping/OTP-15660:
erts: Remove ets traversal yielding if heap fragment
|
|
Tune BEAM instructions for the new compiler (part 2)
|
|
into sverker/master/ets-no-mbuf-trapping/OTP-15660
|
|
into sverker/maint/ets-no-mbuf-trapping/OTP-15660
|
|
Many heap fragments do no longer make the GC slow.
Even worse, we are not guaranteed that a yield will provoke a GC
removing the fragments, which might lead to a one-yield-per-bucket
scenario if the heap fragment(s) still remains after each yield.
|
|
Use S operands instead of s operands for a slight speed increase
and reduction in code size of process_main(). Use micro instructions
for frequently executed instructions.
While at it, use safe multiplication in gen_get_integer() in
beam_load.c.
|
|
This helps avoid long sequences of enif_is_xxx in code that
serializes terms (such as JSON encoders) by letting the user
switch on the type.
|
|
* rickard/send-bump-reds/ERL-773/OTP-15513:
Fix faulty assertion
Bump reductions on send based on message size
|
|
|
|
|
|
|
|
Let sys.h define HAVE_OVERFLOW_CHECK_BUILTINS if the compiler supports
__builtin_mul_overflow() and the other overflow checking builtins.
The test is intentionally made in a sys.h and not as a configure test.
On Windows, beam_emu.c is always compiled using gcc, but the other
files are usually compiled with Microsoft's C compiler. With the
test in the header file, HAVE_OVERFLOW_CHECK_BUILTINS will be defined
when compiling beam_emu.c.
|
|
Starting in OTP 19 (in commit 9504c0dd71d0), the compiler emits
a test_unit instruction instead of a skip instruction at the end
of binary. We can do the same replacement in the loader to get
rid of the i_bs_skip_bits_all2 instruction.
|
|
On a 64-bit architecture, the size of any binary that would fit in the
memory must fit in a small, so we can fail immediately if the size
term is not a small.
|