Age | Commit message (Collapse) | Author |
|
Optimize continuation pointer management
|
|
The BEAM instructions for calling a function don't save the
continuation pointer (return address) on the stack, but to a special
BEAM register called CP. It is the responsibility of the called
function to save CP to the stack frame before calling other functions.
In the earlier implementations of BEAM on Sparc, CP was located in a
CPU register. That meant that the continuation pointer was never
written to memory when calling simple functions that didn't call
other functions at all or ended in a tail-call to another function.
The modern BEAM no longer keeps CP in CPU register. Instead, it is
kept in the `process` struct (in `p->cp`). That means the continuation
pointer must be written to the memory on every call, and if the called
function will call other functions, it will must read the continuation
pointer from `p->cp` and store it on the stack.
This commit eliminates the concept of the CP register and modifies
the call instructions to directly store the continuation pointer on
the stack. That makes allocation and trimming of stack frames slightly
faster. A more important benefit is simplification of code that handles
continuation pointers. Because all continuation pointers are now stored
on the stack, the special case of handling `p->cp` disappears.
Co-authored-by: John Högberg <[email protected]>
|
|
* maint:
erts: Scan heap fragments for off-heap binaries
|
|
* john/erts/process_info-binary-heap-fragments/OTP-15978:
erts: Scan heap fragments for off-heap binaries
|
|
|
|
* sverker/re-enable-big-creation/OTP-15603:
doc: Add links between dist flags and external tags (DTD updated)
erts: Fix docs for new pid,port,ref external tags
erts: Document new EPMD response ALIVE2_X_RESP
erl_interface: Support 32-bit creation local cnode
jinterface: Remove old encoding of pid,port,refs
epmd: Support 32-bit creation values in local node
erl_interface: Remove old encoding of pid,port,refs
erts: Remove old encoding of pids, ports and refs
erts: Make DFLAG_BIG_CREATION mandatory
|
|
'sverker/erts/process-info-reductions-idle-proc/ERL-964/OTP-15865' into maint
* sverker/erts/process-info-reductions-idle-proc/ERL-964/OTP-15865:
erts: Improve test of process_info(reductions)
Revert "erts: Force process_info(reductions) as signal"
|
|
This reverts commit 70dbf671a8196110d2aee2e7507afc2c2c75183f.
As the comment of 70dbf671a8 itself indicates, that "fix" is not really
necessary. It has, however, the bad effect of always consuming reductions of the
process you want to know reduction from, that is you can't meassure reduction
count without affecting it.
|
|
This reverts revert-commit d293c3ff700c1a0992a32dc3da9ae18964893c23.
|
|
* sverker/process_info-reductions-fix/OTP-15793:
erts: Force process_info(reductions) as signal
erts: Fix another bug in process_info(reductions)
|
|
|
|
* sverker/process_info-reductions-fix/OTP-15793:
erts: Force process_info(reductions) as signal
erts: Fix another bug in process_info(reductions)
|
|
Not 100% sure this is needed to get correct reductions
as the direct query is not done if process is RUNNING anyway.
|
|
* rickard/dist-system-limit/OTP-15708:
Fail when we cannot encode term in binary
|
|
This commit fixes an ETS test case that tests the decentralized memory
counter in tables of type ordered_set with the write_concurrency
option turned on. The test case assumed that the memory consumption of
the table would only grow monotonically when terms are
inserted. However, this was not the case when the emulator was
compiled in debug mode as random splits and joins of CA tree nodes
could happen. This commit fixes the test case by disabling random
splits and joins in the tested table.
|
|
* sverker/revert-big-creation:
Revert "erts: Make DFLAG_BIG_CREATION mandatory"
Revert "erts: Remove old encoding of pids, ports and refs"
Revert "erl_interface: Remove old encoding of pid,port,refs"
Revert "epmd: Support 32-bit creation values in local node"
Revert "jinterface: Remove old encoding of pid,port,refs"
Revert "erl_interface: Support 32-bit creation local cnode"
Revert "erts: Document new EPMD response ALIVE2_X_RESP"
|
|
Previously, all ETS tables used centralized counter variables to keep
track of the number of items stored and the amount of memory
consumed. These counters can cause scalability problems (especially on
big NUMA systems). This commit adds an implementation of a
decentralized counter and modifies the implementation of ETS so that
ETS tables of type ordered_set with write_concurrency enabled use the
decentralized counter. [Experiments][1] indicate that this change
substantially improves the scalability of ETS ordered_set tables with
write_concurrency enabled in scenarios with frequent `ets:insert/2`
and `ets:delete/2` calls.
The new counter is implemented in the module erts_flxctr
(`erts_flxctr.h` and `erts_flxctr.c`). The module has the suffix
flxctr as it contains the implementation of a flexible counter (i.e.,
counter instances can be configured to be either centralized or
decentralized). Counters that are configured to be centralized are
implemented with a single counter variable which is modified with
atomic operations. Decentralized counters are spread over several
cache lines (how many can be configured with the parameter
`+dcg`). The scheduler threads are mapped to cache lines so that there
is no single point of contention when decentralized counters are
updated. The thread progress functionality of the Erlang VM is
utilized to implement support for linearizable snapshots of
decentralized counters. The snapshot functionality is used by the
`ets:info/1` and `ets:info/2` functions.
[1]: http://winsh.me/ets_catree_benchmark/flxctr_res.html
|
|
This reverts commit bd8f6106d44a58c261920eef72842bb3bc5a4968.
PLUS a little change in epmd_srv.c:750 ("4" -> "replylen")
that was part of e2cf4a8a4b03b9f430ba228276c3b2629159e832
by mistake.
|
|
|
|
Fail when we cannot encode term in binary instead of producing a
faulty result.
|
|
to avoid failed ERTS_DBG_CHK_REDS by clearing virtual_reds.
|
|
* sverker/enable-big-creation/OTP-15603:
epmd: Support 32-bit creation values in local node
erts: Robustify epmd reply function
erts: Reject decoded local refs with too large first word
erts: Fix bug in list_to_ref
erl_interface: Remove old encoding of pid,port,refs
erts: Remove old encoding of pids, ports and refs
erts: Make DFLAG_BIG_CREATION mandatory
|
|
* Increase distribution version from 5 to 6
* Introduce new ALIVE2_X_RESP with 32-bit creation
as reply to ALIVE2_REQ when sender dist version >= 6
* Still reply old ALIVE2_RESP with tiny creation 1..3
if sender dist version < 6.
|
|
into sverker/master/enif_whereis_pid-dirty-dtor
|
|
to run user NIF code in a more known execution context.
Fixes problems like user calling enif_whereis_pid() in destructor
which may need to release process main lock in order to lock reg_tab.
|
|
into sverker/master/ets-no-mbuf-trapping/OTP-15660
|
|
into sverker/maint/ets-no-mbuf-trapping/OTP-15660
|
|
Many heap fragments do no longer make the GC slow.
Even worse, we are not guaranteed that a yield will provoke a GC
removing the fragments, which might lead to a one-yield-per-bucket
scenario if the heap fragment(s) still remains after each yield.
|
|
All of the Red-Black Tree _yielding functions have been
updated to work with reductions returned by the called
function instead of yielding on each element.
|
|
|
|
|
|
|
|
If the main lock is not taken then any process running
on a dirty scheduler may cause all kinds of problems.
|
|
Conflicts:
erts/emulator/beam/bif.c
erts/preloaded/ebin/erlang.beam
erts/preloaded/ebin/erts_internal.beam
erts/preloaded/ebin/prim_file.beam
|
|
This flag allows logger and other components to set the
process which log messages from ERTS are to be sent.
|
|
to easier generate a routing tree for test
without having to spend cpu to provoke actual repeated lock conflicts.
|
|
"(void)result" will silence warning about unused variable
and compiler will optimize away such unused variables.
|
|
which only existed in a patched version of valgrind (by pan)
no longer used.
Instead we use standard VALGRIND_PRINTF which will end up like this
if valgrind log format is XML and valgrind version >= 3.9:
<clientmsg>
<tid>7</tid>
<threadname>3_scheduler</threadname>
<text>Test case #20 ei_encode_SUITE:test_ei_encode_long/1
</text>
</clientmsg>
Note the extra trailing whitespace that may occure before </text>.
|
|
Two of them only affect valgrind builds
and the one for ERL_CRASH_DUMP_NICE seems benign.
Return value changed in c2d70945dce9cb09d5d7120d6e9ddf7faac8d230
old -> new
-1 -> 0 not found
0 -> 1 found ok
1 -> -1 found but too big
|
|
|
|
* sverker/ets-delete_all_objects-trap/OTP-15078:
erts: Rename untrapping db_free_*empty*_table
erts: Make ets:delete_all_objects yield on fixed table
erts: Optimize ets delete all in fixed table
erts: Refactor ets select iteration code
erts: Cleanup ets code
erts: Optimize ets hash object deallocactions
erts: Refactor pseudo deleted ets objects
erts: Make atomic ets:delete_all_objects yield
erts: Fix reduction bump for ets:delete/1
|
|
|
|
|
|
by using a cooperative strategy that will make
any process accessing the table execute delelete_all_objects_continue
until the table is empty.
This is not an optimal solution as concurrent threads will still
block on the table lock, but at least thread progress is made.
|
|
* rickard/process_info/OTP-14966:
Fix scheduled process_info() 'status' request
Fix handling of process-info requests in receive
|
|
|
|
Improve memory instrumentation
OTP-15024
OTP-14961
|
|
This commit replaces the old memory instrumentation with a new
implementation that scans carriers instead of wrapping
erts_alloc/erts_free. The old implementation could not extract
information without halting the emulator, had considerable runtime
overhead, and the memory maps it produced were noisy and lacked
critical information.
Since the new implementation walks through existing data structures
there's no longer a need to start the emulator with special flags to
get information about carrier utilization/fragmentation. Memory
fragmentation is also easier to diagnose as it's presented on a
per-carrier basis which eliminates the need to account for "holes"
between mmap segments.
To help track allocations, each allocation can now be tagged with
what it is and who allocated it at the cost of one extra word per
allocation. This is controlled on a per-allocator basis with the
+M<S>atags option, and is enabled by default for binary_alloc and
driver_alloc (which is also used by NIFs).
|
|
|
|
Run debug VM or config with --enable-lock-checking.
Exercise VM and then run
erts_debug:lc_graph().
to create a file "lc_graph.<pid>" in current working directory.
|