Age | Commit message (Collapse) | Author |
|
* maint:
Fix etp-ets-tables
Fix node refc test for free processes hanging around
Enhanced node refc bookkeeping
Fix node container refc tests of ETS
Fix node refc test of external data
Node container refc test for persistent terms
Include persistent term storage in node/dist refc check
Fix node refc test for system message queue
|
|
* rickard/node-refc-tests-22:
Fix etp-ets-tables
Fix node refc test for free processes hanging around
Enhanced node refc bookkeeping
Fix node container refc tests of ETS
Fix node refc test of external data
Node container refc test for persistent terms
Include persistent term storage in node/dist refc check
Fix node refc test for system message queue
|
|
|
|
* poroh/erts/sched-stuck-fix/OTP-15941:
Infinite cycle fixed on try to change run queue (if it has already changed concurrently)
|
|
|
|
Infinite cycle fixed on try to change run queue (if it has already ch…
|
|
concurrently)
|
|
* kjell/stdlib/ets_ordered_set_slow_react/OTP-15906:
ETS ordered_set: Improvements to the CA tree implementation
|
|
This commit only affects the implementation of ETS `ordered_set`
tables with the `write_concurrency` option enabled. Such tables are
implemented with a data structure that is called the contention
adapting search tree (CA tree). This commit introduces the following
changes:
* This commit causes a join to be triggered in one randomly selected
base node in about one of 1000 read unlock calls for base node
locks. No such joins happened before this commit. Before this
commit, operations that only acquired looks in read-mode never
triggered any contention adaptation. Therefore, the CA tree could
get stuck in a sub-optimal state in certain scenarios. This could
happen, for example, when a CA tree is first populated with parallel
inserts (which will cause splits of base nodes) and then only
read-only operations are applied to the data structure. Benchmark
results from the
`ets_SUITE:lookup_catree_par_vs_seq_init_benchmark/0` benchmark
function (which is included in this commit) shows that this change
can improve the throughput of the CA tree in the scenario described
above.
* Read-only operations will now also increase values of statistics
counters when they detect that they need to wait for other
operations. Only write operation changed statistics counters before
this commit. This improves the statistics that the adaptation
heuristics is based on.
* Additionally, this commit adds an upper and lower limit to the
contention statistics variables in the base nodes. Such limits did
not exist before this commit. This should, for example, make the CA
tree more responsive to contention after long periods of low
contention.
|
|
This commit only affects the implementation of ETS `ordered_set`
tables with the `write_concurrency` option enabled. Such tables are
implemented with a data structure that is called the contention
adapting search tree (CA tree). This commit introduces the following
changes:
* This commit causes a join to be triggered in one randomly selected
base node in about one of 1000 read unlock calls for base node
locks. No such joins happened before this commit. Before this
commit, operations that only acquired looks in read-mode never
triggered any contention adaptation. Therefore, the CA tree could
get stuck in a sub-optimal state in certain scenarios. This could
happen, for example, when a CA tree is first populated with parallel
inserts (which will cause splits of base nodes) and then only
read-only operations are applied to the data structure. Benchmark
results from the
`ets_SUITE:lookup_catree_par_vs_seq_init_benchmark/0` benchmark
function (which is included in this commit) shows that this change
can improve the throughput of the CA tree in the scenario described
above.
* Read-only operations will now also increase values of statistics
counters when they detect that they need to wait for other
operations. Only write operation changed statistics counters before
this commit. This improves the statistics that the adaptation
heuristics is based on.
* Additionally, this commit adds an upper and lower limit to the
contention statistics variables in the base nodes. Such limits did
not exist before this commit. This should, for example, make the CA
tree more responsive to contention after long periods of low
contention.
|
|
|
|
* sverker/seq-trace-label-old-heap-bug/ERL-700/OTP-15849:
erts: Fix faulty spec for seq_trace:set_token/2
erts: Fix seq_trace:print/2 for arbitrary labels
erts: Fix bug in seq_trace:set_token(label,_)
|
|
If internal seq-trace tuple is on old heap
an incorrect ref from old to new heap was made.
|
|
jhogberg/john/erts/seq-trace-on-spawn/OTP-15232/ERL-700
Propagate seq_trace tokens to spawned processes
|
|
Previously, all ETS tables used centralized counter variables to keep
track of the number of items stored and the amount of memory
consumed. These counters can cause scalability problems (especially on
big NUMA systems). This commit adds an implementation of a
decentralized counter and modifies the implementation of ETS so that
ETS tables of type ordered_set with write_concurrency enabled use the
decentralized counter. [Experiments][1] indicate that this change
substantially improves the scalability of ETS ordered_set tables with
write_concurrency enabled in scenarios with frequent `ets:insert/2`
and `ets:delete/2` calls.
The new counter is implemented in the module erts_flxctr
(`erts_flxctr.h` and `erts_flxctr.c`). The module has the suffix
flxctr as it contains the implementation of a flexible counter (i.e.,
counter instances can be configured to be either centralized or
decentralized). Counters that are configured to be centralized are
implemented with a single counter variable which is modified with
atomic operations. Decentralized counters are spread over several
cache lines (how many can be configured with the parameter
`+dcg`). The scheduler threads are mapped to cache lines so that there
is no single point of contention when decentralized counters are
updated. The thread progress functionality of the Erlang VM is
utilized to implement support for linearizable snapshots of
decentralized counters. The snapshot functionality is used by the
`ets:info/1` and `ets:info/2` functions.
[1]: http://winsh.me/ets_catree_benchmark/flxctr_res.html
|
|
Setting the hole marker in debug on the entire heap is very slow
so instead we do it in erts_produce_heap.
|
|
Trace tokens can be lost when a process delegates message sending
to a child process, which is pretty surprising and limits the
usefulness of seq tracing. One example of this is gen_statem:call/4
which uses a child process to implement timeouts without the risk
of a late message arriving to the caller.
This commit attempts to remedy this by propagating the trace token
to spawned processes, and adds optional tracing for process
spawning as well.
|
|
into sverker/master/enif_whereis_pid-dirty-dtor
|
|
to run user NIF code in a more known execution context.
Fixes problems like user calling enif_whereis_pid() in destructor
which may need to release process main lock in order to lock reg_tab.
|
|
The reason in EXIT and DOWN may be arbitrarily large,
so we yield and allow other processes to execute while
encoding and sending the signals over the distribution.
|
|
All of the Red-Black Tree _yielding functions have been
updated to work with reductions returned by the called
function instead of yielding on each element.
|
|
* lukas/erts/scheduler-pollset-fixes/OTP-15538:
erts: Fix getting of poll events on linux >= 4.15.0
erts: Use reduction based polling for starved poll-set
erts: Fix pollset test cases
|
|
When the schedulers never go to sleep (and thus never polls)
it may be that the fds in schedulers poll-sets are never polled.
Before this commit, this was solved by starting a timer when an
overload was detected. This had issues as overloads were not always
detected in time. So this commit reverts to the pre OTP-21 behaviour
so keep a global counter makes that the poll is called when it should.
|
|
rickard/dirty_scheduler_collapse/maint-21/OTP-15509
* rickard/dirty_scheduler_collapse/OTP-15509:
Fix bug causing dirty scheduler sleeper list inconsistency
|
|
|
|
At start of the VM a poll-set that the schedulers
will check is created where fds that have triggered
many (at the moment, many means 10) times without
being deselected inbetween. In this scheduler specific
poll-set fds do not use ONESHOT, which means that the
number of syscalls goes down dramatically for such fds.
This pollset is introduced in order to handle fds that
are used by the erlang distribution and that never
change their state from {active, true}.
This pollset only handles ready_input events,
ready_output is still handled by the poll threads.
During overload, polling the scheduler poll-set is done
on a 10ms timer.
|
|
|
|
|
|
|
|
If no message/signal is sent (to same destination)
then monitor signal is flushed when process is scheduled out.
|
|
This commit replaces the old memory instrumentation with a new
implementation that scans carriers instead of wrapping
erts_alloc/erts_free. The old implementation could not extract
information without halting the emulator, had considerable runtime
overhead, and the memory maps it produced were noisy and lacked
critical information.
Since the new implementation walks through existing data structures
there's no longer a need to start the emulator with special flags to
get information about carrier utilization/fragmentation. Memory
fragmentation is also easier to diagnose as it's presented on a
per-carrier basis which eliminates the need to account for "holes"
between mmap segments.
To help track allocations, each allocation can now be tagged with
what it is and who allocated it at the cost of one extra word per
allocation. This is controlled on a per-allocator basis with the
+M<S>atags option, and is enabled by default for binary_alloc and
driver_alloc (which is also used by NIFs).
|
|
This may be of interest in crash dumps and allows the upcoming
allocation tagging feature to track allocations on a per-NIF basis.
Note that this is only updated when user code calls a NIF; it's not
altered when the emulator calls NIFs during code upgrades or
tracing.
|
|
* rickard/process_info/OTP-14966:
New process_info() implementation using signals
|
|
|
|
* john/erts/bwt-wt-dirty-schedulers/OTP-14959:
Add +sbwt/+swt analogues for dirty schedulers
|
|
Sharing these settings for all schedulers can degrade performance,
so it makes sense to be able to configure them separately.
This also changes the default busy-wait time to "short" for both
kinds of dirty schedulers.
|
|
OTP-14899
|
|
Communication between Erlang processes has conceptually always been
performed through asynchronous signaling. The runtime system
implementation has however previously preformed most operation
synchronously. In a system with only one true thread of execution, this
is not problematic (often the opposite). In a system with multiple threads
of execution (as current runtime system implementation with SMP support)
it becomes problematic. This since it often involves locking of structures
when updating them which in turn cause resource contention. Utilizing
true asynchronous communication often avoids these resource contention
issues.
The case that triggered this change was contention on the link lock due
to frequent updates of the monitor trees during communication with a
frequently used server. The signal order delivery guarantees of the
language makes it hard to change the implementation of only some signals
to use true asynchronous signaling. Therefore the implementations
of (almost) all signals have been changed.
Currently the following signals have been implemented as true
asynchronous signals:
- Message signals
- Exit signals
- Monitor signals
- Demonitor signals
- Monitor triggered signals (DOWN, CHANGE, etc)
- Link signals
- Unlink signals
- Group leader signals
All of the above already defined as asynchronous signals in the
language. The implementation of messages signals was quite
asynchronous to begin with, but had quite strict delivery constraints
due to the ordering guarantees of signals between a pair of processes.
The previously used message queue partitioned into two halves has been
replaced by a more general signal queue partitioned into three parts
that service all kinds of signals. More details regarding the signal
queue can be found in comments in the erl_proc_sig_queue.h file.
The monitor and link implementations have also been completely replaced
in order to fit the new asynchronous signaling implementation as good
as possible. More details regarding the new monitor and link
implementations can be found in the erl_monitor_link.h file.
|
|
* rickard/remove-approx-started/OTP-14975:
Remove process start time for crash dumps
|
|
Bug introduced in commit fbb10ebc4a37555c7ea7f99e14286d862993976a
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Otherwise on targets which have small data area with short addressing
like on PowerPC ther will be linking errors due to the mismatch of
declaration/usage and definition.
|
|
|
|
It is not longer relevant when using the poll thread
|
|
|