Age | Commit message (Collapse) | Author |
|
* rickard/dealloc/OTP-10162:
Improve the enqueue operation of delayed dealloc
Implement delayed aux work wake up
|
|
By using a delayed aux work wake up approach, a memory barrier
can be omitted in the delayed dealloc enqueue operation. The
amount of operations, on the potentially contended, wake up
structure is also reduced.
|
|
* rickard/thr-prgr-use/OTP-10116:
Fix faulty use of thread progress in handle_aux_work()
|
|
As an optimization old thread progress data was kept and used in
handle_aux_work() in erl_process.c. This could cause memory to be
deallocated at a later time than intended, which is quite harmless.
This has, however, now been fixed.
|
|
The hybrid heap emulator was last working in the non-SMP R11B
run-time system. When the constant pools were introduced in R12B,
the hybrid heap emulator was not updated to handle them.
At this point, the harm from reduced readability of the code is
greater than any potential usefulness of keeping the code.
|
|
* rickard/sched-busy-wait/OTP-10044:
Add switch controlling scheduler busy wait
Conflicts:
erts/emulator/beam/erl_process.c
erts/emulator/beam/erl_process.h
|
|
rickard/sched-wakeup-other-r15b01/OTP-10033
Conflicts:
erts/emulator/beam/erl_process.c
erts/vsn.mk
|
|
|
|
|
|
|
|
|
|
|
|
User tags in a dynamic trace enabled VM are spread throughout the system
in the same way as seq_trace tokens. This is used by the file module
and various other modules to get hold of the tag from the user process
without changing the protocol.
|
|
Add probes to the virtual machine, except (mostly) the efile_drv.c
driver and other file I/O-related source files.
|
|
|
|
* rickard/barriers/OTP-9922:
Reduce thread progress read operations in handle_aux_work()
Misc memory barrier fixes
|
|
|
|
- Document barrier semantics
- Introduce ddrb suffix on atomic ops
- Barrier macros for both non-SMP and SMP case
- Make the thread progress API a bit more intuitive
|
|
* jz/erts-remove-unused-var:
erts: Remove unused variable
OTP-9926
|
|
* dgud/sched-work-time/OTP-9858:
emulator: Document and test scheduler_wall_time
Implement statistics(scheduler_wall_time)
|
|
|
|
Remove unused variable erts_system_monitor_msg_queue_len
|
|
It is possible also in non-SMP case:
1. The process receives an exit signal and is set in status exiting
and inserted into the run queue.
2. The distribution port exits before the process has been selected
for execution and cannot remove the link half on the process since
it is in status exiting.
3. Process is selected for execution and when removing this link half
the distribution channel is gone!
|
|
* rickard/generic-thr-queue/OTP-9632:
Give elements of lock-free queues some time to be deallocated
Fix cleanup of elements in lock-free queues
|
|
|
|
* rickard/generic-thr-queue/OTP-9632:
Fix handle_async_ready_clean()
|
|
|
|
* rickard/rm-common-runq/OTP-9727:
Remove common run-queue in SMP case
Fix scheduler suspend bug
Conflicts:
erts/emulator/beam/erl_init.c
|
|
The common run-queue implementation is removed since it is unused,
untested, undocumented, unsupported, and only complicates the code.
A spinlock used by the run-queue management sometimes got heavily
contended. This code has now been rewritten, and the spinlock
has been removed.
|
|
Calls to erlang:system_flag(schedulers_online, N) and/or
erlang:system_flag(multi_scheduling, block|unblock) could cause
internal data used by this functionality to get into an inconsistent
state. When this happened various problems occurred. This bug was
quite hard to trigger, so hopefully no-one has been effected by it.
|
|
There is a static variable called 'processes_busy' in erl_process.c,
which will be incremented but never used.
To confuse things, there is also a global variable with the same
name, but it it is only defined and used if BM_COUNTERS is defined.
In erl_process.c, the global version of 'processes_busy' will be
seen and incremented if BM_COUNTERS is defined.
Remove the static version of 'processes_busy' in erl_process.c,
but keep the global version if BM_COUNTERS is defined.
Noticed-by: Jovi Zhang
|
|
* sverk/hipe-without-fpe/OTP-9724:
otp_build: Disable FPE by default on Linux
stdlib: Make sure qlc_SUITE:otp_6964 restores backtrace_depth
erts: Add test for inf/NaN intermediate float results
hipe,erts: Allow hipe without floating point exceptions
hipe: Fix bug in hipe_rtl_lcm:calc_killed_expr_bb
erts: Rename macros used by float instructions without FPE
|
|
* rickard/sched-compact-load/OTP-9695:
Add switch that can disable scheduler compaction of load
|
|
|
|
|
|
* rickard/generic-thr-queue/OTP-9632:
Use generic lock-free queue for async threads
Use generic lock-free queue for misc aux work
Implement generic lock-free queue
|
|
* rickard/thr-progress-block/OTP-9631:
Replace system block with thread progress block
|
|
* rickard/alloc-opt/OTP-7775:
Optimize memory allocation
Conflicts:
erts/aclocal.m4
erts/emulator/hipe/hipe_bif_list.m4
erts/preloaded/ebin/erl_prim_loader.beam
erts/preloaded/ebin/erlang.beam
erts/preloaded/ebin/init.beam
erts/preloaded/ebin/otp_ring0.beam
erts/preloaded/ebin/prim_file.beam
erts/preloaded/ebin/prim_inet.beam
erts/preloaded/ebin/prim_zip.beam
erts/preloaded/ebin/zlib.beam
|
|
Queues used for communication between async threads and scheduler threads
have been replaced with lock-free queues.
Drivers using the driver_async functionality are not automatically locked
to the system anymore, and can be unloaded as any dynamically linked in
driver.
Scheduling of ready async jobs is now also interleaved in between other
jobs. Previously all ready async jobs was performed at once.
|
|
|
|
The ERTS internal system block functionality has been replaced by
new functionality for blocking the system. The old system block
functionality had contention issues and complexity issues. The
new functionality piggy-backs on thread progress tracking functionality
needed by newly introduced lock-free synchronization in the runtime
system. When the functionality for blocking the system isn't used
there is more or less no overhead at all. This since the functionality
for tracking thread progress is there and needed anyway.
|
|
A number of memory allocation optimizations have been implemented. Most
optimizations reduce contention caused by synchronization between
threads during allocation and deallocation of memory. Most notably:
* Synchronization of memory management in scheduler specific allocator
instances has been rewritten to use lock-free synchronization.
* Synchronization of memory management in scheduler specific
pre-allocators has been rewritten to use lock-free synchronization.
* The 'mseg_alloc' memory segment allocator now use scheduler specific
instances instead of one instance. Apart from reducing contention
this also ensures that memory allocators always create memory
segments on the local NUMA node on a NUMA system.
|
|
Conflicts:
erts/aclocal.m4
erts/emulator/beam/erl_db.c
erts/emulator/sys/win32/sys.c
erts/include/internal/ethread_header_config.h.in
|
|
* dev:
[erl_docgen] Missing header level in PDF bookmarks menu
Fixed version, release notes and appup in prep for release.
Fix lost wakeup of scheduler when enqueuing auxiliary work
|
|
* rickard/aux-work-bug/OTP-9567:
Fix lost wakeup of scheduler when enqueuing auxiliary work
|
|
When auxiliary work was enqueued on a scheduler, the wakeup of
the scheduler in order to handle this work could be lost. Wakeups
in order to handle ordinary work were not effected by this bug.
The bug only effected runtime systems with SMP support as follows:
* Deallocation of some ETS data structures could be delayed.
* On Linux systems not using the NPTL thread library (typically
ancient systems with kernel versions prior to 2.6) and Windows
systems, the {Port, {exit_status, Status}} message from a
terminating port program could be delayed. That is, it only
effected port programs which had been started by passing
exit_status as an option to open_port/2.
|
|
* dev:
Make sure we have a run_queue
|
|
* In rare cases we could have no run_queue in schedule misc ops
|
|
In the half-word emulator, smp emulator, and non-smp emulator
the X register and float register arrays were allocated in
different ways.
Always allocate the registers and store the pointers to the
allocated register arrays in the scheduler data.
|
|
All uses of the old deprecated atomic API in the runtime system
have been replaced with the use of the new atomic API. In a lot of
places this change imply a relaxation of memory barriers used.
|