Age | Commit message (Collapse) | Author |
|
|
|
|
|
|
|
|
|
Functionality for scheduling operations at thread progress later
has been introduced.
Deallocation of ETS table structures were previously done by scheduling
misc aux work. Deallocation of process structures (not released yet)
was also implemented this way. Instead of using the misc aux work
functionality these implementation now use the newly introduced
functionality for scheduling operations at thread progress later. By
using this new functionaliy we reduce the amount of memory
allocation/deallocation operations needed.
|
|
As an optimization old thread progress data was kept and used in
handle_aux_work() in erl_process.c. This could cause memory to be
deallocated at a later time than intended, which is quite harmless.
This has, however, now been fixed.
|
|
|
|
- Document barrier semantics
- Introduce ddrb suffix on atomic ops
- Barrier macros for both non-SMP and SMP case
- Make the thread progress API a bit more intuitive
|
|
The ERTS internal system block functionality has been replaced by
new functionality for blocking the system. The old system block
functionality had contention issues and complexity issues. The
new functionality piggy-backs on thread progress tracking functionality
needed by newly introduced lock-free synchronization in the runtime
system. When the functionality for blocking the system isn't used
there is more or less no overhead at all. This since the functionality
for tracking thread progress is there and needed anyway.
|
|
A number of memory allocation optimizations have been implemented. Most
optimizations reduce contention caused by synchronization between
threads during allocation and deallocation of memory. Most notably:
* Synchronization of memory management in scheduler specific allocator
instances has been rewritten to use lock-free synchronization.
* Synchronization of memory management in scheduler specific
pre-allocators has been rewritten to use lock-free synchronization.
* The 'mseg_alloc' memory segment allocator now use scheduler specific
instances instead of one instance. Apart from reducing contention
this also ensures that memory allocators always create memory
segments on the local NUMA node on a NUMA system.
|