Age | Commit message (Collapse) | Author |
|
* henrik/update-copyrightyear:
update copyright-year
|
|
* carrier_create
* carrier_destroy
* carrier_pool_put
* carrier_pool_get
|
|
|
|
Microstate accounting is a way to track which state the
different threads within ERTS are in. The main usage area
is to pin point performance bottlenecks by checking which
states the threads are in and then from there figuring out
why and where to optimize.
Since checking whether microstate accounting is on or off is
relatively expensive if done in a short loop only a few of the
states are enabled by default and more states can be enabled
through configure.
I've done some benchmarking and the overhead with it turned off
is not noticible and with it on it is a fraction of a percent.
If you enable the extra states, depending on the benchmark,
the ovehead when turned off is about 1% and when turned on
somewhere inbetween 5-15%.
OTP-12345
|
|
* The youngest generation of the heap can now consist of multiple
blocks. Heap fragments and message fragments are added to the
youngest generation when needed without triggering a GC. After
a GC the youngest generation is contained in one single block.
* The off_heap_message_queue process flag has been added. When
enabled all message data in the queue is kept off heap. When
a message is selected from the queue, the message fragment (or
heap fragment) containing the actual message is attached to the
youngest generation. Messages stored off heap is not part of GC.
|
|
|
|
|
|
|
|
|
|
|
|
rickard/aligned-sys_alloc-carriers_maint/OTP-11318
Conflicts:
erts/emulator/beam/erl_alloc.c
erts/emulator/beam/erl_alloc_util.c
erts/emulator/beam/erl_alloc_util.h
|
|
erts_sys_aligned_alloc() is currently implemented using posix_memalign if
it exist, or using _aligned_malloc on Windows.
If erts_sys_aligned_alloc() exist allocators will create sys_alloc
carriers similar to how this was done pre-R16.
|
|
|
|
|
|
|
|
Some query functions in erl_alloc_util.c lock the allocator mutex
and then use erts_printf that in turn may call the sys allocator
through the wrappers. To avoid breaking locking order these
query functions first "pre-locks" all allocator wrappers.
|
|
|
|
- Document barrier semantics
- Introduce ddrb suffix on atomic ops
- Barrier macros for both non-SMP and SMP case
- Make the thread progress API a bit more intuitive
|
|
Almost all uses of the 'long' datatype is removed from VM and tests
Emulator test now runs w/o drivers crashing
Nasty abs bug fixed in VM as well as type errors in allocator debug functions
Still one allocator test that fails, domain knowledge is needed to fix that.
Fix type inconsistency in beam_load causing crashes
|
|
A number of memory allocation optimizations have been implemented. Most
optimizations reduce contention caused by synchronization between
threads during allocation and deallocation of memory. Most notably:
* Synchronization of memory management in scheduler specific allocator
instances has been rewritten to use lock-free synchronization.
* Synchronization of memory management in scheduler specific
pre-allocators has been rewritten to use lock-free synchronization.
* The 'mseg_alloc' memory segment allocator now use scheduler specific
instances instead of one instance. Apart from reducing contention
this also ensures that memory allocators always create memory
segments on the local NUMA node on a NUMA system.
|
|
|
|
alloc_no of sbmbc_low_alloc was set to ERTS_ALC_A_STANDARD_LOW
|
|
* sverker/valgrind-new-suppressions:
Make halfword emulator with valgrind target allocate low memory
Add erts_alloc_permanent_cache_aligned to supress valgrind
|
|
Ease the valgrind supression of memory that are permanently
allocated and then aligned up to cache line.
|
|
|
|
|
|
* pan/otp_8332_halfword:
Teach testcase in driver_suite the new prototype for driver_async
wx: Correct usage of driver callbacks from wx thread
Adopt the new (R13B04) Nif functionality to the halfword codebase
Support monitoring and demonitoring from driver threads
Fix further test-suite problems
Correct the VM to work for more test suites
Teach {wordsize,internal|external} to system_info/1
Make tracing and distribution work
Turn on instruction packing in the loader and virtual machine
Add the BeamInstr data type for loaded BEAM code
Fix the BEAM dissambler for the half-word emulator
Store pointers to heap data in 32-bit words
Add a custom mmap wrapper to force heaps into the lower address range
Fit all heap data into the 32-bit address range
|
|
Fix safe_mul in the loader, which caused failures in the bit
syntax test cases.
Fix yet another Uint in erl_alloc.h (ERTS_CACHE_LINE_SIZE) causing
segmentation fault when we have many schedulers (why only in that
situation?).
Clean up erl_mseg (remove old code for the Linux 32-bit mmap flag).
While at it, also remove compilation warnings.
|
|
|
|
Store Erlang terms in 32-bit entities on the heap, expanding the
pointers to 64-bit when needed. This works because all terms are stored
on addresses in the 32-bit address range (the 32 most significant bits
of pointers to term data are always 0).
Introduce a new datatype called UWord (along with its companion SWord),
which is an integer having the exact same size as the machine word
(a void *), but might be larger than Eterm/Uint.
Store code as machine words, as the instructions are pointers to
executable code which might reside outside the 32-bit address range.
Continuation pointers are stored on the 32-bit stack and hence must
point to addresses in the low range, which means that loaded beam code
much be placed in the low 32-bit address range (but, as said earlier,
the instructions themselves are full words).
No Erlang term data can be stored on C stacks (enforced by an
earlier commit).
This version gives a prompt, but test cases still fail (and dump core).
The loader (and emulator loop) has instruction packing disabled.
The main issues has been in rewriting loader and actual virtual
machine. Subsystems (like distribution) does not work yet.
|
|
|