Age | Commit message (Collapse) | Author |
|
A number of memory allocation optimizations have been implemented. Most
optimizations reduce contention caused by synchronization between
threads during allocation and deallocation of memory. Most notably:
* Synchronization of memory management in scheduler specific allocator
instances has been rewritten to use lock-free synchronization.
* Synchronization of memory management in scheduler specific
pre-allocators has been rewritten to use lock-free synchronization.
* The 'mseg_alloc' memory segment allocator now use scheduler specific
instances instead of one instance. Apart from reducing contention
this also ensures that memory allocators always create memory
segments on the local NUMA node on a NUMA system.
|
|
Large parts of the ethread library have been rewritten. The
ethread library is an Erlang runtime system internal, portable
thread library used by the runtime system itself.
Most notable improvement is a reader optimized rwlock
implementation which dramatically improve the performance of
read-lock/read-unlock operations on multi processor systems by
avoiding ping-ponging of the rwlock cache lines. The reader
optimized rwlock implementation is used by miscellaneous
rwlocks in the runtime system that are known to be read-locked
frequently, and can be enabled on ETS tables by passing the
`{read_concurrency, true}' option upon table creation. See the
documentation of `ets:new/2' for more information.
The ethread library can now also use the libatomic_ops library
for atomic memory accesses. This makes it possible for the
Erlang runtime system to utilize optimized atomic operations
on more platforms than before. Use the
`--with-libatomic_ops=PATH' configure command line argument
when specifying where the libatomic_ops installation is
located. The libatomic_ops library can be downloaded from:
http://www.hpl.hp.com/research/linux/atomic_ops/
The changed API of the ethread library has also caused
modifications in the Erlang runtime system. Preparations for
the to come "delayed deallocation" feature has also been done
since it depends on the ethread library.
Note: When building for x86, the ethread library will now use
instructions that first appeared on the pentium 4 processor. If
you want the runtime system to be compatible with older
processors (back to 486) you need to pass the
`--enable-ethread-pre-pentium4-compatibility' configure command
line argument when configuring the system.
|
|
* pan/otp_8332_halfword:
Teach testcase in driver_suite the new prototype for driver_async
wx: Correct usage of driver callbacks from wx thread
Adopt the new (R13B04) Nif functionality to the halfword codebase
Support monitoring and demonitoring from driver threads
Fix further test-suite problems
Correct the VM to work for more test suites
Teach {wordsize,internal|external} to system_info/1
Make tracing and distribution work
Turn on instruction packing in the loader and virtual machine
Add the BeamInstr data type for loaded BEAM code
Fix the BEAM dissambler for the half-word emulator
Store pointers to heap data in 32-bit words
Add a custom mmap wrapper to force heaps into the lower address range
Fit all heap data into the 32-bit address range
|
|
Store Erlang terms in 32-bit entities on the heap, expanding the
pointers to 64-bit when needed. This works because all terms are stored
on addresses in the 32-bit address range (the 32 most significant bits
of pointers to term data are always 0).
Introduce a new datatype called UWord (along with its companion SWord),
which is an integer having the exact same size as the machine word
(a void *), but might be larger than Eterm/Uint.
Store code as machine words, as the instructions are pointers to
executable code which might reside outside the 32-bit address range.
Continuation pointers are stored on the 32-bit stack and hence must
point to addresses in the low range, which means that loaded beam code
much be placed in the low 32-bit address range (but, as said earlier,
the instructions themselves are full words).
No Erlang term data can be stored on C stacks (enforced by an
earlier commit).
This version gives a prompt, but test cases still fail (and dump core).
The loader (and emulator loop) has instruction packing disabled.
The main issues has been in rewriting loader and actual virtual
machine. Subsystems (like distribution) does not work yet.
|
|
|