Age | Commit message (Collapse) | Author |
|
* rickard/sched-busy-wait/OTP-10044:
Add switch controlling scheduler busy wait
Conflicts:
erts/emulator/beam/erl_process.c
erts/emulator/beam/erl_process.h
|
|
rickard/sched-wakeup-other-r15b01/OTP-10033
Conflicts:
erts/emulator/beam/erl_process.c
erts/vsn.mk
|
|
|
|
|
|
|
|
|
|
Conflicts:
erts/doc/src/erlang.xml
erts/emulator/beam/erl_process.c
erts/emulator/beam/erl_process.h
erts/emulator/test/bif_SUITE.erl
erts/preloaded/ebin/erlang.beam
erts/preloaded/src/erlang.erl
lib/hipe/cerl/erl_bif_types.erl
|
|
|
|
|
|
Rename lock_code_ix as seize_code_write_permission. Don't want to call
it a "lock" as it can be held between schedulings and different threads
and is not managed by lock checker.
Rename "activate" staging as "commit" staging. Why not be consistent
and use git terminology all the way.
|
|
Activation of staged code is scheduled for a later moment when
all schedulers have done a full memory barrier. This allow
them to read active code index while executing without any
memory barriers at all.
|
|
|
|
Staging is a better and more general name as does not necessary need
to involve code loading (can be deletion, tracing, etc).
|
|
As it can only be used at initialization for preloading
|
|
Still blocking code loading
|
|
Code loading still blocking
|
|
* rickard/rm-common-runq/OTP-9727:
Remove common run-queue in SMP case
Fix scheduler suspend bug
Conflicts:
erts/emulator/beam/erl_init.c
|
|
The common run-queue implementation is removed since it is unused,
untested, undocumented, unsupported, and only complicates the code.
A spinlock used by the run-queue management sometimes got heavily
contended. This code has now been rewritten, and the spinlock
has been removed.
|
|
* rickard/sched-compact-load/OTP-9695:
Add switch that can disable scheduler compaction of load
|
|
|
|
There is no reason to have erts_load_module() return integer values,
only to have the caller convert the values to atoms. Return the
appropriate atom directly from the place where the error is generated
instead. Return NIL if the module was successfully loaded.
|
|
The -l option used to print information about modules being loaded,
but now it prints very little information.
|
|
* rickard/multi-exit-bug/OTP-9705:
Make sure only one thread exits the runtime system
|
|
* rickard/generic-thr-queue/OTP-9632:
Use generic lock-free queue for async threads
Use generic lock-free queue for misc aux work
Implement generic lock-free queue
|
|
* rickard/thr-progress-block/OTP-9631:
Replace system block with thread progress block
|
|
* rickard/alloc-opt/OTP-7775:
Optimize memory allocation
Conflicts:
erts/aclocal.m4
erts/emulator/hipe/hipe_bif_list.m4
erts/preloaded/ebin/erl_prim_loader.beam
erts/preloaded/ebin/erlang.beam
erts/preloaded/ebin/init.beam
erts/preloaded/ebin/otp_ring0.beam
erts/preloaded/ebin/prim_file.beam
erts/preloaded/ebin/prim_inet.beam
erts/preloaded/ebin/prim_zip.beam
erts/preloaded/ebin/zlib.beam
|
|
The runtime system crashed if more than one thread tried to exit
the runtime system at the same time.
|
|
Queues used for communication between async threads and scheduler threads
have been replaced with lock-free queues.
Drivers using the driver_async functionality are not automatically locked
to the system anymore, and can be unloaded as any dynamically linked in
driver.
Scheduling of ready async jobs is now also interleaved in between other
jobs. Previously all ready async jobs was performed at once.
|
|
The implementation of an ERTS internal, generic, many to one, lock-free
queue for communication between threads. The many to one scenario is
very common in ERTS, so it can be used in a lot of places in the future.
Changing to this queue from a lock based queue, however, often requires
some redesigning. This since we have often used the lock of the queue
to protect other information too.
|
|
The ERTS internal system block functionality has been replaced by
new functionality for blocking the system. The old system block
functionality had contention issues and complexity issues. The
new functionality piggy-backs on thread progress tracking functionality
needed by newly introduced lock-free synchronization in the runtime
system. When the functionality for blocking the system isn't used
there is more or less no overhead at all. This since the functionality
for tracking thread progress is there and needed anyway.
|
|
A number of memory allocation optimizations have been implemented. Most
optimizations reduce contention caused by synchronization between
threads during allocation and deallocation of memory. Most notably:
* Synchronization of memory management in scheduler specific allocator
instances has been rewritten to use lock-free synchronization.
* Synchronization of memory management in scheduler specific
pre-allocators has been rewritten to use lock-free synchronization.
* The 'mseg_alloc' memory segment allocator now use scheduler specific
instances instead of one instance. Apart from reducing contention
this also ensures that memory allocators always create memory
segments on the local NUMA node on a NUMA system.
|
|
Conflicts:
erts/aclocal.m4
erts/emulator/beam/erl_db.c
erts/emulator/sys/win32/sys.c
erts/include/internal/ethread_header_config.h.in
|
|
|
|
* dev:
Move init of smp rw mutex from init to sys_args to make sure that it is initialized before the first erts_sys_getenv call
Move erts_sys_env_init() to erts_sys_pre_init()
Remove _DEBUG from start_erl.c
Spelling correction in erlsrv doc
Convert windows start_erl to take rootdir on command line
Add command start_disabled to erlsrv
Add global lock for erlsrv to avoid races
Change start order so that service_event gets initialized before it's used
Conflicts:
erts/emulator/sys/win32/sys.c
|
|
|
|
All uses of the old deprecated atomic API in the runtime system
have been replaced with the use of the new atomic API. In a lot of
places this change imply a relaxation of memory barriers used.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The compressed format is using a slighty modified variant of the extern format
(term_to_binary). To not worsen key lookup's too much, the top tuple itself
and the key element are not compressed. Table objects with only immediate
non-key elements will therefor not gain anything (but actually consume one
extra word for "alloc_size").
|
|
* rickard/cpu-groups/OTP-8861:
Generalize reader groups
Move cpu topology functionality into erl_cpu_topology.[ch]
Do not use more reader groups for schedulers than schedulers
Conflicts:
erts/emulator/beam/erl_init.c
|
|
|
|
When the runtime system had fewer schedulers than logical processors,
the system could get an unnecessarily large amount reader groups.
|
|
Id: OTP-8912
This patch creates a new family of flags with the "+z" prefix. It
further creates a new configuration option called "dbbl" (which is the
first letter of the name dist_buf_busy_limit). Example usage of this
flag would be "+zdbbl 1048576".
This patch creates an adjustable buffer limit for the amount of data
that may be buffered by the erlang distribution code (in dist.c
specifically). Before this patch, this hard-coded constant was used:
#define ERTS_DE_BUSY_LIMIT (128*1024)
When large binaries are transmitted between nodes (or simply a lot of
medium-sized binaries), it is very easy to hit the old 128KB limit.
Processes that use the erlang:system_monitor() BIF to monitor system
events can be spammed by {monitor, busy_dist_port, ...} message tuples
at rates of tens to even hundreds of messages/second.
A larger buffer limit will allow processes to buffer more outgoing
messages over the distribution. When the buffer limit has been
reached, sending processes will be suspended until the buffer size has
shrunk. The buffer limit is per distribution channel. A higher limit
will give lower latency and higher throughput at the expense of
higher memory usage.
A variation of this patch has been in commercial production use in at
least two companies that the author is aware of. Larger buffer values
can reduce the number of {monitor, busy_dist_port, ...} system
messages drastically, lower overall messaging latencies, and prevent
false timeouts and 'nodedown' messages in extremely busy Mnesia systems.
Test suite: there are two tests:
a. In erlexec_SUITE.erl to test basic set & get of the value
b. In distribution_SUITE.erl, to verify that setting +zdbbl very
low will actually change behavior.
|
|
The scheduler wakeup threshold is now possible to adjust at system boot.
For more information see the `+swt' command line argument of `erl'.
|
|
Large parts of the ethread library have been rewritten. The
ethread library is an Erlang runtime system internal, portable
thread library used by the runtime system itself.
Most notable improvement is a reader optimized rwlock
implementation which dramatically improve the performance of
read-lock/read-unlock operations on multi processor systems by
avoiding ping-ponging of the rwlock cache lines. The reader
optimized rwlock implementation is used by miscellaneous
rwlocks in the runtime system that are known to be read-locked
frequently, and can be enabled on ETS tables by passing the
`{read_concurrency, true}' option upon table creation. See the
documentation of `ets:new/2' for more information.
The ethread library can now also use the libatomic_ops library
for atomic memory accesses. This makes it possible for the
Erlang runtime system to utilize optimized atomic operations
on more platforms than before. Use the
`--with-libatomic_ops=PATH' configure command line argument
when specifying where the libatomic_ops installation is
located. The libatomic_ops library can be downloaded from:
http://www.hpl.hp.com/research/linux/atomic_ops/
The changed API of the ethread library has also caused
modifications in the Erlang runtime system. Preparations for
the to come "delayed deallocation" feature has also been done
since it depends on the ethread library.
Note: When building for x86, the ethread library will now use
instructions that first appeared on the pentium 4 processor. If
you want the runtime system to be compatible with older
processors (back to 486) you need to pass the
`--enable-ethread-pre-pentium4-compatibility' configure command
line argument when configuring the system.
|
|
|