Age | Commit message (Collapse) | Author |
|
|
|
Microstate accounting is a way to track which state the
different threads within ERTS are in. The main usage area
is to pin point performance bottlenecks by checking which
states the threads are in and then from there figuring out
why and where to optimize.
Since checking whether microstate accounting is on or off is
relatively expensive if done in a short loop only a few of the
states are enabled by default and more states can be enabled
through configure.
I've done some benchmarking and the overhead with it turned off
is not noticible and with it on it is a fraction of a percent.
If you enable the extra states, depending on the benchmark,
the ovehead when turned off is about 1% and when turned on
somewhere inbetween 5-15%.
OTP-12345
|
|
Lots of pthread platforms unnecessarily falled back on the pipe/select
solution. This since we tried to use the same monotonic clock source
for pthread_cond_timedwait() as used by OS monotonic time. This has
been fixed on most platforms by using another clock source.
Darwin can however not use pthread_cond_timedwait() with monotonic
clock source and has to use the pipe/select solution. On darwin we
now use select with _DARWIN_UNLIMITED_SELECT in order to be able to
handle a large amount of file descriptors.
|
|
|
|
* rickard/time_api/OTP-11997: (22 commits)
Update primary bootstrap
inets: Suppress deprecated warning on erlang:now/0
inets: Cleanup of multiple copies of functions Add inets_lib with common functions used by multiple modules
inets: Update comments
Suppress deprecated warning on erlang:now/0
Use new time API and be back-compatible in inets Remove unused functions and removed redundant test
asn1 test SUITE: Eliminate use of now/0
Disable deprecated warning on erlang:now/0 in diameter_lib
Use new time API and be back-compatible in ssh
Replace all calls to now/0 in CT with new time API functions
test_server: Replace usage of erlang:now() with usage of new API
Replace usage of erlang:now() with usage of new API
Replace usage of erlang:now() with usage of new API
Replace usage of erlang:now() with usage of new API
Replace usage of erlang:now() with usage of new API
otp_SUITE: Warn for calls to erlang:now/0
Replace usage of erlang:now() with usage of new API
Multiple timer wheels
Erlang based BIF timer implementation for scalability
Implement ethread events with timeout
...
Conflicts:
bootstrap/bin/start.boot
bootstrap/bin/start_clean.boot
bootstrap/lib/compiler/ebin/beam_asm.beam
bootstrap/lib/compiler/ebin/compile.beam
bootstrap/lib/kernel/ebin/auth.beam
bootstrap/lib/kernel/ebin/dist_util.beam
bootstrap/lib/kernel/ebin/global.beam
bootstrap/lib/kernel/ebin/hipe_unified_loader.beam
bootstrap/lib/kernel/ebin/inet_db.beam
bootstrap/lib/kernel/ebin/inet_dns.beam
bootstrap/lib/kernel/ebin/inet_res.beam
bootstrap/lib/kernel/ebin/os.beam
bootstrap/lib/kernel/ebin/pg2.beam
bootstrap/lib/stdlib/ebin/dets.beam
bootstrap/lib/stdlib/ebin/dets_utils.beam
bootstrap/lib/stdlib/ebin/erl_tar.beam
bootstrap/lib/stdlib/ebin/escript.beam
bootstrap/lib/stdlib/ebin/file_sorter.beam
bootstrap/lib/stdlib/ebin/otp_internal.beam
bootstrap/lib/stdlib/ebin/qlc.beam
bootstrap/lib/stdlib/ebin/random.beam
bootstrap/lib/stdlib/ebin/supervisor.beam
bootstrap/lib/stdlib/ebin/timer.beam
erts/aclocal.m4
erts/emulator/beam/bif.c
erts/emulator/beam/erl_bif_info.c
erts/emulator/beam/erl_db_hash.c
erts/emulator/beam/erl_init.c
erts/emulator/beam/erl_process.h
erts/emulator/beam/erl_thr_progress.c
erts/emulator/beam/utils.c
erts/emulator/sys/unix/sys.c
erts/preloaded/ebin/erlang.beam
erts/preloaded/ebin/erts_internal.beam
erts/preloaded/ebin/init.beam
erts/preloaded/src/erts_internal.erl
lib/common_test/test/ct_hooks_SUITE_data/cth/tests/empty_cth.erl
lib/diameter/src/base/diameter_lib.erl
lib/kernel/src/os.erl
lib/ssh/test/ssh_basic_SUITE.erl
system/doc/efficiency_guide/advanced.xml
|
|
|
|
* lukas/erts/crashdump_improvements/OTP-12377:
erts: Make main thread safe from pipe closed event
erts: Improve crash dumps
erts: Rename sys_sigset to sys_signal
erts: Introduce thread suspend functions
erts: Remove usage of QUANTIFY signal
erts: Add support for thread names
ets: Increase data available in crash dumps and ets:info
erts: Start compilation of beam_emu earlier
|
|
These functions allow any thread to suspend any other thread
immediately and then resume all threads. This is useful when
doing a crash dump in order to get a more accurate picture
of what state the system is in.
|
|
|
|
The 64-bit atomic ops API is implemented by
* native word size atomic ops on 64-bit architectures, and
* native double word size atomic ops on 32-bit architectures
when available. When native double word size atomic is not
available, the fallback using modification counters is
used.
|
|
This simplified debugging on OSE and also limits the number of ppdata
keys that are created when beam is restarted.
|
|
|
|
|
|
rickard/r16b/thread-queue-fix/OTP-10854
Conflicts:
erts/emulator/beam/erl_threads.h
|
|
|
|
|
|
|
|
|
|
- Document barrier semantics
- Introduce ddrb suffix on atomic ops
- Barrier macros for both non-SMP and SMP case
- Make the thread progress API a bit more intuitive
|
|
Windows native critical sections are now used internally in the
runtime system as mutex implementation. This since they perform
better under extreme contention than our own implementation.
|
|
A number of memory allocation optimizations have been implemented. Most
optimizations reduce contention caused by synchronization between
threads during allocation and deallocation of memory. Most notably:
* Synchronization of memory management in scheduler specific allocator
instances has been rewritten to use lock-free synchronization.
* Synchronization of memory management in scheduler specific
pre-allocators has been rewritten to use lock-free synchronization.
* The 'mseg_alloc' memory segment allocator now use scheduler specific
instances instead of one instance. Apart from reducing contention
this also ensures that memory allocators always create memory
segments on the local NUMA node on a NUMA system.
|
|
Conflicts:
erts/aclocal.m4
erts/emulator/beam/erl_db.c
erts/emulator/sys/win32/sys.c
erts/include/internal/ethread_header_config.h.in
|
|
All uses of the old deprecated atomic API in the runtime system
have been replaced with the use of the new atomic API. In a lot of
places this change imply a relaxation of memory barriers used.
|
|
The ethread atomics API now also provide double word size atomics.
Double word size atomics are implemented using native atomic
instructions on x86 (when the cmpxchg8b instruction is available)
and on x86_64 (when the cmpxchg16b instruction is available). On
other hardware where 32-bit atomics or word size atomics are
available, an optimized fallback is used; otherwise, a spinlock,
or a mutex based fallback is used.
The ethread library now performs runtime tests for presence of
hardware features, such as for example SSE2 instructions, instead
of requiring this to be determined at compile time.
There are now functions implementing each atomic operation with the
following implied memory barrier semantics: none, read, write,
acquire, release, and full. Some of the operation-barrier
combinations aren't especially useful. But instead of filtering
useful ones out, and potentially miss a useful one, we implement
them all.
A much smaller set of functionality for native atomics are required
to be implemented than before. More or less only cmpxchg and a
membar macro are required to be implemented for each atomic size.
Other functions will automatically be constructed from these. It is,
of course, often wise to implement more that this if possible from a
performance perspective.
|
|
|
|
* rickard/mtx-destroy-ebusy/OTP-9009:
Send warning instead of abort on EBUSY from pthread_mutex_destroy
|
|
Due to a bug in glibc the runtime system could abort
while trying to destroy a mutex. The runtime system
will now issue a warning instead of aborting.
|
|
|
|
|
|
|
|
|
|
|
|
Large parts of the ethread library have been rewritten. The
ethread library is an Erlang runtime system internal, portable
thread library used by the runtime system itself.
Most notable improvement is a reader optimized rwlock
implementation which dramatically improve the performance of
read-lock/read-unlock operations on multi processor systems by
avoiding ping-ponging of the rwlock cache lines. The reader
optimized rwlock implementation is used by miscellaneous
rwlocks in the runtime system that are known to be read-locked
frequently, and can be enabled on ETS tables by passing the
`{read_concurrency, true}' option upon table creation. See the
documentation of `ets:new/2' for more information.
The ethread library can now also use the libatomic_ops library
for atomic memory accesses. This makes it possible for the
Erlang runtime system to utilize optimized atomic operations
on more platforms than before. Use the
`--with-libatomic_ops=PATH' configure command line argument
when specifying where the libatomic_ops installation is
located. The libatomic_ops library can be downloaded from:
http://www.hpl.hp.com/research/linux/atomic_ops/
The changed API of the ethread library has also caused
modifications in the Erlang runtime system. Preparations for
the to come "delayed deallocation" feature has also been done
since it depends on the ethread library.
Note: When building for x86, the ethread library will now use
instructions that first appeared on the pentium 4 processor. If
you want the runtime system to be compatible with older
processors (back to 486) you need to pass the
`--enable-ethread-pre-pentium4-compatibility' configure command
line argument when configuring the system.
|
|
|
|
Missing memory barriers in erts_poll() could cause the runtime system to
hang indefinitely.
|
|
|