Age | Commit message (Collapse) | Author |
|
* rickard/gcc-atomics/OTP-12383:
Improve ethread atomics based on GCC builtins
Conflicts:
erts/aclocal.m4
|
|
* Use of __atomic builtins when available.
* Improved configure test that checks for missing memory
barrier in __sync_synchronize(). The old approach was to
verify known working gcc versions and check gcc version at
compile time. Besides not being very safe, the old approach
often unnecessarily caused usage of the very expensive
workaround.
* Introduced (no overhead) workaround for missing clobber in
__sync_synchronize() when using buggy LLVM implementation of
__sync_synchronize().
* Implement native memory barriers for ARM processors supporting
the DMB instruction.
* Use of volatile store on Alpha as atomic set operation if no
__atomic_store_n() is available (already used on x86/x86_64
Sparc V9, PowerPC, and MIPS). Fallback used when not using
volatile store is typically very expensive.
* Use volatile load on Alpha and ARM as atomic read operation
if no __atomic_load_n() is available (already used on
x86/x86_64 Sparc V9, PowerPC, and MIPS). Fallback when not
using volatile load is typically very expensive.
|
|
These functions allow any thread to suspend any other thread
immediately and then resume all threads. This is useful when
doing a crash dump in order to get a more accurate picture
of what state the system is in.
|
|
|
|
Use AO_fetch_compare_and_swap*() when present
|
|
The commit adb5dc0090bc419e2c4c1250653badbddeb6263b (ETHR_FORCE_INLINE)
broke some platforms without adequate thread support.
|
|
* jjhoo/mingw_compile_fix_forceinline/OTP-11945:
Fix redefinition of ETHR_FORCE_INLINE
|
|
|
|
* jjhoo/mingw_compile_fix_forceinline/OTP-11945:
Do not use __always_inline__ attribute unless gcc vsn >= 3.1.1
Add ETHR_FORCE_INLINE define to hide compiler specific directives
|
|
|
|
|
|
Some win32 specific code does not compile with gcc (mingw-w64) since
'__forceinline' is not supported by gcc. This can be avoided by
defining a new macro ETHR_FORCE_INLINE similar to ETHR_INLINE.
|
|
* lukas/ose/master/OTP-11334: (71 commits)
erts: Fix unix efile assert
ose: Use -O2 when building
ose: Expand OSE docs
ose: Add dummy ttsl driver
ose: Cleanup cleanup of mutex selection defines
ose: Polish mmap configure checks
ose: Add ose specific x-compile flags
ose: Updating fd_driver and spawn_driver for OSE
ose: Updating event and signal API for OSE
ose: Cleanup of mutex selection defines
win32: Compile erl_log.exe
ose: Remove uneccesary define
ose: Fix ssl configure test for osx
erts: Fix sys_msg_dispatcher assert
ose: Fix broken doc links
ose: Thread priorities configurable from lmconf
ose: Yielding the cpu is done "the OSE" way
ose: Start using ppdata for tse key
ose: Do not use spinlocks on OSE
ose: Fix support for crypto
...
Conflicts:
lib/crypto/c_src/crypto.c
|
|
|
|
|
|
The pattern used for getting the priority from the lmconf
is based on the name of the process created. The pattern is:
ERTS_%%PROCESS_NAME%%_PRIO
with the %%PROCESS_NAME%% replaced by the prefix of the process
the priority applies to. eg:
ERTS_SCHEDULER_PRIO=24
applies to processes with name SCHEDULER_1, SCHEDULER_2 etc.
|
|
|
|
|
|
This is because it is very easy to deadlock/livelock inbetween
processes on OSE.
|
|
This simplified debugging on OSE and also limits the number of ppdata
keys that are created when beam is restarted.
|
|
There is a system limit on the number of ppdata that is available
but that should not be reached, and ppdata is faster than using
get_envp.
|
|
This port has support for both non-smp and smp.
It contains a new way to do io checking in which erts_poll_wait
receives the payload of the polled entity. This has implications
for all linked-in drivers.
|
|
|
|
Some basic tests are already done in configure. This makes sure we
cover all cases by bailing out when compiling as well.
|
|
|
|
|
|
An attempt to speedup valgrind
|
|
|
|
A faulty #if 0 caused healthy gcc builtin atomic to be ignored.
|
|
Windows native critical sections are now used internally in the
runtime system as mutex implementation. This since they perform
better under extreme contention than our own implementation.
|
|
The ethread atomics API now also provide double word size atomics.
Double word size atomics are implemented using native atomic
instructions on x86 (when the cmpxchg8b instruction is available)
and on x86_64 (when the cmpxchg16b instruction is available). On
other hardware where 32-bit atomics or word size atomics are
available, an optimized fallback is used; otherwise, a spinlock,
or a mutex based fallback is used.
The ethread library now performs runtime tests for presence of
hardware features, such as for example SSE2 instructions, instead
of requiring this to be determined at compile time.
There are now functions implementing each atomic operation with the
following implied memory barrier semantics: none, read, write,
acquire, release, and full. Some of the operation-barrier
combinations aren't especially useful. But instead of filtering
useful ones out, and potentially miss a useful one, we implement
them all.
A much smaller set of functionality for native atomics are required
to be implemented than before. More or less only cmpxchg and a
membar macro are required to be implemented for each atomic size.
Other functions will automatically be constructed from these. It is,
of course, often wise to implement more that this if possible from a
performance perspective.
|
|
|
|
|
|
|
|
|
|
|
|
The atomic memory operations interface used the 'long' type and assumed that
it was of the same size as 'void *'. This is true on most platforms, however,
not on Windows 64.
|
|
* ta/fix-ethread-void-return:
ethread: do not return from void ethr_atomic_set_relb
OTP-8944
|
|
|
|
Reported-by: Patrick Baggett <[email protected]>
|
|
Large parts of the ethread library have been rewritten. The
ethread library is an Erlang runtime system internal, portable
thread library used by the runtime system itself.
Most notable improvement is a reader optimized rwlock
implementation which dramatically improve the performance of
read-lock/read-unlock operations on multi processor systems by
avoiding ping-ponging of the rwlock cache lines. The reader
optimized rwlock implementation is used by miscellaneous
rwlocks in the runtime system that are known to be read-locked
frequently, and can be enabled on ETS tables by passing the
`{read_concurrency, true}' option upon table creation. See the
documentation of `ets:new/2' for more information.
The ethread library can now also use the libatomic_ops library
for atomic memory accesses. This makes it possible for the
Erlang runtime system to utilize optimized atomic operations
on more platforms than before. Use the
`--with-libatomic_ops=PATH' configure command line argument
when specifying where the libatomic_ops installation is
located. The libatomic_ops library can be downloaded from:
http://www.hpl.hp.com/research/linux/atomic_ops/
The changed API of the ethread library has also caused
modifications in the Erlang runtime system. Preparations for
the to come "delayed deallocation" feature has also been done
since it depends on the ethread library.
Note: When building for x86, the ethread library will now use
instructions that first appeared on the pentium 4 processor. If
you want the runtime system to be compatible with older
processors (back to 486) you need to pass the
`--enable-ethread-pre-pentium4-compatibility' configure command
line argument when configuring the system.
|
|
The number of spinlocks used when implementing atomic fall-backs when no
native atomic implementation is available has been increased from 16 to
1024.
|
|
Support for using gcc's built-in functions for atomic memory access has
been added. This functionallity will be used if available and no other
native atomic implementation in ERTS is available.
|
|
Missing memory barriers in erts_poll() could cause the runtime system to
hang indefinitely.
|
|
|