Age | Commit message (Collapse) | Author |
|
Since b29ecbd (OTP-10418, R15B03) Erlang does not compile anymore with
old versions of GCC that do not have atomic ops builtins on platforms
where there is no native ethread implementation (e.g. ARM):
In file included from ../include/internal/gcc/ethread.h:29,
from ../include/internal/ethread.h:354,
from beam/erl_threads.h:264,
from beam/erl_smp.h:27,
from beam/sys.h:413,
from hipe/hipe_mkliterals.c:29:
../include/internal/gcc/ethr_membar.h:49:4: error: #error "No __sync_val_compare_and_swap"
This patch adds a header guard in "gcc/ethread.h", as is present in
"libatomic_ops/ethread.h".
|
|
|
|
|
|
* sverk/win-64-pointer-fix:
erts: Correct term type for printf %T
erts: Correct internal printf integer type for win64
erts: Correct some printf type formatting
erts: Fix type bug in get_proc_affinity for windows
OTP-10887
Forgot this ticket for sverk/erlang_pid-revert:
OTP-10885
|
|
|
|
|
|
An attempt to speedup valgrind
|
|
|
|
A faulty #if 0 caused healthy gcc builtin atomic to be ignored.
|
|
- Document barrier semantics
- Introduce ddrb suffix on atomic ops
- Barrier macros for both non-SMP and SMP case
- Make the thread progress API a bit more intuitive
|
|
|
|
Removed symbolic links from repository.
|
|
Windows native critical sections are now used internally in the
runtime system as mutex implementation. This since they perform
better under extreme contention than our own implementation.
|
|
The ethread atomics API now also provide double word size atomics.
Double word size atomics are implemented using native atomic
instructions on x86 (when the cmpxchg8b instruction is available)
and on x86_64 (when the cmpxchg16b instruction is available). On
other hardware where 32-bit atomics or word size atomics are
available, an optimized fallback is used; otherwise, a spinlock,
or a mutex based fallback is used.
The ethread library now performs runtime tests for presence of
hardware features, such as for example SSE2 instructions, instead
of requiring this to be determined at compile time.
There are now functions implementing each atomic operation with the
following implied memory barrier semantics: none, read, write,
acquire, release, and full. Some of the operation-barrier
combinations aren't especially useful. But instead of filtering
useful ones out, and potentially miss a useful one, we implement
them all.
A much smaller set of functionality for native atomics are required
to be implemented than before. More or less only cmpxchg and a
membar macro are required to be implemented for each atomic size.
Other functions will automatically be constructed from these. It is,
of course, often wise to implement more that this if possible from a
performance perspective.
|
|
|
|
* rickard/barriers/OTP-9281:
Silence warnings
Fix build with hipe on amd64
Reduce number of atomic ops
Use 32-bit atomic for port snapshot
Remove pointless erts_ports_alive variable
Ensure quick break
Ensure that all rehashing information are seen when done
Ensure that stack updates are seen when stack is released
Add needed barriers for write_concurrency tables
Homogenize memory barriers on atomics
|
|
Atomic operations with specified barriers have specified barrier semantics.
Set and read operations have undefined barrier semantics. All other atomic
operations implied full memory barriers, except when using the libatomic_ops
library and the tilera atomics api.
Some code in the runtime system assumed that all operations used (except for
set, read and specified) implied full memory barriers. The use of the
libatomic_ops library and the tilera atomics api have therefore been modified
to behave as the other implementations.
Some atomic operations with specified barrier semantics on sparc32 have also
been been relaxed in this commit.
|
|
Conflicts:
erts/emulator/beam/erl_printf_term.c
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The atomic memory operations interface used the 'long' type and assumed that
it was of the same size as 'void *'. This is true on most platforms, however,
not on Windows 64.
|
|
* rickard/rwmutex-bug/OTP-8925:
Use correct argument types on rwlock_wake_set_flags()
|
|
|
|
* rickard/rwmutex-bug/OTP-8925:
Miscellaneous rwmutex bug fixes and improvements
Don't use more reader groups than schedulers
New test suite containing stress tests of the rwmutex implementation
Conflicts:
erts/emulator/beam/erl_init.c
|
|
The ERTS internal rwlock implementation could get
into an inconsistent state. This bug was very seldom
triggered, but could be during heavy contention. The
bug was introduced in R14B (erts-5.8.1).
The bug was most likely to be triggered when using the
read_concurrency option on an ETS table that
was frequently accessed from multiple processes doing
lots of writes and reads. That is, in a situation where
you typically don't want to use the read_concurrency
option in the first place.
|
|
* ta/fix-ethread-void-return:
ethread: do not return from void ethr_atomic_set_relb
OTP-8944
|
|
|
|
Reported-by: Patrick Baggett <[email protected]>
|
|
* sv/ethread-atomic-mips:
add MIPS architecture to GCC ethread atomics support
|
|
|
|
Gcc for MIPS supports immediate atomic gets and sets, and also
supports a working __sync_synchronize() for gcc 4.2 and greater.
|
|
The CPU topology is now automatically detected on Windows
systems with less than 33 logical processors. The runtime system
will now, also on Windows, by default bind schedulers to logical
processors using the 'default_bind' bind type if the amount of
schedulers is at least equal to the amount of logical processors
configured, binding of schedulers is supported, and a CPU topology
is available at startup.
|
|
Calling erlang:system_info/1 with the new argument 'update_cpu_info'
will make the runtime system reread and update the internally stored
CPU information. For more information see the documentation of
erlang:system_info(update_cpu_info).
|
|
Large parts of the ethread library have been rewritten. The
ethread library is an Erlang runtime system internal, portable
thread library used by the runtime system itself.
Most notable improvement is a reader optimized rwlock
implementation which dramatically improve the performance of
read-lock/read-unlock operations on multi processor systems by
avoiding ping-ponging of the rwlock cache lines. The reader
optimized rwlock implementation is used by miscellaneous
rwlocks in the runtime system that are known to be read-locked
frequently, and can be enabled on ETS tables by passing the
`{read_concurrency, true}' option upon table creation. See the
documentation of `ets:new/2' for more information.
The ethread library can now also use the libatomic_ops library
for atomic memory accesses. This makes it possible for the
Erlang runtime system to utilize optimized atomic operations
on more platforms than before. Use the
`--with-libatomic_ops=PATH' configure command line argument
when specifying where the libatomic_ops installation is
located. The libatomic_ops library can be downloaded from:
http://www.hpl.hp.com/research/linux/atomic_ops/
The changed API of the ethread library has also caused
modifications in the Erlang runtime system. Preparations for
the to come "delayed deallocation" feature has also been done
since it depends on the ethread library.
Note: When building for x86, the ethread library will now use
instructions that first appeared on the pentium 4 processor. If
you want the runtime system to be compatible with older
processors (back to 486) you need to pass the
`--enable-ethread-pre-pentium4-compatibility' configure command
line argument when configuring the system.
|
|
Writer preferred pthread read/write locks has been enabled on Linux.
|
|
The number of spinlocks used when implementing atomic fall-backs when no
native atomic implementation is available has been increased from 16 to
1024.
|
|
Support for using gcc's built-in functions for atomic memory access has
been added. This functionallity will be used if available and no other
native atomic implementation in ERTS is available.
|
|
Missing memory barriers in erts_poll() could cause the runtime system to
hang indefinitely.
|
|
* pan/otp_8332_halfword:
Teach testcase in driver_suite the new prototype for driver_async
wx: Correct usage of driver callbacks from wx thread
Adopt the new (R13B04) Nif functionality to the halfword codebase
Support monitoring and demonitoring from driver threads
Fix further test-suite problems
Correct the VM to work for more test suites
Teach {wordsize,internal|external} to system_info/1
Make tracing and distribution work
Turn on instruction packing in the loader and virtual machine
Add the BeamInstr data type for loaded BEAM code
Fix the BEAM dissambler for the half-word emulator
Store pointers to heap data in 32-bit words
Add a custom mmap wrapper to force heaps into the lower address range
Fit all heap data into the 32-bit address range
|
|
Change erl_int_sizes_config to include HALFWORD_HEAP_EMULATOR,
which make it possible for the NIFs to figure out the term size.
|
|
tile-cc 2.0.1.78377 when compiling the runtime system.
|
|
|