aboutsummaryrefslogtreecommitdiffstats
path: root/erts/include/internal/gcc/ethr_dw_atomic.h
AgeCommit message (Collapse)Author
2017-05-04Update copyright yearRaimo Niskanen
2017-02-14Fixed typos in ertsAndrew Dryga
2015-06-18Change license text to APLv2Bruce Yinhe
2015-01-14Improve ethread atomics based on GCC builtinsRickard Green
* Use of __atomic builtins when available. * Improved configure test that checks for missing memory barrier in __sync_synchronize(). The old approach was to verify known working gcc versions and check gcc version at compile time. Besides not being very safe, the old approach often unnecessarily caused usage of the very expensive workaround. * Introduced (no overhead) workaround for missing clobber in __sync_synchronize() when using buggy LLVM implementation of __sync_synchronize(). * Implement native memory barriers for ARM processors supporting the DMB instruction. * Use of volatile store on Alpha as atomic set operation if no __atomic_store_n() is available (already used on x86/x86_64 Sparc V9, PowerPC, and MIPS). Fallback used when not using volatile store is typically very expensive. * Use volatile load on Alpha and ARM as atomic read operation if no __atomic_load_n() is available (already used on x86/x86_64 Sparc V9, PowerPC, and MIPS). Fallback when not using volatile load is typically very expensive.
2011-06-14Improve ethread atomicsRickard Green
The ethread atomics API now also provide double word size atomics. Double word size atomics are implemented using native atomic instructions on x86 (when the cmpxchg8b instruction is available) and on x86_64 (when the cmpxchg16b instruction is available). On other hardware where 32-bit atomics or word size atomics are available, an optimized fallback is used; otherwise, a spinlock, or a mutex based fallback is used. The ethread library now performs runtime tests for presence of hardware features, such as for example SSE2 instructions, instead of requiring this to be determined at compile time. There are now functions implementing each atomic operation with the following implied memory barrier semantics: none, read, write, acquire, release, and full. Some of the operation-barrier combinations aren't especially useful. But instead of filtering useful ones out, and potentially miss a useful one, we implement them all. A much smaller set of functionality for native atomics are required to be implemented than before. More or less only cmpxchg and a membar macro are required to be implemented for each atomic size. Other functions will automatically be constructed from these. It is, of course, often wise to implement more that this if possible from a performance perspective.