aboutsummaryrefslogtreecommitdiffstats
path: root/erts/emulator/beam/erl_lock_check.c
AgeCommit message (Collapse)Author
2018-10-19erts: Remove dead tree merging codeSverker Eriksson
2018-10-03erts: Remove "dynamic" lock order supportSverker Eriksson
2018-10-03erts: Add lock order check for route nodesSverker Eriksson
Lock order is reverse tree depth, from leafs toward root. This solution may eventually fail if running too long as route nodes do not increase their 'lc_order' in a join operation when they move up in the tree. But who runs a VM with lock-checker for such a long time?
2018-10-03erts: Add lock order check of erl_db_catree base nodesSverker Eriksson
Lock order is key term order, so each base node needs its own key if lock check is enabled.
2018-10-03erts: Add Erlang term order to lock checkerSverker Eriksson
2018-09-05Add a more scalable ETS ordered_set implementationKjell Winblad
The current ETS ordered_set implementation can quickly become a scalability bottleneck on multicore machines when an application updates an ordered_set table from concurrent processes [1][2]. The current implementation is based on an AVL tree protected from concurrent writes by a single readers-writer lock. Furthermore, the current implementation has an optimization, called the stack optimization [3], that can improve the performance when only a single process accesses a table but can cause bad scalability even in read-only scenarios. It is possible to pass the option {write_concurrency, true} to ets:new/2 when creating an ETS table of type ordered_set but this option has no effect for tables of type ordered_set without this commit. The new ETS ordered_set implementation, added by this commit, is only activated when one passes the options ordered_set and {write_concurrency, true} to the ets:new/2 function. Thus, the previous ordered_set implementation (from here on called the default implementation) can still be used in applications that do not benefit from the new implementation. The benchmark results on the following web page show that the new implementation is many times faster than the old implementation in some scenarios and that the old implementation is still better than the new implementation in some scenarios. http://winsh.me/ets_catree_benchmark/ets_ca_tree_benchmark_results.html The new implementation is expected to scale better than the default implementation when concurrent processes use the following ETS operations to operate on a table: delete/2, delete_object/2, first/1, insert/2 (single object), insert_new/2 (single object), lookup/2, lookup_element/2, member/2, next/2, take/2 and update_element/3 (single object). Currently, the new implementation does not have scalable support for the other operations (e.g., select/2). However, when these operations are used infrequently, the new implantation may still scale better than the default implementation as the benchmark results at the URL above shows. Description of the New Implementation ---------------------------------- The new implementation is based on a data structure which is called the contention adapting search tree (CA tree for short). The following publication contains a detailed description of the CA tree: A Contention Adapting Approach to Concurrent Ordered Sets Journal of Parallel and Distributed Computing, 2018 Kjell Winblad and Konstantinos Sagonas https://doi.org/10.1016/j.jpdc.2017.11.007 http://www.it.uu.se/research/group/languages/software/ca_tree/catree_proofs.pdf A discussion of how the CA tree can be used as an ETS back-end can be found in another publication [1]. The CA tree is a data structure that dynamically changes its synchronization granularity based on detected contention. Internally, the CA tree uses instances of a sequential data structure to store items. The CA tree implementation contained in this commit uses the same AVL tree implementation as is used for the default ordered set implementation. This AVL tree implementation is reused so that much of the existing code to implement the ETS operations can be reused. Tests ----- The ETS tests in `lib/stdlib/test/ets_SUITE.erl` have been extended to also test the new ordered_set implementation. The function ets_SUITE:throughput_benchmark/0 has also been added to this file. This function can be used to measure and compare the performance of the different ETS table types and options. This function writes benchmark data to standard output that can be visualized by the HTML page `lib/stdlib/test/ets_SUITE_data/visualize_throughput.html`. [1] More Scalable Ordered Set for ETS Using Adaptation. In Thirteenth ACM SIGPLAN workshop on Erlang (2014). Kjell Winblad and Konstantinos Sagonas. https://doi.org/10.1145/2633448.2633455 http://www.it.uu.se/research/group/languages/software/ca_tree/erlang_paper.pdf [2] On the Scalability of the Erlang Term Storage In Twelfth ACM SIGPLAN workshop on Erlang (2013) Kjell Winblad, David Klaftenegger and Konstantinos Sagonas https://doi.org/10.1145/2505305.2505308 http://winsh.me/papers/erlang_workshop_2013.pdf [3] The stack optimization works by keeping one preallocated stack instance in every ordered_set table. This stack is updated so that it contains the search path in some read operations (e.g., ets:next/2). This makes it possible for a subsequent ets:next/2 to avoid traversing some nodes in some cases. Unfortunately, the preallocated stack needs to be flagged so that it is not updated concurrently by several threads which cause bad scalability.
2018-04-26erts: Fix memory leak in lock checker at thread exitSverker Eriksson
Leak introduced in 865ac3b740d9efa1a0583349929591c757300412.
2018-04-13erts: Make locked checker allocations thread specificSverker Eriksson
Also removed unused ERTS_LC_STATIC_ALLOC.
2018-04-13erts: Add erts_debug:lc_graph/0Sverker Eriksson
Run debug VM or config with --enable-lock-checking. Exercise VM and then run erts_debug:lc_graph(). to create a file "lc_graph.<pid>" in current working directory.
2018-04-13erts: Refactor erl_lock_check.cSverker Eriksson
with just lots of renaming, nothing else. erts_lc_locked_locks_t -> lc_thread_t create_locked_locks -> create_thread_data l_lcks -> thr l_lck -> ll And dropped erts_ prefix on some file scope types.
2018-04-12erts: Remove obsolete locks from lock checkerSverker Eriksson
2018-03-21Implementation of true asynchronous signaling between processesRickard Green
Communication between Erlang processes has conceptually always been performed through asynchronous signaling. The runtime system implementation has however previously preformed most operation synchronously. In a system with only one true thread of execution, this is not problematic (often the opposite). In a system with multiple threads of execution (as current runtime system implementation with SMP support) it becomes problematic. This since it often involves locking of structures when updating them which in turn cause resource contention. Utilizing true asynchronous communication often avoids these resource contention issues. The case that triggered this change was contention on the link lock due to frequent updates of the monitor trees during communication with a frequently used server. The signal order delivery guarantees of the language makes it hard to change the implementation of only some signals to use true asynchronous signaling. Therefore the implementations of (almost) all signals have been changed. Currently the following signals have been implemented as true asynchronous signals: - Message signals - Exit signals - Monitor signals - Demonitor signals - Monitor triggered signals (DOWN, CHANGE, etc) - Link signals - Unlink signals - Group leader signals All of the above already defined as asynchronous signals in the language. The implementation of messages signals was quite asynchronous to begin with, but had quite strict delivery constraints due to the ordering guarantees of signals between a pair of processes. The previously used message queue partitioned into two halves has been replaced by a more general signal queue partitioned into three parts that service all kinds of signals. More details regarding the signal queue can be found in comments in the erl_proc_sig_queue.h file. The monitor and link implementations have also been completely replaced in order to fit the new asynchronous signaling implementation as good as possible. More details regarding the new monitor and link implementations can be found in the erl_monitor_link.h file.
2018-03-09Always use sys_memcpy/cmp/etc instead of plain memcpy/cmp/etcJohn Högberg
2017-11-30Reimplement efile_drv as a dirty NIFJohn Högberg
This improves the latency of file operations as dirty schedulers are a bit more eager to run jobs than async threads, and use a single global queue rather than per-thread queues, eliminating the risk of a job stalling behind a long-running job on the same thread while other async threads sit idle. There's no such thing as a free lunch though; the lowered latency comes at the cost of increased busy-waiting which may have an adverse effect on some applications. This behavior can be tweaked with the +sbwt flag, but unfortunately it affects all types of schedulers and not just dirty ones. We plan to add type-specific flags at a later stage. sendfile has been moved to inet_drv to lessen the effect of a nasty race; the cooperation between inet_drv and efile has never been airtight and the socket dying at the wrong time (Regardless of reason) could result in fd aliasing. Moving it to the inet driver makes it impossible to trigger this by closing the socket in the middle of a sendfile operation, while still allowing it to be aborted -- something that can't be done if it stays in the file driver. The race still occurs if the controlling process dies in the short window between dispatching the sendfile operation and the dup(2) call in the driver, but it's much less likely to happen now. A proper fix is in the works. -- Notable functional differences: * The use_threads option for file:sendfile/5 no longer has any effect. * The file-specific DTrace probes have been removed. The same effect can be achieved with normal tracing together with the nif__entry/nif__return probes to track scheduling. -- OTP-14256
2017-11-30Change resource_monitors' lock orderJohn Högberg
If a NIF monitor fired while the resource was present in an ETS table, the lock checker would erroneously report a lock order violation. This has no effect outside of debug builds.
2017-09-15erts: Replace check_io spinlock with lock-less list insertionSverker Eriksson
2017-09-11erts: Remove possibility to disable dirty schedulersLukas Larsson
2017-09-05Merge branch 'maint'Rickard Green
* maint: Bug fixes of statistics(wall_clock) and statistics(runtime) Conflicts: erts/emulator/beam/erl_time_sup.c
2017-09-05Merge branch 'rickard/statistics-time-fixes/OTP-14597/ERL-465' into maintRickard Green
* rickard/statistics-time-fixes/OTP-14597/ERL-465: Bug fixes of statistics(wall_clock) and statistics(runtime) Conflicts: erts/emulator/beam/erl_time_sup.c
2017-09-04Bug fixes of statistics(wall_clock) and statistics(runtime)Rickard Green
2017-07-25Merge branch 'maint'John Högberg
2017-07-17erts: Remove ERTS_SMP and USE_THREAD definesLukas Larsson
This refactor was done using the unifdef tool like this: for file in $(find erts/ -name *.[ch]); do unifdef -t -f defile -o $file $file; done where defile contained: #define ERTS_SMP 1 #define USE_THREADS 1 #define DDLL_SMP 1 #define ERTS_HAVE_SMP_EMU 1 #define SMP 1 #define ERL_BITS_REENTRANT 1 #define ERTS_USE_ASYNC_READY_Q 1 #define FDBLOCK 1 #undef ERTS_POLL_NEED_ASYNC_INTERRUPT_SUPPORT #define ERTS_POLL_ASYNC_INTERRUPT_SUPPORT 0 #define ERTS_POLL_USE_WAKEUP_PIPE 1 #define ERTS_POLL_USE_UPDATE_REQUESTS_QUEUE 1 #undef ERTS_HAVE_PLAIN_EMU #undef ERTS_SIGNAL_STATE
2017-07-11Use erl_lock_flags in lock checkingJohn Högberg
2017-07-06Allow toggling lock counting at runtimeJohn Högberg
The implementation is still hidden behind ERTS_ENABLE_LOCK_COUNT, and all categories are still enabled by default, but the actual counting can be toggled at will. OTP-13170
2017-05-04Update copyright yearRaimo Niskanen
2017-03-22erts: Remove meta_main_tabSverker Eriksson
\o/ O / \ Also removed the body for CHECK_TABLES enabled by HARDDEBUG. Removed quite useless check for hash, but kept dead check code for tree.
2017-03-22erts: Replace meta_pid_to{_fixed}_tab with linked listsSverker Eriksson
from process psd through all owned/fixed tables. As meta_pid_to{_fixed}_tab maps to slot in meta_main_tab which is planned for destruction. In this commit we no longer seize table lock while freeing the table (free_table_cont) as it's not needed and makes the code a bit simpler. Any concurrent operation on the table will only access lock, owner and status and then bail out.
2017-02-20Merge branch 'master' into sverker/enif_selectSverker Eriksson
Conflicts: erts/emulator/beam/erl_binary.h erts/emulator/beam/erl_monitors.c erts/emulator/beam/erl_nif.c erts/emulator/beam/global.h erts/emulator/test/nif_SUITE_data/nif_SUITE.c
2017-02-09erts: Add enif_monitor_process and enif_demonitor_processSverker Eriksson
2017-02-06Implement magic referencesRickard Green
Magic references are *intentionally* indistinguishable from ordinary references for the Erlang software. Magic references do not change the language, and are intended as a pure runtime internal optimization. An ordinary reference is typically used as a key in some table. A magic reference has a direct pointer to a reference counted magic binary. This makes it possible to implement various things without having to do lookups in a table, but instead access the data directly. Besides very fast lookups this can also improve scalability by removing a potentially contended table. A couple of examples of planned future usage of magic references are ETS table identifiers, and BIF timer identifiers. Besides future optimizations using magic references it should also be possible to replace the exposed magic binary cludge with magic references. That is, magic binaries that are exposed as empty binaries to the Erlang software.
2017-01-12Merge branch 'maint'Rickard Green
* maint: Fix call time tracing with dirty schedulers
2017-01-11Fix call time tracing with dirty schedulersRickard Green
2017-01-02Merge branch 'maint'Rickard Green
* maint: Multi scheduling block bug-fixes Fix VM global GC info for dirty schedulers Leave dirty work in dirty run-queues on multi scheduling block Fix premature removal of process struct Fix crash due to GC of node entry on dirty scheduler
2016-12-28Fix VM global GC info for dirty schedulersRickard Green
2016-08-31Remove old purge strategyRickard Green
2016-08-29Fix purge of codeRickard Green
Ensure that we cannot get any dangling pointers into code that has been purged. This is done by a two phase purge. At first phase all fun entries pointing into the code to purge are marked for purge. All processes trying to call these funs will be suspended and by this we avoid getting new direct references into the code. When all processes has been checked, these processes are resumed. The new purge strategy now also completely ignore the existence of indirect references to the code (funs). If such exist, they will cause bad fun exceptions to the caller, but will not prevent a soft purge or cause a kill of a process having such live references during a hard purge. This since it is impossible to give any guarantees that no processes in the system have such indirect references. Even when the system is completely clean from such references, new ones can appear via distribution and/or disk.
2016-08-26Reclaim literal area after purge has completedRickard Green
2016-04-15erts: Implement tracer modulesLukas Larsson
Add the possibility to use modules as trace data receivers. The functions in the module have to be nifs as otherwise complex trace probes will be very hard to handle (complex means trace probes for ports for example). This commit changes the way that the ptab->tracer field works from always being an immediate, to now be NIL if no tracer is present or else be the tuple {TracerModule, TracerState} where TracerModule is an atom that is later used to lookup the appropriate tracer callbacks to call and TracerState is just passed to the tracer callback. The default process and port tracers have been rewritten to use the new API. This commit also changes the order which trace messages are delivered to the potential tracer process. Any enif_send done in a tracer module may be delayed indefinitely because of lock order issues. If a message is delayed any other trace message send from that process is also delayed so that order is preserved for each traced entity. This means that for some trace events (i.e. send/receive) the events may come in an unintuitive order (receive before send) to the trace receiver. Timestamps are taken when the trace message is generated so trace messages from differented processes may arrive with the timestamp out of order. Both the erlang:trace and seq_trace:set_system_tracer accept the new tracer module tracers and also the backwards compatible arguments. OTP-10267
2016-03-15update copyright-yearHenrik Nord
2016-02-02erts: Add microstate accountingLukas Larsson
Microstate accounting is a way to track which state the different threads within ERTS are in. The main usage area is to pin point performance bottlenecks by checking which states the threads are in and then from there figuring out why and where to optimize. Since checking whether microstate accounting is on or off is relatively expensive if done in a short loop only a few of the states are enabled by default and more states can be enabled through configure. I've done some benchmarking and the overhead with it turned off is not noticible and with it on it is a fraction of a percent. If you enable the extra states, depending on the benchmark, the ovehead when turned off is about 1% and when turned on somewhere inbetween 5-15%. OTP-12345
2015-12-15erts: Move os_pid to port hash to child setupLukas Larsson
Had to move the hashing because of a race that can otherwise happen where a new os_pid value was inserted into the hash before the previous value had been removed. Also replaced the protocol inbetween erts and child setup to be a binary protocol. This was done in order to deal with the varying size of Eterm.
2015-12-15erts: Create forker process for spawn driverLukas Larsson
Instead of forking from the beam process, we create a separate process in which all forks are done. This has several advantages: 1) performance: * don't have to close all fd's in the world * fork only has to copy stuff from a small process * work is done in a completely seperate process * a 3x performance increase has been measured, can be made even greater (10x) if we cache the environment in child setup 2) stability * the exec is done in another process than beam, which means that if the file that we exec to is on an nfs that is not available right now we will not block a scheduler until the nfs returns. 3) simplicity * don't have to deal with SIGCHLD in the erts Unfortunately, this solution also implies some badness. 1) There will always be a seperate process running together with beam on unix. This could be confusing and undesirable. 2) We have to transfer the entire environment to child_setup for each command. OTP-13088
2015-06-24erts: Remove HALFWORD_HEAP definitionBjörn-Egil Dahlberg
2015-06-18Change license text to APLv2Bruce Yinhe
2015-05-08Optimized timer implementationRickard Green
2015-03-20Introduce a new time APIRickard Green
The old time API is based on erlang:now/0. The major issue with erlang:now/0 is that it was intended to be used for so many unrelated things. This tied these unrelated operations together and unnecessarily caused performance, scalability as well as accuracy, and precision issues for operations that do not need to have such issues. The new API spreads different functionality over multiple functions in order to improve on this. The new API consists of a number of new BIFs: - erlang:convert_time_unit/3 - erlang:monotonic_time/0 - erlang:monotonic_time/1 - erlang:system_time/0 - erlang:system_time/1 - erlang:time_offset/0 - erlang:time_offset/1 - erlang:timestamp/0 - erlang:unique_integer/0 - erlang:unique_integer/1 - os:system_time/0 - os:system_time/1 and a number of extensions of existing BIFs: - erlang:monitor(time_offset, clock_service) - erlang:system_flag(time_offset, finalize) - erlang:system_info(os_monotonic_time_source) - erlang:system_info(time_offset) - erlang:system_info(time_warp_mode) - erlang:system_info(time_correction) - erlang:system_info(start_time) See the "Time and Time Correction in Erlang" chapter of the ERTS User's Guide for more information.
2014-08-22Ensure "runnable port" trace messages are not sent out of orderRickard Green
2014-08-06erts: Print error reason when malloc failsLukas Larsson
2014-03-28erts: Add etp-lc-dump and etp-ppc-stacktrace macroLukas Larsson
2014-02-24ose,erts: Specify name for tsd keysLukas Larsson
This simplified debugging on OSE and also limits the number of ppdata keys that are created when beam is restarted.