aboutsummaryrefslogtreecommitdiffstats
path: root/erts/emulator/beam/erl_db_tree.c
AgeCommit message (Collapse)Author
2019-04-10Decentralized counters for ETS ordered_set with write_concurrencyKjell Winblad
Previously, all ETS tables used centralized counter variables to keep track of the number of items stored and the amount of memory consumed. These counters can cause scalability problems (especially on big NUMA systems). This commit adds an implementation of a decentralized counter and modifies the implementation of ETS so that ETS tables of type ordered_set with write_concurrency enabled use the decentralized counter. [Experiments][1] indicate that this change substantially improves the scalability of ETS ordered_set tables with write_concurrency enabled in scenarios with frequent `ets:insert/2` and `ets:delete/2` calls. The new counter is implemented in the module erts_flxctr (`erts_flxctr.h` and `erts_flxctr.c`). The module has the suffix flxctr as it contains the implementation of a flexible counter (i.e., counter instances can be configured to be either centralized or decentralized). Counters that are configured to be centralized are implemented with a single counter variable which is modified with atomic operations. Decentralized counters are spread over several cache lines (how many can be configured with the parameter `+dcg`). The scheduler threads are mapped to cache lines so that there is no single point of contention when decentralized counters are updated. The thread progress functionality of the Erlang VM is utilized to implement support for linearizable snapshots of decentralized counters. The snapshot functionality is used by the `ets:info/1` and `ets:info/2` functions. [1]: http://winsh.me/ets_catree_benchmark/flxctr_res.html
2019-03-11erts: Fix ets:select table fixation leak at owner changeSverker Eriksson
Symtom: ETS table remains fixed after finished ets:select* call. Problem: The decision to unfix table after a yielding ets:select* is based on table ownership, but ownership might have changed while ets:select* was yielding. Solution: Remember and pass along whether table was fixed when the traversal started.
2019-03-07Merge branch 'sverker/maint/ets-no-mbuf-trapping/OTP-15660'Sverker Eriksson
into sverker/master/ets-no-mbuf-trapping/OTP-15660
2019-03-07Merge branch 'sverker/ets-no-mbuf-trapping/OTP-15660'Sverker Eriksson
into sverker/maint/ets-no-mbuf-trapping/OTP-15660
2019-03-07erts: Remove ets traversal yielding if heap fragmentSverker Eriksson
Many heap fragments do no longer make the GC slow. Even worse, we are not guaranteed that a yield will provoke a GC removing the fragments, which might lead to a one-yield-per-bucket scenario if the heap fragment(s) still remains after each yield.
2018-11-01erts: Tidy some ordered_set iteration codeSverker Eriksson
with variable name changes, some comments and moved catree_find_prev_from_pb_key_root after its twin.
2018-10-23erts: Fix slot bug in find_next/prevSverker Eriksson
2018-10-23erts: Do contention adaptions during (updating) iterationsSverker Eriksson
Once an iteration key has been found, never fall back to first/last key in next/prev tree as trees may split or join under our feet. I.e we must always use previous key when searching for the next key.
2018-10-23erts: Optimize find_next/prev_from_pb_keySverker Eriksson
to not have to backtrack up on the stack.
2018-10-19erts: Remove tree merging for ets:slotSverker Eriksson
Brute force solution will always iterate tree from slot 0 and forward. ToDo1: Yield. ToDo2: Maybe optimize by caching AVL tree size in each base node.
2018-10-19erts: Remove tree merging for ets:first,last,next,prevSverker Eriksson
2018-10-19erts: Remove tree merging for ets:select*Sverker Eriksson
2018-10-19erts: Add table type assertions for static stack accessSverker Eriksson
DbTableCATree has no static stack.
2018-10-19erts: Refactor ets:select* bound key lookupSverker Eriksson
Move lookup from analyze_pattern to callers.
2018-10-19erts: Refactor ets ordered_set match spec key boundnessSverker Eriksson
2018-10-09erts: Fix bug introduced in merge commitSverker Eriksson
f4f409ff28185b3308359ca5ca91921bc51f536f
2018-10-09Merge branch 'maint'Sverker Eriksson
# Conflicts: # erts/emulator/beam/erl_db_tree.c
2018-10-09Merge branch 'sverker/erts/ets-select_replace-bug/OTP-15346' into maintSverker Eriksson
* sverker/erts/ets-select_replace-bug/OTP-15346: erts: Fix bug in ets:select_replace for bound key
2018-10-09erts: Fix bug in ets:select_replace for bound keySverker Eriksson
which may cause following calls to ets:next or ets:prev to fail.
2018-09-05Add a more scalable ETS ordered_set implementationKjell Winblad
The current ETS ordered_set implementation can quickly become a scalability bottleneck on multicore machines when an application updates an ordered_set table from concurrent processes [1][2]. The current implementation is based on an AVL tree protected from concurrent writes by a single readers-writer lock. Furthermore, the current implementation has an optimization, called the stack optimization [3], that can improve the performance when only a single process accesses a table but can cause bad scalability even in read-only scenarios. It is possible to pass the option {write_concurrency, true} to ets:new/2 when creating an ETS table of type ordered_set but this option has no effect for tables of type ordered_set without this commit. The new ETS ordered_set implementation, added by this commit, is only activated when one passes the options ordered_set and {write_concurrency, true} to the ets:new/2 function. Thus, the previous ordered_set implementation (from here on called the default implementation) can still be used in applications that do not benefit from the new implementation. The benchmark results on the following web page show that the new implementation is many times faster than the old implementation in some scenarios and that the old implementation is still better than the new implementation in some scenarios. http://winsh.me/ets_catree_benchmark/ets_ca_tree_benchmark_results.html The new implementation is expected to scale better than the default implementation when concurrent processes use the following ETS operations to operate on a table: delete/2, delete_object/2, first/1, insert/2 (single object), insert_new/2 (single object), lookup/2, lookup_element/2, member/2, next/2, take/2 and update_element/3 (single object). Currently, the new implementation does not have scalable support for the other operations (e.g., select/2). However, when these operations are used infrequently, the new implantation may still scale better than the default implementation as the benchmark results at the URL above shows. Description of the New Implementation ---------------------------------- The new implementation is based on a data structure which is called the contention adapting search tree (CA tree for short). The following publication contains a detailed description of the CA tree: A Contention Adapting Approach to Concurrent Ordered Sets Journal of Parallel and Distributed Computing, 2018 Kjell Winblad and Konstantinos Sagonas https://doi.org/10.1016/j.jpdc.2017.11.007 http://www.it.uu.se/research/group/languages/software/ca_tree/catree_proofs.pdf A discussion of how the CA tree can be used as an ETS back-end can be found in another publication [1]. The CA tree is a data structure that dynamically changes its synchronization granularity based on detected contention. Internally, the CA tree uses instances of a sequential data structure to store items. The CA tree implementation contained in this commit uses the same AVL tree implementation as is used for the default ordered set implementation. This AVL tree implementation is reused so that much of the existing code to implement the ETS operations can be reused. Tests ----- The ETS tests in `lib/stdlib/test/ets_SUITE.erl` have been extended to also test the new ordered_set implementation. The function ets_SUITE:throughput_benchmark/0 has also been added to this file. This function can be used to measure and compare the performance of the different ETS table types and options. This function writes benchmark data to standard output that can be visualized by the HTML page `lib/stdlib/test/ets_SUITE_data/visualize_throughput.html`. [1] More Scalable Ordered Set for ETS Using Adaptation. In Thirteenth ACM SIGPLAN workshop on Erlang (2014). Kjell Winblad and Konstantinos Sagonas. https://doi.org/10.1145/2633448.2633455 http://www.it.uu.se/research/group/languages/software/ca_tree/erlang_paper.pdf [2] On the Scalability of the Erlang Term Storage In Twelfth ACM SIGPLAN workshop on Erlang (2013) Kjell Winblad, David Klaftenegger and Konstantinos Sagonas https://doi.org/10.1145/2505305.2505308 http://winsh.me/papers/erlang_workshop_2013.pdf [3] The stack optimization works by keeping one preallocated stack instance in every ordered_set table. This stack is updated so that it contains the search path in some read operations (e.g., ets:next/2). This makes it possible for a subsequent ets:next/2 to avoid traversing some nodes in some cases. Unfortunately, the preallocated stack needs to be flagged so that it is not updated concurrently by several threads which cause bad scalability.
2018-07-27Change "can not" into "cannot"Raimo Niskanen
I did not find any legitimate use of "can not", however skipped changing e.g RFCs archived in the source tree.
2018-06-18Update copyright yearHenrik Nord
2018-05-08erts: Rename untrapping db_free_*empty*_tableSverker Eriksson
as it's now only used for empty tables by ets:new/2.
2018-05-08erts: Cleanup ets codeSverker Eriksson
2018-05-08erts: Make atomic ets:delete_all_objects yieldSverker Eriksson
by using a cooperative strategy that will make any process accessing the table execute delelete_all_objects_continue until the table is empty. This is not an optimal solution as concurrent threads will still block on the table lock, but at least thread progress is made.
2017-09-11Merge branch 'maint'John Högberg
2017-09-07Replace ad-hoc MIN/MAX macros with common onesJohn Högberg
Besides being noisy, they were already defined by a global Unix- specific header, causing the Windows build to fail if one forgot to define them.
2017-07-17erts: Replace usage of all erts_smp prefixes to just ertsLukas Larsson
2017-05-04Update copyright yearRaimo Niskanen
2017-04-12erts: Introduce struct binary_internalsSverker Eriksson
to replace macro ERTS_INTERNAL_BINARY_FIELDS as header in Binary and friends.
2017-03-23Remove redundant variable initializationsGuilherme Andrade
2017-03-22Use ETS table id references on select_replaceGuilherme Andrade
2017-03-22erts: Optimize ets:select_replace to not use heapSverker Eriksson
for temporary matchspec results. ToDo: Would be even nicer if PAM could allocate and build the ETS objects without extra copy_struct needed.
2017-03-22Cleanup some unnecessary variable initializationSverker Eriksson
2017-03-22Reject unsafe matchspecs on ets:select_replace/2Guilherme Andrade
Preemptively fail operation with badarg if the replacement object might have a different key.
2017-03-22Use magic refs on revamped ETS codeGuilherme Andrade
2017-03-22Deduplicate select* code on ETS hash tablesGuilherme Andrade
Refactor existing solution into a common iteration loop parameterized using stateful callbacks.
2017-03-22ETS: Allow for matchspec-based replacementGuilherme Andrade
2017-03-22erts: Improve reduction count during table cleanupSverker Eriksson
2017-03-22erts: Remove meta_main_tabSverker Eriksson
\o/ O / \ Also removed the body for CHECK_TABLES enabled by HARDDEBUG. Removed quite useless check for hash, but kept dead check code for tree.
2017-03-22erts: Pass tid argument down to trapping functionsSverker Eriksson
to get rid of meta table lookup by integer (tb->common.id)
2017-02-06Use magic refs for compiled match specsRickard Green
2016-11-17erts: Refactor crash dumping with cbprintfSverker Eriksson
Instead of passing around a file descriptor use a function pointer to facilitate more advanced backend write logic such as size limitation or compression.
2016-03-15update copyright-yearHenrik Nord
2016-02-24Merge branch 'master' into sverk/master/halt-INT_MINSverker Eriksson
2016-02-24erts: Change erl_exit into erts_exitSverker Eriksson
This is mostly a pure refactoring. Except for the buggy cases when calling erlang:halt() with a positive integer in the range -(INT_MIN+2) to -INT_MIN that got confused with ERTS_ABORT_EXIT, ERTS_DUMP_EXIT and ERTS_INTR_EXIT. Outcome OLD erl_exit(n, ) NEW erts_exit(n, ) ------- ------------------- ------------------------------------------- exit(Status) n = -Status <= 0 n = Status >= 0 crashdump+abort n > 0, ignore n n = ERTS_ERROR_EXIT < 0 The outcome of the old ERTS_ABORT_EXIT, ERTS_INTR_EXIT and ERTS_DUMP_EXIT are the same as before (even though their values have changed).
2015-06-24erts: Remove halfword bases in ETSBjörn-Egil Dahlberg
2015-06-24erts: Remove halfword relative printfBjörn-Egil Dahlberg
2015-06-24erts: Remove halfword is_same bases macroBjörn-Egil Dahlberg
Keep is_same macro for readability but remove base pointers.
2015-06-24erts: Remove halfword object manipulationBjörn-Egil Dahlberg
* Remove macros size_object_rel, copy_struct_rel and copy_shallow_rel