Age | Commit message (Collapse) | Author |
|
Symtom:
ETS table remains fixed after finished ets:select* call.
Problem:
The decision to unfix table after a yielding ets:select*
is based on table ownership, but ownership might have changed
while ets:select* was yielding.
Solution:
Remember and pass along whether table was fixed
when the traversal started.
|
|
OTP-15610
|
|
All of the Red-Black Tree _yielding functions have been
updated to work with reductions returned by the called
function instead of yielding on each element.
|
|
to easier generate a routing tree for test
without having to spend cpu to provoke actual repeated lock conflicts.
|
|
{RouteNodes, BaseNodes, MaxRouteTreeDepth}
|
|
Update table memory stats before scheduling free
to not be dependent on deallocation order with main table struct.
|
|
The current ETS ordered_set implementation can quickly become a
scalability bottleneck on multicore machines when an application updates
an ordered_set table from concurrent processes [1][2]. The current
implementation is based on an AVL tree protected from concurrent writes
by a single readers-writer lock. Furthermore, the current implementation
has an optimization, called the stack optimization [3], that can improve
the performance when only a single process accesses a table but can
cause bad scalability even in read-only scenarios. It is possible to
pass the option {write_concurrency, true} to ets:new/2 when creating an
ETS table of type ordered_set but this option has no effect for tables
of type ordered_set without this commit. The new ETS ordered_set
implementation, added by this commit, is only activated when one passes
the options ordered_set and {write_concurrency, true} to the ets:new/2
function. Thus, the previous ordered_set implementation (from here on
called the default implementation) can still be used in applications
that do not benefit from the new implementation. The benchmark results
on the following web page show that the new implementation is many times
faster than the old implementation in some scenarios and that the old
implementation is still better than the new implementation in some
scenarios.
http://winsh.me/ets_catree_benchmark/ets_ca_tree_benchmark_results.html
The new implementation is expected to scale better than the default
implementation when concurrent processes use the following ETS
operations to operate on a table:
delete/2, delete_object/2, first/1, insert/2 (single object),
insert_new/2 (single object), lookup/2, lookup_element/2, member/2,
next/2, take/2 and update_element/3 (single object).
Currently, the new implementation does not have scalable support for the
other operations (e.g., select/2). However, when these operations are
used infrequently, the new implantation may still scale better than the
default implementation as the benchmark results at the URL above shows.
Description of the New Implementation
----------------------------------
The new implementation is based on a data structure which is called the
contention adapting search tree (CA tree for short). The following
publication contains a detailed description of the CA tree:
A Contention Adapting Approach to Concurrent Ordered Sets
Journal of Parallel and Distributed Computing, 2018
Kjell Winblad and Konstantinos Sagonas
https://doi.org/10.1016/j.jpdc.2017.11.007
http://www.it.uu.se/research/group/languages/software/ca_tree/catree_proofs.pdf
A discussion of how the CA tree can be used as an ETS back-end can be
found in another publication [1]. The CA tree is a data structure that
dynamically changes its synchronization granularity based on detected
contention. Internally, the CA tree uses instances of a sequential data
structure to store items. The CA tree implementation contained in this
commit uses the same AVL tree implementation as is used for the default
ordered set implementation. This AVL tree implementation is reused so
that much of the existing code to implement the ETS operations can be
reused.
Tests
-----
The ETS tests in `lib/stdlib/test/ets_SUITE.erl` have been extended to
also test the new ordered_set implementation. The function
ets_SUITE:throughput_benchmark/0 has also been added to this file. This
function can be used to measure and compare the performance of the
different ETS table types and options. This function writes benchmark
data to standard output that can be visualized by the HTML page
`lib/stdlib/test/ets_SUITE_data/visualize_throughput.html`.
[1]
More Scalable Ordered Set for ETS Using Adaptation.
In Thirteenth ACM SIGPLAN workshop on Erlang (2014).
Kjell Winblad and Konstantinos Sagonas.
https://doi.org/10.1145/2633448.2633455
http://www.it.uu.se/research/group/languages/software/ca_tree/erlang_paper.pdf
[2]
On the Scalability of the Erlang Term Storage
In Twelfth ACM SIGPLAN workshop on Erlang (2013)
Kjell Winblad, David Klaftenegger and Konstantinos Sagonas
https://doi.org/10.1145/2505305.2505308
http://winsh.me/papers/erlang_workshop_2013.pdf
[3]
The stack optimization works by keeping one preallocated stack instance
in every ordered_set table. This stack is updated so that it contains
the search path in some read operations (e.g., ets:next/2). This makes
it possible for a subsequent ets:next/2 to avoid traversing some nodes
in some cases. Unfortunately, the preallocated stack needs to be flagged
so that it is not updated concurrently by several threads which cause
bad scalability.
|
|
A lot of erts internal messages used behind APIs to create
non-blocking calls, e.g. port_command, would cause the seq_trace
token to be cleared from the caller when it should not.
This commit fixes that and adds asserts that makes sure
that all messages sent have to correct token set.
Fixes: ERL-602
|
|
|
|
* sverker/ets-auto-unfix-delete-race/OTP-15109:
erts: Fix race between ets table deletion and auto-unfix
|
|
Bug exists since ets-refs were introduced in 20.0
0d6dc895744c34c9c52fd42f4801a8a941864ae3.
Problem:
1. Process A fixates table T.
2. Process B starts deleting table T (either by ets:delete or exit)
and does tid_clear().
3. Process A exits and does proc_cleanup_fixed_table()
and get NULL from btid2tab() and deallocates DbFixation.
4. Process B continues deleting table in free_fixations_locked()
and finds the deallocated DbFixation in the fixing_procs tree.
Solution:
Wait with tid_clear() until after free_fixations_locked()
has traversed the fixing_procs tree.
|
|
by expanding the default size of the hash table
and increase number of locks.
|
|
|
|
and let compiler determine string lengths.
These were actually wrong in erl_db.c:
count_trap\0
replace_tra
select_tra
|
|
|
|
of same named table.
If other process does ets:delete
before ets:new has completely finished and done save_owned_table
then ets:delete might do delete_owned_table and deref wild pointers
in tb->common.owned.
|
|
* sverker/ets-delete_all_objects-trap/OTP-15078:
erts: Rename untrapping db_free_*empty*_table
erts: Make ets:delete_all_objects yield on fixed table
erts: Optimize ets delete all in fixed table
erts: Refactor ets select iteration code
erts: Cleanup ets code
erts: Optimize ets hash object deallocactions
erts: Refactor pseudo deleted ets objects
erts: Make atomic ets:delete_all_objects yield
erts: Fix reduction bump for ets:delete/1
|
|
as it's now only used for empty tables by ets:new/2.
|
|
|
|
by using a cooperative strategy that will make
any process accessing the table execute delelete_all_objects_continue
until the table is empty.
This is not an optimal solution as concurrent threads will still
block on the table lock, but at least thread progress is made.
|
|
If no message/signal is sent (to same destination)
then monitor signal is flushed when process is scheduled out.
|
|
|
|
and not the name. For more sane named table semantics.
Applies to both select/1 continuation and trap context.
|
|
|
|
|
|
|
|
* sverker/ets-fix-assert-fix:
erts: Fix faulty ASSERT of table fixation counter
|
|
Cannot read fix->counter safely without table lock.
|
|
|
|
This refactor was done using the unifdef tool like this:
for file in $(find erts/ -name *.[ch]); do unifdef -t -f defile -o $file $file; done
where defile contained:
#define ERTS_SMP 1
#define USE_THREADS 1
#define DDLL_SMP 1
#define ERTS_HAVE_SMP_EMU 1
#define SMP 1
#define ERL_BITS_REENTRANT 1
#define ERTS_USE_ASYNC_READY_Q 1
#define FDBLOCK 1
#undef ERTS_POLL_NEED_ASYNC_INTERRUPT_SUPPORT
#define ERTS_POLL_ASYNC_INTERRUPT_SUPPORT 0
#define ERTS_POLL_USE_WAKEUP_PIPE 1
#define ERTS_POLL_USE_UPDATE_REQUESTS_QUEUE 1
#undef ERTS_HAVE_PLAIN_EMU
#undef ERTS_SIGNAL_STATE
|
|
The implementation is still hidden behind ERTS_ENABLE_LOCK_COUNT, and
all categories are still enabled by default, but the actual counting can be
toggled at will.
OTP-13170
|
|
|
|
to replace macro ERTS_INTERNAL_BINARY_FIELDS
as header in Binary and friends.
|
|
|
|
|
|
The existing implementation presented both
semantic inconsistencies and performance issues.
|
|
|
|
|
|
|
|
'the_name' keeps name of all tables.
'type' & DB_NAMED_TABLE mark tables as named.
table ref id is built from magic bin when needed.
|
|
\o/
O
/ \
Also removed the body for CHECK_TABLES enabled by HARDDEBUG.
Removed quite useless check for hash,
but kept dead check code for tree.
|
|
to get rid of meta table lookup by integer (tb->common.id)
|
|
|
|
from process psd through all owned/fixed tables.
As meta_pid_to{_fixed}_tab maps to slot in meta_main_tab
which is planned for destruction.
In this commit we no longer seize table lock while
freeing the table (free_table_cont) as it's not needed
and makes the code a bit simpler. Any concurrent operation
on the table will only access lock, owner and status
and then bail out.
|
|
|
|
|
|
|
|
|
|
* maint:
Atomic reference count of binaries also in non-SMP
Conflicts:
erts/emulator/beam/erl_fun.c
|
|
NIF resources was not handled in a thread-safe manner in the runtime
system without SMP support.
As a consequence of this fix, the following driver functions are now
thread-safe also in the runtime system without SMP support:
- driver_free_binary()
- driver_realloc_binary()
- driver_binary_get_refc()
- driver_binary_inc_refc()
- driver_binary_dec_refc()
|