Age | Commit message (Collapse) | Author |
|
|
|
* rickard/ds-sched-suspend:
Improved scheduler suspend functionality
|
|
- The calling process is now suspended while synchronizing
scheduler suspends via erlang:system_flag(schedulers_online, _)
and erlang:system_flag(multi_scheduling, _), instead of blocking
the scheduler thread in the BIF call waiting for the operation
to synchronize. Besides releasing the scheduler for other work
(or immediate suspend) it also makes it possible to abort the
operation by killing the process.
- erlang:system_flag(schedulers_online, _) now only wait for normal
schedulers to complete before it returns. This since it may take
a very long time before all dirty schedulers suspends.
- erlang:system_flag(multi_scheduling, block_normal|unblock_normal)
which only operate on normal schedulers has been introduced. This
since there are use cases where suspend of dirty schedulers are
not of interest (hipe loader).
- erlang:system_flag(multi_scheduling, block) still blocks all
dirty schedulers as well as all normal schedulers except one since
it is hard to redefine what multi scheduling block means.
- The three operations:
- changing amount of schedulers online
- blocking/unblocking normal multi scheduling
- blocking/unblocking full multi scheduling
can now be done in parallel. This is important since otherwise
a full multi scheduling block would potentially delay the other
operations for a very long time.
|
|
|
|
No point in checking tmp_alloc instance 0
as any non-scheduler thread could race us.
|
|
|
|
|
|
|
|
The BIFs prepare_loading/2 and finish_loading/1 have been
designed to allow fast loading in parallel of many modules.
Because of the complications with on_load functions,
the initial implementation of finish_loading/1 only allowed
a single element in the list of prepared modules.
finish_loading/1 does not suspend other processes, but it must wait
for all schedulers to pass a write barrier ("thread progress"). The
time for all schedulers to pass the write barrier is highly variable,
depending on what kind of code they are executing. Therefore, allowing
finish_loading/1 to finish the loading for more than one module before
passing the write barrier could potentially be much faster than
calling finish_loading/1 multiple times.
The test case many/1 run on my computer shows that with "heavy load",
finish loading of 100 modules in parallel is almost 50 times faster
than loading them sequentially. With "light load", the gain is still
almost 10 times.
Here follows an actual sample of the output from the test case on
my computer (an 2012 iMac):
Light load
==========
Sequential: 22361 µs
Parallel: 2586 µs
Ratio: 9
Heavy load
==========
Sequential: 254512 µs
Parallel: 5246 µs
Ratio: 49
|
|
* sverk/fix-list-length-int/OTP-13288:
erts: Fix error cases in enif_get_list_length
erts: Use Sint instead of int for list lengths
|
|
* bjorn/remove-test_server/OTP-12705:
Remove test_server as a standalone application
Erlang mode for Emacs: Include ct.hrl instead test_server.hrl
Remove out-commented references to the test_server applications
Makefiles: Remove test_server from include path and code path
Eliminate use of test_server.hrl and test_server_line.hrl
|
|
|
|
Since no test suites includede test_server.hrl, there is no need
to have test_server in the include path or code path.
|
|
As a first step to removing the test_server application as
as its own separate application, change the inclusion of
test_server.hrl to an inclusion of ct.hrl and remove the
inclusion of test_server_line.hrl.
|
|
* rickard/ds-fixes:
Fix unique_SUITE for dirty schedulers
Add dirty scheduler process termination test
Ensure that work is done on the correct type of schedulers
Conflicts:
erts/emulator/beam/erl_process.c
erts/emulator/beam/erl_process.h
erts/emulator/beam/erl_process_dump.c
|
|
|
|
In scheduler_SUITE add a new test that runs a single dirty I/O
scheduler and launches a number of dirty I/O NIF calls that each sleep
for 3 seconds. Given the single scheduler, the first of these will run
while the rest queue up. Then start killing these processes, and
verify they call exit correctly.
|
|
ResA may have been GC'd after its last use.
|
|
* sverk/thread-unsafe-alloc:
erts: Fix faulty assert for non-smp
erts: Add checks for thread safe allocation
|
|
false if improper list
false if length > UINT_MAX
|
|
* kvakvs/erts/list_to_integer/OTP-13293:
Better list_to_integer
Moved do_list_to_integer from bif.c to big.c
|
|
* lukas/erts/msacc:
Update preloaded modules
erts: Make msacc alloctor type thread safe
Silence compiler
erts: Fix msacc testcase on some windowses
erts: Add power saving cpu feature tests and use them
erts: Refactor perf counter internal interface
erts: Add rdtscp instruction check
erts: Fix hrtime for windows
erts: use correct function for perf counter on non-x86
erts: Fix msacc win32 debug compile error
erts: Add microstate accounting
erts, kernel: Add os:perf_counter function
erts: Add ERTS_WRITE_UNLIKELY
|
|
* maint:
Use nano second time unit in tracing
|
|
* rickard/monotonic-time-improvements/OTP-13222:
Use nano second time unit in tracing
|
|
|
|
Conflicts:
erts/emulator/beam/beam_emu.c
|
|
Now tries to use whole width of signed long (Sint) and this halves amount of
multiplications needed to parse long integers. New code is 2-3 times faster
than the old code for large inputs (tens and hundreds of digits), behavior
should not change for small inputs.
Test ran 10k times with GC forced between attempts.
Was (R17):
720 el base 10: 0.14682 sec; base 16: 0.192722 sec; base 36: 0.337118 sec.
2800 el base 10: 1.794133 sec; base 16: 2.735106 sec; base 36: 4.761108 sec.
6500 el base 10: 9.316434 sec; base 16: 14.109469 sec; base 36: 25.319263 sec.
Now (R19 Dev)
720 el base 10: 0.10265 sec; base 16: 0.10851 sec; base 36: 0.160478 sec.
2800 el base 10: 1.002793 sec; base 16: 1.360649 sec; base 36: 2.174309 sec.
6500 el base 10: 4.722197 sec; base 16: 6.60522 sec; base 36: 10.552795 sec.
Added test for corner cases and sign bit corruption. Replaced macros with
inline and hid it inside C file to not pollute global namespace
Old bug in #define LG2_LOOKUP: Replaced with inline function and table
recalculated for all bases 2 to 36 (was 2 to 64)
|
|
* margnus1/bs_unit_fix:
hipe: Fix signed compares of unsigned sizes
beam: Fix overflow bug in i_bs_add_jId
hipe: Add tests for bad bit syntax float sizes
Add a case testing the handling of guards involving binaries
Add some more binary syntax construction tests
hipe: Guard against enormous numbers in ranges
hipe: Fix constructing huge binaries
hipe: Fix binary constructions failing with badarith
Add missing corner-case to bs_construct_SUITE
hipe: Allow unsigned args in hipe_rtl_arith
hipe: test unit size match in bs_put_binary_all
hipe: test unit size match in bs_append
Fix hipe_rtl_binary_construct:floorlog2/1
OTP-13272
|
|
* maint:
Fix testcase
|
|
* rickard/rq-len/OTP-13201:
Fix testcase
|
|
|
|
Microstate accounting is a way to track which state the
different threads within ERTS are in. The main usage area
is to pin point performance bottlenecks by checking which
states the threads are in and then from there figuring out
why and where to optimize.
Since checking whether microstate accounting is on or off is
relatively expensive if done in a short loop only a few of the
states are enabled by default and more states can be enabled
through configure.
I've done some benchmarking and the overhead with it turned off
is not noticible and with it on it is a fraction of a percent.
If you enable the extra states, depending on the benchmark,
the ovehead when turned off is about 1% and when turned on
somewhere inbetween 5-15%.
OTP-12345
|
|
* maint:
Introduce time management in native APIs
Introduce time warp safe replacement for safe_fixed option
Introduce time warp safe trace timestamp formats
Conflicts:
erts/emulator/beam/erl_bif_trace.c
erts/emulator/beam/erl_driver.h
erts/emulator/beam/erl_nif.h
erts/emulator/beam/erl_trace.c
erts/preloaded/ebin/erlang.beam
|
|
* rickard/monotonic-time-improvements/OTP-13222:
Introduce time management in native APIs
Introduce time warp safe replacement for safe_fixed option
Introduce time warp safe trace timestamp formats
|
|
* lukas/erts/gc_info/OTP-13265:
erts: Add garbage_collection_info to process_info/2
Conflicts:
erts/emulator/beam/erl_bif_info.c
|
|
* maint:
Fix HL timer hard debug implementation
Fix stack alignment problem in ethread test on arm
Skip time_SUITE:timestamp on timewarp test
|
|
* rickard/test-fix:
Fix HL timer hard debug implementation
Fix stack alignment problem in ethread test on arm
Skip time_SUITE:timestamp on timewarp test
|
|
|
|
Assert thread unsafe allocator is only created on non-smp
and only called by the main thread.
Removed test of unsafe allocator in custom thread.
|
|
|
|
New timestamp options for trace, sequential trace, and
system profile:
- monotonic_timestamp
- strict_monotonic_timestamp
|
|
|
|
* sverk/armata-memset-bug:
erts: Workaround memset bug in test case
|
|
memset seen to fail with values larger than 255
on (armata) 32-bit ARM Debian
with EGLIBC 2.13-38+rpi2+deb7u8
and gcc 4.6.3-14+rpi1.
|
|
* sverk/float_SUITE-arith:
erts: Add new test cases to float_SUITE
|
|
* maint:
Light weight statistics of run queue lengths
Conflicts:
erts/preloaded/ebin/erlang.beam
|
|
- statistics(total_run_queue_lengths)
- statistics(run_queue_lengths)
- statistics(total_active_tasks)
- statistics(active_tasks)
Conflicts:
erts/emulator/beam/erl_process.c
|
|
|
|
* lukas/erts/forker: (28 commits)
erts: Never abort in the forked child
erts: Mend ASSERT makro for erl_child_setup
erts: Allow enomem failures in port_SUITE
erts: iter_port sleep longer on freebsd
erts: Allow one dangling fd if there is a gethost port
erts: Only use forker StackAck on freebsd
erts: It is not possible to exit the forker driver
erts: Add forker StartAck for port start flowcontrol
erts: Fix large open_port arg segfault for win32
erts: Fix memory leak at async open port
kernel: Remove cmd server for unix os:cmd
erts: Add testcase for huge port environment
erts: Move os_pid to port hash to child setup
erts: Handle all EINTR and EAGAIN cases in child setup
erts: Make child_setup work with large environments
erts: Fix forker driver ifdefs for win32
erts: Fix uds socket handling for os x
erts: Fix dereferencing of unaligned integer for sparc
erts: Flatten too long io vectors in uds write
erts: Add fd count test for spawn_driver
...
Conflicts:
erts/emulator/beam/erl_node_tables.c
erts/preloaded/src/erts_internal.erl
|
|
With the new forker implementation also enomem can be returned
as an indicator that it is not possible to create more ports.
|