Age | Commit message (Collapse) | Author |
|
|
|
One little (unsigned long) left behind.
|
|
* sverk/master/halt-INT_MIN:
erts: Make erlang:halt() accept bignums as Status
erts: Change erl_exit into erts_exit
kernel: Remove calls to erl_exit
|
|
OTP-13251
* sverk/halt-INT_MIN:
erts: Make erlang:halt() accept bignums as Status
erts: Change erl_exit into erts_exit
kernel: Remove calls to erl_exit
|
|
* bjorn/multiple-load/OTP-13111:
code: Add functions that can load multiple modules
Refactor post_beam_load handling
Simplify and robustify code_server:all_loaded/1
Update preloaded modules
Add erl_prim_loader:get_modules/3
Add has_prepared_code_on_load/1 BIF
Allow erlang:finish_loading/1 to load more than one module
beam_load.c: Add a function to check for an on_load function
|
|
Conflicts:
erts/emulator/beam/erl_alloc.types
erts/emulator/beam/erl_bif_info.c
erts/emulator/beam/erl_process.c
erts/preloaded/ebin/erts_internal.beam
|
|
|
|
|
|
The BIFs prepare_loading/2 and finish_loading/1 have been
designed to allow fast loading in parallel of many modules.
Because of the complications with on_load functions,
the initial implementation of finish_loading/1 only allowed
a single element in the list of prepared modules.
finish_loading/1 does not suspend other processes, but it must wait
for all schedulers to pass a write barrier ("thread progress"). The
time for all schedulers to pass the write barrier is highly variable,
depending on what kind of code they are executing. Therefore, allowing
finish_loading/1 to finish the loading for more than one module before
passing the write barrier could potentially be much faster than
calling finish_loading/1 multiple times.
The test case many/1 run on my computer shows that with "heavy load",
finish loading of 100 modules in parallel is almost 50 times faster
than loading them sequentially. With "light load", the gain is still
almost 10 times.
Here follows an actual sample of the output from the test case on
my computer (an 2012 iMac):
Light load
==========
Sequential: 22361 µs
Parallel: 2586 µs
Ratio: 9
Heavy load
==========
Sequential: 254512 µs
Parallel: 5246 µs
Ratio: 49
|
|
|
|
Just mask away the high bits to get a more tolerant erlang:halt
that behaves the same on 32 and 64 bit architectures.
|
|
This is mostly a pure refactoring.
Except for the buggy cases when calling erlang:halt() with a positive
integer in the range -(INT_MIN+2) to -INT_MIN that got confused with
ERTS_ABORT_EXIT, ERTS_DUMP_EXIT and ERTS_INTR_EXIT.
Outcome OLD erl_exit(n, ) NEW erts_exit(n, )
------- ------------------- -------------------------------------------
exit(Status) n = -Status <= 0 n = Status >= 0
crashdump+abort n > 0, ignore n n = ERTS_ERROR_EXIT < 0
The outcome of the old ERTS_ABORT_EXIT, ERTS_INTR_EXIT and
ERTS_DUMP_EXIT are the same as before (even though their values have
changed).
|
|
* maint:
Do not wait for main lock when looking up process not running
|
|
|
|
|
|
* sverk/proc-lock-check-fix:
erts: Fix lock checker for process locks
|
|
We will need a way to check whether an prepared BEAM modules has
an on_load function.
|
|
* rickard/rq-state-bug/OTP-13298:
Fix bug causing run-queue mask to become inconsistent
|
|
Do lock order check *before* trying to seize lock... duh!
|
|
* sverk/fix-list-length-int/OTP-13288:
erts: Fix error cases in enif_get_list_length
erts: Use Sint instead of int for list lengths
|
|
* jv/erts/optimize-cmp:
Unify comparison macros in erl_utils.h
Avoid erts_cmp jump in atom, int and float comparisons
|
|
This commit implements erts_internal:system_check(schedulers) with the
intent of a basic responsiveness test check of the schedulers.
|
|
This removes the duplication in having both `cmp_eq`
and `CMP_EQ` and normalizes their name to uppercase.
|
|
Given the function definition below:
check(X) when X >= 0, X <= 20 -> true.
@nox has originally noticed that perfoming lt and ge
guard tests were performing slower than they should be.
Further investigation revealed that most of the cost
was in jumping to the erts_cmp function. This patch
brings the operations already inlined in erts_cmp
into the emulator, removing the jump cost.
After applying these changes, invoking the check/1
function defined above 30000 times with different
values from 0 to 20 has fallen from 367us to 213us
(measured as average of 3 runs). This is a
considerably improvement over Erlang 18 which takes
556us on average.
Floats have also dropped their time from 1126us
(on Erlang 18) to 613us.
|
|
* rickard/ds-fixes:
Fix unique_SUITE for dirty schedulers
Add dirty scheduler process termination test
Ensure that work is done on the correct type of schedulers
Conflicts:
erts/emulator/beam/erl_process.c
erts/emulator/beam/erl_process.h
erts/emulator/beam/erl_process_dump.c
|
|
Only the actual call to the dirty nif is allowed to execute on
dirty schedulers. The dirty nif is not allowed to execute on
normal schedulers if dirty schedulers are available.
Arrival of exit signals and system tasks, while a process was
scheduled for execution on a dirty scheduler, could mess up the
process internal state.
Preparation for dirty system task has been made, but is currently
unused.
|
|
* sverk/thread-unsafe-alloc:
erts: Fix faulty assert for non-smp
erts: Add checks for thread safe allocation
|
|
|
|
* sverk/hipe-line-table-bug/master/OTP-13282:
erts: Fix bug concerning line information for hipe modules
|
|
* sverk/hipe-line-table-bug/OTP-13282:
erts: Fix bug concerning line information for hipe modules
|
|
* sverk/proc-exiting-timer-race/OTP-13245:
erts: Fix race between receive timeout and exit signal
|
|
false if improper list
false if length > UINT_MAX
|
|
This avoids potential integer arithmetic overflow for very large lists.
|
|
|
|
* kvakvs/erts/list_to_integer/OTP-13293:
Better list_to_integer
Moved do_list_to_integer from bif.c to big.c
|
|
* maint:
erts: When erts_alloc fails, the emulator no longer aborts
|
|
* lukas/erts/enomem_no_abort/OTP-13292:
erts: When erts_alloc fails, the emulator no longer aborts
|
|
* lukas/erts/msacc:
Update preloaded modules
erts: Make msacc alloctor type thread safe
Silence compiler
erts: Fix msacc testcase on some windowses
erts: Add power saving cpu feature tests and use them
erts: Refactor perf counter internal interface
erts: Add rdtscp instruction check
erts: Fix hrtime for windows
erts: use correct function for perf counter on non-x86
erts: Fix msacc win32 debug compile error
erts: Add microstate accounting
erts, kernel: Add os:perf_counter function
erts: Add ERTS_WRITE_UNLIKELY
|
|
* maint:
Use nano second time unit in tracing
|
|
* rickard/monotonic-time-improvements/OTP-13222:
Use nano second time unit in tracing
|
|
|
|
Conflicts:
erts/emulator/beam/beam_emu.c
|
|
Now tries to use whole width of signed long (Sint) and this halves amount of
multiplications needed to parse long integers. New code is 2-3 times faster
than the old code for large inputs (tens and hundreds of digits), behavior
should not change for small inputs.
Test ran 10k times with GC forced between attempts.
Was (R17):
720 el base 10: 0.14682 sec; base 16: 0.192722 sec; base 36: 0.337118 sec.
2800 el base 10: 1.794133 sec; base 16: 2.735106 sec; base 36: 4.761108 sec.
6500 el base 10: 9.316434 sec; base 16: 14.109469 sec; base 36: 25.319263 sec.
Now (R19 Dev)
720 el base 10: 0.10265 sec; base 16: 0.10851 sec; base 36: 0.160478 sec.
2800 el base 10: 1.002793 sec; base 16: 1.360649 sec; base 36: 2.174309 sec.
6500 el base 10: 4.722197 sec; base 16: 6.60522 sec; base 36: 10.552795 sec.
Added test for corner cases and sign bit corruption. Replaced macros with
inline and hid it inside C file to not pollute global namespace
Old bug in #define LG2_LOOKUP: Replaced with inline function and table
recalculated for all bases 2 to 36 (was 2 to 64)
|
|
* margnus1/bs_unit_fix:
hipe: Fix signed compares of unsigned sizes
beam: Fix overflow bug in i_bs_add_jId
hipe: Add tests for bad bit syntax float sizes
Add a case testing the handling of guards involving binaries
Add some more binary syntax construction tests
hipe: Guard against enormous numbers in ranges
hipe: Fix constructing huge binaries
hipe: Fix binary constructions failing with badarith
Add missing corner-case to bs_construct_SUITE
hipe: Allow unsigned args in hipe_rtl_arith
hipe: test unit size match in bs_put_binary_all
hipe: test unit size match in bs_append
Fix hipe_rtl_binary_construct:floorlog2/1
OTP-13272
|
|
LONG_LIVED is not thread safe on non-smp and
can only be used by scheduler.
|
|
|
|
perf counter is now part of the function pointer interface
and also the function returns the value instead of writing
to a memory buffer.
|
|
|
|
Microstate accounting is a way to track which state the
different threads within ERTS are in. The main usage area
is to pin point performance bottlenecks by checking which
states the threads are in and then from there figuring out
why and where to optimize.
Since checking whether microstate accounting is on or off is
relatively expensive if done in a short loop only a few of the
states are enabled by default and more states can be enabled
through configure.
I've done some benchmarking and the overhead with it turned off
is not noticible and with it on it is a fraction of a percent.
If you enable the extra states, depending on the benchmark,
the ovehead when turned off is about 1% and when turned on
somewhere inbetween 5-15%.
OTP-12345
|
|
The perf_counter is a very very cheap and high resolution timer
that can be used to timestamp system events. It does not have
monoticity guarantees, but should on most OS's expose a monotonous
time.
A special instruction has been created for this counter to further
speed up fetching it.
OTP-12908
|