Age | Commit message (Collapse) | Author |
|
shutting down
This patch fixes a problem happening when BEAM is shutting down. It is possible for a dirty scheduler to take the lock, and keep it, when the system is shutting down. It may also happen that a normal scheduler decides to schedule some dirty job (example is major garbage collection that results in migrating the process into dirty CPU queue), and hangs trying to take the lock that will never be released.
To fix the problem, either release the lock before entering endless wait loop, or reverse the order in which schedulers are stopped. Either fix works, and, of course, it works even better to apply both.
|
|
into maint-20
* john/erts/spectre-configure-flag-otp_20/OTP-15430/ERIERL-237:
Allow disabling retpoline in interpreter loop
Add a ./configure flag for spectre mitigation
|
|
* rickard/dirty_scheduler_collapse/OTP-15509:
Fix bug causing dirty scheduler sleeper list inconsistency
|
|
|
|
* sverker/big-band-bug/ERL-804/OTP-15487:
erts: Fix bug in 'band' of two negative numbers, one big
|
|
Similar bug as for bxor fixed by abc4fd372d476821448dfb9
Ex:
1> io:format("~.16B\n", [-16#1110000000000000000 band (-1)]).
-1120000000000000000
Wrong result for
(-X bsl WS) band -Y.
where
X is any positive integer
WS is erlang:system_info(wordsize)*8*N where N is 1 or larger
Y is any positive integer smaller than (1 bsl WS)
Fix:
The subtraction of 1 (for 2-complement conversion)
must be carried along all the way to the last words.
|
|
john/erts/spectre-configure-flag-otp_20/OTP-15430/ERIERL-237
* john/erts/spectre-configure-flag/OTP-15430/ERIERL-237:
Allow disabling retpoline in interpreter loop
Add a ./configure flag for spectre mitigation
|
|
We only do this when the user has explicitly told us it's okay to
partially disable mitigation (spectre-mitigation=incomplete). The
macro is inert if it isn't.
|
|
* john/erts/OTP-20.3.8/minusminus_trapping/OTP-15371:
Optimize operator '--' and yield on large inputs
|
|
The removal set now uses a red-black tree instead of an array on
large inputs, decreasing runtime complexity from `n*n` to
`n*log(n)`. It will also exit early when there are no more items
left in the removal set, drastically improving performance and
memory use when the items to be removed are present near the head
of the list.
This got a lot more complicated than before as the overhead of
always using a red-black tree was unacceptable when either of the
inputs were small, but this compromise has okay-to-decent
performance regardless of input size.
Co-authored-by: Dmytro Lytovchenko <[email protected]>
|
|
* rickard/internal_ref_cmp/OTP-15399/ERL-751:
Fix erts_internal_ref_number_cmp()
|
|
|
|
* sverker/erts/ets-select_replace-bug/OTP-15346:
erts: Fix bug in ets:select_replace for bound key
|
|
which may cause following calls to ets:next or ets:prev to fail.
|
|
* dotsimon/ref_ordering_bug/OTP-15225:
Fixed #Ref ordering bug
Test #Ref ordering in lists and ets
|
|
* rickard/full-cache-nif-env/OTP-15223/ERL-695:
Fix caching of NIF environment when executing dirty
|
|
|
|
|
|
|
|
* sverker/crash-dump-crash-literals/OTP-15181:
erts: Fix bug in crash dump generation
|
|
* john/erts/inet-drv-race/OTP-15158/ERL-654:
Fix a race condition when generating async operation ids
|
|
Symptom: emulator core dumps during crash dump generation.
Problem:
erts_dump_lit_areas did not grow correctly
to always be equal or larger than number of loaded modules.
The comment about twice the size to include both curr and old
did not seem right. The beam_ranges structure contains *all* loaded
module instances until they are removed when purged.
|
|
into maint-20
* john/erts/fix-process-schedule-after-free/OTP-15067/ERL-573:
Don't enqueue system tasks if target process is in fail_state
Fix erroneous schedule of freed/exiting processes
Fix deadlock in run queue evacuation
Fix memory leak of processes that died in the run queue
|
|
The counter used for generating async operation ids was a plain int
shared between all ports, which was incorrect but mostly worked
fine since the ids only had to be unique on a per-port basis.
However, some compilers (notably GCC 8.1.1) generated code that
assumed that this value didn't change between reads. Using a
shortened version of enq_async_w_tmo as an example:
int id = async_ref++;
op->id = id; //A
return id; //B
In GCC 7 and earlier, `async_ref` would be read once and assigned
to `id` before being incremented, which kept the values at A and B
consistent. In GCC 8, `async_ref` was read when assigned at A and
read again at B, and then incremented, which made them inconsistent
if we raced with another port.
This commit fixes the issue by removing `async_ref` altogether and
replacing it with a per-port counter which makes it impossible to
race with someone else.
|
|
The fail state wasn't re-checked in the state change loop; only
the FREE state was checked. In addition to that, we would leave
the task in the queue when bailing out which could lead to a
double-free.
This commit backports active_sys_enqueue from master to make it
easier to merge onwards.
|
|
When scheduled out, the process was never checked for the FREE state
before rescheduling, which meant that a system task could sneak in
and cause a double-free later on.
|
|
* sverker/ets-auto-unfix-delete-race/OTP-15109:
erts: Fix race between ets table deletion and auto-unfix
|
|
* sverker/system-profile-bug/OTP-15085:
erts: Fix bug in system_profile
|
|
* sverker/enif_binary_to_term-bug/OTP-15080:
erts: Fix bug in enif_binary_to_term for immediates
|
|
Problem:
1. Process A fixates table T.
2. Process B starts deleting table T (either by ets:delete or exit)
and does tid_clear().
3. Process A exits and does proc_cleanup_fixed_table()
and get NULL from btid2tab() and deallocates DbFixation.
4. Process B continues deleting table in free_fixations_locked()
and finds the deallocated DbFixation in the fixing_procs tree.
Solution:
Wait with tid_clear() until after free_fixations_locked()
has traversed the fixing_procs tree.
|
|
|
|
|
|
seen to cause redundant {profile,_,active,_,_} messages
when process is terminating.
|
|
Symptom: Heap corruption
Expanded test case to provoke this bug
and test some more term types.
|
|
* sverker/erts/more-crash-dump-info/OTP-14820:
erts,observer: Add port-suspended pids to crash dump
erts,observer: Add port states and flags to crash dump
erts,observer: Add dirty schedulers to crash dump
observer: Refactor get_schedulerinfo1
erts,observer: Add more port info to crash dump
erts: Cleanup dump_process_info()
erts: Include failing garbing process in crash dump
erts: Remove unused args to collect_live_heap_frags
erts: Add binary vheap sizes to crash dump
|
|
* lukas/erts/dirty_trace_clean_fix/OTP-14938:
erts: Delay cleanup of removed tracer on dirty scheds
|
|
It is not simple to do the correct de-allocation on
a dirty schedulers, so we just delay it until this
process runs on a normal scheduler.
|
|
|
|
|
|
|
|
|
|
by testing F_SENSITIVE only once.
|
|
Exclude garbing processes, EXCEPT if run by crash dumping thread
in which case we assume the heap is healthy
without any move markers yet/left.
Switched order between (allocating) setup_rootset()
and (move marking) collect_live_heap_frags().
|
|
|
|
|
|
|
|
|
|
When supplied without an enclosing list, bitstrings were returned
as-is instead of badarging.
|
|
|
|
A binary is a binary as long as its size in bits is evenly divisible
by 8, regardless of whether it has a bit offset or not.
|