Age | Commit message (Collapse) | Author |
|
* rickard/running-trace-fix/ERL-713/OTP-15269:
Fix missing 'in' trace events during 'running' trace
|
|
'in' trace events could be lost when a process had to be
rescheduled on another scheduler type (normal <-> dirty).
|
|
* bjorn/compiler/ssa:
Travis CI: Run the SSA linter in the Linux64SmokeTest build
Remove retired compiler passes
Introduce a new SSA-based intermediate format
hipe_beam_to_icode: Correct translation of get_map_elements
beam_dead: Remove shortcut of binary matching instruction
beam_bs: Remove optimizations that are easier done on SSA format
Don't run unsafe compiler passes
Simplify optimizations by introducing is_nil late
beam_utils: Make is_tagged_tuple a pure test
beam_except: Enhance recognition of function_clause exceptions
beam_validator: Infer the types of copies in a smarter way
beam_validator: Improve merge of cons and literal list
beam_validator: Strengthen validation of func_info
beam_validator: Allow get_tuple_element before dsetelement
beam_validator: Don't transfer state to labels that can't be reached
beam_validator: Improve type analysis for tuples
beam_validator: Be more careful when updating try/catch state
beam_trim: Handle an empty list of instructions
v3_core: Number argument variables in ascending order
Teach binary instructions to use Y registers as destination
OTP-14894
|
|
Do not allocate good and bad shifts for single byte lookups
|
|
* max-au/dist_msg_too_long:
Cleanup unused dist output buf immediately instead of at GC
Throw 'system_limit' when distribution message size exceed INT_MAX instead of crashing emulator with 'Absurdly large distribution data buffer'
|
|
|
|
* maint:
Fix incoming suspend monitor down
|
|
* rickard/fix-suspend-monitor-down/OTP-15237/ERL-704:
Fix incoming suspend monitor down
|
|
An incoming suspend monitor down wasn't handled correct when the
local monitor half had been removed with an emulator crash as result.
|
|
The new code generator will use Y registers as a destination for
binary construction and matching instructions. v3_codegen would
always first store terms in an X register and it would be the
responsibility of the optimization passes to optimize the extra
moves.
|
|
The single byte lookups always rely on `memchr` and
never really use the good and bad shifts arrays.
|
|
Optimize binary match from 10% up to 70x
|
|
* rickard/full-cache-nif-env/OTP-15223/ERL-695:
Fix caching of NIF environment when executing dirty
# Conflicts:
# erts/emulator/beam/erl_nif.c
|
|
* dotsimon/ref_ordering_bug/OTP-15225:
Fixed #Ref ordering bug
Test #Ref ordering in lists and ets
|
|
* maint:
Fix caching of NIF environment when executing dirty
|
|
* rickard/full-cache-nif-env/OTP-15223/ERL-695:
Fix caching of NIF environment when executing dirty
|
|
|
|
|
|
* maint:
Fixed #Ref ordering bug
Test #Ref ordering in lists and ets
|
|
|
|
The idea is to use memchr on the first lookup for
binary:match/2 and also after every match on binary:matches/2.
We only use memchr in case of matches because benchmarks
showed that using memchr even when we had false positives
could negatively affect performance.
This speeds up binary matching and binary splitting by 4x
in some cases and by 70x in other scenarios (when the last
character in the needle does not occur in the subject).
The reason to use memchr is that it is highly specialized
in most modern operating systems, often defaulting to
SIMD operations.
The implementation uses the reduction count to figure out
how many bytes should be read with memchr. We could increase
those numbers but they do not seem to make a large difference.
|
|
Do not allocate a new map when the value is the same
|
|
|
|
A lot of erts internal messages used behind APIs to create
non-blocking calls, e.g. port_command, would cause the seq_trace
token to be cleared from the caller when it should not.
This commit fixes that and adds asserts that makes sure
that all messages sent have to correct token set.
Fixes: ERL-602
|
|
|
|
RaimoNiskanen/raimo/can_not-should-mostly-be-cannot
OTP-14282
'can not' should mostly be 'cannot'
|
|
|
|
I did not find any legitimate use of "can not", however skipped
changing e.g RFCs archived in the source tree.
|
|
After this whitespace modification there should be no "can not"s
separated by a newline in the entire OTP repository, so to find
them all a simple git grep will do just fine.
|
|
|
|
|
|
This patch optimizes map operations to not allocate new maps
when the key is being replaced by the exact same value in memory.
Imagine this very common idiom:
Map#{key := compute_new_value(Value, Condition)}
where:
compute_new_x(X, true) -> X + 1;
compute_new_x(X, false) -> X;
In many cases, we are not changing the value in `Key`, however
the code prior to this patch would still allocate a new array
for the map values. This optimization changes this.
The cost of optimization is minimum, as in the worst case scenario
it only adds a pointer comparison and boolean check. The major benefit
is reducing the GC pressure by avoiding allocating data.
Next we list the operations we have changed alongside the benchmark
results. The benchmarks basically create a map and perform the same
operations, roughly 20000 times, once replacing the key with the same
value, and another with a different value.
* Map#{Key := Value}
For a map with 4 keys, replacing the fourth key 20000 times went from
718us to 539us.
For a map with 8 keys, replacing the fourth key 20000 times went from
976us to 555us.
* maps:update/3
For a map with 4 keys, replacing the fourth key 20000 times went from
673us to 575us.
For a map with 8 keys, replacing the fourth key 20000 times went from
827us to 585us.
* maps:put/3
For a map with 4 keys, replacing the fourth key 20000 times went from
763us to 553us.
For a map with 8 keys, replacing the fourth key 20000 times went from
788us to 561us.
Note that we have ported some optimizations found in maps:update/3
to maps:put/3 while creating this patch.
|
|
to deselect read and/or writes without stop callback.
|
|
maps:new/0 is no longer a BIF
|
|
|
|
* sverker/erl_interface/valgrind/OTP-15171:
erl_interface: Fix bug in ei_*receive_msg* functions
erl_interface: Seal test case memory leaks
erl_interface: Initialize erl_errno to zero
erts: Remove use of VALGRIND_PRINTF_XML
erl_interface: Add valgrind ability for test port programs
erts: Fix benign bug in cerl for valgrind
erts: Fix buggy calls to erts_sys_explicit_8bit_getenv
|
|
which only existed in a patched version of valgrind (by pan)
no longer used.
Instead we use standard VALGRIND_PRINTF which will end up like this
if valgrind log format is XML and valgrind version >= 3.9:
<clientmsg>
<tid>7</tid>
<threadname>3_scheduler</threadname>
<text>Test case #20 ei_encode_SUITE:test_ei_encode_long/1
</text>
</clientmsg>
Note the extra trailing whitespace that may occure before </text>.
|
|
* john/erts/fix-literal-map-elements/OTP-15184:
Fix a rare crash when matching on literal maps
|
|
Optimise creation of anonymous functions
|
|
Implementing it in Erlang allows taking advantage of the literal pool
optimisation, this means the function implemented in Erlang does no
allocations, while the BIF had to allocate new map each time it was
called. Benchmarks show the function is also slightly faster now.
|
|
This introduces a similar optimisation for normal funs
to what was introduced for external funs in #1725.
It is possible to allocate the fun as a literal, if it does not capture
the environment (i.e. it does not close over any variables).
Unfortunately it's not possible to do this in the compiler due to
problems with representation of such functions in the `.beam` files.
Fortunately, we can do this in the loader.
Simple evaluation shows that functions that don't capture the
enviornment consistute over 60% of all funs in the source code of
Erlang/OTP itself.
The only downside is that we lose a meningful value in the `pid` field
of the fun. The goal of this field, beyond debugging, was to be
able to identify the original node of a function. To be able to still do
this, the functions that are created in the loader are assigned the init
pid as the creator.
To solve issues with staryp, initially set the `erts_init_process_id`
to `ERTS_INVALID_PID` and skip the described optimisation if the value
is still uninitialised.
|
|
|
|
* sverker/crash-dump-crash-literals/OTP-15181:
erts: Fix bug in crash dump generation
|
|
* maint:
Updated OTP version
Update release notes
Update version numbers
Fix trace_info/2
Provide build support for standalone corba repo
Fix release notes for OTP-21.0.2
Move to a dirty scheduler even when we have pending system tasks
|
|
* maint-21:
Updated OTP version
Update release notes
Update version numbers
Fix trace_info/2
Provide build support for standalone corba repo
Fix release notes for OTP-21.0.2
Move to a dirty scheduler even when we have pending system tasks
|
|
When matching on a literal map, the map is placed into the general
scratch register first. This is fine in isolation, but when the
key to be matched was in a Y register it would also be placed in
the scratch register, overwriting the map and crashing the
emulator.
|
|
* john/erts/fix-dirty-reschedule-bug/OTP-15154:
Move to a dirty scheduler even when we have pending system tasks
|
|
|
|
* john/erts/adjust-fix-alloc-sizes:
Adjust fix_alloc sizes to guarantee they fit a dd block
|
|
Symptom: emulator core dumps during crash dump generation.
Problem:
erts_dump_lit_areas did not grow correctly
to always be equal or larger than number of loaded modules.
The comment about twice the size to include both curr and old
did not seem right. The beam_ranges structure contains *all* loaded
module instances until they are removed when purged.
|