Age | Commit message (Collapse) | Author |
|
* sverk/hipe_match_wbin/OTP-12667:
erts: Add debug assertions for match state sanity
hipe: Add test for matching of writable binary
erts,hipe: Optimize away calls to emasculate_binary
erts,hipe: Fix bug in binary matching of writable binary
Conflicts:
erts/emulator/hipe/hipe_bif0.c
|
|
sys_write_buf allocator type is used from async-threads and needs
to be lockable. In the SMP case the temporary allocator is lockable but
not in the Non-SMP case.
To remedy this the binary-allocator is used for the Non-SMP case,
which is lockable.
|
|
|
|
|
|
* mikpe/hipe-unbreak-arity-4-bifs:
erts/hipe: unbreak arity 4 BIFs
|
|
* egil/opt-instructions/OTP-12690:
erts: Specialize minus and plus instruction
erts: Add move2 specialization for common move patterns
erts: Specialize rem instruction for common case
erts: Specialize band instruction for common case
erts: Batch loads and stores for move_window
erts: Fix loader increment from minus instruction
erts: Add move window instruction
erts: Add instruction move3 for xy and xx
erts: Specialize compare instructions
kernel: Add instruction_count helper to erts_debug
|
|
OTP-12685
|
|
Seen on SSL application where substraction with x registers were prevalent:
* i_minus specialization on x registers
* i_plus specialization on x registers
|
|
Common pattern seen in SSL:
move y x | move r x -> move2
move r x | move y x -> move2
Common pattern seen in SSL and Compiler:
move x r | move x x -> move2
|
|
* egil/opt-float-cmp:
erts: Brute force float comparisons as well
|
|
* i_rem specialization on x registers
|
|
* i_band specialization on x registers and constants
|
|
May lessen load/store latency.
|
|
A type error caused the optimization to never kick in.
|
|
Move an entire region of x registers to the stack.
This reduces the dispatch pressure of move instructions.
Also introduce a move2 specialization for some common move patterns:
move r y | move x y -> move2 : As above, moving regions to the stack
move x r | move x y -> move2 : A seemingly common pattern
|
|
|
|
* i_is_lt for r, x registers and constants
* i_is_ge for x registers and constants
* i_is_exact_eq for r and x registers
|
|
This fixes arity 4 BIF support in HiPE, following its introduction
on master (OTP 18) via the nox/ets-update_counter-4 merge.
- define standard_bif_interface_4, nbif_4_gc_after_bif, and
nbif_4_simple_exception on ARM: unbreaks the build on ARM
- remove bogus stack re-alignment from standard_bif_interface_4
on AMD64: for 4-arg BIFs the stack is already aligned, and the
code would misalign the C stack which violates the ABI and may
cause alignment faults in vectorized code
- the nbif_4_simple_exception OPD name on PPC64 was incorrectly
using the nbif_3_simple_exception OPD name: this would have
caused a multiple definition error in the assembler or linker
In addition there are a few cleanups:
- fix standard_bif_interface_N comment on x86
- fix standard_bif_interface_4 comment on SPARC
- separate nbif_N_simple_exception blocks by empty lines on PPC,
like on ARM, to clearly show which things belong together
- fix standard_bif_interface_N comment on ARM
- fix standard_bif_interface_4 on AMD64 to match the indentation
and spacing conventions of the rest of that file
|
|
* sverk/pr632/prevent-illegal-nif-terms/OTP-12655:
erts: Reject non-finite float terms in erl_drv_output_term
erts: Remove old docs about experimental NIF versions.
erts: Add enif_has_pending_exception
erts: Clearify erl_nif documentation about badarg exception
erts: Fix compile warning in enif_make_double
erts: Fix divide by zero compile error in nif_SUITE.c
erts: Fix isfinite for windows
Ensure NIF term creation disallows illegal values
|
|
Increases float comparison speed by ~120%
|
|
* sverk/etp-map:
erts: Add map support to gdb etp command
erts: Add etp_the_non_value
|
|
Only call emasculate_binary if ProcBin.flags is set,
which means it's a writable binary.
|
|
Seen symptom: Hipe compiled code with <<C/utf8, ...>> = Bin does sometimes
not match even though Bin contains a valid utf8 character. There might be
other possible binary matching symptoms, as the problem is not utf8
specific.
Problem: A writable binary was not "emasculated" when the matching started
(as it should) by the hipe compiled code.
Fix: Add a new primop emasculate_binary(Bin) that is called when
a matchstate is created.
ToDo: There are probably room for optimization. For example only call
emasculate_binary if ProcBin.flags is set.
|
|
|
|
* egil/cmp-immediate-optimization/OTP-12663:
erts: Optimize comparison operator for frequent immediates
|
|
* Assertion is only removed because we are in icount mode.
|
|
* We use compile directive icount instead
|
|
* egil/maps-refactor:
erts: Use make_small for size terms on flat maps
Conflicts:
erts/emulator/beam/erl_bif_guard.c
|
|
* bjorn/maps:
Document the new {badmap,Term} and {badkey,Key} exceptions
Raise more descriptive error messages for failed map operations
erl_term.h: Add is_not_map() macro
Tigthen code for the i_get_map_elements/3 instruction
Pre-compute hash values for the general get_map_elements instruction
Teach the loader to pre-compute the hash value for single-key lookups
Optimize use of i_get_map_element/4
beam_emu: Slightly optimize update_map_{assoc,exact}
v3_codegen: Don't sort map keys in map creation/update
beam_validator: No longer require strict literal term order
Sort maps keys in the loader
De-optimize the has_map_fields instructions
erts/map_SUITE.erl: Add a test case that tests has_map_fields
Fully evaluate is_map/1 for literals at load-time
map_SUITE: Add tests of is_map/1 with literal maps
Run a clone of map_SUITE without optimizations
Remove the fail label operand of the new_map instruction
Correct transformation of put_map_assoc to new_map
Remove support for put_map_exact without a source map
|
|
* egil/refactor-message-queue-probes:
erts: Refactor dtrace call probes
erts: Refactor erts_queue_message
|
|
* sverk/nlmills-pr653/efile-union-pwritev/OTP-12653:
erts: Cleanup code in invoke_pwritev
Use the correct union member inside efile_drv
|
|
|
|
|
|
|
|
|
|
Add macro erts_isfinite as an OS independent way to check
if a float is finite.
|
|
* small integers
* atoms
|
|
According to EEP-43 for maps, a 'badmap' exception should be
generated when an attempt is made to update non-map term such as:
<<>>#{a=>42}
That was not implemented in the OTP 17.
José Valim suggested that we should take the opportunity to
improve the errors coming from map operations:
http://erlang.org/pipermail/erlang-questions/2015-February/083588.html
This commit implement better errors from map operations similar
to his suggestion.
When a map update operation (Map#{...}) or a BIF that expects a map
is given a non-map term, the exception will be:
{badmap,Term}
This kind of exception is similar to the {badfun,Term} exception
from operations that expect a fun.
When a map operation requires a key that is not present in a map,
the following exception will be raised:
{badkey,Key}
José Valim suggested that the exception should be
{badkey,Key,Map}. We decided not to do that because the map
could potentially be huge and cause problems if the error
propagated through links to other processes.
For BIFs, it could be argued that the exceptions could be simply
'badmap' and 'badkey', because the bad map and bad key can be found in
the argument list for the BIF in the stack backtrace. However, for the
map update operation (Map#{...}), the bad map or bad key will not be
included in the stack backtrace, so that information must be included
in the exception reason itself. For consistency, the BIFs should raise
the same exceptions as update operation.
If more than one key is missing, it is undefined which of
keys that will be reported in the {badkey,Key} exception.
|
|
|
|
* egil/fix-maps-match_spec-return/OTP-12656:
erts: Fix building of Map result from match_specs
|
|
|
|
|
|
For consistency with other data types, add the is_not_map() macro.
|
|
|
|
See the previous commit for justification and use cases.
|
|
Let the loader pre-compute the hash value when a single, literal key
is matched as in:
#{<<"some_key">>:=V} = Map
In my measurements, this optimization resulted in a 30 percent
speedup for short binary keys.
Unfortunately, this optimizization makes no difference for small
maps with less than 32 keys, since the hash value is not used.
Still, there are the following use cases:
* A map used instead of a record with more than 32 entries. I have
seen some applications with huge records.
* Lookup in JSON dictionaries represented as maps.
The hash value will only be used when the map is a hash map
(currently, that means at least 32 entries).
|
|
In the i_get_map_element/4 instruction, for literal keys other than
atoms, the key would be put into x[0] instead of used directly in the
instruction. The reason is that the original implementation of maps
only supported atom keys.
|
|
In the update loop for big maps, the E variable is restored for
each turn of the loop. It only needs to be restored if a garbage
collection has been performed.
Also add a new test case that attempts to force several garbage
collections while updating a map, to help us find bugs with
incorrect restoration of the E variable after a garbage collection.
|
|
The map instructions require that the keys in the instructions
are sorted (for flatmaps). But that is an implementation detail
that should not exposed outside of the BEAM virtual machine.
Therefore, make the sorting of the keys the responsibility of
the loader and not the compiler.
Also note that the sort order for maps with numeric keys or keys
with numeric components has changed in OTP 18. That means that
code compiled for OTP 17 that operated on maps with map keys
might not work in OTP 18 without the sorting in the loader
(although it is unlikely to be an issue in practice).
|
|
The has_map_fields instruction is infrequently used. Thus there
is no need to have the fastest possible implementation; it is
better to have an implementation that reduces the code size in
the already big process_main() function.
We can transform has_map_fields to a get_map_elements instruction,
targeting the same unused x[0] register for all keys. That
instruction will only be marginally slower than existing
implementation.
|