Age | Commit message (Collapse) | Author |
|
Instructions that produce more than one result complicate
optimizations. get_list/3 is one of two instructions that
produce multiple results (get_map_elements/3 is the other).
Introduce the get_hd/2 and get_tl/2 instructions
that return the head and tail of a cons cell, respectively,
and use it internally in all optimization passes.
For efficiency, we still want to use get_list/3 if both
head and tail are used, so we will translate matching pairs
of get_hd and get_tl back to get_list instructions.
|
|
Do local common sub expression elimination (CSE)
|
|
Do some minor optimizations of binary matching
|
|
* ingela/DTLS-supported:
ssl: Fix typo
dtls: Add DTLS handling to utility functions
ssl: Document enhancment
ssl: Document DTLS
|
|
|
|
* dgud/wx/fix-driver-usage:
wx: open_port doesn't allow 0 terminated strings anymore
|
|
Optimize away unnecessary test_unit instructions that verify that
binaries are byte-aligned. In a tight loop, eliminating an
instruction can have a small but measurable improvement of the
execution time.
|
|
Separate the simplification of instructions from updating of the
type data base.
|
|
Conflicts:
lib/stdlib/src/gen_statem.erl
|
|
* raimo/stdlib/optimize-gen_statem:
Optimize plain call response time
Correct typo in design principles for gen_statem
|
|
|
|
* ingela/ssl/record-version-check/OTP-14892:
ssl: Add record version sanity check
|
|
|
|
|
|
|
|
Fix rounding bug in float_to_list/2
|
|
Extend an existing optimization in beam_dead to avoid
creating a match context when matching an empty binary.
|
|
Eliminate repeated evaluation of guard BIFs and building of cons cells
in blocks. This optimization is applicable in more places than might be
expected, because code generation for binaries and record can generate
common sub expressions not visible in the original source code.
For example, consider this function:
make_binary(Term) ->
Bin = term_to_binary(Term),
Size = byte_size(Bin),
<<Size:32,Bin/binary>>.
The compiler inserts a call to byte_size/2 to calculate the size of
the binary being built:
{function, make_binary, 1, 2}.
{label,1}.
{line,...}.
{func_info,{atom,t},{atom,make_binary},1}.
{label,2}.
{allocate,0,1}.
{line,...}.
{call_ext,1,{extfunc,erlang,term_to_binary,1}}.
{line,...}.
{gc_bif,byte_size,{f,0},1,[{x,0}],{x,1}}. %Present in original code.
{line,...}.
{gc_bif,byte_size,{f,0},2,[{x,0}],{x,2}}. %Inserted by compiler.
{bs_add,{f,0},[{x,2},{integer,4},1],{x,2}}.
{bs_init2,{f,0},{x,2},0,2,{field_flags,[]},{x,2}}.
{bs_put_integer,{f,0},{integer,32},1,{field_flags,[unsigned,big]},{x,1}}.
{bs_put_binary,{f,0},{atom,all},8,{field_flags,[unsigned,big]},{x,0}}.
{move,{x,2},{x,0}}.
{deallocate,0}.
return.
Common sub expression elimination (CSE) eliminates the second call to
byte_size/2:
{function, make_binary, 1, 2}.
{label,1}.
{line,...}.
{func_info,{atom,t},{atom,make_binary},1}.
{label,2}.
{allocate,0,1}.
{line,...}.
{call_ext,1,{extfunc,erlang,term_to_binary,1}}.
{line,...}.
{gc_bif,byte_size,{f,0},1,[{x,0}],{x,1}}.
{move,{x,1},{x,2}}.
{bs_add,{f,0},[{x,2},{integer,4},1],{x,2}}.
{bs_init2,{f,0},{x,2},0,2,{field_flags,[]},{x,2}}.
{bs_put_integer,{f,0},{integer,32},1,{field_flags,[unsigned,big]},{x,1}}.
{bs_put_binary,{f,0},{atom,all},8,{field_flags,[unsigned,big]},{x,0}}.
{move,{x,2},{x,0}}.
{deallocate,0}.
return.
Note: A possible future optimization would be to include binary
construction instructions in blocks. If that is done, the
{move,{x,1},{x,2}} instruction could also be eliminated.
|
|
Don't build a stacktrace if it's only passed to erlang:raise/3
|
|
* raimo/stdlib/gen-bench-fsm-vs-statem:
Dodge divide by zero
Introduce gen_statem vs gen_fsm benchmark
Remove test suite warning
|
|
The folling sequence in a block:
{move,{x,1},{x,2}}.
{move,{x,2},{x,2}}.
would be incorrectly rewritten to:
{move,{x,2},{x,2}}.
(Which in turn would be optimized away a little bit later.)
|
|
When attempting to eliminate the move/2 instruction in the following
code:
{bif,self,{f,0},[],{x,0}}.
{move,{x,0},{x,1}}.
.
.
.
{put_tuple,2,{x,1}}.
{put,{atom,ok}}.
{put,{x,0}}.
beam_block would produce the following unsafe code:
{bif,self,{f,0},[],{x,1}}.
.
.
.
{put_tuple,2,{x,1}}.
{put,{atom,ok}}.
{put,{x,1}}.
It is unsafe because the tuple is self-referential.
The following code:
{put_list,{y,6},nil,{x,4}}.
{move,{x,4},{x,5}}.
{put_list,{y,1},{x,5},{x,5}}.
.
.
.
{put_tuple,2,{x,6}}.
{put,{x,4}}.
{put,{x,5}}.
would be incorrectly transformed to:
{put_list,{y,6},nil,{x,5}}.
{put_list,{y,1},{x,5},{x,5}}.
.
.
.
{put_tuple,2,{x,6}}.
{put,{x,5}}.
{put,{x,5}}.
(Both elements in the built tuple get the same value.)
|
|
* maint:
kernel: Correct contracts and a bug in group_history
stdlib: Correct contracts
dialyzer: Optimize handling of a lot of warnings
Conflicts:
lib/kernel/src/erl_boot_server.erl
|
|
* hasse/kernel-stdlib/fix_contracts/OTP-14889:
kernel: Correct contracts and a bug in group_history
stdlib: Correct contracts
dialyzer: Optimize handling of a lot of warnings
|
|
|
|
Make sure that there is the correct number of put/1 instructions
following put_tuple/2. Also make it illegal to reference the
register for the tuple being built in a put/1 instruction.
That is, beam_validator will now issue a diagnostice for the the
following code:
{put_tuple,1,{x,0}}.
{put,{x,0}}.
|
|
|
|
|
|
If the number of warnings is huge the '--'/2 operator is slow.
|
|
Consider the following function:
function({function,Name,Arity,CLabel,Is0}, Lc0) ->
try
%% Optimize the code for the function.
catch
Class:Error:Stack ->
io:format("Function: ~w/~w\n", [Name,Arity]),
erlang:raise(Class, Error, Stack)
end.
The stacktrace is retrieved, but it is only used in the call
to erlang:raise/3. There is no need to build a stacktrace
in this function. We can avoid the building if we introduce
an instruction called raw_raise/3 that works exactly like
the erlang:raise/3 BIF except that its third argument must
be a raw stacktrace.
|
|
* bjorn/erts/beam_debug:
beam_debug: Fix printing of f operand for catch_yf
beam_debug: Print out strings for bs_match_string/bs_put_string
beam_debug: Print the MFA in the i_make_fun/2 instruction
|
|
* maint:
ErLLVM: Preserve precise BEAM tailcall semantics
observer: Fix change accum
Remove double calls
observer: Don't crash for late messages
observer: Optimize tv tab for many tables
|
|
ErLLVM: Preserve precise BEAM tailcall semantics
OTP-14886
|
|
* dgud/observer/opt-tv-tab/OTP-14856:
observer: Fix change accum
Remove double calls
observer: Don't crash for late messages
observer: Optimize tv tab for many tables
|
|
Fix getenv usage.
And remove set path it is automagically done by driver interface.
|
|
* ingela/ssl/no-chacha-default-for-now/ERL-538/OTP-14882:
ssl: Remove chacha ciphers form default for now
|
|
We have discovered interoperability problems, ERL-538, that we
believe needs to be solved in crypto.
|
|
* ingela/ssl/remove-3des-from-default/OTP-14768:
ssl: Remove 3DES cipher suites from default
|
|
The BEAM compiler chooses not to perform tailcall optimisations for some
calls in tail position, for example to some built-in functions. However,
when the ErLLVM HiPE backend is used, LLVM may choose to perform
tailcall optimisation on these calls, breaking the expected semantics.
To preserve the precise semantics exhibited by BEAM, the 'notail'
marker, present in LLVM since version 3.8, is added to call instructions
that BEAM has not turned into tail calls, which inhibits LLVM from
performing tail-call optimisation in turn.
|
|
* maint:
dialyzer: Fix bsl/2 bug
|
|
* hasse/dialyzer/fix_bsl:
dialyzer: Fix bsl/2 bug
|
|
|
|
|
|
|
|
sys_core_bsm: Rearrange arguments to enable delayed sub binary creation
|
|
* bjorn/erts/optimize-utf8/OTP-14774:
Optimize matching of an 'utf8' segment in the binary syntax
|
|
Matching out an 8-bit integer is faster than matching out
an utf8-encoded code point, even if the value of the code
point is less than 128. The reason is that matching out
an 8-bit integer is specially optimized to avoid a function
call. Do a similar optimization for matching out an utf8
segment.
|
|
|
|
|
|
|