Age | Commit message (Collapse) | Author |
|
|
|
* hb/stdlib/update_gb_sets_type:
stdlib: Substitute set() for gb_sets:set() in gb_sets
|
|
|
|
* egil/eprof-totality/OTP-12681:
tools: Add printout of total number of calls and time in eprof
|
|
* egil/core-on-heart-tmo/OTP-12613:
kernel: Document heart environment HEART_KILL_SIGNAL
erts: Enable different abort signal from heart
|
|
* egil/opt-float-cmp:
erts: Brute force float comparisons as well
|
|
|
|
* ia/ssh/improve_docs:
ssh: Move code example to Users Guide
ssh: Keep dependency info in only one place
ssh: Add links
ssh: Align to alphabetic order
ssh: Change wording to become accurate
ssh: Remove extra whitespace
ssh: Corrected information about error and event logging
ssh: Remove legacy statement
ssh: Technically correct description
Editorial updates
|
|
|
|
Some examples had encountered the space eater.
|
|
|
|
|
|
Conflicts:
OTP_VERSION
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Also added some links
|
|
|
|
|
|
SSH application
|
|
* sverk/pr632/prevent-illegal-nif-terms/OTP-12655:
erts: Reject non-finite float terms in erl_drv_output_term
erts: Remove old docs about experimental NIF versions.
erts: Add enif_has_pending_exception
erts: Clearify erl_nif documentation about badarg exception
erts: Fix compile warning in enif_make_double
erts: Fix divide by zero compile error in nif_SUITE.c
erts: Fix isfinite for windows
Ensure NIF term creation disallows illegal values
|
|
Increases float comparison speed by ~120%
|
|
|
|
|
|
* hans/ssh/banner_grabbing/OTP-12659:
ssh: added id_string option for server and client
|
|
* hans/inets/banner_grabbing/OTP-12661:
inets: Add value 'none' in server_tokens config
|
|
|
|
* bjorn/compiler/eprof:
v3_life: Optimize updating of the variable data base
beam_jump: Replace use of lists:dropwhile/2 with a custom function
beam_asm: Eliminate unnecessary use of iolist_to_binary/1
beam_bsm: Optimize btb_index()
beam_type: Eliminate redundant calls to checkerror_1/2
erl_expand_records: Simplify handling of call_ext instructions
beam_utils: Optimize index_labels_1/2
beam_block: Optimize matching of binary literals
Move rewriting of bs_match from beam_clean to beam_z
v3_codegen: Reduce cost for fixing up bs_match_string instructions
v3_codegen: Optimize "turning" of y registers
v3_kernel: Optimize subst_vsub/3
orddict: Eliminate unnecessary consing in store/3 and others
compile: Add the {eprof,Pass} option for easy eprof running
compile: Eliminate unnecessary wrappers for compiler passes
Add z_SUITE to validate loaded code
test suite: Always place .core files in data directories
test suites: Unload modules compiled from .core or .S
compilation_SUITE: Unload tested modules using the code server
|
|
Updating of the variable data base takes most of the time.
|
|
The use of lists:dropwhile/2 is noticeable in the eprof results.
|
|
|
|
lists:dropwhile/2 and the fun in btb_index_1/2 shows up in the
top 10 list of eprof. Replace dropwhile with a special-purpose
function for a tiny increase in speed.
|
|
Profiling shows that the excution time for checkerror_1/2 could
be be near the top even for modules without any floating point
operations.
It turns out that the complexity of simplify_float_1/4 is quadratic.
checkerror/1 is called with the growing accumulator for each
iteration. checkerror/1 will traverse the entire accumulated list
*unless* some floating point operations are used.
We can avoid this situation if we only call checkerror/1 when there
are live floating point registers. We can also avoid calling flush/3
if there are no live floating point registers.
|
|
The erl_expand_records module have inherited code from sys_pre_expand.
We can simplify the code for handling the call_ext instruction to
make the code clearer and a smidge faster.
|
|
The execution time for beam_utils:index_labels_1/2 is among
the longest in the beam_bool, beam_bsm, beam_receive, and
beam_trim compiler passes. Therefore it is worthwhile to do
the minor optimization of replacing a call to lists:dropwhile/2
with a special-purpose drop_labels function.
|
|
When matching a binary literal as in:
<<"abc">> = Bin
the compiler will produce a sequence of three instructions
(some details in the instructions removed for simplicity):
bs_start_match2 Fail BinReg CtxtReg
bs_match_string Fail CtxtReg "abc"
bs_test_tail2 Fail CtxtReg 0
The sequence can be replaced with:
is_eq_exact Fail BinReg "abc"
|
|
The actual bs_match_string instruction has four operands:
bs_match_string {f,Lbl} Ctxt NumBits {string,ListOfBytes}
However, v3_codegen emits a more compact representation where
the bits to match are packaged in a bitstring:
bs_match_string {f,Lbl} Ctxt Bitstring
Currently, beam_clean:clean_labels/1 will rewrite the compact
representation to the final representation. That is unfortunate
since clean_labels/1 is called by beam_dead, which means that
the less compact representation will be introduced long before
it is actually needed by beam_asm. It will also complicate any
optimizations that we might want to do.
Move the rewriting of bs_match_string from beam_clean:clean_labels/1
to the beam_z pass, which is the last pass executed before
beam_validator and beam_asm.
|
|
Commit b76588fb5a introduced an optimization of the compile time of
huge functions with many bs_match_string instructions. The
optimization is done in two passes. The first pass coalesces adjacent
bs_match_string instructions. To avoid copying bitstrings multiple
times, the bitstrings in the instructions are combined in to a (deep)
list. The second pass goes through all instructions in the function
and combines the list of bitstrings to a single bitstring in all
bs_match_string instructions.
The second pass (fix_bs_match_string) is run on all instructions in
each function, even if there are no bs_match_instructions in the
function. While fix_bs_match_string is not a bottleneck (it is a
linear pass), its execution time is noticeable when profiling some
modules.
Move the execution of the second pass to the select_binary()
function so that it will only be executed for instructions that
do binary matching. Also take the opportunity to optimize away
uses of bs_restore2 that occour directly after a bs_save2. That
optimimization is currently done in beam_block, but it can be
done essentially for free in the same pass that fixes up
bs_match_string instructions.
|
|
Profiling shows that the execution time for "turning" y registers
is noticeable for some modules (e.g. S1AP-PDU-Contents from the
asn1 test suite). We can reduce the impact on running time by
special-casing important instructions. In particular, there is
no need to look for y registers in the list argument for a
select_val instruction.
|
|
Profiling shows that subst_vsub/3 dominates the running time. It
is therefore worthwhile optimizing it.
|
|
As a minor optimization, eliminate unnecessary cons operations
in store/3, append/3, append_list/3, update/4, and update_counter/3.
|
|
To run eprof for a compiler pass:
erlc +'{eprof,beam_asm}' file.erl
The name of the compiler pass is the name as printed when
'time' option is used. It is usually, but not always, the module
name for the compiler pass.
|
|
Several compiler passes have unnecessary wrapper functions that
can be easily eliminated.
|
|
If we want to have test cases that run eprof, we must make sure that
there are no modules loaded that don't have a working module_info/1
function, since eprof calls module_info(functions) to retrieve the
list of functions in the module. Some test cases load modules compiled
from Core Erlang that don't have any module_info/1 functions, so
we will need make sure that all such modules have been unloaded.
Add z_SUITE:loaded/1 to run after all other test cases to verify that
all modules that the code server consider loaded are indeed loaded and
all have working module_info/0,1 functions.
|
|
For tidiness, always place .core files in data directories.
|
|
The .core or .S files that are compiled in the test cases
may lack module_info/0,1 functions, which will cause problems if
we (for example) try to run eprof later. To avoid that problem,
unload each module directly after testing it.
|
|
* hb/hipe/opaque_bugfix/OTP-12666:
hipe: Fix a bug in the handling of opaque types
|