Age | Commit message (Collapse) | Author |
|
Using Maps as type information container speedups files like cow_http_hd.erl
by ~500ms. Previously spent ~60% of the time in orddict:store/3.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
A sets implementation based on maps.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Small speed increase for large files.
|
|
This reverts commit e09dd66dc4d89c62ddfd8c19791f9678d5d787c6.
|
|
|
|
* nox/compiler/parse_transform-undef/OTP-12723:
Properly report unknown parse transforms
|
|
|
|
* bjorn/compiler/misc:
test_lib: Simplify uniq/0
beam_dict: Correct comparison in opcode/2
beam_utils: Re-use the local helper function drop_labels/1
beam_asm: Speed up encoding of large numbers
compilation_SUITE: Speed up the self_compile test cases
beam_listing: Optimize writing of .S files
v3_core, v3_codegen: Eliminate old-style catches
cerl_inline: Replace old-style 'catch' with 'try'...'catch'
sys_core_fold: Suppress warnings better
beam_utils: Teach check_liveness/3 to understand get_map_elements
Teach beam_trim to handle map instructions
beam_utils: Be less conservative about liveness for exit instructions
beam_validator: Stop validating the 'aligned' flag for binaries
beam_validator: Clean up updating of types for y register
beam_validator: Remove support for removed BIF fault/1,2
beam_validator: Correct merging of states
beam_validator: Correct merging of y registers
sys_pre_expand: Remove unused fields in #expand{} record
|
|
Simplify the uniq/0 function by using erlang:unique_integer/1.
|
|
The intention of the comparison is to avoid unnecessary updates of the
">=" instead of ">". With the ">" comparison, typically every line
instruction would cause the #asm{} record to be updated.
|
|
In 8470558, the drop_labels/1 function was added to beam_utils
as a minor optimization. Since the function is already available,
we might as well use it in index_label/3 too.
|
|
The misc_SUITE:integer_encoding/1 test case is annoyingly slow.
Rewrite the encoding of integers in beam_asm to use the
binary:encode_unsigned/1 BIF.
Also tweak the test case itself. Scale the down the maximum
size of the numbers being generated, but also add test of
numbers around boundaries of power of two (which are the numbers
most likely to expose bugs in the encoding).
|
|
It is not necessary to compile the compile three times. After the
second compilation, we compare the generated .beam files with the
.beam files that were used when compiling them. Doing one more
round will not find more bugs.
While we are it, remove the ?line macros and the unused make_current/1
function.
|
|
The test suites generates listing files, so we can slightly speed
up running of test suites (especially when running 'cover') by
optimizing writing of .S files.
|
|
|
|
Using 'try'...'catch' simplifies the code and improves coverage
because we don't have to re-throw accidentally caught errors.
|
|
86fbd6d76d strengthened type optimization in lets. As a result of
the stronger optimizations, special care had to be taken to
suppress false warnings.
It turns out that false warnings can still slip through. Slapping
on a 'compiler_generated' annotation at the top-level of a
complex term such as #c_tuple{} may not suppress all warnings.
We will need to go deeper into the term to eliminate all warnings.
|
|
Understanding get_map_elements improves the stack trimming done
by beam_trim.
|
|
|
|
beam_utils used to be overly conservative about liveness for
exit instructions such as:
call_ext erlang:exit/1
beam_utils would consider all y registers to be used, to avoid
overwriting a catch or try tag. That does not seem to be a real
risk.
However, we miss opportunities for stack trimming if we consider
y registers used by an exit instruction.
|
|
The run-time system stopped paying attention the 'aligned' flag in bit
syntax construction and matching when bitstrings were introduced in
language.
The beam_asm compiler pass will crash if the 'aligned' flag is given
in bit syntax instructions.
beam_validator still validates the 'aligned' flag. Before
912fea0b712a (which removed the possibility to validate existing
BEAM files), the 'aligned' flag could actually be encountered
when validating a BEAM file.
Since the validation of 'aligned' no longer serves any useful
purpose, remove the validation code.
|
|
set_type_y/3 is far too complicated. Note that we don't need to check
the #st.numy field, because we will detect the error anyway because
the information for the y register will be missing in the #st.y
gb_tree.
There is also a clause that would never match because of a spelling
error (the first "n" was missing in "uninitialized"). That clause
is not needed because the default clause will do fine.
Furthermore, we can break out the special case for handling catch_end
and similar instructions into a new function.
|
|
* bjorn/use-monotonic-time:
supervisor: Correct restart handling
test_server: Use erlang:monotonic_time/0
compile: Teach 'time' option to show three significant decimals
timer: Use monotonic_time/0 in tc/1,2,3
|
|
The fault/1,2 BIF was removed a long time ago.
|
|
When merging two states, the following fields should be merged
between the states: #st.x, #st.y, #st.numy, #st.ct. Everything
else should be set to the default values in a new state.
|
|
When merging y registers, only the y registers that are found in
both states should be retained.
|
|
The compile, bitdefault, and bittypes records are not really used
in the #expand{} record.
|
|
Updating of the variable data base takes most of the time.
|
|
The use of lists:dropwhile/2 is noticeable in the eprof results.
|
|
|
|
lists:dropwhile/2 and the fun in btb_index_1/2 shows up in the
top 10 list of eprof. Replace dropwhile with a special-purpose
function for a tiny increase in speed.
|
|
Profiling shows that the excution time for checkerror_1/2 could
be be near the top even for modules without any floating point
operations.
It turns out that the complexity of simplify_float_1/4 is quadratic.
checkerror/1 is called with the growing accumulator for each
iteration. checkerror/1 will traverse the entire accumulated list
*unless* some floating point operations are used.
We can avoid this situation if we only call checkerror/1 when there
are live floating point registers. We can also avoid calling flush/3
if there are no live floating point registers.
|
|
The execution time for beam_utils:index_labels_1/2 is among
the longest in the beam_bool, beam_bsm, beam_receive, and
beam_trim compiler passes. Therefore it is worthwhile to do
the minor optimization of replacing a call to lists:dropwhile/2
with a special-purpose drop_labels function.
|
|
When matching a binary literal as in:
<<"abc">> = Bin
the compiler will produce a sequence of three instructions
(some details in the instructions removed for simplicity):
bs_start_match2 Fail BinReg CtxtReg
bs_match_string Fail CtxtReg "abc"
bs_test_tail2 Fail CtxtReg 0
The sequence can be replaced with:
is_eq_exact Fail BinReg "abc"
|
|
The actual bs_match_string instruction has four operands:
bs_match_string {f,Lbl} Ctxt NumBits {string,ListOfBytes}
However, v3_codegen emits a more compact representation where
the bits to match are packaged in a bitstring:
bs_match_string {f,Lbl} Ctxt Bitstring
Currently, beam_clean:clean_labels/1 will rewrite the compact
representation to the final representation. That is unfortunate
since clean_labels/1 is called by beam_dead, which means that
the less compact representation will be introduced long before
it is actually needed by beam_asm. It will also complicate any
optimizations that we might want to do.
Move the rewriting of bs_match_string from beam_clean:clean_labels/1
to the beam_z pass, which is the last pass executed before
beam_validator and beam_asm.
|
|
Commit b76588fb5a introduced an optimization of the compile time of
huge functions with many bs_match_string instructions. The
optimization is done in two passes. The first pass coalesces adjacent
bs_match_string instructions. To avoid copying bitstrings multiple
times, the bitstrings in the instructions are combined in to a (deep)
list. The second pass goes through all instructions in the function
and combines the list of bitstrings to a single bitstring in all
bs_match_string instructions.
The second pass (fix_bs_match_string) is run on all instructions in
each function, even if there are no bs_match_instructions in the
function. While fix_bs_match_string is not a bottleneck (it is a
linear pass), its execution time is noticeable when profiling some
modules.
Move the execution of the second pass to the select_binary()
function so that it will only be executed for instructions that
do binary matching. Also take the opportunity to optimize away
uses of bs_restore2 that occour directly after a bs_save2. That
optimimization is currently done in beam_block, but it can be
done essentially for free in the same pass that fixes up
bs_match_string instructions.
|
|
Profiling shows that the execution time for "turning" y registers
is noticeable for some modules (e.g. S1AP-PDU-Contents from the
asn1 test suite). We can reduce the impact on running time by
special-casing important instructions. In particular, there is
no need to look for y registers in the list argument for a
select_val instruction.
|
|
Profiling shows that subst_vsub/3 dominates the running time. It
is therefore worthwhile optimizing it.
|