Age | Commit message (Collapse) | Author |
|
Updating of the variable data base takes most of the time.
|
|
The use of lists:dropwhile/2 is noticeable in the eprof results.
|
|
|
|
lists:dropwhile/2 and the fun in btb_index_1/2 shows up in the
top 10 list of eprof. Replace dropwhile with a special-purpose
function for a tiny increase in speed.
|
|
Profiling shows that the excution time for checkerror_1/2 could
be be near the top even for modules without any floating point
operations.
It turns out that the complexity of simplify_float_1/4 is quadratic.
checkerror/1 is called with the growing accumulator for each
iteration. checkerror/1 will traverse the entire accumulated list
*unless* some floating point operations are used.
We can avoid this situation if we only call checkerror/1 when there
are live floating point registers. We can also avoid calling flush/3
if there are no live floating point registers.
|
|
The execution time for beam_utils:index_labels_1/2 is among
the longest in the beam_bool, beam_bsm, beam_receive, and
beam_trim compiler passes. Therefore it is worthwhile to do
the minor optimization of replacing a call to lists:dropwhile/2
with a special-purpose drop_labels function.
|
|
When matching a binary literal as in:
<<"abc">> = Bin
the compiler will produce a sequence of three instructions
(some details in the instructions removed for simplicity):
bs_start_match2 Fail BinReg CtxtReg
bs_match_string Fail CtxtReg "abc"
bs_test_tail2 Fail CtxtReg 0
The sequence can be replaced with:
is_eq_exact Fail BinReg "abc"
|
|
The actual bs_match_string instruction has four operands:
bs_match_string {f,Lbl} Ctxt NumBits {string,ListOfBytes}
However, v3_codegen emits a more compact representation where
the bits to match are packaged in a bitstring:
bs_match_string {f,Lbl} Ctxt Bitstring
Currently, beam_clean:clean_labels/1 will rewrite the compact
representation to the final representation. That is unfortunate
since clean_labels/1 is called by beam_dead, which means that
the less compact representation will be introduced long before
it is actually needed by beam_asm. It will also complicate any
optimizations that we might want to do.
Move the rewriting of bs_match_string from beam_clean:clean_labels/1
to the beam_z pass, which is the last pass executed before
beam_validator and beam_asm.
|
|
Commit b76588fb5a introduced an optimization of the compile time of
huge functions with many bs_match_string instructions. The
optimization is done in two passes. The first pass coalesces adjacent
bs_match_string instructions. To avoid copying bitstrings multiple
times, the bitstrings in the instructions are combined in to a (deep)
list. The second pass goes through all instructions in the function
and combines the list of bitstrings to a single bitstring in all
bs_match_string instructions.
The second pass (fix_bs_match_string) is run on all instructions in
each function, even if there are no bs_match_instructions in the
function. While fix_bs_match_string is not a bottleneck (it is a
linear pass), its execution time is noticeable when profiling some
modules.
Move the execution of the second pass to the select_binary()
function so that it will only be executed for instructions that
do binary matching. Also take the opportunity to optimize away
uses of bs_restore2 that occour directly after a bs_save2. That
optimimization is currently done in beam_block, but it can be
done essentially for free in the same pass that fixes up
bs_match_string instructions.
|
|
Profiling shows that the execution time for "turning" y registers
is noticeable for some modules (e.g. S1AP-PDU-Contents from the
asn1 test suite). We can reduce the impact on running time by
special-casing important instructions. In particular, there is
no need to look for y registers in the list argument for a
select_val instruction.
|
|
Profiling shows that subst_vsub/3 dominates the running time. It
is therefore worthwhile optimizing it.
|
|
To run eprof for a compiler pass:
erlc +'{eprof,beam_asm}' file.erl
The name of the compiler pass is the name as printed when
'time' option is used. It is usually, but not always, the module
name for the compiler pass.
|
|
Several compiler passes have unnecessary wrapper functions that
can be easily eliminated.
|
|
If we want to have test cases that run eprof, we must make sure that
there are no modules loaded that don't have a working module_info/1
function, since eprof calls module_info(functions) to retrieve the
list of functions in the module. Some test cases load modules compiled
from Core Erlang that don't have any module_info/1 functions, so
we will need make sure that all such modules have been unloaded.
Add z_SUITE:loaded/1 to run after all other test cases to verify that
all modules that the code server consider loaded are indeed loaded and
all have working module_info/0,1 functions.
|
|
For tidiness, always place .core files in data directories.
|
|
The .core or .S files that are compiled in the test cases
may lack module_info/0,1 functions, which will cause problems if
we (for example) try to run eprof later. To avoid that problem,
unload each module directly after testing it.
|
|
Don't unload modules using BIFs; use the code server to ensure
that code:all_loaded/0 only lists code that is actually loaded.
|
|
According to EEP-43 for maps, a 'badmap' exception should be
generated when an attempt is made to update non-map term such as:
<<>>#{a=>42}
That was not implemented in the OTP 17.
José Valim suggested that we should take the opportunity to
improve the errors coming from map operations:
http://erlang.org/pipermail/erlang-questions/2015-February/083588.html
This commit implement better errors from map operations similar
to his suggestion.
When a map update operation (Map#{...}) or a BIF that expects a map
is given a non-map term, the exception will be:
{badmap,Term}
This kind of exception is similar to the {badfun,Term} exception
from operations that expect a fun.
When a map operation requires a key that is not present in a map,
the following exception will be raised:
{badkey,Key}
José Valim suggested that the exception should be
{badkey,Key,Map}. We decided not to do that because the map
could potentially be huge and cause problems if the error
propagated through links to other processes.
For BIFs, it could be argued that the exceptions could be simply
'badmap' and 'badkey', because the bad map and bad key can be found in
the argument list for the BIF in the stack backtrace. However, for the
map update operation (Map#{...}), the bad map or bad key will not be
included in the stack backtrace, so that information must be included
in the exception reason itself. For consistency, the BIFs should raise
the same exceptions as update operation.
If more than one key is missing, it is undefined which of
keys that will be reported in the {badkey,Key} exception.
|
|
It is no longer necessary to sort the keys, since the loader
does the sorting.
|
|
The BEAM loader will now sort keys for maps during loading, so
beam_validator should not require the keys to be ordered any order.
However, we must still ensure that literals keys are unique (which
was implicitly guaranteed by the strict ordering requirement).
|
|
To be sure that the compiler and BEAM virtual machine correctly
handles literals maps, we must test it.
|
|
=== OTP-17.5 ===
Changed Applications:
- asn1-3.0.4
- common_test-1.10
- compiler-5.0.4
- crypto-3.5
- debugger-4.0.3
- dialyzer-2.7.4
- diameter-1.9
- eldap-1.1.1
- erts-6.4
- hipe-3.11.3
- inets-5.10.6
- kernel-3.2
- mnesia-4.12.5
- observer-2.0.4
- os_mon-2.3.1
- public_key-0.23
- runtime_tools-1.8.16
- ssh-3.2
- ssl-6.0
- stdlib-2.4
- syntax_tools-1.6.18
- test_server-3.8
- tools-2.7.2
- wx-1.3.3
Unchanged Applications:
- cosEvent-2.1.15
- cosEventDomain-1.1.14
- cosFileTransfer-1.1.16
- cosNotification-1.1.21
- cosProperty-1.1.17
- cosTime-1.1.14
- cosTransactions-1.2.14
- edoc-0.7.16
- erl_docgen-0.3.7
- erl_interface-3.7.20
- et-1.5
- eunit-2.2.9
- gs-1.5.16
- ic-4.3.6
- jinterface-1.5.12
- megaco-3.17.3
- odbc-2.10.22
- orber-3.7.1
- ose-1.0.2
- otp_mibs-1.0.10
- parsetools-2.0.12
- percept-0.8.10
- reltool-0.6.6
- sasl-2.4.1
- snmp-5.1.1
- typer-0.9.8
- webtool-0.8.10
- xmerl-1.3.7
Conflicts:
OTP_VERSION
erts/vsn.mk
lib/ssl/vsn.mk
|
|
|
|
|
|
|
|
* bjorn/doc:
cerl_trees: Fix incorrect EDoc reference to the cerl module
cerl: Correct incorrect EDoc references
|
|
|
|
* rickard/time_api/OTP-11997: (22 commits)
Update primary bootstrap
inets: Suppress deprecated warning on erlang:now/0
inets: Cleanup of multiple copies of functions Add inets_lib with common functions used by multiple modules
inets: Update comments
Suppress deprecated warning on erlang:now/0
Use new time API and be back-compatible in inets Remove unused functions and removed redundant test
asn1 test SUITE: Eliminate use of now/0
Disable deprecated warning on erlang:now/0 in diameter_lib
Use new time API and be back-compatible in ssh
Replace all calls to now/0 in CT with new time API functions
test_server: Replace usage of erlang:now() with usage of new API
Replace usage of erlang:now() with usage of new API
Replace usage of erlang:now() with usage of new API
Replace usage of erlang:now() with usage of new API
Replace usage of erlang:now() with usage of new API
otp_SUITE: Warn for calls to erlang:now/0
Replace usage of erlang:now() with usage of new API
Multiple timer wheels
Erlang based BIF timer implementation for scalability
Implement ethread events with timeout
...
Conflicts:
bootstrap/bin/start.boot
bootstrap/bin/start_clean.boot
bootstrap/lib/compiler/ebin/beam_asm.beam
bootstrap/lib/compiler/ebin/compile.beam
bootstrap/lib/kernel/ebin/auth.beam
bootstrap/lib/kernel/ebin/dist_util.beam
bootstrap/lib/kernel/ebin/global.beam
bootstrap/lib/kernel/ebin/hipe_unified_loader.beam
bootstrap/lib/kernel/ebin/inet_db.beam
bootstrap/lib/kernel/ebin/inet_dns.beam
bootstrap/lib/kernel/ebin/inet_res.beam
bootstrap/lib/kernel/ebin/os.beam
bootstrap/lib/kernel/ebin/pg2.beam
bootstrap/lib/stdlib/ebin/dets.beam
bootstrap/lib/stdlib/ebin/dets_utils.beam
bootstrap/lib/stdlib/ebin/erl_tar.beam
bootstrap/lib/stdlib/ebin/escript.beam
bootstrap/lib/stdlib/ebin/file_sorter.beam
bootstrap/lib/stdlib/ebin/otp_internal.beam
bootstrap/lib/stdlib/ebin/qlc.beam
bootstrap/lib/stdlib/ebin/random.beam
bootstrap/lib/stdlib/ebin/supervisor.beam
bootstrap/lib/stdlib/ebin/timer.beam
erts/aclocal.m4
erts/emulator/beam/bif.c
erts/emulator/beam/erl_bif_info.c
erts/emulator/beam/erl_db_hash.c
erts/emulator/beam/erl_init.c
erts/emulator/beam/erl_process.h
erts/emulator/beam/erl_thr_progress.c
erts/emulator/beam/utils.c
erts/emulator/sys/unix/sys.c
erts/preloaded/ebin/erlang.beam
erts/preloaded/ebin/erts_internal.beam
erts/preloaded/ebin/init.beam
erts/preloaded/src/erts_internal.erl
lib/common_test/test/ct_hooks_SUITE_data/cth/tests/empty_cth.erl
lib/diameter/src/base/diameter_lib.erl
lib/kernel/src/os.erl
lib/ssh/test/ssh_basic_SUITE.erl
system/doc/efficiency_guide/advanced.xml
|
|
|
|
Change c_nil/1 to c_nil/0.
Change c_try/3 to c_try/5.
Change c_var_name/1 to var_name/1.
|
|
* egil/maps/hamt/OTP-12585: (113 commits)
erts: Fix bug in ESTACK and WSTACK
kernel: Add spec for erts_debug:map_info/1
mnesia: Update mnesia tests to reflect new ETS hash
erts: Ensure maps uses _rel functions in halfword
erts: Do not treat errors as fatal in erl_printf_term
erts: Update preloaded erts_internal.beam
erts: Add map decomposition wrappers
erts: Ensure halfword has correct temp-heap for maps
hipe: Handle separate hashmap tag correctly
erts: Fix map bug in dec_term for 32-bit debug VM
stdlib: Update qlc tests to reflect new ETS hash
stdlib: Remove obsolete hashmap references in io_lib
erts: Enhance maps ordering tests
hipe: Fix maps sort order testcase
erts: Remove unused variable in crashdump creation
erts: Fix typo in copy_struct for halfword emulator
erts: Restrict GCC intrinsics by compiler version
erts: Fix windows bug in hashmap_info
erts: Fix typo in make_hash2 for 32-bit arch
Fix beam_load assert
...
Conflicts:
erts/emulator/beam/bif.tab
|
|
where key 1 is less than key 1.0
|
|
For unclear reasons, there are two functions in v3_life that are
almost identical: literal/2 and literal2/2. literal/2 is used
for expressions and literal2/2 for patterns.
It turns out that literal2/2 can do everything that literal/2 can
do, except that it transforms maps differently.
If we adjust v3_codegen to accept the same format of maps in
expressions and patterns, we can rename literal2/2 to literal/2
and use it for expressions and patterns.
|
|
v3_codegen puts the compilation in the process dictionary, but
never uses them.
|
|
Inlining the core_parse module is slow (the inline pass alone
takes more than 6 seconds on my computer) and has no benefit.
|
|
Duplicated variables as aliases in patterns, such as:
f({_,_}=Dup=Dup) -> ...
will work, but produce sub-optimal code similar to:
f({_,_}=Dup=NewVar) when Dup =:= NewVar -> ...
with one extra guard test for each duplicated variable.
Rewrite pat_alias/2 to eliminate all duplicated variables. While
we are at it, also simplify handling of tuples, conses, and literals
by using the data functions in the cerl module.
|
|
|
|
|
|
|
|
initialized_regs/2 did not handle allocating instructions;
instead treating them as any other 'set' instruction.
The consequences could be one or both of the following:
Going past the allocating instruction (looking at more instructions)
would mean that initialized_regs/2 could return registers that were
not actually initialized. That could mean that MustBeKilled in
ensure_opt_safe/6 could contain too few registers, and that the code
that followed tried to use an uninitialized register. The
beam_validator should have detected that problem.
Not taking account the number of live registers in the
allocating instruction could mean that some registers
were not found to be initialized, which could mean that
MustBeKilled would contain too many registers. That would
mean a missed optimization.
|
|
For a long time, there has been an optimization for:
{V1,V2,...} = case Expr of Pat -> ... {Val1,Val2,...}; ... end
that avoids building the tuples. The construct looks like this
in Core Erlang:
let <V> = case X of
Pattern -> {Y,Z}
end
in case V of
{A,B} -> A+B
end
The current optimization will try to replace the second 'case'
with a 'let':
let <A,B> = case X of
Pattern -> <Y,Z>
end
in A+B
Simple variations of the construct would prevent the optimizations;
for example this one:
let <V> = case X of
Pattern -> {'ok',Val}
end
in case V of
{ok,Val} -> Val
end
The problem is that the optimization tries to do too much. By
making the optimization do less and have it depend on other
optimizations to finish the job, it will become more powerful.
Thus we can rewrite the code like this:
let <V1,V2> = case X of
Pattern -> <'ok',Val>
end
in let <V> = {V1,V2}
in case V of
{ok,Val} -> Val
end
Note that the second case is unchanged. The other optimizations in the
sys_core_fold module will optimize the second 'case' and eliminate the
building of the tuple.
|
|
Optimize away 'not' in sys_core_fold instead of in beam_block
and beam_dead, as we can do a better job in sys_core_fold.
I modified the test suite temporarily to never turn off Core Erlang
modifications and looked at the coverage. With the new optimizations
active in sys_core_fold, the code in beam_block and beam_dead did not
find a single 'not' that it could optimize. That proves that the new
optimization is at least as good as the old one. Manually, I could
also verify that the new optimization would optimize some variations
of 'not' that the old one would not handle.
|
|
More aggressive optimizations that we plan to introduce could cause
spurious compiler warnings.
|
|
The 'try' ... 'catch' is problematic. Firstly, if no optimization
is possible, an exception will always be thrown. Secondly, bugs
in the code will go unnoticed.
|
|
'=:=' is a cheaper operation than '==', so we should always
use '=:=' if the result will be the same as if '==' were used.
|
|
|
|
Make sure that we take extract all possible type information when
optimizing a 'let' construct.
Since the stronger optimization may generate false warnings, we also
need to take special care to suppress false warnings.
|
|
The put_map_assoc and put_map_exact instructions in the run-time
system will support that the target register is the same as one of
the source registers. Teach the code generator to take advantage
of that.
The disadvantages of not reusing register when possible is that the
garbage collector may retain dead terms longer than necessary.
|
|
|
|
get_ianno/1 would retrieve either a bare annotation or an
annotation wrapped in an #a{} record. In both cases, it would
return a wrapped annotation.
We can replace the calls to get_ianno/1 with calls to get_anno/1,
because the argument is always an #iclause{} and all iclause records
are always initialized with a wrapped annotation.
|