aboutsummaryrefslogtreecommitdiffstats
path: root/lib/compiler/src/compiler.app.src
AgeCommit message (Collapse)Author
2019-02-19Do the destructive setelement optimization in SSABjörn Gustavsson
The expansion of record field updates, when more than one field is updated, but not a majority of the fields, will create a sequence of calls to `erlang:setelement(Index, Value, Tuple)` where Tuple in the first call is the original record tuple, and in the subsequent calls Tuple is the result of the previous call. Furthermore, all Index values are constant positive integers, and the first call to `setelement` will have the greatest index. Thus all the following calls do not actually need to test at run-time whether Tuple has type tuple, nor that the index is within the tuple bounds. Since OTP R7, the `sys_core_dsetel` pass, run as the very last Core Erlang pass, has optimized this sequence of `setelement` calls to use a special destructive version of `setelement` (called `set_tuple_element`) for all but the very first `setelement` in the sequence. It turns out that the presence of the `set_tuple_element` in SSA code is awkward and can prevent or complicate type analysis and aggressive optimizations. Therefore, this commit removes the `sys_core_dsetel` pass and reimplements it for SSA code. The optimization will be done in the `beam_ssa_pre_codegen` pass (that is, just before code generation and after running all other SSA code optimization passes). In most cases, the resulting BEAM code is identical to previous code. For a few modules, the BEAM code is actually slightly better, with smaller stack frames.
2019-01-16Move optimizations of bs_put* instruction to beam_ssa_optBjörn Gustavsson
Do the optimizations of bs_put* instructions in beam_ssa_opt and remove the beam_bs pass. This can lead to a slight improvement of compilation times.
2018-11-28Share the code for semantically equivalent blocksBjörn Gustavsson
Share code for semantically equivalent blocks referred to to by `br` and `switch` instructions. A similar optimization is done in `beam_jump`, but doing it here as well is beneficial as it may enable other optimizations. Also, if there are many semantically equivalent clauses, this optimization can substanstially decrease compilation times.
2018-10-03Optimize named funs and fun-wrapped macrosJohn Högberg
If a fun is defined locally and only used for calls, it can be replaced with direct calls to the relevant function. This greatly speeds up "named functions" (which rely on make_fun to recreate themselves) and macros that wrap their body in a fun.
2018-09-28Rewrite BSM optimizations in the new SSA-based intermediate formatJohn Högberg
This commit improves the bit-syntax match optimization pass, leveraging the new SSA intermediate format to perform much more aggressive optimizations. Some highlights: * Watch contexts can be reused even after being passed to a function or being used in a try block. * Sub-binaries are no longer eagerly extracted, making it far easier to keep "happy paths" free from binary creation. * Trivial wrapper functions no longer disable context reuse.
2018-09-17Remove the beam_dead and beam_split passesBjörn Gustavsson
Most of the optimizations in beam_dead have been superseded by the optimizations in beam_ssa_dead. The forward/1 pass of beam_dead has been moved to beam_jump. The beam_split pass splits blocks that contain instructions with non-zero labels. Because there are no optimizations left that optimize instructions within blocks, beam_block never needs to put such instructions into blocks in the first place. beam_split also moved 'move' instructions out block to help beam_dead. That is no longer necessary since beam_dead no longer exists.
2018-09-17Add beam_ssa_dead.erlBjörn Gustavsson
Add beam_ssa_dead to perform the main optimizations done by beam_dead: * Shortcut branches that jump to another block with a branch. If it can be seen that the second branch will always branch to a specific block, replace the target of the first branch. * Combined nested sequences of '=:=' tests and switch instructions operating on the same variable to a single switch. Diffing the compiler output, it seems that beam_ssa_dead finds many more opportunities for optimizations than beam_dead, although it does not find all opportunities that beam_dead does. In total, beam_ssa_dead is such improvement over beam_dead that there is no reason to keep beam_dead as well as beam_ssa_dead. Note that beam_ssa_dead does not attempt to optimize away redundant bs_context_binary instructions, because that instruction will be superseded by new instructions in the near future.
2018-08-24Introduce a new SSA-based intermediate formatBjörn Gustavsson
v3_codegen is replaced by three new passes: * beam_kernel_to_ssa which translates the Kernel Erlang format to a new SSA-based intermediate format. * beam_ssa_pre_codegen which prepares the SSA-based format for code generation, including register allocation. Registers are allocated using the linear scan algorithm. * beam_ssa_codegen which generates BEAM assembly code from the SSA-based format. It easier and more effective to optimize the SSA-based format before X and Y registers have been assigned. The current optimization passes constantly have to make sure no "holes" in the X register assignments are created (that is, that no X register becomes undefined that an allocation instruction depends on). This commit also introduces the following optimizations: * Replacing of tuple matching of records with the is_tagged_tuple instruction. (Replacing beam_record.) * Sinking of get_tuple_element instructions to just before the first use of the extracted values. As well as potentially avoiding extracting tuple elements when they are not actually used on all executions paths, this optimization could also reduce the number values that will need to be stored in Y registers. (Similar to beam_reorder, but more effective.) * Live optimizations, removing the definition of a variable that is not subsequently used (provided that the operation has no side effects), as well strength reduction of binary matching by replacing the extraction of value from a binary with a skip instruction. (Used to be done by beam_block, beam_utils, and v3_codegen.) * Removal of redundant bs_restore2 instructions. (Formerly done by beam_bs.) * Type-based optimizations across branches. More effective than the old beam_type pass that only did type-based optimizations in basic blocks. * Optimization of floating point instructions. (Formerly done by beam_type.) * Optimization of receive statements to introduce recv_mark and recv_set instructions. More effective with far fewer restrictions on what instructions are allowed between creating the reference and entering the receive statement. * Common subexpression elimination. (Formerly done by beam_block.)
2018-08-17Don't run unsafe compiler passesBjörn Gustavsson
As a preparation for replacing v3_codegen with a new code generator, remove unsafe optimization passes. Especially the older compiler passes have implicit assumptions about how the code is generated. Remove the optimizations in beam_block (keep the code that creates blocks) because they are unsafe. beam_block also calls beam_utils:live_opt/1, which is unsafe. Remove beam_type because it calls beam_utils:live_opt/1, and also because it recalculates the number of heaps words and number of live registers in allocation instructions, thus potentially hiding bugs in other passes. Remove beam_receive because it is unsafe. Remove beam_record because it is the only remaining user of beam_utils:anno_defs/1. Remove beam_reorder because it makes much more sense to run it as an early SSA-based optimization pass. Remove the now unused functions in beam_utils: anno_def/1 delete_annos/1 is_killed_block/2 live_opt/1 usage/3 Note that the following test cases will fail because of the removed optimizations: compile_SUITE:optimized_guards/1 compile_SUITE:bc_options/1 receive_SUITE:ref_opt/1
2017-10-27Eliminate the v3_life passBjörn Gustavsson
The v3_life pass does not do enough to be worth being its own pass. Essentially it does two things: * Calculates life-time information starting from the annotations that v3_kernel provides. That part can be moved into v3_codegen. * Rewrites the Kernel Erlang records to similar plain tuples (for example, #k_cons{hd=Hd,tl=Tl} is rewritten to {cons,Hd,Tl}). That rewriting is not needed and can be eliminated.
2017-07-06Introduce a new core pass called sys_core_aliasJosé Valim
The goal of this pass is to find values that are built from patterns and generate aliases for those values to remove pressure from the GC. For example, this code: example({ok, Val}) -> {ok, Val}. shall become: example({ok, Val} = Tuple) -> Tuple. Currently this pass aliases tuple and cons nodes made of literals, variables and other cons. The tuple/cons may appear anywhere in the pattern and it will be aliased if used later on. Notice a tuple/cons made only of literals is not aliased as it may be part of the literal pool.
2017-06-14Update copyright yearHans Nilsson
2017-06-07Fix unsafe bit syntax matching optimizationBjörn Gustavsson
As part of sys_core_fold, variables involved in bit syntax matching would be annotated when it would be safe for a later pass to do the delayed sub-binary creation optimization. An implicit assumption regarding the annotation was that the code must not be further optimized. That assumption was broken in 05130e48555891, which introduced a fixpoint iteration (applying the optimizations until there were no more changes). That means that a variable could be annotated as safe for reusing the match context in one iteration, but a later iteration could rewrite the code in a way that would make the optimization unsafe. One way to fix this would be to clear all reuse_for_context annotations before each iteration. But that would be wasteful. Instead I chose to fix the problem by moving out the annotation code to a separate pass (sys_core_bsm) that is run later after all major optimizations of Core Erlang has been done.
2017-03-24compiler: Add is_tagged_tuple instructionBjörn-Egil Dahlberg
Rewrite the instruction stream on tagged tuple tests. Tagged tuples means a tuple of any arity with an atom as its first element. Typically records, ok-tuples and error-tuples. from: ... {test,is_tuple,Fail,[Src]}. {test,test_arity,Fail,[Src,Sz]}. ... {get_tuple_element,Src,0,Dst}. ... {test,is_eq_exact,Fail,[Dst,Atom]}. ... to: ... {test,is_tagged_tuple,Fail,[Src,Sz,Atom]}. ...
2016-11-18Remove beam_boolBjörn Gustavsson
The guard optimizations in v3_kernel has removed the need for beam_bool.
2016-09-29Merge branch 'rickard/time-unit/OTP-13831'Rickard Green
* rickard/time-unit/OTP-13831: Replace usage of deprecated time units
2016-09-01Remove sys_pre_expandBjörn Gustavsson
The previous commits have made sys_pre_expand superfluous. Since sys_pre_expand is undocumented and unsupported it can be removed in a major release without prior deprecation. Also remove code in erl_parse that handles abstract code that has passed through sys_pre_expand. We considered deprecating sys_pre_expand just in case, but decided against it for the following reasons: - Anyone brave and knowledgeable enough to use sys_pre_expand should be able to cope with sys_pre_expand being removed. - If we kept it, but didn't test it anywhere in OTP, it could potentially stop working. So we would probably have to add some test cases.
2016-08-25Replace usage of deprecated time unitsRickard Green
2016-03-15update copyright-yearHenrik Nord
2015-09-28Move out bit syntax optimizations from beam_blockBjörn Gustavsson
In the future we might want to add more bit syntax optimizations, but beam_block is already sufficiently complicated. Therefore, move the bit syntax optimizations out of beam_block into a separate compiler pass called beam_bs.
2015-08-21Delay get_tuple_element instructions until they are neededBjörn Gustavsson
When matching tuples, the pattern matching compiler would generate code that would fetch all elements of the tuple that will ultimately be used, *before* testing that (for example) the first element is the correct record tag. For example: is_tuple Fail {x,0} test_arity Fail {x,0} 3 get_tuple_element {x,0} 0 {x,1} get_tuple_element {x,0} 1 {x,2} get_tuple_element {x,0} 2 {x,3} is_eq_exact Fail {x,1} some_tag If {x,2} and {x,3} are not used at label Fail, we can re-arrange the code like this: is_tuple Fail {x,0} test_arity Fail {x,0} 3 get_tuple_element {x,0} 0 {x,1} is_eq_exact Fail {x,1} some_tag get_tuple_element {x,0} 1 {x,2} get_tuple_element {x,0} 2 {x,3} Doing that may be beneficial in two ways. If the branch is taken, we have eliminated the execution of two unnecessary instructions. Even if the branch is never or rarely taken, there is the possibility for more optimizations following the is_eq_exact instructions. For example, imagine that the code looks like this: get_tuple_element {x,0} 1 {x,2} get_tuple_element {x,0} 2 {x,3} move {x,2} {y,0} move {x,3} {y,1} Assuming that {x,2} and {x,3} have no further uses in the code that follows, that can be rewritten to: get_tuple_element {x,0} 1 {y,0} get_tuple_element {x,0} 2 {y,1} When should we perform this optimization? At the very latest, it must be done before opt_blocks/1 in beam_block which does the elimination of unnecessary moves. Actually, we want do the optimization before the blocks have been established, since moving instructions out of one block into another is cumbersome. Therefore, we will do the optimization in a new pass that is run before beam_block. A new pass will make debugging easier, and beam_block already has a fair number of sub passes.
2015-06-18Change license text to APLv2Bruce Yinhe
2015-05-26Merge branch 'egil/opt-compile-time/OTP-12774'Björn-Egil Dahlberg
* egil/opt-compile-time/OTP-12774: stdlib: Relax erl_anno_SUITE:is_anno/1 test Update primary bootstrap compiler: Use Maps as type information compiler: Use Maps instead of dict in beam_jump compiler: Use cerl_sets instead of gb_sets in beam_type compiler: Use Maps instead of gb_trees in beam_dead compiler: Use cerl_sets instead of gb_sets in beam_jump compiler: Use cerl_sets instead of sets in v3_kernel compiler: Use cerl_sets instead of gb_sets in sys_core_fold compiler: Add cerl_sets module compiler: Scope uses gb_sets not gb_trees beam_dict: Use Maps to map function name indices beam_dict: Use Maps to map line indices beam_dict: Use Maps to map atom indices v3_codegen: Use Maps to map local functions v3_life: Refactor variable db compiler: Use lc instead of map/1 in v3_codegen stdlib: Optimize erl_anno:is_string/1 Conflicts: bootstrap/lib/kernel/ebin/inet_dns.beam bootstrap/lib/stdlib/ebin/erl_anno.beam bootstrap/lib/stdlib/ebin/erl_lint.beam
2015-05-21compiler: Add cerl_sets moduleBjörn-Egil Dahlberg
A sets implementation based on maps.
2015-05-08Update runtime depencies for the compiler applicationBjörn Gustavsson
2015-03-20Merge branch 'rickard/time_api/OTP-11997'Rickard Green
* rickard/time_api/OTP-11997: (22 commits) Update primary bootstrap inets: Suppress deprecated warning on erlang:now/0 inets: Cleanup of multiple copies of functions Add inets_lib with common functions used by multiple modules inets: Update comments Suppress deprecated warning on erlang:now/0 Use new time API and be back-compatible in inets Remove unused functions and removed redundant test asn1 test SUITE: Eliminate use of now/0 Disable deprecated warning on erlang:now/0 in diameter_lib Use new time API and be back-compatible in ssh Replace all calls to now/0 in CT with new time API functions test_server: Replace usage of erlang:now() with usage of new API Replace usage of erlang:now() with usage of new API Replace usage of erlang:now() with usage of new API Replace usage of erlang:now() with usage of new API Replace usage of erlang:now() with usage of new API otp_SUITE: Warn for calls to erlang:now/0 Replace usage of erlang:now() with usage of new API Multiple timer wheels Erlang based BIF timer implementation for scalability Implement ethread events with timeout ... Conflicts: bootstrap/bin/start.boot bootstrap/bin/start_clean.boot bootstrap/lib/compiler/ebin/beam_asm.beam bootstrap/lib/compiler/ebin/compile.beam bootstrap/lib/kernel/ebin/auth.beam bootstrap/lib/kernel/ebin/dist_util.beam bootstrap/lib/kernel/ebin/global.beam bootstrap/lib/kernel/ebin/hipe_unified_loader.beam bootstrap/lib/kernel/ebin/inet_db.beam bootstrap/lib/kernel/ebin/inet_dns.beam bootstrap/lib/kernel/ebin/inet_res.beam bootstrap/lib/kernel/ebin/os.beam bootstrap/lib/kernel/ebin/pg2.beam bootstrap/lib/stdlib/ebin/dets.beam bootstrap/lib/stdlib/ebin/dets_utils.beam bootstrap/lib/stdlib/ebin/erl_tar.beam bootstrap/lib/stdlib/ebin/escript.beam bootstrap/lib/stdlib/ebin/file_sorter.beam bootstrap/lib/stdlib/ebin/otp_internal.beam bootstrap/lib/stdlib/ebin/qlc.beam bootstrap/lib/stdlib/ebin/random.beam bootstrap/lib/stdlib/ebin/supervisor.beam bootstrap/lib/stdlib/ebin/timer.beam erts/aclocal.m4 erts/emulator/beam/bif.c erts/emulator/beam/erl_bif_info.c erts/emulator/beam/erl_db_hash.c erts/emulator/beam/erl_init.c erts/emulator/beam/erl_process.h erts/emulator/beam/erl_thr_progress.c erts/emulator/beam/utils.c erts/emulator/sys/unix/sys.c erts/preloaded/ebin/erlang.beam erts/preloaded/ebin/erts_internal.beam erts/preloaded/ebin/init.beam erts/preloaded/src/erts_internal.erl lib/common_test/test/ct_hooks_SUITE_data/cth/tests/empty_cth.erl lib/diameter/src/base/diameter_lib.erl lib/kernel/src/os.erl lib/ssh/test/ssh_basic_SUITE.erl system/doc/efficiency_guide/advanced.xml
2015-03-20Replace usage of erlang:now() with usage of new APIRickard Green
2015-02-12Break out inlining of 'lists' functions to a new moduleBjörn Gustavsson
The code for inlining high-order functions from the lists module is quite annoying when you try to navigate the sys_core_fold module. Break out the code into its own module.
2014-03-20Introduce runtime_dependencies in .app filesRickard Green
Most dependencies introduced are exactly the dependencies to other applications found by xref. That is, there might be real dependencies missing. There might also be pure debug dependencies listed that probably should be removed. Each application has to be manually inspected in order to ensure that all real dependencies are listed. All dependencies introduced are to application versions used in OTP 17.0. This since the previously used version scheme wasn't designed for this, and in order to minimize the work of introducing the dependencies.
2013-01-25Update copyright yearsBjörn-Egil Dahlberg
2013-01-18Remove support for parameterized modulesBjörn Gustavsson
2012-10-09Introduce the mandatory beam_a and beam_z passesBjörn Gustavsson
Introduce the mandary beam_a pass that will be run directly after code generation, and the mandatory beam_z pass that will be run just before beam_asm. Since these passes surround the optimizations, beam_a can (for example) do instruction renaming to simplify the optimization passes and beam_z can undo those renamings.
2012-01-04Add the beam_except pass to optimize exceptionsBjörn Gustavsson
In order to save space, rewrite suitable calls to erlang:error/{1,2} to special BEAM instructions. This code is probably longer than the code taken out of v3_life and v3_codegen in the previous commit, but it is much easier to understand and maintain since the BEAM assembler format is better understood than the v3_life format.
2011-12-09Update copyright yearsBjörn-Egil Dahlberg
2011-12-06Teach the compiler the 'no_dead' optionBjörn Gustavsson
To facilitate debugging of compiler bugs, teach the compiler the 'no_dead' option. Since the beam_dead pass used to do the necessary splitting of basic blocks to expose all labels, we must move that splitting into a separate pass that is always run.
2010-05-12Merge branch 'bg/opt-receive' into devErlang/OTP
* bg/opt-receive: Test that gen_server:call/2,3 are fast even with a huge message queue erts: Add tests for the receive optimization Update primary bootstrap erts: Implement recv_mark/1 and recv_set/1 for real compiler tests: Cover the error handling code in beam_receive compiler test: Test optimization of receive statements Optimize selective receives in the presence of a large message queue Introduce the new recv_mark/1 and recv_mark/1 instructions Compile tests that communicate with R12 nodes with the r12 option Move p_run/2 to test_lib gen: Inline wait_resp_mon/2 to help the compiler optimize OTP-8623 bg/opt-receive reveive statements that can only read out a newly created reference are now specially optimized so that it will execute in constant time regardless of the number of messages in the receive queue for the process. That optimization will benefit calls to gen_server:call(). (See gen:do_call/4 for an example of a receive statement that will be optimized.)
2010-05-11Optimize selective receives in the presence of a large message queueBjörn Gustavsson
If a gen_server process has many messages in its message queue and calls another gen_server process, the selective receive in gen_server:call() will have to go through the entire message queue. Have the compiler generate the new mark_recv/1 and mark_recv/1 instructions that can avoid going through the entire message queue.
2009-11-20The R13B03 release.OTP_R13B03Erlang/OTP