aboutsummaryrefslogtreecommitdiffstats
path: root/lib/compiler/src/v3_codegen.erl
AgeCommit message (Collapse)Author
2016-11-18v3_kernel: Generate optimized code for guardsBjörn Gustavsson
The compiler produces poor code for complex guard expressions with andalso/orelse. Here is an example from the filename module: -define(IS_DRIVELETTER(Letter),(((Letter >= $A) andalso (Letter =< $Z)) orelse ((Letter >= $a) andalso (Letter =< $z)))). skip_prefix(Name, false) -> Name; skip_prefix([L, DrvSep|Name], DrvSep) when ?IS_DRIVELETTER(L) -> Name; skip_prefix(Name, _) -> Name. beam_bool fails to simplify the code for the guard, leaving several 'bif' instructions: {function, skip_prefix, 2, 49}. {label,48}. {line,[{location,"filename.erl",187}]}. {func_info,{atom,filename},{atom,skip_prefix},2}. {label,49}. {test,is_ne_exact,{f,52},[{x,1},{atom,false}]}. {test,is_nonempty_list,{f,52},[{x,0}]}. {get_list,{x,0},{x,2},{x,3}}. {test,is_nonempty_list,{f,52},[{x,3}]}. {get_list,{x,3},{x,4},{x,5}}. {bif,'=:=',{f,52},[{x,1},{x,4}],{x,6}}. {test,is_ge,{f,50},[{x,2},{integer,65}]}. {bif,'=<',{f,52},[{x,2},{integer,90}],{x,7}}. {test,is_eq_exact,{f,51},[{x,7},{atom,false}]}. {test,is_ge,{f,50},[{x,2},{integer,97}]}. {bif,'=<',{f,52},[{x,2},{integer,122}],{x,7}}. {jump,{f,51}}. {label,50}. {move,{atom,false},{x,7}}. {label,51}. {bif,'=:=',{f,52},[{x,7},{atom,true}],{x,7}}. {test,is_eq_exact,{f,52},[{x,6},{atom,true}]}. {test,is_eq_exact,{f,52},[{x,7},{atom,true}]}. {move,{x,5},{x,0}}. return. {label,52}. return. We can add optimizations of guard tests to v3_kernel to achive a better result: {function, skip_prefix, 2, 49}. {label,48}. {line,[{location,"filename.erl",187}]}. {func_info,{atom,filename},{atom,skip_prefix},2}. {label,49}. {test,is_ne_exact,{f,51},[{x,1},{atom,false}]}. {test,is_nonempty_list,{f,51},[{x,0}]}. {get_list,{x,0},{x,2},{x,3}}. {test,is_nonempty_list,{f,51},[{x,3}]}. {get_list,{x,3},{x,4},{x,5}}. {test,is_eq_exact,{f,51},[{x,1},{x,4}]}. {test,is_ge,{f,51},[{x,2},{integer,65}]}. {test,is_lt,{f,50},[{integer,90},{x,2}]}. {test,is_ge,{f,51},[{x,2},{integer,97}]}. {test,is_ge,{f,51},[{integer,122},{x,2}]}. {label,50}. {move,{x,5},{x,0}}. return. {label,51}. return. Looking at the STDLIB application, there were 112 lines of BIF calls in guards that beam_bool failed to convert to test instructions. This commit eliminates all those BIF calls. Here is how I counted the instructions: $ PATH=$ERL_TOP/bin:$PATH erlc -I ../include -I ../../kernel/include -S *.erl $ grep "bif,'[=<>]" *.S | grep -v f,0 dets.S: {bif,'=:=',{f,547},[{x,4},{atom,read_write}],{x,4}}. dets.S: {bif,'=:=',{f,547},[{x,5},{atom,saved}],{x,5}}. dets.S: {bif,'=:=',{f,589},[{x,5},{atom,read}],{x,5}}. . . . $ grep "bif,'[=<>]" *.S | grep -v f,0 | wc 112 224 6765 $
2016-09-21Simplify handling of internal BIFsBjörn Gustavsson
Do a simpler translation of internal BIFs. While we are it, also remove the dummy values of Index and Uniq from the make_fun internal operation.
2016-06-02Eliminate crash for map updates in guardsBjörn Gustavsson
beam_validator would complain that x(1) is uninitialized in a test_heap instruction when attempting to compile the following code with sys_core_fold turned off: foo(M) when not (M#{true := 0}); [M] -> ok. Simplified, the generated BEAM assembly code looked like this: test is_map BadMap x(0) put_map_exact Fail x(0) => x(1) ... jump BooleanStuff BadMap: move ok => x(1) jump Fail BooleanStuff: ... move Boolean => x(2) jump Build Fail: move false => x(2) Build: test_heap 2 3 %% x(0), x(1), x(2) must be live. ... That is, if put_map_exact failed, control would transfer to the label Fail without initializing x(1). Fix that by making sure that x(1) is initilized even if put_map_exact fails: test is_map BadMap x(0) put_map_exact BadLbl x(0) => x(1) ... jump OkLbl BadLbl: move ok => x(1) jump Fail OkLbl: jump BooleanStuff BadMap: move ok => x(1) jump Fail BooleanStuff: ... move Boolean => x(2) jump Build Fail: move false => x(2) Build: test_heap 2 3 %% x(0), x(1), x(2) must be live. ... Note that this situation is rare, and that other optimization passes (beam_dead and beam_jump in particular) will clean up this mess.
2016-05-25v3_codegen: Don't confuse beam_validatorBjörn Gustavsson
Generate code that not only is safe, but can easily be seen by beam_validator to be safe.
2016-05-25v3_codegen: Correct code generation for an error/1 call in a guardBjörn Gustavsson
Sometimes v3_codegen would generate unsafe code when there was a call to error/1 in a guard.
2016-03-15update copyright-yearHenrik Nord
2016-01-11Merge branch 'maint'Björn Gustavsson
* maint: Eliminate crash in v3_codegen
2016-01-11Eliminate crash in v3_codegenBjörn Gustavsson
The following code would crash v3_codegen: order(From) -> catch if From#{[] => sufficient} -> saint end. Before explaining the crash, first some background on the stack frame and the Y registers. Certain instructions, most notably the 'call' instructions, clobber all X registers. Before any such instruction, all X registers that have values that will be used after the call must be saved to Y registers (i.e. to the stack frame). adjust_stack/4 will be called when X registers must be saved. There is also another situation when X registers must be saved, namely within a 'catch' if we are about to execute any instruction that may cause an exception. Examples of such instructions are some guard BIFs (such as length/1) and construction of binaries or maps. Within a 'catch', X registers must be be saved because if an exception is thrown and catched all X registers will be destroyed. The same adjust_stack/4 function will be called for those instructions, but only if they occur within a 'catch'. There is actually one more complication. If there is code in a guard within a catch, the X registers should not be saved, because the code in a guard never clobbers any X registers that were alive before the guard code was entered. v3_codegen is written with the implicit assumption that code in guards never cause anything to be saved to Y registers. The code for building maps and binaries would incorrectly save X registers within a guard inside a 'catch'. For construction of binaries, that would mean that a useless but harmelss 'move' instruction was generated. But for construction of maps, the saving of the Y register would not be harmless. There would be a crash when attempting to merge #sr{} records. #sr{} records keeps track of the contents of X and Y registers. When two separate code paths are joined (e.g. at the end of 'case' statement), the register descriptors must be reconciled. Basically, the register descriptors for both paths must be identical. The #sr{} record for one path must not claim that {y,0} contains a certain value, while another path claims that {y,0} is dead. Thus, the crash occurs in sr_merge/2 when failing to reconcile the Y registers. To fix this bug this bug we will introduce a new function called maybe_adjust_stack/5. It will save X registers on the stack only if the code is inside a catch but not inside a guard. We will change all existing code to use this new function when appropriate. Reported-by: Thomas Arts
2015-09-28v3_codegen: Optimize matching of the final size-less binary segmentBjörn Gustavsson
Consider the following function: f(Bin, Bool) -> case Bin of <<Val:16/binary,_/binary>> when Bool -> Val end. Simplified, the generated code looks like: bs_start_match2 Fail Live Bin => Bin bs_get_integer2 Fail Live Bin size=Sz unit=1 => Val bs_skip_bits2 Fail Bin size=all unit=8 is_eq_exact Fail Bool true The code generator will replace the bs_skip_bits2 instruction with a bs_test_unit instruction if it can be clearly seen that the context register will not be used again. In this case, it is not obvious without looking at the code at the Fail label. However, it turns out that bs_test_unit instruction is always safe beacuse of the way v3_kernel compiles pattern matching. It doesn't matter whether the match context will be used again. If it will be used again, the position in it will *not* be used. Instead, a bs_restore2 instruction will restore one of the saved instructions.
2015-06-18Change license text to APLv2Bruce Yinhe
2015-05-21v3_codegen: Use Maps to map local functionsBjörn-Egil Dahlberg
2015-05-21compiler: Use lc instead of map/1 in v3_codegenBjörn-Egil Dahlberg
Small speed increase for large files.
2015-04-29v3_core, v3_codegen: Eliminate old-style catchesBjörn Gustavsson
2015-04-22v3_codegen: Reduce cost for fixing up bs_match_string instructionsBjörn Gustavsson
Commit b76588fb5a introduced an optimization of the compile time of huge functions with many bs_match_string instructions. The optimization is done in two passes. The first pass coalesces adjacent bs_match_string instructions. To avoid copying bitstrings multiple times, the bitstrings in the instructions are combined in to a (deep) list. The second pass goes through all instructions in the function and combines the list of bitstrings to a single bitstring in all bs_match_string instructions. The second pass (fix_bs_match_string) is run on all instructions in each function, even if there are no bs_match_instructions in the function. While fix_bs_match_string is not a bottleneck (it is a linear pass), its execution time is noticeable when profiling some modules. Move the execution of the second pass to the select_binary() function so that it will only be executed for instructions that do binary matching. Also take the opportunity to optimize away uses of bs_restore2 that occour directly after a bs_save2. That optimimization is currently done in beam_block, but it can be done essentially for free in the same pass that fixes up bs_match_string instructions.
2015-04-22v3_codegen: Optimize "turning" of y registersBjörn Gustavsson
Profiling shows that the execution time for "turning" y registers is noticeable for some modules (e.g. S1AP-PDU-Contents from the asn1 test suite). We can reduce the impact on running time by special-casing important instructions. In particular, there is no need to look for y registers in the list argument for a select_val instruction.
2015-04-13v3_codegen: Don't sort map keys in map creation/updateBjörn Gustavsson
It is no longer necessary to sort the keys, since the loader does the sorting.
2015-03-11v3_life: Combine literal/2 and literal2/2Björn Gustavsson
For unclear reasons, there are two functions in v3_life that are almost identical: literal/2 and literal2/2. literal/2 is used for expressions and literal2/2 for patterns. It turns out that literal2/2 can do everything that literal/2 can do, except that it transforms maps differently. If we adjust v3_codegen to accept the same format of maps in expressions and patterns, we can rename literal2/2 to literal/2 and use it for expressions and patterns.
2015-03-09v3_codegen: Don't save options in the process dictionaryBjörn Gustavsson
v3_codegen puts the compilation in the process dictionary, but never uses them.
2015-03-09v3_codegen: Teach the put_map_* instructions to reuse source registersBjörn Gustavsson
The put_map_assoc and put_map_exact instructions in the run-time system will support that the target register is the same as one of the source registers. Teach the code generator to take advantage of that. The disadvantages of not reusing register when possible is that the garbage collector may retain dead terms longer than necessary.
2014-10-01compiler: Fix harmless need_heap error for MapsBjörn-Egil Dahlberg
Need heap for maps is zero and fall through is also zero.
2014-08-26compiler: Use variables in Map beam assmeblerBjörn-Egil Dahlberg
2014-07-03compiler: Maps are always patterns never values in matchingBjörn-Egil Dahlberg
Removed dead code.
2014-03-24compiler: Remove redudant code in v3_codegenBjörn-Egil Dahlberg
2014-03-24compiler: Remove redundant clause in v3_codegenBjörn-Egil Dahlberg
Multiple 'nil' cannot happen on instruction level. Map keys are always unique with literals.
2014-03-04Properly sort map pairs in v3_codegenAnthony Ramine
Literal nil values aren't tagged tuple but the bare atom nil. The function lists:sort/2 expects the passed function to return true if the first element is less than or equal to the second, not strictly less than. The original base clause is changed accordingly. Reported-by: Ulf Norell
2014-02-13compiler: Change map instructions for fetching valuesBjörn-Egil Dahlberg
* Combine multiple get values with one instruction * Combine multiple check keys with one instruction
2014-02-07compiler: Fix codegen multiple updates for MapsBjörn-Egil Dahlberg
This fixes an error on multiple updates optimization for map pairs. The error was introduced with moving to term order in Maps. This also fixes an error where register life time was lost for values and could result in erroneuos values being emitted in for map pairs. Simplified v3_codegen by moving multiple update optimizations to v3_kernel.
2014-01-29compiler: Fix term order compiler for mapsBjörn-Egil Dahlberg
2014-01-28compiler: Implement different instructions for => and :=Björn Gustavsson
2014-01-28Pass the map pair operators through to the v3_codegen passBjörn Gustavsson
2014-01-28compiler: Fix sorted keys in put_map instructionBjörn-Egil Dahlberg
All pairs in a Map needs to be in strict ascending key order.
2014-01-28compiler: Fix no basic blocks for mapsBjörn-Egil Dahlberg
2014-01-28compiler: Fix stack register reorderingBjörn-Egil Dahlberg
Can now handle {list [reg()]} elements in instructions.
2014-01-28Implement support for maps in the compilerBjörn Gustavsson
To make it possible to build the entire OTP system, also define dummys for the instructions in ops.tab.
2014-01-16compiler: Correct line number in exception from binary constructionBjörn Gustavsson
Reported-by: Stanislav Seletskiy
2012-11-12Merge branch 'maint'Björn Gustavsson
* maint: Fix compiler crash for binary matching and a complicated guard
2012-11-06Fix compiler crash for binary matching and a complicated guardBjörn Gustavsson
The compiler would crash when attempting to compile a function head that did binary matching and had a complex expression using 'andalso' and 'not'. Noticed-by: José Valim
2012-10-09v3_codegen: Combine adjacent bs_match_string instructionsBjörn Gustavsson
In modules with huge functions with many bs_match_string instructions, we can speed up the compilation by combining adjacent bs_match_strings instruction in v3_codegen (as opposed to in beam_block where we used to do it). For instance, on my computer the v3_codegen became more than twice as fast when compiling the re_testoutput1_split_test module in the STDLIB test suites.
2012-10-09v3_codegen: Clarify code generation for binary constructionBjörn Gustavsson
Add some comments to make the code generation for binary construction somewhat clearer. Also get rid of the cg_bo_newregs/2 function that serves no useful purpose. (It was probably intended to undo the effect of cg_live/2, but note that the value passed to cg_bo_newregs/2 has not passed through cg_live/2.)
2012-10-09v3_codegen: Clean up comments and remove out-commented codeBjörn Gustavsson
2012-10-09v3_codegen: Don't duplicate the functionality of cg_reg_arg/2Björn Gustavsson
2012-03-30Update copyright yearsBjörn-Egil Dahlberg
2012-03-14v3_core: Don't put negative line numbers in annotationsBjörn Gustavsson
In Core Erlang and later passes, compiler-generated code can be indicated in two different ways: by negative line numbers and by a 'compiler_generated' annotation. Simplify the code and improve coverage by turning negative line numbers positive and adding a 'compiler_generated' annotation in the v3_core pass. That means that Core Erlang and latter passes do not have deal with negative line numbers.
2012-01-11v3_codegen: Eliminate the special case of 'put' without destinationBjörn Gustavsson
If we let v3_kernel make sure that a 'put' operation always has a destination register, the special case in v3_codegen is not needed.
2012-01-04Eliminate the match_fail primop in v3_kernel and later passesBjörn Gustavsson
In the v3_life pass, it is assumed that a 'match_fail' primop only occur at the top-level and at the end of a function. But this code: do_split_cases(A) -> case A of x -> Z = dummy1; _ -> Z = dummy2, a=b end, Z. will be optimized by sys_core_fold to the following code: 'split_cases'/1 = fun (_cor0) -> let <_cor7,Z> = case _cor0 of <'x'> when 'true' -> < 'dummy1','dummy1' > <_cor6> when 'true' -> %% Here follows a 'match_fail' primop inside %% multiple return values: < primop 'match_fail'({'badmatch','b'}),'dummy2' > end in Z moving the 'match_fail' primop into a "values" construction. In the future, we would like to get rid of the v3_life pass (it is there for historical reasons), so in the mean-time we prefer to not add more code to it by generalizing the handling of 'match_fail'. Since the 'match_fail' primop can be simulated by erlang:error/{1,2}, the simplest solution is to translate 'match_fail' to a call to erlang:error/{1,2} in v3_kernel and remove the handling of 'match_fail' in v3_life and v3_codegen. It is tempting to get rid of 'match_fail' also in the Core Erlang format, but there are two issues: - Removing the support for 'match_fail' completely may break tools that generate Core Erlang code. We should not do that in a minor release. - There is no easy way to generate a 'function_clause' exception that will remain correct if it will be inlined into another function. (Calling "erlang:error(function_clause, Args)" is fine only if it is not inlined into another function.) A good solution probably involves introducing new instructions, which is better done in a major release. Noticed-by: Håkan Matsson Minimized-test-case-by: Erik Søe Sørensen
2011-08-16compiler: Generate line instructionsBjörn Gustavsson
2011-03-11Update copyright yearsBjörn-Egil Dahlberg
2011-02-07v3_codegen: Use the latest instance of StBjörn Gustavsson
By accident a previous instance of St is used, which is harmless in this case, but leads to worse quality of the generated code.
2010-09-23Fix compiler crash when constructing zero-size binary segmentsBjörn Gustavsson
Code such as foo(A) -> <<A:0>>. would cause a compiler crash.
2010-03-26compiler: Suppress bs_context_to_binary/1 for a literal operandBjörn Gustavsson
The inliner can cause illegal uses of the bs_context_to_binary/1 instruction, such as: bs_context_to_binary {literal,"string"} or bs_context_to_binary {integer,1} Remove the bs_context_to_binary/1 instruction if the operand is not a register (it is clearly not needed).