otp.git - Mirror of Erlang/OTP repository.

Age	Commit message (Collapse)	Author
2018-03-02	beam_block: Fix unsafe sinking of get_tuple_element/3	Björn Gustavsson
	In the following code: {get_tuple_element,{x,0},0,{x,1}}. {put_tuple,2,{x,1}}. {put,{atom,badmap}}. {put,{x,0}}. {move,{x,1},{x,0}}. beam_block would move the get_tuple_element/3 instruction and eliminate the move/2 instruction: {put_tuple,2,{x,1}}. {put,{atom,badmap}}. {put,{x,0}}. {get_tuple_element,{x,0},0,{x,0}}. That is not correct, since the result of the tuple building in {x,1} is now ignored.
2018-02-14	beam_block: Combine blocks when running beam_block the second time	Björn Gustavsson
	1a029efd1ad47f started to run the beam_block pass a second time, but it did not attempt to combine adjacent blocks. Combining adjacent blocks leads to many more opportunities for optimizations. After doing some diffing in generated code, it turns out that there is no benefit for beam_split to split out line instructions from blocks. It seems that the only reason it was done was to slightly simplify the implementation of the no_line_info option in beam_clean.
2018-02-14	Disable CSE for floating point operations	Björn Gustavsson
	As a preparation for combining blocks before running beam_block for the second time, disable CSE for floating point operations because it will generate invalid code.
2018-02-12	Fix unsafe use of 'allocate' where 'allocate_zero' should be used	Björn Gustavsson
	The more aggressive optimizations of 'allocate_zero' introduced in cb6fc15c35c7e could produce unsafe code such as the following: {allocate,0,1}. {bif,element,{f,0},[{integer,1},{x,0}],{x,0}}. The code is not safe because if element/2 fails, the runtime system may scan the stack and find garbage that looks like a catch tag, and would most probably crash. Fix the problem by making beam_utils:is_killed/3 be more conservative when asked whether a Y register will be killed. Also fix an unsafe move upwards of an allocation instruction in beam_block.
2018-02-01	Merge pull request #1701 from bjorng/bjorn/get_hd_tl	Björn Gustavsson
	Eliminate get_list/3 internally in the compiler
2018-01-30	Fix incorrect handling of floating point instructions	Björn Gustavsson
	1a029efd1ad47f started to run the beam_block pass a second time. Since it is run after introduction of the optimized floating point instructions, it must handle those instructions correctly. In particular, it must be careful when hoisting allocation instructions. For example, the following code: {test_heap,{alloc,[{words,0},{floats,1}]},5}. . . . {fmove,{fr,2},{x,0}}. {allocate_zero,1,4}. must not be rewritten to: {test_heap,{alloc,[{words,0},{floats,1}]},5}. . . . {allocate_zero,1,4}. {fmove,{fr,2},{x,0}}. because beam_validator will not consider it safe. (The code may actually be safe depending on what the code between the two allocation instructions do.) https://bugs.erlang.org/browse/ERL-555
2018-01-26	Eliminate get_list/3 internally in the compiler	Björn Gustavsson
	Instructions that produce more than one result complicate optimizations. get_list/3 is one of two instructions that produce multiple results (get_map_elements/3 is the other). Introduce the get_hd/2 and get_tl/2 instructions that return the head and tail of a cons cell, respectively, and use it internally in all optimization passes. For efficiency, we still want to use get_list/3 if both head and tail are used, so we will translate matching pairs of get_hd and get_tl back to get_list instructions.
2018-01-24	Apply common subexpression elimination in blocks	Björn Gustavsson
	Eliminate repeated evaluation of guard BIFs and building of cons cells in blocks. This optimization is applicable in more places than might be expected, because code generation for binaries and record can generate common sub expressions not visible in the original source code. For example, consider this function: make_binary(Term) -> Bin = term_to_binary(Term), Size = byte_size(Bin), <<Size:32,Bin/binary>>. The compiler inserts a call to byte_size/2 to calculate the size of the binary being built: {function, make_binary, 1, 2}. {label,1}. {line,...}. {func_info,{atom,t},{atom,make_binary},1}. {label,2}. {allocate,0,1}. {line,...}. {call_ext,1,{extfunc,erlang,term_to_binary,1}}. {line,...}. {gc_bif,byte_size,{f,0},1,[{x,0}],{x,1}}. %Present in original code. {line,...}. {gc_bif,byte_size,{f,0},2,[{x,0}],{x,2}}. %Inserted by compiler. {bs_add,{f,0},[{x,2},{integer,4},1],{x,2}}. {bs_init2,{f,0},{x,2},0,2,{field_flags,[]},{x,2}}. {bs_put_integer,{f,0},{integer,32},1,{field_flags,[unsigned,big]},{x,1}}. {bs_put_binary,{f,0},{atom,all},8,{field_flags,[unsigned,big]},{x,0}}. {move,{x,2},{x,0}}. {deallocate,0}. return. Common sub expression elimination (CSE) eliminates the second call to byte_size/2: {function, make_binary, 1, 2}. {label,1}. {line,...}. {func_info,{atom,t},{atom,make_binary},1}. {label,2}. {allocate,0,1}. {line,...}. {call_ext,1,{extfunc,erlang,term_to_binary,1}}. {line,...}. {gc_bif,byte_size,{f,0},1,[{x,0}],{x,1}}. {move,{x,1},{x,2}}. {bs_add,{f,0},[{x,2},{integer,4},1],{x,2}}. {bs_init2,{f,0},{x,2},0,2,{field_flags,[]},{x,2}}. {bs_put_integer,{f,0},{integer,32},1,{field_flags,[unsigned,big]},{x,1}}. {bs_put_binary,{f,0},{atom,all},8,{field_flags,[unsigned,big]},{x,0}}. {move,{x,2},{x,0}}. {deallocate,0}. return. Note: A possible future optimization would be to include binary construction instructions in blocks. If that is done, the {move,{x,1},{x,2}} instruction could also be eliminated.
2018-01-24	Correct bug in beam_block:opt/2	Björn Gustavsson
	The folling sequence in a block: {move,{x,1},{x,2}}. {move,{x,2},{x,2}}. would be incorrectly rewritten to: {move,{x,2},{x,2}}. (Which in turn would be optimized away a little bit later.)
2018-01-24	Correct unsafe optimizations in beam_block	Björn Gustavsson
	When attempting to eliminate the move/2 instruction in the following code: {bif,self,{f,0},[],{x,0}}. {move,{x,0},{x,1}}. . . . {put_tuple,2,{x,1}}. {put,{atom,ok}}. {put,{x,0}}. beam_block would produce the following unsafe code: {bif,self,{f,0},[],{x,1}}. . . . {put_tuple,2,{x,1}}. {put,{atom,ok}}. {put,{x,1}}. It is unsafe because the tuple is self-referential. The following code: {put_list,{y,6},nil,{x,4}}. {move,{x,4},{x,5}}. {put_list,{y,1},{x,5},{x,5}}. . . . {put_tuple,2,{x,6}}. {put,{x,4}}. {put,{x,5}}. would be incorrectly transformed to: {put_list,{y,6},nil,{x,5}}. {put_list,{y,1},{x,5},{x,5}}. . . . {put_tuple,2,{x,6}}. {put,{x,5}}. {put,{x,5}}. (Both elements in the built tuple get the same value.)
2018-01-12	Merge pull request #1680 from bjorng/bjorn/compiler/beam_block	Björn Gustavsson
	Run beam_block a second time
2018-01-11	Run beam_block again after other optimizations have been run	Björn Gustavsson
	Running beam_block again after the other optimizations have run will give it more opportunities for optimizations. In particular, more allocate_zero/2 instructions can be turned into allocate/2 instructions, and more get_tuple_element/3 instructions can store the retrieved value into the correct register at once. Out of a sample of about 700 modules in OTP, 64 modules were improved by this commit.
2018-01-11	beam_block: Reorder element/2 calls in guards	Björn Gustavsson
	In a guard, reorder two consecutive calls to the element/2 BIF that access the same tuple and have the same failure label so that highest index is fetched first. That will allow the second element/2 to be replace with the slightly cheaper get_tuple_element/3 instruction.
2018-01-11	Refactor '%live' and '%def' annotations	Björn Gustavsson
	The annotations in the optimizing passes currently looks like this: {'%live',NumRegistersUsed,RegistersUsedBitmap} {'%def',RegistersDefinedBitmap} (NumRegistersUsed is no longer used.) When I attempted to extend some optimizations, I found that I had to add additional clauses to tolerate/handle both types of annotations. That problem would only get worse if any more annotations are added in the future. To simplify annotation handling, this commit wraps both types of annotations in a {'%anno',_} tuple: {'%anno',{used,RegistersUsedBitmap}} {'%anno',{def,RegistersDefinedBitmap}} The '%live' annotation has been renamed to 'used' to make it somewhat clearer what it means, and the unused NumRegistersUsed part of the old annotation has been removed. Alternatives considered: My first attempt was to wrap the annotation in a 'set' tuple so that there would only be 'set' tuples in a block. For example: {set,[],[],{anno,{live,RegistersUsedBitmap}}} It was not as convenient as expected. Annotations often need to be handled specially from other instructions in a block. When they are wrapped in a 'set' tuple, they can very easily be handled incorrectly or passed on to the next pass. That causes subtle errors or worse code, and it can be difficult to debug. Therefore, my conclusion is that annotations should be distinct from other instructions, to make it obvious when one have missed to handle an annotation.
2018-01-10	beam_block: Improve optimization of allocate_zero instructions	Björn Gustavsson
	Turn more allocate_zero instructions into allocate instructions.
2017-12-18	Merge pull request #1658 from bjorng/bjorn/compiler/delay-stackframe	Björn Gustavsson
	Delay creation of stack frames
2017-12-15	beam_block: Improve moving of allocations	Björn Gustavsson
	Use annotations added by beam_utils:anno_defs/1 to move more allocations upwards in the instruction stream. That in turn allows us to optimize away more 'move' instructions.
2017-12-08	Use the new syntax for retrieving stack traces	Björn Gustavsson

2017-11-23	Place move S x0 instructions at the end of blocks	Michał Muskała
	The loader has a lot of fused instructions that include move S x0. Placing them at the end of blocks makes it possible to take advantage of this optimization more frequently.
2017-01-12	Add specs for the beam_*:module/2 functions	Björn Gustavsson

2016-10-05	beam_block: Avoid unsafe inclusion of get_map_elements in blocks	Björn Gustavsson
	c2035ebb8b restricted the get_map_elements instruction so that it could only occur at the beginning of a block. It turns out that including it anywhere in a block is unsafe. Therefore, never put get_map_elements instruction in blocks. (Also remove the beam_utils:join_even/2 function since it is no longer used.) ERL-266
2016-08-05	beam_block: Fix potentially unsafe optimization in move_allocates/1	Björn Gustavsson
	beam_block has an optimization that only is safe when it is applied immediately after code generation. That is pointed out in a comment: NOTE: Moving allocation instructions is only safe because it is done immediately after code generation so that we KNOW that if {x,X} is initialized, all x registers with lower numbers are also initialized. That assumption may not be true after other optimizations, such as the beam_utils:live_opt/1 optimization. The new beam_reorder pass added in OTP 19 runs before beam_block. Therefore, the optimization is potentially unsafe. The optimization is also unsafe if compilation is started from assembly code in a .S file. Rewrite the optimization to make it safe. See the newly added comment for details. ERL-202
2016-06-01	beam_block: Eliminate crash in beam_utils	Björn Gustavsson
	Somewhat simplified, beam_block would rewrite the target for the first instruction in this code sequence: move x(0) => y(1) gc_bif '+' 1 x(0) => y(0) move y(1) => x(1) move nil => x(0) call 2 local_function/2 The resulting code would be: move x(0) => x(1) %% Changed target. gc_bif '+' 1 x(0) => y(0) move x(1) => y(1) %% Operands swapped (see 02d6135813). move nil => x(0) call 2 local_function/2 The resulting code is not safe because the x(1) will be killed by the gc_bif instruction. 7a47b20c3a cleaned up move optimizations and would reject the optimization if the target was an X register and an allocating instruction was found. To avoid this bug, the optimization must be rejected even if the target is a Y register.
2016-04-13	Merge branch 'henrik/update-copyrightyear'	Henrik Nord
	* henrik/update-copyrightyear: update copyright-year
2016-04-08	Remove unreachable code after 'raise' instructions	Björn Gustavsson
	Remove the unreachable instructions after a 'raise' instruction (e.g. a 'jump' or 'deallocate', 'return') to decrease code size.
2016-03-15	update copyright-year	Henrik Nord

2016-03-10	beam_block: Eliminate unsafe optimization	Björn Gustavsson
	Consider this code: %% Start of block get_tuple_element Tuple 0 Element get_map_elements Fail Map [Key => Dest] . . . move Element UltimateDest %% End of block Fail: %% Code that uses Element. beam_block (more precisely, otp_tuple_element/1) would incorrectly transform the code to this: %% Start of block get_map_elements Fail Map [Key => Dest] . . . get_tuple_element Tuple 0 UltimateDest %% End of block Fail: %% Code that uses Element. That is, the code at label Fail would use register Element, which is either uninitalized or contains the wrong value. We could fix this problem by always keeping label information at hand when optimizing blocks so that we could check the code at the failure label for get_map_elements. That would require changes to beam_block and beam_utils. We might consider doing that in the future if it turns out be worth it. For now, I have decided that I want to keep the simplicity of blocks (allowing them to be optimized without keeping label information). That could be achieved by not including get_map_elements in blocks. Another way, which I have chosen, is to only allow get_map_elements as the first instruction in the block. For background on the bug: c288ab8 introduced the beam_reorder pass and 5f431276 introduced opt_tuple_element() in beam_block.
2015-09-28	beam_type: Improve optimizations by keeping track of booleans	Björn Gustavsson
	There is an optimization in beam_block to simplify a select_val on a known boolean value. We can implement this optimization in a cleaner way in beam_type and it will also be applicable in more situations. (When I added the optimization to beam_type without removing the optimization from beam_block, the optimization was applied 66 times.)
2015-09-28	Move out bit syntax optimizations from beam_block	Björn Gustavsson
	In the future we might want to add more bit syntax optimizations, but beam_block is already sufficiently complicated. Therefore, move the bit syntax optimizations out of beam_block into a separate compiler pass called beam_bs.
2015-09-21	Regain full coverage of beam_block	Björn Gustavsson
	d0784035ab fixed a problem with register corruption. Because of that, opt_moves/2 will never be asked to optimize instructions with more than two destination registers. Therefore, to regain full coverage of beam_block, remove the final clause in opt_moves/2.
2015-09-07	Merge branch 'maint'	Björn-Egil Dahlberg

2015-09-04	compiler: Fix get_map_elements register corruption	Björn-Egil Dahlberg
	Instruction get_map_elements might destroy target registers when the fail-label is taken. Only seen for patterns with two, and only two, target registers. Specifically: we copy one register, and then jump. foo(A,#{a := V1, b := V2}) -> ... foo(A,#{b := V}) -> ... call foo(value, #{a=>whops, c=>42}). corresponding assembler: {test,is_map,{f,5},[{x,1}]}. {get_map_elements,{f,7},{x,1},{list,[{atom,a},{x,1},{atom,b},{x,2}]}}. %% if 'a' exists but not 'b' {x,1} is overwritten, jump {f,7} {move,{integer,1},{x,0}}. {call_only,3,{f,10}}. {label,7}. {get_map_elements,{f,8},{x,1},{list,[{atom,b},{x,2}]}}. %% {x,1} (src) is read with a corrupt value {move,{x,0},{x,1}}. {move,{integer,2},{x,0}}. {call_only,3,{f,10}}. The fix is to remove 'opt_moves' pass for get_map_elements instruction in the case of two or more destinations. Reported-by: Valery Tikhonov
2015-08-21	beam_validator: Don't allow x(1023) to be used	Björn Gustavsson
	In 45f469ca0890, the BEAM loader started to use x(1023) as scratch register for some instructions. Therefore we should not allow x(1023) to be used in code emitted by the compiler.
2015-08-21	Put 'try' in blocks to optimize allocation instructions	Björn Gustavsson
	Put 'try' instructions inside block to improve the optimization of allocation instructions. Currently, the compiler only looks at initialization of y registers inside blocks when determining which y registers that will be "naturally" initialized.
2015-08-21	Optimize get_tuple_element instructions by moving them forward	Björn Gustavsson

2015-08-21	beam_block: Improve the move optimizations	Björn Gustavsson
	Here is an example of a move instruction that could not be optimized away because the {x,2} register was not killed: get_tuple_element Reg Pos {x,2} . . . move {x,2} {y,0} put_list {x,2} nil Any We can do the optimization if we replace all occurrences of the {x,2} register as a source with {y,0}: get_tuple_element Reg Pos {y,0} . . . put_list {y,0} nil Dst
2015-08-21	beam_block: Clean up optimization of move optimizations	Björn Gustavsson
	The 'move' optimization was relatively clean until GC BIFs were introduced. Instead of re-thinking the implementation, the existing code was fixed and patched. The current code unsuccessfully attempts to eliminate 'move' instructions across GC BIF and allocation instructions. We can simplify the code if we give up as soon as we encounter any instruction that allocates.
2015-08-21	beam_block: Eliminate redundant wasteful call to opt/1	Björn Gustavsson
	opt_alloc/1 makes a redundant call to opt/1. It is redundant because the opt/1 function has already been applied to the instruction sequence prior to calling opt_alloc/1.
2015-06-18	Change license text to APLv2	Bruce Yinhe

2015-04-22	v3_codegen: Reduce cost for fixing up bs_match_string instructions	Björn Gustavsson
	Commit b76588fb5a introduced an optimization of the compile time of huge functions with many bs_match_string instructions. The optimization is done in two passes. The first pass coalesces adjacent bs_match_string instructions. To avoid copying bitstrings multiple times, the bitstrings in the instructions are combined in to a (deep) list. The second pass goes through all instructions in the function and combines the list of bitstrings to a single bitstring in all bs_match_string instructions. The second pass (fix_bs_match_string) is run on all instructions in each function, even if there are no bs_match_instructions in the function. While fix_bs_match_string is not a bottleneck (it is a linear pass), its execution time is noticeable when profiling some modules. Move the execution of the second pass to the select_binary() function so that it will only be executed for instructions that do binary matching. Also take the opportunity to optimize away uses of bs_restore2 that occour directly after a bs_save2. That optimimization is currently done in beam_block, but it can be done essentially for free in the same pass that fixes up bs_match_string instructions.
2015-03-09	sys_core_fold: Improve optimization of 'not'	Björn Gustavsson
	Optimize away 'not' in sys_core_fold instead of in beam_block and beam_dead, as we can do a better job in sys_core_fold. I modified the test suite temporarily to never turn off Core Erlang modifications and looked at the coverage. With the new optimizations active in sys_core_fold, the code in beam_block and beam_dead did not find a single 'not' that it could optimize. That proves that the new optimization is at least as good as the old one. Manually, I could also verify that the new optimization would optimize some variations of 'not' that the old one would not handle.
2015-03-09	Introduce '%live' annotations with a complete register map	Björn Gustavsson
	As a preparation for fixing a bug, introduce a complete register map in the '%live' annotations.
2015-01-12	compiler: Rename util function to adhere to name policy	Björn-Egil Dahlberg
	* beam_utils:joineven/1 -> beam_utils:join_even/1 * beam_utils:split_even/1 -> beam_utils:split_even/1
2014-08-26	compiler: Use variables in Map beam assmebler	Björn-Egil Dahlberg

2014-02-13	compiler: Change map instructions for fetching values	Björn-Egil Dahlberg
	* Combine multiple get values with one instruction * Combine multiple check keys with one instruction
2014-01-28	compiler: Fix get_map_element bug with allocate	Björn-Egil Dahlberg
	The instruction get_map_element has a faillabel so you may not use the instruction within a allocate/deallocate block.
2014-01-28	compiler: Implement different instructions for => and :=	Björn Gustavsson

2014-01-28	Implement support for maps in the compiler	Björn Gustavsson
	To make it possible to build the entire OTP system, also define dummys for the instructions in ops.tab.
2013-12-13	Collect all optimised allocate instructions in beam_block	Anthony Ramine
	Any init instruction following an allocate is put in the Inits list of the corresponding alloc tuple.
2013-12-13	Properly collect allocate_zero instructions in beam_block	Anthony Ramine
	If an allocate_zero instruction is fed to beam_block and the beam_type pass is not used afterwards (e.g. with erlc +no_topt), the 'no_opt' atom will be rejected by beam_flatten.