aboutsummaryrefslogtreecommitdiffstats
path: root/erts/emulator/utils/beam_makeops
AgeCommit message (Collapse)Author
2017-09-14Merge branch 'bjorn/erts/relative-jumps'Björn Gustavsson
* bjorn/erts/relative-jumps: Pack failure labels in i_select_val2 and i_select_tuple_arity2 Optimize i_select_tuple_arity2 and is_select_lins Rewrite select_val_bins so that its labels can be packed Pack sequences of trailing 'f' operands Implement packing of 'f' and 'j' Make sure that mask literals are 64 bits Use relative failure labels Add information about offset to common group start position Remove JUMP_OFFSET Refactor instructions to support relative jumps Introduce a new trace_jump/1 instruction for tracing Avoid using $Src more than once
2017-09-14Implement packing of 'f' and 'j'Björn Gustavsson
2017-09-14Make sure that mask literals are 64 bitsBjörn Gustavsson
Use the "ull" suffix for the mask literals instead of "ul" to ensure that the literals are 64 bits also on Windows.
2017-09-14Add information about offset to common group start positionBjörn Gustavsson
2017-09-13Add built-in macros $ARG_POSITION() and $IS_PACKED()Björn Gustavsson
2017-09-11Check the right side of a transformation betterBjörn Gustavsson
The right side of a transformation must be either a single call to a transformation function OR a list of instructions. Mixing them like this is not supported: some_instruction A B C => gen_something(A) | other B C Unfortunately, beam_makeops would silenty ignore anything after the function call, basically handling it in the same way as: some_instruction A B C => gen_something(A) Add a sanity check to reject such mixed right-hand sides.
2017-09-04Try to avoid generating unecessary do/while wrappersBjörn Gustavsson
Make the generated code easier to read.
2017-08-31Introduce '%warm' and beam_warm.hBjörn Gustavsson
The bit syntax instructions are mixed among other instructions in beam_hot.h and beam_cold.h. Introduce a new hotness level called '%warm' with is associated file beam_warm.h. Mark all bit syntax instructions as '%warm'.
2017-08-23beam_makeops: Stop using the Arg() macroBjörn Gustavsson
Generated code uses 'I' explicitly in other places, so it can as well use 'I' when accessing the operands for instructions.
2017-08-23Eliminate the beam_instrs.h fileBjörn Gustavsson
The beam_instrs.h file serves no useful purpose. Put the instructions in beam_hot.h instead.
2017-08-23Add the 'S' type for a register sourceBjörn Gustavsson
The type 'd' could be used both for destination registers and source register. Restrict the 'd' type to only be used for destinations, and introduce the new 'S' type to be used when a source must be a register.
2017-08-23Pack cold instructions tooBjörn Gustavsson
Cold instructions used to be cooler (less frequently executed), so it did not seem worthwhile to pack their operands. Now bit syntax instructions are included among the cold instructions, and they are frequently used.
2017-08-23Pack instructions using 'q', 'c', and 's'Björn Gustavsson
Update the pack engine to safely push literal operands to the pack stack and to safely pop them back to another code address. That will allow packing of more instructions.
2017-08-23beam_makeops: Rewrite the packer, fixing several bugsBjörn Gustavsson
The packer had several bugs and limitations. For instance, on a 32-bit Erlang virtual machine it would gladly pack three 't' values into one word even though it would be not safe. The rewritten version will be more careful how much it packs into each word. It will also be able to do packing for more instructions.
2017-08-23beam_makeops: Introduce the new type 'W' (machine word)Björn Gustavsson
As a preparation for potentially improving packing in the future, we will need to make sure that packable types have a defined maximum size. The packer algorithm assumes that two 'I' operands can be packed into one 64-bit word, but there are instructions that use an 'I' operand to store a pointer. It only works because those instructions are not packed for other reasons. Introduce the 'W' type and use it for operands that don't fit in 32 bits.
2017-08-23beam_makeops: Remove the unused aliases 'N' and 'U'Björn Gustavsson
I don't remember what they were used for, but they are certainly no longer used.
2017-08-23beam_makeops: Add an additional sanity checkBjörn Gustavsson
If a type has a size in %arg_size, it should also have a defined pattern in %bit_type.
2017-08-23beam_makeops: Prevent truncation when packing 'I' valuesBjörn Gustavsson
BEAM_WIDE_MASK covered the 16 right-most bits, instead of the 32 right-most bits. This bug will bite us when we'll do more packing in the future. This bug has been harmless in the past. It has been used in test_heap and allocate instructions for the number of heap words needed. It would be theoretically possible to construct a program that would need 65536 or more heap words, but it is hard to imagine a practical use for such a program. (The program would have to build a tuple or list with at least one variable and the rest of the elements being literals.)
2017-08-22beam_makeops: Remove unused subroutine save_c_codeBjörn Gustavsson
2017-08-08beam_makeops: Pretty-print the generated codeBjörn Gustavsson
2017-08-08beam_makeops: Define ARCH_32 and ARCH_64Björn Gustavsson
2017-08-08Introduce micro instructionsBjörn Gustavsson
beam_makeops will place all micro instructions in a block and generate goto instructions from one micro instruction to the next. It will also add adjustments of 'I' if necessary (if the micro instructions have different length).
2017-08-08Simplify specifying implementation of instructionsBjörn Gustavsson
Eliminate the need to write pre-processor macros for each instruction. Instead allow the implementation of instruction to be written in C directly in the .tab files. Rewrite all existing macros in this way and remove the %macro directive.
2017-06-14Update copyright yearHans Nilsson
2017-05-18Allow multiple types per argument for specific instructionsBjörn Gustavsson
Inroduce syntactic sugar so that we can write: get_list xy xy xy instead of: get_list x x x get_list x x y get_list x y x get_list x y y get_list y x x get_list y x y get_list y y x get_list y y y
2017-05-18Modernize subroutine calls by removing '&'Björn Gustavsson
In Perl 5, '&' on direct subroutine calls are optional.
2017-05-18Eliminate the -gen_dest macro flagBjörn Gustavsson
Instructions that take a 'd' argument needs a -gen_dest flag in their macros. For example: %macro:put_list PutList -pack -gen_dest put_list s s d -gen_dest was needed when x(0) was stored in a register, since it is not possible to take the address of a register. Now that x(0) is stored in memory and we can take the address, we can eliminate gen_dest.
2016-06-22beam_makeops: Save some memory by making loader tables 'const'Björn Gustavsson
Before: $ size bin/x86_64-unknown-linux-gnu/beam.smp text data bss dec hex filename 3080982 188369 158472 3427823 344def bin/x86_64-unknown-linux-gnu/beam.smp After: $ size bin/x86_64-unknown-linux-gnu/beam.smp text data bss dec hex filename 3164694 104657 158472 3427823 344def bin/x86_64-unknown-linux-gnu/beam.smp
2016-06-22beam_makeops: Separate static information from countersBjörn Gustavsson
The counters are only used in the special 'icount' emulator. We will save some memory by including the counters in the OpEntry. It will also make it possible to make opc 'const'.
2016-04-18Merge branch 'bjorn/erts/beam_load'Björn Gustavsson
* bjorn/erts/beam_load: Optimize get_tuple_element instructions that target Y registers Mend beam_SUITE:packed_registers/1 Correct unpacking of 3 operands on 32-bit archictectures Eliminate misleading #ifdef ARCH_64 in beam_opcodes.h beam_debug: Correct masking when unpacking packed operands
2016-04-14Correct unpacking of 3 operands on 32-bit archictecturesBjörn Gustavsson
0a4750f91c83 optimized unpacking by removing a mask operation when unpacking three packed operands. Unfortunately, that optimization is only safe on 64-bit architectures. Here is what happens on 32-bit architectures. The operands to be packed are 10-bit register numbers that have been turned into byte offsets: aaaaaaaaaa00 bbbbbbbbbb00 cccccccccc00 They can be packed into a single word like this: 30 20 10 0 | | | | aa aaaaaaaabb bbbbbbbbcc cccccccc00 If we call the packed word P, the original operands can be extracted like this: C = P band 2#111111111100 B = (P bsr 10) band 2#111111111100 A = (P bsr 20) band 2#111111111100 The bug was that A was extracted without the masking: A = P bsr 20 That would give A the value: aaaaaaaaaaaabb That would only be safe if the two most significant bits in B were zeroes.
2016-04-14Eliminate misleading #ifdef ARCH_64 in beam_opcodes.hBjörn Gustavsson
There is a '#ifdef ARCH_64' beam_opcodes.h, which might make you think that files generated by beam_makeops will work for both 32-bit and 64-bit architectures. They will not. beam_makeops will generate different code depending on its -wordsize option.
2016-04-13Merge branch 'henrik/update-copyrightyear'Henrik Nord
* henrik/update-copyrightyear: update copyright-year
2016-04-07Remove unused variables after code generationBjörn Gustavsson
The removal of instructions on the left side of a transformation is done while generating the code for the left side. Postpone removal of unused variables to a later, separate passes to allow more variables to be eliminated after the optimizations passes introduced in the previous commits.
2016-04-07Avoid rebuilding unchanged instructionsBjörn Gustavsson
In transformations such as: move S X0=x==0 | line Loc | call_ext Ar Func => \ line Loc | move S X0 | call_ext Ar Func we can avoid rebuilding the last instruction in the sequence by introducing a 'keep' instruction. Currently, there are only 13 transformations that are hit by this optimization, but most of them are frequently used.
2016-04-07Introduce a 'rename' instructionBjörn Gustavsson
Introduce a 'rename' instruction that can be used to optimize simple renaming with unchanged operands such as: get_tuple_element Reg P Dst => i_get_tuple_element Reg P Dst By allowing it to lower the arity of instruction, transformations such as the following can be handled: trim N Remaining => i_trim N All in all, currently 67 transformations can be optimized in this way, including some commonly used ones.
2016-04-07Simplify window management for the transformation engineBjörn Gustavsson
Generic instructions have a min_window field. Its purpose is to avoid calling transform_engine() when there are too few instructions in the current "transformation window" for a transformation to succeed. Currently it does not do much good since the window size will be decremented by one before being used. The reason for the subtraction is probably that in some circumstances in the past, the loader could read past the end of the BEAM module while attempting to fetch instructions to increase the window size. Therefore, it would not be safe to just remove the subtraction by one. The simplest and safest solution seems to always ensure that there are always at least TWO instructions when calling transform_engine(). That will be safe, as long as a BEAM module is always finished with an int_code_end/0 that is not involved in any transformation.
2016-04-07Eliminate allocation of variables in transform_engine()Björn Gustavsson
When an instruction with a variable number operands (such as select_val) is seen of the left side of a transformation, the 'next_arg' instruction will allocate a buffer to fit all variables and all operands will be copied into the buffer. Very often, the 'commit' instruction will never be reached because of a test or predicate failing or because of a short window; in that case, the variable buffer will be deallocated. Note that originally there were only few instructions with a variable number of operands, but now common operations such as tuple building also have a variable number of operands. To avoid those frequent allocations and deallocations, modify the 'next_arg' instruction to only save a pointer to the first of the "rest" arguments. Also move the deallocation of the instructions on the left side from the 'commit' instruction to the 'end' instruction to ensure that 'store_rest_args' will still work.
2016-03-15update copyright-yearHenrik Nord
2015-07-06Improve unpacking performance on x86_64Björn Gustavsson
When unpacking operands on 64-bit CPUs, use a smarter mask to help the compiler optimize the code. It turns out that on x86_64, if we use the mask 0xFFFFUL (selecting the 16 least significant bits), the compiler can combine a move and a mask operation into the single insruction 'movzwl', which will eliminate one instruction.
2015-07-06Eliminate prefetch for conditional instructionsBjörn Gustavsson
Not pre-fetching in conditional instructions (instructions that use -fail_action) seems to improve performance slightly. The reason for that is that conditional instructions may jump to the failure label, wasting the pre-fetched instruction. Another reason is that that many conditional instructions do a function call, and the register pressure is therefore high. Avoiding the pre-fetch may reduce the register pressure and pontentially result in more efficient code.
2015-07-03Teach beam_makeops to pack operands for move3 and move_windowBjörn Gustavsson
It is currently only possible to pack up to 4 operands. However, the move_window4 instrucion has 5 operands and move_window5 and move3 instrucations have 6 operands. Teach beam_makeops to pack instructions with 5 or 6 operands. Also rewrite the move_window instructions in beam_emu.c to macros to allow their operands to get packed.
2015-07-03beam_makeops: Eliminate unnecessary masking when packing 3 operandsBjörn Gustavsson
When packing 3 operands into one word, there would be an unnecessary mask operation when extracting the last operand.
2015-07-03Make the 'r' operand type optionalBjörn Gustavsson
The 'r' type is now mandatory. That means in order to handle both of the following instructions: move x(0) y(7) move x(1) y(7) we would need to define two specific operations in ops.tab: move r y move x y We want to make 'r' operands optional. That is, if we have only this specific instruction: move x y it will match both of the following instructions: move x(0) y(7) move x(1) y(7) Make 'r' optional allows us to save code space when we don't want to make handling of x(0) a special case, but we can still use 'r' to optimize commonly used instructions.
2015-07-03Allow X and Y registers to be overloaded with any literalBjörn Gustavsson
Consider the try_case_end instruction: try_case_end s The 's' operand type means that the operand can either be a literal of one of the types atom, integer, or empty list, or a register. That worked well before R12. In R12 additional types of literals where introduced. Because of way the overloading was done, an 's' operand cannot handle the new types of literals. Therefore, code such as the following is necessary in ops.tab to avoid giving an 's' operand a literal: try_case_end Literal=q => move Literal x | try_case_end x While this work, it is error-prone in that it is easy to forget to add that kind of rule. It would also be complicated in case we wanted to introduce a new kind of addition operator such as: i_plus jssd Since there are two 's' operands, two scratch registers and two 'move' instructions would be needed. Therefore, we'll need to find a smarter way to find tag register operands. We will overload the pid and port tags for X and Y register, respectively. That works because pids and port are immediate values (fit in one word), and there are no literals for pids and ports.
2015-07-03Change the meaning of 'x' in a transformationBjörn Gustavsson
The purpose of this series of commits is to improve code generation for the Clang compiler. As a first step we want to change the meaning of 'x' in a transformation such as: operation Literal=q => move Literal x | operation x Currently, a plain 'x' means reg[0] or x(0), which is the first element in the X register array. That element is distinct from r(0) which is a variable in process_main(). Therefore, since r(0) and x(0) are currently distinct it is fine to use x(0) as a scratch register. However, in the next commit we will eliminate the separate variable for storing the contents of X register zero (thus, x(0) and r(0) will point to the same location in the X register array). Therefore, we must use another scratch register in transformation. Redefine a plain 'x' in a transformation to mean x(1023). Also define SCRATCH_X_REG so that we can refer to the register by name from C code.
2015-07-03beam_makeops: Eliminate crash because of unsafe packingBjörn Gustavsson
Consider an hypothetical instruction: do_something x x c The loader would crash if we tried to load an instance of the instruction with the last operand referencing a literal: {do_something,{x,0},{x,1},{literal,{a,b,c}}} Teach beam_makeops to turn off packing for such unsafe instructions.
2015-06-24erts: Remove HALFWORD_HEAP definitionBjörn-Egil Dahlberg
2015-06-18Change license text to APLv2Bruce Yinhe
2014-01-28BEAM loader: Support preservation of extra operand in transformsBjörn Gustavsson
It was not possible to preserve extra arguments in transformations. The following (hypothetical) example will now work: some_op Lit=c SizeArg Rest=* => move Lit x | some_op x SizeArg Rest