aboutsummaryrefslogtreecommitdiffstats
path: root/erts/emulator/beam/ops.tab
AgeCommit message (Collapse)Author
2018-01-26Eliminate get_list/3 internally in the compilerBjörn Gustavsson
Instructions that produce more than one result complicate optimizations. get_list/3 is one of two instructions that produce multiple results (get_map_elements/3 is the other). Introduce the get_hd/2 and get_tl/2 instructions that return the head and tail of a cons cell, respectively, and use it internally in all optimization passes. For efficiency, we still want to use get_list/3 if both head and tail are used, so we will translate matching pairs of get_hd and get_tl back to get_list instructions.
2018-01-22Don't build a stacktrace if it's only passed to erlang:raise/3Björn Gustavsson
Consider the following function: function({function,Name,Arity,CLabel,Is0}, Lc0) -> try %% Optimize the code for the function. catch Class:Error:Stack -> io:format("Function: ~w/~w\n", [Name,Arity]), erlang:raise(Class, Error, Stack) end. The stacktrace is retrieved, but it is only used in the call to erlang:raise/3. There is no need to build a stacktrace in this function. We can avoid the building if we introduce an instruction called raw_raise/3 that works exactly like the erlang:raise/3 BIF except that its third argument must be a raw stacktrace.
2017-12-08Merge pull request #1634 from bjorng/bjorn/get_stacktrace-syntax/OTP-14692Björn Gustavsson
Add syntax in try/catch to retrieve the stacktrace directly
2017-11-30Add syntax in try/catch to retrieve the stacktrace directlyBjörn Gustavsson
This commit adds a new syntax for retrieving the stacktrace without calling erlang:get_stacktrace/0. That allow us to deprecate erlang:get_stacktrace/0 and ultimately remove it. The problem with erlang:get_stacktrace/0 is that it can keep huge terms in a process for an indefinite time after an exception. The stacktrace can be huge after a 'function_clause' exception or a failed call to a BIF or operator, because the arguments for the call will be included in the stacktrace. For example: 1> catch abs(lists:seq(1, 1000)). {'EXIT',{badarg,[{erlang,abs, [[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20|...]], []}, {erl_eval,do_apply,6,[{file,"erl_eval.erl"},{line,674}]}, {erl_eval,expr,5,[{file,"erl_eval.erl"},{line,431}]}, {shell,exprs,7,[{file,"shell.erl"},{line,687}]}, {shell,eval_exprs,7,[{file,"shell.erl"},{line,642}]}, {shell,eval_loop,3,[{file,"shell.erl"},{line,627}]}]}} 2> erlang:get_stacktrace(). [{erlang,abs, [[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22, 23,24|...]], []}, {erl_eval,do_apply,6,[{file,"erl_eval.erl"},{line,674}]}, {erl_eval,expr,5,[{file,"erl_eval.erl"},{line,431}]}, {shell,exprs,7,[{file,"shell.erl"},{line,687}]}, {shell,eval_exprs,7,[{file,"shell.erl"},{line,642}]}, {shell,eval_loop,3,[{file,"shell.erl"},{line,627}]}] 3> We can extend the syntax for clauses in try/catch to optionally bind the stacktrace to a variable. Here is an example using the current syntax: try Expr catch C:E -> Stk = erlang:get_stacktrace(), . . . In the new syntax, it would look like: try Expr catch C:E:Stk -> . . . Only a variable (not a pattern) is allowed in the stacktrace position, to discourage matching of the stacktrace. (Matching would also be expensive, because the raw format of the stacktrace would have to be converted to the cooked form before matching.) Note that: try Expr catch E -> . . . is a shorthand for: try Expr catch throw:E -> . . . If the stacktrace is to be retrieved for a throw, the 'throw:' prefix must be explicitly included: try Expr catch throw:E:Stk -> . . .
2017-11-30Stop trying to maximize the use of x(0)Björn Gustavsson
X register 0 used to be mapped to a hardware register, and therefore faster than the other registers. Because of that, the compiler tried to use x(0) as much as possible as a temporary register. That was changed a few releases ago. X register 0 is now placed in the array of all X registers and has no special speed advantage compared to the other registers. Remove the code in the compiler that attempts to use x(0) as much as possible. As a result, the following type of instruction will be much less frequent: {put_list,Src,{x,0},{x,0}} Instead, the following type of instruction will be more frequent: {put_list,Src,{x,X},{x,X}} (Where X is an arbitrary X register.) Update the runtime system to specialize that kind of put_list instruction.
2017-11-14Fix broken receive optimizationBjörn Gustavsson
When a ref is created before performing a receive that will only receive message containing that ref, there is a compiler optimization to avoid scanning messages that can't possible contain the newly created ref. Magnus Lång pointed out that the implementation of the optimization is flawed. Exceptions or recursive calls could cause the receive operation to scan the receive queue from a position beyond the expected message (that is, the message containing the ref would never be matched out). See the receive_opt_exception/1 and receive_opt_recursion/1 test cases in receive_SUITE. It turns out that we can simplify the implementation of the optimization while fixing the bug (suggested by Magnus Lång). We actually don't need the c_p->msg.mark field. It is enough to have c_p->msg.saved_pos; if it is non-zero, it is a valid position in the message qeueue. All we need to do is to ensure that we clear c_p->msg.saved_pos when a receive is exited normally or abnormally. We can clear c_p->msg.saved_pos in JOIN_MESSAGE(), since it is called both when leaving a receive because a message matched and because there was a timeout and the 'after' clause was executed. In addition, we need to clear c_p->msg.saved_pos when an exception is caught. https://bugs.erlang.org/browse/ERL-511
2017-11-07Strengthen tests of definition of specific instructionsBjörn Gustavsson
Don't allow defining an specific operation more than once with the exact same operands. Don't allow a specific operation to be defined with different arities.
2017-10-09Slightly speed up try/catchBjörn Gustavsson
The try_end and try_case instructions are implemented the same way (try_case is translated to try_end by the loader). We can do better than that. We know that try_case will only be executed when an exception has been caught. Therefore, we know that x(0) is the non-value and that x(1) through x(3) need to be shifted down to x(0) through x(2). There is no need to test x(0) before shifting down. try_end does not need the register shifting code at all.
2017-10-05Introduce a syntax for marking operands as "optional use"Björn Gustavsson
Introduce a syntax to mark an operand that is not always used when an instrution is executed. Example of such operands are the fail label for is_nil or the number of live registers for an allocate instruction. Use a question mark to annotate optional use: is_nil f? xy allocate t t?
2017-10-05ops.tab: Slightly optimize badmatch on a Y registerBjörn Gustavsson
2017-10-01Change operand from 'P' to 'Q' for i_apply_last and i_apply_fun_lastBjörn Gustavsson
All other instructions that increment the stack pointer takes a 'Q' operand.
2017-09-14Merge branch 'bjorn/erts/relative-jumps'Björn Gustavsson
* bjorn/erts/relative-jumps: Pack failure labels in i_select_val2 and i_select_tuple_arity2 Optimize i_select_tuple_arity2 and is_select_lins Rewrite select_val_bins so that its labels can be packed Pack sequences of trailing 'f' operands Implement packing of 'f' and 'j' Make sure that mask literals are 64 bits Use relative failure labels Add information about offset to common group start position Remove JUMP_OFFSET Refactor instructions to support relative jumps Introduce a new trace_jump/1 instruction for tracing Avoid using $Src more than once
2017-09-14Pack failure labels in i_select_val2 and i_select_tuple_arity2Björn Gustavsson
2017-09-13Merge branch 'bjorn/erts/improve-beam-ops'Björn Gustavsson
* bjorn/erts/improve-beam-ops: Add built-in macros $ARG_POSITION() and $IS_PACKED() Use 't' instead of 'I' bit syntax operands Optimize operand type for match context in i_bs_get_integer Change operand type from 's' to 'S' for a few instructions Use the correct name of the parameter
2017-09-13Merge pull request #1544 from michalmuskala/eq-optBjörn Gustavsson
Optimise equality comparisons
2017-09-11Use 't' instead of 'I' bit syntax operandsBjörn Gustavsson
The number of live registers, unit and flags, and the number of slots all fit comortably in 16 bits. This change will give more opportunities for packing.
2017-09-11Optimize operand type for match context in i_bs_get_integerBjörn Gustavsson
The match context is always in an X register. Change the 's' operand to 'x'. While we are it, also change the operands for Live and FlagsAndUnit to 't' (they will both fit comfortably in 16 bits).
2017-09-11Change operand type from 's' to 'S' for a few instructionsBjörn Gustavsson
'S' is slightly more efficient. Using 'S' may also enable more packing.
2017-09-11Introduce a new trace_jump/1 instruction for tracingBjörn Gustavsson
As a preparation for introducing relative jumps, introduce "trace_jump W" that can be used for tracing. This instruction will continue to have an absolute address for the jump target. (Note: This instruction is never created during loading; it is only created in stubs when tracing is active.)
2017-09-08Optimise equality comparisonsMichał Muskała
* In both loader and compiler, make sure constants are always the second operand - many passes of the compiler assume that's always the case. * In loader rewrite is_eq_exact with same arguments to skip the instruction and with different constants move one to an x register to maintain the properly outlined above. * The same (but in reverse) is done with the is_ne_exact, where we rewrite to an unconditional jump or add a move to an x register. * All of the above allow to replace is_eq_exact_fss with is_eq_exact_fyy and is_ne_exact_fss with is_ne_exact_fSS as those are the only possibilities left.
2017-09-04Annotate try_case_end as coldBjörn Gustavsson
try_case_end is an excepting-generating instruction that is infrequently executed.
2017-08-31ops.tab: Mark infrequently used instructions as %coldBjörn Gustavsson
Instructions that used to be implemented in beam_emu.c were not marked as cold as it would make no difference.
2017-08-31Introduce '%warm' and beam_warm.hBjörn Gustavsson
The bit syntax instructions are mixed among other instructions in beam_hot.h and beam_cold.h. Introduce a new hotness level called '%warm' with is associated file beam_warm.h. Mark all bit syntax instructions as '%warm'.
2017-08-23Add the 'S' type for a register sourceBjörn Gustavsson
The type 'd' could be used both for destination registers and source register. Restrict the 'd' type to only be used for destinations, and introduce the new 'S' type to be used when a source must be a register.
2017-08-23Introduce more packable typesBjörn Gustavsson
The 'I' type can be replaced with 't' if we know that the value will fit in 16 bits. The 'P' type can be replaced with 'Q' if it is used for deallocating a stack frame.
2017-08-23Make map update instruction functions indepedent of instruction formatBjörn Gustavsson
Having the helper functions for map update knowing all the details of operands for the instruction will make it difficult to make improvements such as better packing.
2017-08-23beam_makeops: Introduce the new type 'W' (machine word)Björn Gustavsson
As a preparation for potentially improving packing in the future, we will need to make sure that packable types have a defined maximum size. The packer algorithm assumes that two 'I' operands can be packed into one 64-bit word, but there are instructions that use an 'I' operand to store a pointer. It only works because those instructions are not packed for other reasons. Introduce the 'W' type and use it for operands that don't fit in 32 bits.
2017-08-23Use the wait_timeout_{un}locked_int instructionsBjörn Gustavsson
The transformations were incorrect.
2017-08-15Slightly optimize updating of mapsBjörn Gustavsson
The instruction put_map_assoc/5 (used for updating a map) has a failure operand, but it can't actually fail provided that its "map" argument is a map. The following code: M#{key=>value}. will be compiled to: {test,is_map,{f,3},[{x,0}]}. {line,[...]}. {put_map_assoc,{f,0},{x,0},{x,0},1,{list,[{atom,key},{atom,value}]}}. return. {label,3}. %% Code that produces a 'badmap' exception follows. Because of the is_map instruction, {x,0} always contains a map when the put_map_assoc instruction is executed. Therefore we can remove the failure operand. That will save one word, and also eliminate two tests at run-time. The only problem is that the compiler in OTP 17 did not emit a is_map instruction before the put_map_assoc instruction. Therefore, we must add an instruction that tests for a map if the code was compiled with the OTP 17 compiler. Unfortunately, there is no safe and relatively easy way to known that the OTP 17 compiler was used, so we will check whether a compiler before OTP 20 was used. OTP 20 introduced a new chunk type for atoms, which is trivial to check.
2017-08-11Break out most instructions from beam_emu.cBjörn Gustavsson
2017-08-08Simplify specifying implementation of instructionsBjörn Gustavsson
Eliminate the need to write pre-processor macros for each instruction. Instead allow the implementation of instruction to be written in C directly in the .tab files. Rewrite all existing macros in this way and remove the %macro directive.
2017-07-06Introduce new_small_map_lit opMichal Muskala
Take advantage of the fact that small maps have a tuple for keys. When new map is constructed and all keys are literals, we can construct the entire keys tuple as a literal. This should reduce the memory of maps created with literal keys almost by half, since they all can share the same keys tuple.
2017-06-28Reduce C code size for bit syntax instructionsBjörn Gustavsson
Bit syntax instructions never store their result in a Y register. Therefore, change the bit syntax instructions to use 'x' as the destination instead of 'd'. That will simplify the code that stores the result, and will be a slight reduction in code size and execution time.
2017-06-28ops.tab: Use combined operandsBjörn Gustavsson
2017-06-26Eliminate potential unsafe use of general destinationBjörn Gustavsson
a3407eaa2104d6 eliminated the -gen_dest flag for macros in ops.tab. It turns out that the new implementation (taking the address of the X or X destination register) is unsafe if the destination is a Y register and there can be a GC. The problem is that the address to the Y register will change if there is a GC. Fortunately, the few instructions in OTP 20 that have a general destinations are safe. The put_list_ssd instruction never does a GC. The bit syntax instructions that may do a GC will always store the result to an X register. To be completely sure, rewrite the destination register from 'd' to 'x' for the bit syntax instructions. That means that a bit syntax instruction with a Y register destionation will abort the loading if it is encountered.
2017-06-14Update copyright yearHans Nilsson
2017-05-18Allow multiple types per argument for specific instructionsBjörn Gustavsson
Inroduce syntactic sugar so that we can write: get_list xy xy xy instead of: get_list x x x get_list x x y get_list x y x get_list x y y get_list y x x get_list y x y get_list y y x get_list y y y
2017-05-18Eliminate the -gen_dest macro flagBjörn Gustavsson
Instructions that take a 'd' argument needs a -gen_dest flag in their macros. For example: %macro:put_list PutList -pack -gen_dest put_list s s d -gen_dest was needed when x(0) was stored in a register, since it is not possible to take the address of a register. Now that x(0) is stored in memory and we can take the address, we can eliminate gen_dest.
2017-03-24compiler: Add is_tagged_tuple instructionBjörn-Egil Dahlberg
Rewrite the instruction stream on tagged tuple tests. Tagged tuples means a tuple of any arity with an atom as its first element. Typically records, ok-tuples and error-tuples. from: ... {test,is_tuple,Fail,[Src]}. {test,test_arity,Fail,[Src,Sz]}. ... {get_tuple_element,Src,0,Dst}. ... {test,is_eq_exact,Fail,[Dst,Atom]}. ... to: ... {test,is_tagged_tuple,Fail,[Src,Sz,Atom]}. ...
2017-01-12Perform potentially long GC on dirty schedulers if availableRickard Green
2016-09-06Merge branch 'kvakvs/erts/gc_minor_option/OTP-11695'Lukas Larsson
* kvakvs/erts/gc_minor_option/OTP-11695: erts: Fix req_system_task gc typespec Fix process_SUITE system_task_blast and no_priority_inversion2 Option to erlang:garbage_collect to request minor (generational) GC Conflicts: erts/emulator/beam/erl_process.c erts/preloaded/src/erts_internal.erl
2016-08-31Remove old purge strategyRickard Green
2016-08-12Option to erlang:garbage_collect to request minor (generational) GCDmytro Lytovchenko
Note: Minor GC option is a hint, and GC may still decide to run fullsweep. Test case for major and minor gc on self Test case for major and minor gs on some other process + async gc test check docs fix
2016-04-18Merge branch 'bjorn/compiler/misc-opt'Björn Gustavsson
* bjorn/compiler/misc-opt: v3_kernel: Construct literal lists properly Use the register map in %live in beam_utils:is_killed_block/2 Teach beam_utils to check liveness for put_map instructions beam_peep: Help out beam_jump
2016-04-18Merge branch 'bjorn/erts/beam_load'Björn Gustavsson
* bjorn/erts/beam_load: Optimize get_tuple_element instructions that target Y registers Mend beam_SUITE:packed_registers/1 Correct unpacking of 3 operands on 32-bit archictectures Eliminate misleading #ifdef ARCH_64 in beam_opcodes.h beam_debug: Correct masking when unpacking packed operands
2016-04-14v3_kernel: Construct literal lists properlyBjörn Gustavsson
Use cerl:make_list/1 instead of a home-made make_list/1 to ensure that literal lists are constructed as literals. In a future release, we would like to forbid in the loader construction of literal lists using instructions like: put_list {atom,a} [] Dst The proper way is: move {literal,[a]} {x,0} Also update the comment about "put_list Const [] Dst" in ops.tab.
2016-04-14Optimize get_tuple_element instructions that target Y registersBjörn Gustavsson
Several improvements in the compiler (e.g. c288ab87fd6) has lead to an Y register being the target for get_tuple_element instructions. Therefore, introduce i_get_tuple_element2y that combines two consecutive get_tuple_element instructions that target Y registers.
2016-04-13Merge branch 'henrik/update-copyrightyear'Henrik Nord
* henrik/update-copyrightyear: update copyright-year
2016-04-08Simplify the raise instruction to reduce code sizeBjörn Gustavsson
The raise/2 instruction is almost always used like this: raise x(2) x(1) Therefore, we can translate it to an internal i_raise/0 instruction that uses x(2) x(1) as its implicit operands. We will also remove the backward compatibility with R10-0. It is unlikely that anyone still is using BEAM files compiled with the R10-0 compiler, especially since most of those modules cannot be loaded. The loader will refuse to load any module that uses the old non-GCIng arithmetic instructions or the non-GCing versions of length/1 or size/1. Doing these changes will reduce both the size of the loaded BEAM code and size of the code in process_main().
2016-04-07Eliminate unnecessary renaming of bs_put_utf16/3Björn Gustavsson
There is no reason to rename bs_put_utf16/3. (We rename instructions if we'll need to change the operands or if we will need to avoid an endless transformation loop. Neither of these reasons apply to bs_put_utf16/3.)