aboutsummaryrefslogtreecommitdiffstats
path: root/erts/emulator/beam/beam_emu.c
AgeCommit message (Collapse)Author
2015-12-09Merge branch 'egil/pd-opt-get/OTP-13167'Björn-Egil Dahlberg
* egil/pd-opt-get/OTP-13167: erts: Add i_get_hash instruction erts: Use internal hash for process dictionaries
2015-12-08Merge branch 'rickard/ohmq-fixup/OTP-13047'Rickard Green
* rickard/ohmq-fixup/OTP-13047: Replace off_heap_message_queue option with message_queue_data option Always use literal_alloc Distinguish between GC disabled by BIFs and other disabled GC Fix process_info(_, off_heap_message_queue) Off heap message queue test suite Remove unused variable Fix memory leaks
2015-12-08Distinguish between GC disabled by BIFs and other disabled GCRickard Green
Processes remember heap fragments that are known to be fully live due to creation in a just called BIF that yields in the live_hf_end field. This field must not be used if we have not disabled GC in a BIF. F_DELAY_GC has been introduced in order to distinguish between to two different scenarios. - F_DISABLE_GC should *only* be used by BIFs. This when the BIF needs to yield while preventig a GC. - F_DELAY_GC should only be used when GC is temporarily disabled while the process is scheduled. A process must not be scheduled out while F_DELAY_GC is set.
2015-12-07erts: Add i_get_hash instructionBjörn-Egil Dahlberg
Calculate hashvalue in load-time for constant process dictionary gets.
2015-11-19Refactor have seq_trace token testBjörn-Egil Dahlberg
2015-11-17Remove DTrace Harddebug clutterBjörn-Egil Dahlberg
2015-11-13Merge branch 'rickard/gc-bump-reds/OTP-13097'Rickard Green
* rickard/gc-bump-reds/OTP-13097: Bump reductions on GC
2015-11-13Merge branch 'rickard/gc-after-bif-cond/OTP-13098'Rickard Green
* rickard/gc-after-bif-cond/OTP-13098: Use the same conditions when triggering GC after BIF
2015-11-12Merge branch 'rickard/ohmq/OTP-13047'Rickard Green
* rickard/ohmq/OTP-13047: Fragmented young heap generation and off_heap_message_queue option Refactor GC Introduce literal tag Conflicts: erts/doc/src/erlang.xml erts/emulator/beam/erl_gc.c
2015-11-12Merge branch 'sverk/literal-memory-range'Rickard Green
* sverk/literal-memory-range: erts: Refactor line table in loaded beam code erts: Refactor header of loaded beam code fix check_process_code for separate literal area erts: Add support for fast erts_is_literal() erts: Refactor erl_mmap to allow several mapper instances erts: Add new allocator LITERAL erts: Fix strangeness in treatment of MSEG_ALIGN_BITS erts: Cleanup main carrier creation erts: Remove unused erts_have_erts_mmap erts: Refactor config test for posix_memalign
2015-11-12Bump reductions on GCRickard Green
2015-11-12Use the same conditions when triggering GC after BIFRickard Green
2015-11-12Fragmented young heap generation and off_heap_message_queue optionRickard Green
* The youngest generation of the heap can now consist of multiple blocks. Heap fragments and message fragments are added to the youngest generation when needed without triggering a GC. After a GC the youngest generation is contained in one single block. * The off_heap_message_queue process flag has been added. When enabled all message data in the queue is kept off heap. When a message is selected from the queue, the message fragment (or heap fragment) containing the actual message is attached to the youngest generation. Messages stored off heap is not part of GC.
2015-11-12erts: Refactor header of loaded beam codeSverker Eriksson
to use a real C struct instead of array.
2015-11-12Introduce literal tagRickard Green
2015-10-09Teach erlang:is_builtin/3 that erlang:apply/3 is built-inBjörn Gustavsson
erlang:is_builtin(erlang, apply, 3) returns 'false'. That seems to be an oversight in the implementation of erlang:is_builtin/3 rather than a conscious design decision. Part of apply/3 is implemented in C (as a special instruction), and part of it in Erlang (only used if apply/3 is used recursively). That makes apply/3 special compared to all other BIFs. From the viewpoint of the user, apply/3 is a built-in function, since it cannot possibly be implemented in pure Erlang. Noticed-by: Stavros Aronis
2015-07-07Merge branch 'maint'Zandra Hird
2015-07-07Merge branch 'maint-17' into maintZandra Hird
Conflicts: OTP_VERSION erts/doc/src/notes.xml erts/vsn.mk lib/runtime_tools/doc/src/notes.xml lib/runtime_tools/vsn.mk otp_versions.table
2015-07-06Teach non-smp VM how to deal with trace port crashRickard Green
2015-07-06Slightly tweak the peformance for get_listBjörn Gustavsson
Fetch the head and tail parts to temporary variables before writing them to their destinations. That should allow the CPU to perform the moves in parallel, which might improve performance.
2015-07-06Speed up list matchingBjörn Gustavsson
The combination is_non_empty_list followed by get_list is extremly common (but not in estone_SUITE, which is why it has not been noticed before). Therefore it is worthwile to introduce a combined instruction.
2015-07-06Eliminate the variable temp_bits at the top scope of process_main()Björn Gustavsson
2015-07-03Teach beam_makeops to pack operands for move3 and move_windowBjörn Gustavsson
It is currently only possible to pack up to 4 operands. However, the move_window4 instrucion has 5 operands and move_window5 and move3 instrucations have 6 operands. Teach beam_makeops to pack instructions with 5 or 6 operands. Also rewrite the move_window instructions in beam_emu.c to macros to allow their operands to get packed.
2015-07-03Use a cheaper tag scheme for 'd' operandsBjörn Gustavsson
Since 'd' operands can only either an X register or an Y register, we only need a single bit to distinguish them. Furthermore, we can pre-multiply the register number with the word size to speed up address calculation.
2015-07-03Introduce swap_temp/3 and swap/2Björn Gustavsson
Sequences of three move instructionst that effectively swap the contents of two registers are fairly common. We can replace them with a swap_temp/3 instruction. The third operand is the temporary register to be used for swapping, since the temporary register may actually be used. If swap_temp/3 instruction is followed by a call, the temporary register will often (but not always) be killed by the call. If it is killed, we can replace the swap_temp/3 instruction with a slightly cheaper swap/2 instruction.
2015-07-03Introduce specialized versions of move2Björn Gustavsson
Currently, move2/2 does the two moves sequentially to ensure that the instruction will always work correctly. We can do better than that. If the two move instructions have any registers in common, we can introduce simpler and slightly more efficient instructions to handle those cases: move_shift/3 move_dup/3 For the remaining cases when the the move instructions have no common registers, the move2/4 instruction can perform the moves in parallel which is probably slightly more efficient. For clarity's sake, we will remain the instruction to move2_par/4.
2015-07-03Add back frequently used x(0) instructionsBjörn Gustavsson
2015-07-03Rewrite the hipe_mode_switch instructionsBjörn Gustavsson
The 'cmd' variable that were shared by several hipe_mode_switch instructions would cause clang to produce sub-optimal code, probably because it considered the instructions as part of of loop that needed to be optimized. What would was that 'cmd' would be assigned to the ESI register (lower 32 bits of the RSI register). It would use ESI for other purposes in instructions, but at the end of every instruction it would set ESI to 1 just in case the next instruction happened to be hipe_trap_return. This can be seen clearly if this commit is omitted and the define HIPE_MODE_SWITCH_CMD_RETURN in hipe/hipe_mode_switch.h is changed from 1 to some other number such as 42. You will see that 42 is assigned to ESI at the end of every instruction. Eliminate this problem by elimininating the shared 'cmd' variable.
2015-07-03Remove the last use of tmp_arg1Björn Gustavsson
2015-07-03Eliminate use of tmp_arg1 and tmp_arg2 in bit syntaxBjörn Gustavsson
2015-07-03Remove the i_fetch instructionBjörn Gustavsson
2015-07-03Eliminate use of i_fetch for bit syntax instructionsBjörn Gustavsson
2015-07-03Eliminate the use of i_fetch for BIF instructionsBjörn Gustavsson
2015-07-03Eliminate the use of i_fetch for relational operatorsBjörn Gustavsson
2015-07-03Eliminate the use of i_fetch in arithmetic instructionsBjörn Gustavsson
The i_fetch instruction fetches two operands and places them in the tmp_arg1 and tmp_arg2 variables. The next instruction (such as i_plus) does not have to handle different types of operands, but can get get them simply from the tmp_arg* variables. Thus, i_fetch was introduced as a way to temper a potentail combinatorial explosion. Unfortunately, clang will generate terrible code because of the tmp_arg1 and tmp_arg2 variables being live across multiple instructions. Note that Clang has no way to predict the control flow from one instruction to another. Clang must assume that any instruction can jump to any other instruction. Somehow GCC manages to cope with this situation much better. Therefore, to improve the quality of the code generated by clang, we must eliminate all uses of the tmp_arg1 and tmp_arg2 variables. This commit eliminates the use of i_fetch in combination with the arithmetic and logical instructions. While we are touching the code for the bsr and bsl instructions, also move the tmp_big[] array from top scope of process main into the block that encloses the bsr and bsl instructions.
2015-07-03Make the 'r' operand type optionalBjörn Gustavsson
The 'r' type is now mandatory. That means in order to handle both of the following instructions: move x(0) y(7) move x(1) y(7) we would need to define two specific operations in ops.tab: move r y move x y We want to make 'r' operands optional. That is, if we have only this specific instruction: move x y it will match both of the following instructions: move x(0) y(7) move x(1) y(7) Make 'r' optional allows us to save code space when we don't want to make handling of x(0) a special case, but we can still use 'r' to optimize commonly used instructions.
2015-07-03Allow X and Y registers to be overloaded with any literalBjörn Gustavsson
Consider the try_case_end instruction: try_case_end s The 's' operand type means that the operand can either be a literal of one of the types atom, integer, or empty list, or a register. That worked well before R12. In R12 additional types of literals where introduced. Because of way the overloading was done, an 's' operand cannot handle the new types of literals. Therefore, code such as the following is necessary in ops.tab to avoid giving an 's' operand a literal: try_case_end Literal=q => move Literal x | try_case_end x While this work, it is error-prone in that it is easy to forget to add that kind of rule. It would also be complicated in case we wanted to introduce a new kind of addition operator such as: i_plus jssd Since there are two 's' operands, two scratch registers and two 'move' instructions would be needed. Therefore, we'll need to find a smarter way to find tag register operands. We will overload the pid and port tags for X and Y register, respectively. That works because pids and port are immediate values (fit in one word), and there are no literals for pids and ports.
2015-07-03Eliminate R_REG_DEFBjörn Gustavsson
2015-07-03Store r(0) and x(0) in the same locationBjörn Gustavsson
As part of improving code generation for clang, we want to eliminate the special variable that stores the content of X register zero most of the time. In a future, that will allow us to eliminate the special case of handling r(0) for most instructions, thus reducing the code size and allow other simplifcations. Therefore, in this commit, eliminate the variable that is used to store r(0) and make r(0) as synonym for x(0). I have chosen to keep the r(0) define to keep the size of the diff managable.
2015-07-03beam_emu.c: Remove unused MoveGenDest macroBjörn Gustavsson
2015-07-01erts: Remove halfword !HEAP_ON_C_STACKBjörn-Egil Dahlberg
2015-06-24erts: Remove halfword pointer compressionBjörn-Egil Dahlberg
* Removed COMPRESS_POINTER and EXPAND_POINTER
2015-06-24erts: Remove HALFWORD_HEAP definitionBjörn-Egil Dahlberg
2015-06-18Change license text to APLv2Bruce Yinhe
2015-06-15erts: Remove hashmap probabilistic heap overestimationSverker Eriksson
by adding a dynamic heap factory. "binary_to_term" is now a hybrid solution with both a call to decoded_size() to calculate needed heap space AND possible dynamic allocation of more heap space if needed for big maps. The heap size returned from decoded_size() is guaranteed to be sufficient for all term heap data except for hashmap nodes. All hashmap nodes are created at the end of dec_term() by invoking the heap factory interface that may allocate more heap space on process heap or in fragments. With this commit it is no longer guaranteed that a message is confined to only one heap fragment.
2015-05-11Send format and args on process exit to error_loggerJosé Valim
Previously, the emulator would generate a whole string with values and call the error_logger passing "~s~n". This commit changes it to a format string containing ~p with the respective values as arguments.
2015-05-08Merge branch 'rickard/timer-optimization/OTP-12650'Rickard Green
* rickard/timer-optimization/OTP-12650: Optimized timer implementation Reusable red-black tree implementation Conflicts: erts/emulator/beam/erl_bif_timer.c
2015-05-08Optimized timer implementationRickard Green
2015-04-28Merge branch 'egil/opt-instructions/OTP-12690'Björn-Egil Dahlberg
* egil/opt-instructions/OTP-12690: erts: Specialize minus and plus instruction erts: Add move2 specialization for common move patterns erts: Specialize rem instruction for common case erts: Specialize band instruction for common case erts: Batch loads and stores for move_window erts: Fix loader increment from minus instruction erts: Add move window instruction erts: Add instruction move3 for xy and xx erts: Specialize compare instructions kernel: Add instruction_count helper to erts_debug
2015-04-27erts: Specialize minus and plus instructionBjörn-Egil Dahlberg
Seen on SSL application where substraction with x registers were prevalent: * i_minus specialization on x registers * i_plus specialization on x registers