otp.git - Mirror of Erlang/OTP repository.

Age	Commit message (Collapse)	Author
2015-07-06	Speed up list matching	Björn Gustavsson
	The combination is_non_empty_list followed by get_list is extremly common (but not in estone_SUITE, which is why it has not been noticed before). Therefore it is worthwile to introduce a combined instruction.
2015-07-06	Eliminate prefetch for conditional instructions	Björn Gustavsson
	Not pre-fetching in conditional instructions (instructions that use -fail_action) seems to improve performance slightly. The reason for that is that conditional instructions may jump to the failure label, wasting the pre-fetched instruction. Another reason is that that many conditional instructions do a function call, and the register pressure is therefore high. Avoiding the pre-fetch may reduce the register pressure and pontentially result in more efficient code.
2015-07-03	Teach beam_makeops to pack operands for move3 and move_window	Björn Gustavsson
	It is currently only possible to pack up to 4 operands. However, the move_window4 instrucion has 5 operands and move_window5 and move3 instrucations have 6 operands. Teach beam_makeops to pack instructions with 5 or 6 operands. Also rewrite the move_window instructions in beam_emu.c to macros to allow their operands to get packed.
2015-07-03	Ensure that the move_call_ext_{last,only} instructions are used	Björn Gustavsson
	Update transformations to ensure that the move_call_ext_last and move_call_ext_last are used.
2015-07-03	Introduce swap_temp/3 and swap/2	Björn Gustavsson
	Sequences of three move instructionst that effectively swap the contents of two registers are fairly common. We can replace them with a swap_temp/3 instruction. The third operand is the temporary register to be used for swapping, since the temporary register may actually be used. If swap_temp/3 instruction is followed by a call, the temporary register will often (but not always) be killed by the call. If it is killed, we can replace the swap_temp/3 instruction with a slightly cheaper swap/2 instruction.
2015-07-03	Introduce specialized versions of move2	Björn Gustavsson
	Currently, move2/2 does the two moves sequentially to ensure that the instruction will always work correctly. We can do better than that. If the two move instructions have any registers in common, we can introduce simpler and slightly more efficient instructions to handle those cases: move_shift/3 move_dup/3 For the remaining cases when the the move instructions have no common registers, the move2/4 instruction can perform the moves in parallel which is probably slightly more efficient. For clarity's sake, we will remain the instruction to move2_par/4.
2015-07-03	Add back frequently used x(0) instructions	Björn Gustavsson

2015-07-03	Remove the last use of tmp_arg1	Björn Gustavsson

2015-07-03	Remove the i_fetch instruction	Björn Gustavsson

2015-07-03	Eliminate use of i_fetch for bit syntax instructions	Björn Gustavsson

2015-07-03	Eliminate the use of i_fetch for BIF instructions	Björn Gustavsson

2015-07-03	Eliminate the use of i_fetch for relational operators	Björn Gustavsson

2015-07-03	Eliminate the use of i_fetch in arithmetic instructions	Björn Gustavsson
	The i_fetch instruction fetches two operands and places them in the tmp_arg1 and tmp_arg2 variables. The next instruction (such as i_plus) does not have to handle different types of operands, but can get get them simply from the tmp_arg* variables. Thus, i_fetch was introduced as a way to temper a potentail combinatorial explosion. Unfortunately, clang will generate terrible code because of the tmp_arg1 and tmp_arg2 variables being live across multiple instructions. Note that Clang has no way to predict the control flow from one instruction to another. Clang must assume that any instruction can jump to any other instruction. Somehow GCC manages to cope with this situation much better. Therefore, to improve the quality of the code generated by clang, we must eliminate all uses of the tmp_arg1 and tmp_arg2 variables. This commit eliminates the use of i_fetch in combination with the arithmetic and logical instructions. While we are touching the code for the bsr and bsl instructions, also move the tmp_big[] array from top scope of process main into the block that encloses the bsr and bsl instructions.
2015-07-03	Make the 'r' operand type optional	Björn Gustavsson
	The 'r' type is now mandatory. That means in order to handle both of the following instructions: move x(0) y(7) move x(1) y(7) we would need to define two specific operations in ops.tab: move r y move x y We want to make 'r' operands optional. That is, if we have only this specific instruction: move x y it will match both of the following instructions: move x(0) y(7) move x(1) y(7) Make 'r' optional allows us to save code space when we don't want to make handling of x(0) a special case, but we can still use 'r' to optimize commonly used instructions.
2015-07-03	Allow X and Y registers to be overloaded with any literal	Björn Gustavsson
	Consider the try_case_end instruction: try_case_end s The 's' operand type means that the operand can either be a literal of one of the types atom, integer, or empty list, or a register. That worked well before R12. In R12 additional types of literals where introduced. Because of way the overloading was done, an 's' operand cannot handle the new types of literals. Therefore, code such as the following is necessary in ops.tab to avoid giving an 's' operand a literal: try_case_end Literal=q => move Literal x \| try_case_end x While this work, it is error-prone in that it is easy to forget to add that kind of rule. It would also be complicated in case we wanted to introduce a new kind of addition operator such as: i_plus jssd Since there are two 's' operands, two scratch registers and two 'move' instructions would be needed. Therefore, we'll need to find a smarter way to find tag register operands. We will overload the pid and port tags for X and Y register, respectively. That works because pids and port are immediate values (fit in one word), and there are no literals for pids and ports.
2015-06-18	Change license text to APLv2	Bruce Yinhe

2015-04-27	erts: Specialize minus and plus instruction	Björn-Egil Dahlberg
	Seen on SSL application where substraction with x registers were prevalent: * i_minus specialization on x registers * i_plus specialization on x registers
2015-04-27	erts: Add move2 specialization for common move patterns	Björn-Egil Dahlberg
	Common pattern seen in SSL: move y x \| move r x -> move2 move r x \| move y x -> move2 Common pattern seen in SSL and Compiler: move x r \| move x x -> move2
2015-04-24	erts: Specialize rem instruction for common case	Björn-Egil Dahlberg
	* i_rem specialization on x registers
2015-04-24	erts: Specialize band instruction for common case	Björn-Egil Dahlberg
	* i_band specialization on x registers and constants
2015-04-23	erts: Add move window instruction	Björn-Egil Dahlberg
	Move an entire region of x registers to the stack. This reduces the dispatch pressure of move instructions. Also introduce a move2 specialization for some common move patterns: move r y \| move x y -> move2 : As above, moving regions to the stack move x r \| move x y -> move2 : A seemingly common pattern
2015-04-23	erts: Add instruction move3 for xy and xx	Björn-Egil Dahlberg

2015-04-23	erts: Specialize compare instructions	Björn-Egil Dahlberg
	* i_is_lt for r, x registers and constants * i_is_ge for x registers and constants * i_is_exact_eq for r and x registers
2015-04-13	Pre-compute hash values for the general get_map_elements instruction	Björn Gustavsson
	See the previous commit for justification and use cases.
2015-04-13	Teach the loader to pre-compute the hash value for single-key lookups	Björn Gustavsson
	Let the loader pre-compute the hash value when a single, literal key is matched as in: #{<<"some_key">>:=V} = Map In my measurements, this optimization resulted in a 30 percent speedup for short binary keys. Unfortunately, this optimizization makes no difference for small maps with less than 32 keys, since the hash value is not used. Still, there are the following use cases: * A map used instead of a record with more than 32 entries. I have seen some applications with huge records. * Lookup in JSON dictionaries represented as maps. The hash value will only be used when the map is a hash map (currently, that means at least 32 entries).
2015-04-13	Optimize use of i_get_map_element/4	Björn Gustavsson
	In the i_get_map_element/4 instruction, for literal keys other than atoms, the key would be put into x[0] instead of used directly in the instruction. The reason is that the original implementation of maps only supported atom keys.
2015-04-13	Sort maps keys in the loader	Björn Gustavsson
	The map instructions require that the keys in the instructions are sorted (for flatmaps). But that is an implementation detail that should not exposed outside of the BEAM virtual machine. Therefore, make the sorting of the keys the responsibility of the loader and not the compiler. Also note that the sort order for maps with numeric keys or keys with numeric components has changed in OTP 18. That means that code compiled for OTP 17 that operated on maps with map keys might not work in OTP 18 without the sorting in the loader (although it is unlikely to be an issue in practice).
2015-04-13	De-optimize the has_map_fields instructions	Björn Gustavsson
	The has_map_fields instruction is infrequently used. Thus there is no need to have the fastest possible implementation; it is better to have an implementation that reduces the code size in the already big process_main() function. We can transform has_map_fields to a get_map_elements instruction, targeting the same unused x[0] register for all keys. That instruction will only be marginally slower than existing implementation.
2015-04-13	Fully evaluate is_map/1 for literals at load-time	Björn Gustavsson
	The compiler will only emit is_map/1 instructions with literal argument if optimization is turned off. Therefore, the only reason for this commit is cleanliness.
2015-04-13	Remove the fail label operand of the new_map instruction	Björn Gustavsson
	The new_map instruction cannot fail, and thus needs no fail label.
2015-04-13	Correct transformation of put_map_assoc to new_map	Björn Gustavsson
	A put_map_assoc instruction with an empty source map should be converted to a simpler new_map instruction. The transformation didn't happen because an empty source map is no longer represented as a NIL term (as it was in the beginning before map literals were implemented).
2015-04-13	Remove support for put_map_exact without a source map	Björn Gustavsson
	Using the exact operator (':=') is only allowed when an existing map is being updated. Thus the following causes a compilation error: #{k:=v} Therefore there is no need to support the put_map_exact instruction without a source map.
2014-12-05	erts: Use linear search for small select_val arrays	Björn-Egil Dahlberg
	For searching a key in an array we use linear search in arrays up to 10 elements. Selecting on tuple arity will always use linear search. Instead of using two different instructions we assume selecting on different tuple arities are always few in numbers.
2014-03-17	erts: Handle literals in is_map/1	Björn-Egil Dahlberg

2014-02-21	erts: Maps src instructions can't be literals	Björn-Egil Dahlberg
	Move src to a register if it is a literal.
2014-02-19	erts: Introduce new instructions for combined key fetches	Björn-Egil Dahlberg

2014-02-05	erts: Fix Maps for beam_load	Björn-Egil Dahlberg
	Map source may be anything, not only registers.
2014-01-28	compiler: Implement different instructions for => and :=	Björn Gustavsson

2014-01-28	erts: Maps beam-instruction definitions	Björn Gustavsson

2014-01-28	Implement support for maps in the compiler	Björn Gustavsson
	To make it possible to build the entire OTP system, also define dummys for the instructions in ops.tab.
2013-11-18	Execution of system tasks in context of another process	Rickard Green
	A process requesting a system task to be executed in the context of another process will be notified by a message when the task has executed. This message will be on the form: {RequestType, RequestId, Pid, Result}. A process requesting a system task to be executed can set priority on the system task. The requester typically set the same priority on the task as its own process priority, and by this avoiding priority inversion. A request for execution of a system task is made by calling the statically linked in NIF erts_internal:request_system_task(Pid, Prio, Request). This is an undocumented ERTS internal function that should remain so. It should only be called from BIF implementations. Currently defined system tasks are: * garbage_collect * check_process_code Further system tasks can and will be implemented in the future. The erlang:garbage_collect/[1,2] and erlang:check_process_code/[2,3] BIFs are now implemented using system tasks. Both the 'garbage_collect' and the 'check_process_code' operations perform or may perform garbage_collections. By doing these via the system task functionality all garbage collect operations in the system will be performed solely in the context of the process being garbage collected. This makes it possible to later implement functionality for disabling garbage collection of a process over context switches. Newly introduced BIFs: * erlang:garbage_collect/2 - The new second argument is an option list. Introduced option: * {async, RequestId} - making it possible for users to issue asynchronous garbage collect requests. * erlang:check_process_code/3 - The new third argument is an option list. Introduced options: * {async, RequestId} - making it possible for users to issue asynchronous check process code requests. * {allow_gc, boolean()} - making it possible to issue requests that aren't allowed to garbage collect (operation will abort if gc should be needed). These options have been introduced as a preparation for parallelization of check_process_code operations when the code_server is about to purge a module.
2013-02-22	Update copyright years	Björn-Egil Dahlberg

2013-02-07	BEAM loader: Handle element(Pos, not_a_tuple)	Björn Gustavsson
	The loader failed to load non-optimized BEAM code generated from: element(2, not_a_tuple) Commit ece4c17d2288a3161c995 introduced such code into core_fold_SUITE, leading to core_fold_no_opt_SUITE and core_fold_post_opt_SUITE failing to load.
2012-06-25	Don't go to single-scheduler mode when managing breakpoints	Björn Gustavsson
	Calls to erlang:set_trace_pattern/3 will no longer block all other schedulers. We will still go to single-scheduler mode when new code is loaded for a module that is traced, or when loading code when there is a default trace pattern set. That is not impossible to fix, but that requires much closer cooperation between tracing BIFs and the loader BIFs.
2012-06-25	Change the data structures for breakpoints	Björn Gustavsson
	Change the data structures for breakpoints to make it possible (in a future commit) to manage breakpoints without taking down the system to single-scheduling mode. The current "breakpoint wheel" data structure (a circular, double-linked list of breakpoints) was invented before the SMP emulator. To support it in the SMP emulator, there is essentially one breakpoint wheel per scheduler. As more breakpoint types have been added, the implementation has become messy and hard to understand and maintain. Therefore, the time for a rewrite has come. Use one struct to hold all breakpoint data for a breakpoint in a function. Use a flag field to indicate what different type of break actions that are enabled.
2012-03-30	Merge branch 'maint'	Björn-Egil Dahlberg

2012-03-30	Update copyright years	Björn-Egil Dahlberg

2012-03-22	Merge branch 'maint'	Patrik Nyblom
	Conflicts: erts/emulator/beam/beam_emu.c erts/emulator/beam/bif.tab erts/preloaded/ebin/prim_file.beam lib/hipe/cerl/erl_bif_types.erl
2012-03-22	Rename dyntrace BIFs to more suiting names	Patrik Nyblom

2012-03-22	If VM probes are not enabled, short-circuit calls to probe BIFs	Björn Gustavsson