Age | Commit message (Collapse) | Author |
|
The is_eq_exact/3 and is_ne_exact/3 instructions are commonly used
with one immediate or literal operand.
Introduce three new specialized instructions:
i_is_eq_exact_literal/3
i_is_ne_exact_immed/3
i_is_ne_exact_literal/3
The i_is_ne_exact_literal/3 instruction is not very frequently
used, but its existence is justified because we removed in a
a previous commit the special instruction for matching bignums
and we now use i_is_ne_exact_literal/3 instead.
For consistency, rename the existing is_eq_immed/3 instruction to
is_eq_exact_immed/3.
While at it, remove the optimization of an is_eq/3 instruction
with an immediate operand because that optimization is already
done by the compiler.
|
|
Introduce a new i_increment/4 to optimize the addition of
a register and a small integer. This instruction saves two
instruction words compared to the standard instructions
(an i_fetch/2 instruction followed by a i_plus/3 instruction)
and will also be slightly faster.
|
|
The new instruction will save one word (because no size operand
is needed), and is slightly faster.
Handle select_tuple_arity in the same way.
|
|
|
|
Since the literal (constant) pool was introduced in R12, the
BEAM compiler will never generate a "put_list Const [] Dst"
instruction (it will instead generate a "move [Const] Dst"
instruction).
|
|
The move_sd specific instruction is no longer used since there
are specific move instructions covering all possible permutations
of operands. Also eliminate the move_cy instruction because it
is almost never generated by the compiler.
|
|
Create separate instructions for each register type. The "badmatch x(0)"
and "case_end x(0)" (which are very common) will only require a single
word each, compared to two words when GetArg1() is used.
|
|
Use separate instructions for each register type.
|
|
Use separate instructions for each register type.
|
|
|
|
Instead of having one i_select_val_sfI instruction that uses
the GetArg1() macro to fetch the controlling expression, use
three separate instructions for each of the register types.
That will save one word when selecting on the {x,0} register.
It should also be slightly faster since a conditional branch
is eliminated.
Although it seems that the BEAM compiler will never generate
a constant controlling expression (even with optimizations
turned off), we still make sure that they will work by
evaluating the select_val instruction at load time.
Handle the select_tuple_arity instruction in the same way.
|
|
The tmp_arg1 and tmp_arg2 variables are intended for transferring
values from the fetch/2 instructions to instructions such as
i_plus/3. In many places, however, tmp_arg1 and tmp_arg2 are used
as general temporary variables within a single instruction.
Improve the code generation by replacing sloppy use of tmp_arg1
and tmp_arg2 with block-local variables. In most cases, that will
allow the temporary values to be kept in registers.
|
|
By default, GCC will inline calls to helper functions. Since
process_main() is already huge, there is no reason to inline
the helper functions (and some of them are used very seldom).
|
|
There were two separate functions (call_error_handler() and
call_breakpoint_handler()) that were identical except for
the name of the function in the error_handler module being
called. Generalize call_error_handler() by adding a function
name argument so that it can be used for both purposes.
Also let the call_error_handler() return the new program
counter instead of passing it in c_p->i. That slightly decrease
the code size at the call site.
There is also no need to use the Dispatch() macro to yet again
decrease the reduction counter, because that has just been done by
the call instruction that caused the execution of the
call_error_handler or i_debug_breakpoint instruction.
|
|
The compiler does not generate select_val instructions that only
selects one value, but the loader may previously have created such
an instruction when it splitted a select_val instruction that
selected on bignums.
|
|
Combine the put_tuple/2 and all following put/1 instructions
to one i_put_tuple/2 instruction. In general, that will reduce
the number of instruction words by 50 percent.
Measurements seem to indicate that the speed is about the same.
|
|
|
|
|
|
In many (not all) cases, the value for the 'I' type will
fit into 32 bits.
|
|
|
|
|
|
|
|
Introduce a new 'Q' type, similar to 'P' except that it
can be packed.
|
|
In the 32-bit BEAM emulator, it is only possible to pack
3 register operands into one word. Therefore, the move2
instruction (that has 4 operands) needs two words for its
operands.
Take advantage of the larger wordsize in the 64-bit emulator
and pack up to 4 operands into a single word.
|
|
In the transformation engine in the loader, an is_eq/1 instruction
is currently always preceded by an is_type/1 instruction. Therefore,
save a word and slight amount of time by combining those
instructions into an is_type_eq/2 instruction.
|
|
The i_jump_on_val_zero/3 and i_select_tuple_arity/3 instructions
were not disassembled correctly.
|
|
|
|
It would only really work in simple case like:
select_val S=q Fail=f Size=u Rest=* => ...
where all operands for a single instruction where bound to
variables, and not for more complicated cases such as:
i_put_tuple Dst Arity Puts=* | put PutSrc => ...
|
|
There was a version of the BEAM loader and emulator that
had two versions of the fmove/2 instruction, one version
that allocated heap space internally and a newer version that
assumed that a previous test_heap/2 instruction had already
allocated the heap space.
Though the allocating fmove/2 instruction is no longer
supported, some vestiges of it still remains.
|
|
erts_debug:instructions/0 is useful for finding which specific
instructions that are not used at all.
|
|
* pan/unicode-filenames/OTP-8887: (27 commits)
Test and correct filelib and filename
Add documentation to erlang.xml and slight correction to unicode_usage.xml
Add section about Unicode file names to stdlib users guide
Correct bug in file_name_SUITE making it fail on Unix instead of Windows7
Add documentation about raw filenames and Unicode file name translation mode
Make filelib not crash on re codepoints beyond 255 in re when filename is raw
Mend on_load_embedded testcase which did not handle windows links
Correct testcase regarding windows versions supporting soft links.
Teach filelib to use re in unicode mode when filenames are not raw
Treat soft links on Windows correctly in file_name_SUITE
Adapt new soft and hard link routines on Windos to Unicode
Corrected testcases broken by unicode filenames
Update preloaded prim_file
Teach prim_file not to accept atoms and not to throw exceptions
Adapt inet_drv to Visual Studio 2008
Teach spawn_executable about Unicode
Convert filenames read on MacOSX to canonical form
Teach file to accept codepoints beyond 255.
Add testcases
Correct shell utilities to handle unicode and possibly binaries
...
|
|
* rickard/load-balance-wrap/OTP-8950:
Prevent wrapping of values used during load balancing
|
|
Some integer values used during load balancing could
under rare circumstances wrap causing a load unbalance
between schedulers.
|
|
|
|
Also corrected compressed files on Windows
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The compressed format is using a slighty modified variant of the extern format
(term_to_binary). To not worsen key lookup's too much, the top tuple itself
and the key element are not compressed. Table objects with only immediate
non-key elements will therefor not gain anything (but actually consume one
extra word for "alloc_size").
|
|
* rickard/cpu-groups/OTP-8861:
Generalize reader groups
Move cpu topology functionality into erl_cpu_topology.[ch]
Do not use more reader groups for schedulers than schedulers
Conflicts:
erts/emulator/beam/erl_init.c
|
|
Reader groups have been generalized to cpu groups which can be
used for implementing reader groups, but also for implementing
other functionality in the future.
|
|
|
|
When the runtime system had fewer schedulers than logical processors,
the system could get an unnecessarily large amount reader groups.
|
|
* rickard/sys_schedule_debug:
Verify that no outstanding I/O exist when checking for I/O in debug build
|
|
|