Age | Commit message (Collapse) | Author |
|
|
|
|
|
|
|
Introduce HAllocX to allocate heap fragments with a larger capacity
than requested and by that reduce the number of fragments allocated.
|
|
|
|
|
|
|
|
Did not properly take care of case when TryMeElse restarted
with next match clause.
|
|
|
|
Faulty use of term on C-stack in heap_dump()
|
|
|
|
|
|
|
|
|
|
|
|
In halfword emulator, make ETS use a variant of the internal term
format that uses relative offsets instead of absolute pointers. This
will allow storage in high memory (>4G). Preprocessor macros (like
list_val_rel(TERM,BASE)) are used to make normal (fullword) emulator
almost completely unchanged while still reusing most of the code.
|
|
* bjorn/beam-loader/OTP-9030: (43 commits)
c: Reduce memory footprint
erl_posix_msg: Reduce memory footprint
Introduce a few more variations of the move instructions
Combine a move + jump sequence into the move_jump instruction
Optimize and clean-up the exact equality/non-equality instructions
Optimize addition of a small integer to a variable
Introduce a special instruction for select_val with two values
Introduce a few more specialized put_list instructions
Eliminate the "put_list c n Dst" instructions
Eliminate the specific move_sd instruction
Eliminate use of GetArg1() in the badmatch and case_end instructions
Eliminate use of GetArg2() in the i_element instruction
Eliminate use of GetArg1() in the fast_element instruction
Eliminate use of GetArg1() in the jump_on_val* instructions
Eliminate use of GetArg1() in the select_val instruction
beam_emu: Eliminate sloppy use of tmp_arg1 and tmp_arg2
beam_emu: Don't inline helper functions into process_main()
beam_emu: Clean up calling of the error_handler module
Simplify a select_val instruction that selects only one value
Optimize creation of tuples
...
|
|
Frequency counts show that
move Const x(1)
move Const x(2)
are very common.
|
|
That will save one word and small amount of time for
each occurrence.
|
|
The is_eq_exact/3 and is_ne_exact/3 instructions are commonly used
with one immediate or literal operand.
Introduce three new specialized instructions:
i_is_eq_exact_literal/3
i_is_ne_exact_immed/3
i_is_ne_exact_literal/3
The i_is_ne_exact_literal/3 instruction is not very frequently
used, but its existence is justified because we removed in a
a previous commit the special instruction for matching bignums
and we now use i_is_ne_exact_literal/3 instead.
For consistency, rename the existing is_eq_immed/3 instruction to
is_eq_exact_immed/3.
While at it, remove the optimization of an is_eq/3 instruction
with an immediate operand because that optimization is already
done by the compiler.
|
|
Introduce a new i_increment/4 to optimize the addition of
a register and a small integer. This instruction saves two
instruction words compared to the standard instructions
(an i_fetch/2 instruction followed by a i_plus/3 instruction)
and will also be slightly faster.
|
|
The new instruction will save one word (because no size operand
is needed), and is slightly faster.
Handle select_tuple_arity in the same way.
|
|
|
|
Since the literal (constant) pool was introduced in R12, the
BEAM compiler will never generate a "put_list Const [] Dst"
instruction (it will instead generate a "move [Const] Dst"
instruction).
|
|
The move_sd specific instruction is no longer used since there
are specific move instructions covering all possible permutations
of operands. Also eliminate the move_cy instruction because it
is almost never generated by the compiler.
|
|
Create separate instructions for each register type. The "badmatch x(0)"
and "case_end x(0)" (which are very common) will only require a single
word each, compared to two words when GetArg1() is used.
|
|
Use separate instructions for each register type.
|
|
Use separate instructions for each register type.
|
|
|
|
Instead of having one i_select_val_sfI instruction that uses
the GetArg1() macro to fetch the controlling expression, use
three separate instructions for each of the register types.
That will save one word when selecting on the {x,0} register.
It should also be slightly faster since a conditional branch
is eliminated.
Although it seems that the BEAM compiler will never generate
a constant controlling expression (even with optimizations
turned off), we still make sure that they will work by
evaluating the select_val instruction at load time.
Handle the select_tuple_arity instruction in the same way.
|
|
The tmp_arg1 and tmp_arg2 variables are intended for transferring
values from the fetch/2 instructions to instructions such as
i_plus/3. In many places, however, tmp_arg1 and tmp_arg2 are used
as general temporary variables within a single instruction.
Improve the code generation by replacing sloppy use of tmp_arg1
and tmp_arg2 with block-local variables. In most cases, that will
allow the temporary values to be kept in registers.
|
|
By default, GCC will inline calls to helper functions. Since
process_main() is already huge, there is no reason to inline
the helper functions (and some of them are used very seldom).
|
|
There were two separate functions (call_error_handler() and
call_breakpoint_handler()) that were identical except for
the name of the function in the error_handler module being
called. Generalize call_error_handler() by adding a function
name argument so that it can be used for both purposes.
Also let the call_error_handler() return the new program
counter instead of passing it in c_p->i. That slightly decrease
the code size at the call site.
There is also no need to use the Dispatch() macro to yet again
decrease the reduction counter, because that has just been done by
the call instruction that caused the execution of the
call_error_handler or i_debug_breakpoint instruction.
|
|
The compiler does not generate select_val instructions that only
selects one value, but the loader may previously have created such
an instruction when it splitted a select_val instruction that
selected on bignums.
|
|
Combine the put_tuple/2 and all following put/1 instructions
to one i_put_tuple/2 instruction. In general, that will reduce
the number of instruction words by 50 percent.
Measurements seem to indicate that the speed is about the same.
|
|
|
|
|
|
|
|
In many (not all) cases, the value for the 'I' type will
fit into 32 bits.
|
|
We don't want the packable types listed in two places.
|
|
|
|
|
|
|
|
|
|
Introduce a new 'Q' type, similar to 'P' except that it
can be packed.
|
|
In the 32-bit BEAM emulator, it is only possible to pack
3 register operands into one word. Therefore, the move2
instruction (that has 4 operands) needs two words for its
operands.
Take advantage of the larger wordsize in the 64-bit emulator
and pack up to 4 operands into a single word.
|
|
Giving the beam_makeops script access to the external word
size (=the size of instruction words) will allow it to pack
more operands into a word for the 64 bits emulator.
|
|
In the transformation engine in the loader, an is_eq/1 instruction
is currently always preceded by an is_type/1 instruction. Therefore,
save a word and slight amount of time by combining those
instructions into an is_type_eq/2 instruction.
|
|
|
|
The i_jump_on_val_zero/3 and i_select_tuple_arity/3 instructions
were not disassembled correctly.
|