Age | Commit message (Collapse) | Author |
|
* bjorn/erts/pack-combined:
Pack combined instructions
beam_makeops: Refactor code generation
Correct disassembly of select instructions
|
|
* lukas/erts/remove-dirty-scheduler-defines/OTP-14613:
erts: Remove possibility to disable dirty schedulers
|
|
|
|
The refactoring will simplify packing of combined instructions.
|
|
* bjorn/erts/relative-jumps:
Pack failure labels in i_select_val2 and i_select_tuple_arity2
Optimize i_select_tuple_arity2 and is_select_lins
Rewrite select_val_bins so that its labels can be packed
Pack sequences of trailing 'f' operands
Implement packing of 'f' and 'j'
Make sure that mask literals are 64 bits
Use relative failure labels
Add information about offset to common group start position
Remove JUMP_OFFSET
Refactor instructions to support relative jumps
Introduce a new trace_jump/1 instruction for tracing
Avoid using $Src more than once
|
|
|
|
Use the "ull" suffix for the mask literals instead of "ul"
to ensure that the literals are 64 bits also on Windows.
|
|
|
|
|
|
|
|
The right side of a transformation must be either a single call
to a transformation function OR a list of instructions. Mixing
them like this is not supported:
some_instruction A B C => gen_something(A) | other B C
Unfortunately, beam_makeops would silenty ignore anything after the
function call, basically handling it in the same way as:
some_instruction A B C => gen_something(A)
Add a sanity check to reject such mixed right-hand sides.
|
|
Make the generated code easier to read.
|
|
The bit syntax instructions are mixed among other instructions
in beam_hot.h and beam_cold.h.
Introduce a new hotness level called '%warm' with is associated
file beam_warm.h. Mark all bit syntax instructions as '%warm'.
|
|
Generated code uses 'I' explicitly in other places, so it
can as well use 'I' when accessing the operands for instructions.
|
|
The beam_instrs.h file serves no useful purpose. Put the
instructions in beam_hot.h instead.
|
|
The type 'd' could be used both for destination registers and
source register.
Restrict the 'd' type to only be used for destinations, and
introduce the new 'S' type to be used when a source must be
a register.
|
|
Cold instructions used to be cooler (less frequently executed),
so it did not seem worthwhile to pack their operands. Now bit
syntax instructions are included among the cold instructions,
and they are frequently used.
|
|
Update the pack engine to safely push literal operands to the pack
stack and to safely pop them back to another code address. That
will allow packing of more instructions.
|
|
The packer had several bugs and limitations. For instance, on
a 32-bit Erlang virtual machine it would gladly pack three
't' values into one word even though it would be not safe.
The rewritten version will be more careful how much it packs
into each word. It will also be able to do packing for more
instructions.
|
|
As a preparation for potentially improving packing in the future,
we will need to make sure that packable types have a defined maximum
size.
The packer algorithm assumes that two 'I' operands can be packed
into one 64-bit word, but there are instructions that use an 'I'
operand to store a pointer. It only works because those instructions
are not packed for other reasons.
Introduce the 'W' type and use it for operands that don't fit in
32 bits.
|
|
I don't remember what they were used for, but they are certainly
no longer used.
|
|
If a type has a size in %arg_size, it should also have
a defined pattern in %bit_type.
|
|
BEAM_WIDE_MASK covered the 16 right-most bits, instead of the 32
right-most bits. This bug will bite us when we'll do more packing in
the future.
This bug has been harmless in the past. It has been used in
test_heap and allocate instructions for the number of heap words
needed. It would be theoretically possible to construct a program
that would need 65536 or more heap words, but it is hard to imagine
a practical use for such a program. (The program would have to build
a tuple or list with at least one variable and the rest of the elements
being literals.)
|
|
|
|
|
|
|
|
beam_makeops will place all micro instructions in a block and generate
goto instructions from one micro instruction to the next. It will also
add adjustments of 'I' if necessary (if the micro instructions have
different length).
|
|
Eliminate the need to write pre-processor macros for each instruction.
Instead allow the implementation of instruction to be written in
C directly in the .tab files. Rewrite all existing macros in this
way and remove the %macro directive.
|
|
|
|
Inroduce syntactic sugar so that we can write:
get_list xy xy xy
instead of:
get_list x x x
get_list x x y
get_list x y x
get_list x y y
get_list y x x
get_list y x y
get_list y y x
get_list y y y
|
|
In Perl 5, '&' on direct subroutine calls are optional.
|
|
Instructions that take a 'd' argument needs a -gen_dest flag in their
macros. For example:
%macro:put_list PutList -pack -gen_dest
put_list s s d
-gen_dest was needed when x(0) was stored in a register, since it is
not possible to take the address of a register. Now that x(0) is stored
in memory and we can take the address, we can eliminate gen_dest.
|
|
|
|
26b59dfe67 introduced the new 'AtU8' chunk to support
Unicode atoms.
make_preload strips the pre-loaded BEAM files so that they
only contain essential chunks. It expects to find the old
'Atom' chunk.
Teach make_preload to read the new 'AtU8' chunk instead of the old
chunk. Also produce a nice error message if someone by mistake
compiles the pre-loaded modules with an OTP 19 compiler.
|
|
|
|
|
|
Broken on master by f0f4e72c8ec5c08993ff.
|
|
* bjorn/gc-bifs:
compiler: Eliminate num_bif_SUITE
erl_internal: Eliminate duplication of guard tests
beam_debug: Improve the disassembly of gc_bif instructions
Simplify creation of new GC BIFs
make_tables: Remove broken automatic BIF aliasing
|
|
Add the BIF type "gcbif" in bif.tab for defining GC BIFs. That will
eliminate some of the hand-written administrative code for handling
GC BIFs, saving the developer's time.
|
|
Before:
$ size bin/x86_64-unknown-linux-gnu/beam.smp
text data bss dec hex filename
3080982 188369 158472 3427823 344def bin/x86_64-unknown-linux-gnu/beam.smp
After:
$ size bin/x86_64-unknown-linux-gnu/beam.smp
text data bss dec hex filename
3164694 104657 158472 3427823 344def bin/x86_64-unknown-linux-gnu/beam.smp
|
|
The counters are only used in the special 'icount' emulator.
We will save some memory by including the counters in the
OpEntry. It will also make it possible to make opc 'const'.
|
|
The make_tables script still contains a broken implementation
of a mechanism to automatically give a BIF an additional
(for example, was used for erlang:'++'/2 and erlang:append/2).
When that featured broke, it was worked around by adding
additional entries to bif.tab. There is therefore no reason
to mend the feature.
|
|
Mark the preloaded code 'const' to allow the compiler to put it into
the 'text' segment instead of into the 'data' segment. Since the
'text' segment is shared among all instances of the Erlang virtual
machine, this change could potentially reduce memory consumption
(slightly).
Before the change:
$ size bin/x86_64-unknown-linux-gnu/beam.smp
text data bss dec hex filename
2920246 352273 158472 3430991 345a4f bin/x86_64-unknown-linux-gnu/beam.smp
After the change:
$ size bin/x86_64-unknown-linux-gnu/beam.smp
text data bss dec hex filename
3081046 191473 158472 3430991 345a4f bin/x86_64-unknown-linux-gnu/beam.smp
Roughly speaking, this change cuts the size of the data segment in half.
|
|
* bjorn/erts/beam_load:
Optimize get_tuple_element instructions that target Y registers
Mend beam_SUITE:packed_registers/1
Correct unpacking of 3 operands on 32-bit archictectures
Eliminate misleading #ifdef ARCH_64 in beam_opcodes.h
beam_debug: Correct masking when unpacking packed operands
|
|
Add the possibility to use modules as trace data receivers. The functions
in the module have to be nifs as otherwise complex trace probes will be
very hard to handle (complex means trace probes for ports for example).
This commit changes the way that the ptab->tracer field works from always
being an immediate, to now be NIL if no tracer is present or else be
the tuple {TracerModule, TracerState} where TracerModule is an atom that
is later used to lookup the appropriate tracer callbacks to call and
TracerState is just passed to the tracer callback. The default process and
port tracers have been rewritten to use the new API.
This commit also changes the order which trace messages are delivered to the
potential tracer process. Any enif_send done in a tracer module may be delayed
indefinitely because of lock order issues. If a message is delayed any other
trace message send from that process is also delayed so that order is preserved
for each traced entity. This means that for some trace events (i.e. send/receive)
the events may come in an unintuitive order (receive before send) to the
trace receiver. Timestamps are taken when the trace message is generated so
trace messages from differented processes may arrive with the timestamp
out of order.
Both the erlang:trace and seq_trace:set_system_tracer accept the new tracer
module tracers and also the backwards compatible arguments.
OTP-10267
|
|
0a4750f91c83 optimized unpacking by removing a mask operation
when unpacking three packed operands. Unfortunately, that optimization
is only safe on 64-bit architectures.
Here is what happens on 32-bit architectures.
The operands to be packed are 10-bit register numbers that have been
turned into byte offsets:
aaaaaaaaaa00
bbbbbbbbbb00
cccccccccc00
They can be packed into a single word like this:
30 20 10 0
| | | |
aa aaaaaaaabb bbbbbbbbcc cccccccc00
If we call the packed word P, the original operands can be
extracted like this:
C = P band 2#111111111100
B = (P bsr 10) band 2#111111111100
A = (P bsr 20) band 2#111111111100
The bug was that A was extracted without the masking:
A = P bsr 20
That would give A the value:
aaaaaaaaaaaabb
That would only be safe if the two most significant bits in B
were zeroes.
|
|
There is a '#ifdef ARCH_64' beam_opcodes.h, which might make you
think that files generated by beam_makeops will work for both
32-bit and 64-bit architectures. They will not. beam_makeops will
generate different code depending on its -wordsize option.
|
|
* henrik/update-copyrightyear:
update copyright-year
|
|
The removal of instructions on the left side of a transformation
is done while generating the code for the left side.
Postpone removal of unused variables to a later, separate passes to
allow more variables to be eliminated after the optimizations
passes introduced in the previous commits.
|
|
In transformations such as:
move S X0=x==0 | line Loc | call_ext Ar Func => \
line Loc | move S X0 | call_ext Ar Func
we can avoid rebuilding the last instruction in the sequence
by introducing a 'keep' instruction.
Currently, there are only 13 transformations that are hit by
this optimization, but most of them are frequently used.
|