Age | Commit message (Collapse) | Author |
|
* bjorn/erts/beam_load:
Optimize get_tuple_element instructions that target Y registers
Mend beam_SUITE:packed_registers/1
Correct unpacking of 3 operands on 32-bit archictectures
Eliminate misleading #ifdef ARCH_64 in beam_opcodes.h
beam_debug: Correct masking when unpacking packed operands
|
|
Add the possibility to use modules as trace data receivers. The functions
in the module have to be nifs as otherwise complex trace probes will be
very hard to handle (complex means trace probes for ports for example).
This commit changes the way that the ptab->tracer field works from always
being an immediate, to now be NIL if no tracer is present or else be
the tuple {TracerModule, TracerState} where TracerModule is an atom that
is later used to lookup the appropriate tracer callbacks to call and
TracerState is just passed to the tracer callback. The default process and
port tracers have been rewritten to use the new API.
This commit also changes the order which trace messages are delivered to the
potential tracer process. Any enif_send done in a tracer module may be delayed
indefinitely because of lock order issues. If a message is delayed any other
trace message send from that process is also delayed so that order is preserved
for each traced entity. This means that for some trace events (i.e. send/receive)
the events may come in an unintuitive order (receive before send) to the
trace receiver. Timestamps are taken when the trace message is generated so
trace messages from differented processes may arrive with the timestamp
out of order.
Both the erlang:trace and seq_trace:set_system_tracer accept the new tracer
module tracers and also the backwards compatible arguments.
OTP-10267
|
|
0a4750f91c83 optimized unpacking by removing a mask operation
when unpacking three packed operands. Unfortunately, that optimization
is only safe on 64-bit architectures.
Here is what happens on 32-bit architectures.
The operands to be packed are 10-bit register numbers that have been
turned into byte offsets:
aaaaaaaaaa00
bbbbbbbbbb00
cccccccccc00
They can be packed into a single word like this:
30 20 10 0
| | | |
aa aaaaaaaabb bbbbbbbbcc cccccccc00
If we call the packed word P, the original operands can be
extracted like this:
C = P band 2#111111111100
B = (P bsr 10) band 2#111111111100
A = (P bsr 20) band 2#111111111100
The bug was that A was extracted without the masking:
A = P bsr 20
That would give A the value:
aaaaaaaaaaaabb
That would only be safe if the two most significant bits in B
were zeroes.
|
|
There is a '#ifdef ARCH_64' beam_opcodes.h, which might make you
think that files generated by beam_makeops will work for both
32-bit and 64-bit architectures. They will not. beam_makeops will
generate different code depending on its -wordsize option.
|
|
* henrik/update-copyrightyear:
update copyright-year
|
|
The removal of instructions on the left side of a transformation
is done while generating the code for the left side.
Postpone removal of unused variables to a later, separate passes to
allow more variables to be eliminated after the optimizations
passes introduced in the previous commits.
|
|
In transformations such as:
move S X0=x==0 | line Loc | call_ext Ar Func => \
line Loc | move S X0 | call_ext Ar Func
we can avoid rebuilding the last instruction in the sequence
by introducing a 'keep' instruction.
Currently, there are only 13 transformations that are hit by
this optimization, but most of them are frequently used.
|
|
Introduce a 'rename' instruction that can be used to optimize
simple renaming with unchanged operands such as:
get_tuple_element Reg P Dst => i_get_tuple_element Reg P Dst
By allowing it to lower the arity of instruction, transformations
such as the following can be handled:
trim N Remaining => i_trim N
All in all, currently 67 transformations can be optimized in this
way, including some commonly used ones.
|
|
Generic instructions have a min_window field. Its purpose is to
avoid calling transform_engine() when there are too few instructions
in the current "transformation window" for a transformation to
succeed.
Currently it does not do much good since the window size will be
decremented by one before being used. The reason for the subtraction
is probably that in some circumstances in the past, the loader could
read past the end of the BEAM module while attempting to fetch
instructions to increase the window size. Therefore, it would not
be safe to just remove the subtraction by one.
The simplest and safest solution seems to always ensure that there
are always at least TWO instructions when calling transform_engine().
That will be safe, as long as a BEAM module is always finished with
an int_code_end/0 that is not involved in any transformation.
|
|
When an instruction with a variable number operands (such as
select_val) is seen of the left side of a transformation, the
'next_arg' instruction will allocate a buffer to fit all variables and
all operands will be copied into the buffer. Very often, the 'commit'
instruction will never be reached because of a test or predicate
failing or because of a short window; in that case, the variable
buffer will be deallocated.
Note that originally there were only few instructions with a variable
number of operands, but now common operations such as tuple building
also have a variable number of operands.
To avoid those frequent allocations and deallocations, modify the
'next_arg' instruction to only save a pointer to the first of the
"rest" arguments. Also move the deallocation of the instructions
on the left side from the 'commit' instruction to the 'end'
instruction to ensure that 'store_rest_args' will still work.
|
|
|
|
* maint:
Add configure switch --disable-saved-compile-time
Fix ethread events with timeout
Improve choice of clock sources at build time
|
|
|
|
When unpacking operands on 64-bit CPUs, use a smarter mask to
help the compiler optimize the code.
It turns out that on x86_64, if we use the mask 0xFFFFUL (selecting
the 16 least significant bits), the compiler can combine a move and
a mask operation into the single insruction 'movzwl', which will
eliminate one instruction.
|
|
Not pre-fetching in conditional instructions (instructions that use
-fail_action) seems to improve performance slightly.
The reason for that is that conditional instructions may jump to the
failure label, wasting the pre-fetched instruction. Another reason
is that that many conditional instructions do a function call, and
the register pressure is therefore high. Avoiding the pre-fetch
may reduce the register pressure and pontentially result in more
efficient code.
|
|
It is currently only possible to pack up to 4 operands. However,
the move_window4 instrucion has 5 operands and move_window5 and
move3 instrucations have 6 operands.
Teach beam_makeops to pack instructions with 5 or 6 operands.
Also rewrite the move_window instructions in beam_emu.c to macros
to allow their operands to get packed.
|
|
When packing 3 operands into one word, there would be an unnecessary
mask operation when extracting the last operand.
|
|
The 'r' type is now mandatory. That means in order to handle
both of the following instructions:
move x(0) y(7)
move x(1) y(7)
we would need to define two specific operations in ops.tab:
move r y
move x y
We want to make 'r' operands optional. That is, if we have
only this specific instruction:
move x y
it will match both of the following instructions:
move x(0) y(7)
move x(1) y(7)
Make 'r' optional allows us to save code space when we don't
want to make handling of x(0) a special case, but we can still
use 'r' to optimize commonly used instructions.
|
|
Consider the try_case_end instruction:
try_case_end s
The 's' operand type means that the operand can either be a
literal of one of the types atom, integer, or empty list, or
a register. That worked well before R12. In R12 additional
types of literals where introduced. Because of way the
overloading was done, an 's' operand cannot handle the
new types of literals. Therefore, code such as the following
is necessary in ops.tab to avoid giving an 's' operand a
literal:
try_case_end Literal=q => move Literal x | try_case_end x
While this work, it is error-prone in that it is easy to
forget to add that kind of rule. It would also be complicated
in case we wanted to introduce a new kind of addition operator
such as:
i_plus jssd
Since there are two 's' operands, two scratch registers and
two 'move' instructions would be needed.
Therefore, we'll need to find a smarter way to find tag
register operands. We will overload the pid and port tags
for X and Y register, respectively. That works because pids
and port are immediate values (fit in one word), and there
are no literals for pids and ports.
|
|
The purpose of this series of commits is to improve code generation
for the Clang compiler.
As a first step we want to change the meaning of 'x' in a
transformation such as:
operation Literal=q => move Literal x | operation x
Currently, a plain 'x' means reg[0] or x(0), which is the first
element in the X register array. That element is distinct from
r(0) which is a variable in process_main(). Therefore, since r(0)
and x(0) are currently distinct it is fine to use x(0) as a
scratch register.
However, in the next commit we will eliminate the separate variable
for storing the contents of X register zero (thus, x(0) and r(0)
will point to the same location in the X register array). Therefore,
we must use another scratch register in transformation. Redefine
a plain 'x' in a transformation to mean x(1023). Also define
SCRATCH_X_REG so that we can refer to the register by name from
C code.
|
|
Consider an hypothetical instruction:
do_something x x c
The loader would crash if we tried to load an instance of the
instruction with the last operand referencing a literal:
{do_something,{x,0},{x,1},{literal,{a,b,c}}}
Teach beam_makeops to turn off packing for such unsafe instructions.
|
|
|
|
|
|
This has to be done in order to consistently generate the same
file so that we do not get rebuilds all the time.
|
|
|
|
|
|
It was not possible to preserve extra arguments in transformations.
The following (hypothetical) example will now work:
some_op Lit=c SizeArg Rest=* => move Lit x | some_op x SizeArg Rest
|
|
|
|
|
|
|
|
None of the OTP linked-in driver are supported
|
|
Both crypto and asn1 are supported.
|
|
|
|
|
|
The sha will only be included if there is no tag
starting with OTP_R* associated with the sha. This
is because we do not want the sha to show on offical
releases.
|
|
|
|
|
|
Perl 5.16.1 (and perhaps other versions) issues the following
warning:
defined(@array) is deprecated at utils/beam_makeops line 1714.
(Maybe you should just omit the defined()?)
for the following line:
$prev_last = pop(@{$gen_transform{$key}})
if defined @{$gen_transform{$key}}; # LINE 1714
The documentation for "defined" says that its use on hashes and
arrays is deprecated and that it may stop working in a future
release.
Simply removing "defined" (as suggested by the warning message)
will not work, as there will be an error when trying to use an
undefined value as an array reference:
Can't use an undefined value as an ARRAY reference at
utils/beam_makeops line 1714.
What we must do is to check whether $gen_transform{$key} is
defined before trying to use it as an array reference.
Noticed-by: Tuncer Ayaz
|
|
Conflicts:
lib/diameter/autoconf/vxworks/sed.general
xcomp/README.md
|
|
|
|
* maint:
make_preload: Don't fail if Perl's default file encoding is UTF-8
|
|
Setting Perl's default encoding for files to UTF-8, for example
like this:
PERL_UNICODE=DS make
would crash the build with a message similar to:
form size 1413 greater than size 1237 of module at
utils/make_preload line 175, <FILE> chunk 1.
Tell Perl to interpret the data in BEAM files as binary by
using the binmode() function. The binmode() function existed
before Unicode support was added to Perl, which means that
make_preload should work even in old versions of Perl.
Noticed-by: Aaron Harnly
|
|
Calls to erlang:set_trace_pattern/3 will no longer block all
other schedulers.
We will still go to single-scheduler mode when new code is loaded
for a module that is traced, or when loading code when there is a
default trace pattern set. That is not impossible to fix, but that
requires much closer cooperation between tracing BIFs and the loader
BIFs.
|
|
|
|
|
|
|
|
Still does not run, just compiles.
|
|
The current calling convention for BIFs makes it necessary to
handle each arity specially, since each argument for the BIF
also becomes an argument for the C function implementing the BIF,
which makes it hard to allow BIFs with any number of arguments.
Change the calling convention for BIFs, so that BIF arguments are
passed in an array to the C function implementing the BIF.
|
|
We already avoid outputting a comment terminator ("*/") inside
a comment to avoid causing a syntax error. Also avoid outputting
the start of a comment ("/*") to avoid causing a compiler warning.
Noticed-by: Tuncer Ayaz
|
|
|