Age | Commit message (Collapse) | Author |
|
|
|
on 32-bit, as the granularity of the literal bit vector
is super-alignment.
|
|
* lukas/erts/tracing/fix-spawned-lc-error/OTP-10267:
erts: Fix lock order bug when only child is procs traced
|
|
|
|
* bjorn/compiler/misc-opt:
v3_kernel: Construct literal lists properly
Use the register map in %live in beam_utils:is_killed_block/2
Teach beam_utils to check liveness for put_map instructions
beam_peep: Help out beam_jump
|
|
* bjorn/erts/beam_load:
Optimize get_tuple_element instructions that target Y registers
Mend beam_SUITE:packed_registers/1
Correct unpacking of 3 operands on 32-bit archictectures
Eliminate misleading #ifdef ARCH_64 in beam_opcodes.h
beam_debug: Correct masking when unpacking packed operands
|
|
that uses its own super carrier (erts_exec_mmapper)
to guarantee low addressed and executable memory (PROT_EXEC).
Currently only used on x86_64 that needs low memory
for HiPE/AMD64's small code model.
By initializing erts_exec_mapper early we secure
its low memory area before erts_literal_mmapper might
steal it.
|
|
to prepare for hipe native code allocation.
|
|
Make the callbacks more general to be usable for any allocator
that that uses its own ErtsMemMapper.
|
|
Reduce main carrier size
and number of free descriptors.
|
|
|
|
|
|
This is needed as otherwise messages from system_profile
will not be guaranteed to arrive before trace delivered.
|
|
|
|
Without off_heap message queue the GC of the tracer could take
very long and thus delay the code:purge which would in turn allow
the trace generator to generate a lot of more messages which would
make the tracer GC for even longer etc etc. The time taken for the
testcase could go up to as large as 10 seconds and use lots of memory.
|
|
Any heap fragment created during a nif call to a tracer nif
should be free'd immediately in order for the GC not to treat
it as live data.
|
|
OTP-13497
This trace event is triggered when a process is created from the
process that is created.
|
|
Rickards said that this was ok
|
|
We have the main lock on rp->p, so why not?
|
|
provoked by nif_SUITE:nif_binary_to_term.
If we fail to decode an immediate (unsafe atom for example) with
a dummy factory then hp and factory->hp will both be uninitialized
and valgrind will complain about comparing them.
|
|
|
|
Tracing to port in non-smp now creates port tasks
instead of calling directly to the port to fake
schedule events don't exist any more.
|
|
erts_block/unblock_fpe should only be called at entry to/exit from
native user code.
|
|
This commit completes the tracing for processes so that
all messages sent by a process (via nifs or otherwise) will
be traced.
The commit also adds tracing of all types of events from ports.
When enabling tracing using erlang:trace, the 'all' flag now also
enables tracing on all ports.
OTP-13496
|
|
Add the possibility to use modules as trace data receivers. The functions
in the module have to be nifs as otherwise complex trace probes will be
very hard to handle (complex means trace probes for ports for example).
This commit changes the way that the ptab->tracer field works from always
being an immediate, to now be NIL if no tracer is present or else be
the tuple {TracerModule, TracerState} where TracerModule is an atom that
is later used to lookup the appropriate tracer callbacks to call and
TracerState is just passed to the tracer callback. The default process and
port tracers have been rewritten to use the new API.
This commit also changes the order which trace messages are delivered to the
potential tracer process. Any enif_send done in a tracer module may be delayed
indefinitely because of lock order issues. If a message is delayed any other
trace message send from that process is also delayed so that order is preserved
for each traced entity. This means that for some trace events (i.e. send/receive)
the events may come in an unintuitive order (receive before send) to the
trace receiver. Timestamps are taken when the trace message is generated so
trace messages from differented processes may arrive with the timestamp
out of order.
Both the erlang:trace and seq_trace:set_system_tracer accept the new tracer
module tracers and also the backwards compatible arguments.
OTP-10267
|
|
These are convinience functions for calling nifs from erts
|
|
|
|
* lukas/erts/enif_send_null_env/OTP-13495:
erts: Add enif_send with NULL as msg env
|
|
* egil/erts/fix-erlang-system_profile/ERL-126/OTP-13494:
erts: Enhance system_profile tests
erts: Don't use function location when process is terminating
|
|
|
|
|
|
|
|
Use cerl:make_list/1 instead of a home-made make_list/1 to ensure that
literal lists are constructed as literals. In a future release, we
would like to forbid in the loader construction of literal lists using
instructions like:
put_list {atom,a} [] Dst
The proper way is:
move {literal,[a]} {x,0}
Also update the comment about "put_list Const [] Dst" in ops.tab.
|
|
Several improvements in the compiler (e.g. c288ab87fd6) has
lead to an Y register being the target for get_tuple_element
instructions. Therefore, introduce i_get_tuple_element2y
that combines two consecutive get_tuple_element instructions
that target Y registers.
|
|
packed_registers/1 may have actually tested put_list/3 instructions
with high register numbers at some time. Currently, the compiler
will generate code that only uses low register numbers.
Totally rewrite the test case. It is difficult to arrange so that
put_list/3 uses three high register numbers, so we will use
get_list/3 instructions with high register numbers.
|
|
0a4750f91c83 optimized unpacking by removing a mask operation
when unpacking three packed operands. Unfortunately, that optimization
is only safe on 64-bit architectures.
Here is what happens on 32-bit architectures.
The operands to be packed are 10-bit register numbers that have been
turned into byte offsets:
aaaaaaaaaa00
bbbbbbbbbb00
cccccccccc00
They can be packed into a single word like this:
30 20 10 0
| | | |
aa aaaaaaaabb bbbbbbbbcc cccccccc00
If we call the packed word P, the original operands can be
extracted like this:
C = P band 2#111111111100
B = (P bsr 10) band 2#111111111100
A = (P bsr 20) band 2#111111111100
The bug was that A was extracted without the masking:
A = P bsr 20
That would give A the value:
aaaaaaaaaaaabb
That would only be safe if the two most significant bits in B
were zeroes.
|
|
There is a '#ifdef ARCH_64' beam_opcodes.h, which might make you
think that files generated by beam_makeops will work for both
32-bit and 64-bit architectures. They will not. beam_makeops will
generate different code depending on its -wordsize option.
|
|
|
|
* egil/erts/opt-list_append/OTP-13487:
erts: Optimize '++' operator
|
|
It's not just ok to throw badarg, it MUST throw badarg.
|
|
|
|
* henrik/update-copyrightyear:
update copyright-year
|
|
|
|
* bjorn/raise:
Remove unreachable code after 'raise' instructions
Simplify the raise instruction to reduce code size
|
|
This also optimizes the BIF lists:append/2
Use one pass to check for properness and copying LHS list.
If LHS turns out not being a proper list, bail and reset htop.
If we run out of heap, allocate a heap-fragment and calculate
the remaining length as normal, thus checking for properness,
and then continue copying.
Measurements shows this being ~50% faster.
|
|
|
|
|
|
|
|
|
|
|