Age | Commit message (Collapse) | Author |
|
* bjorn/erts/beam_load:
Optimize get_tuple_element instructions that target Y registers
Mend beam_SUITE:packed_registers/1
Correct unpacking of 3 operands on 32-bit archictectures
Eliminate misleading #ifdef ARCH_64 in beam_opcodes.h
beam_debug: Correct masking when unpacking packed operands
|
|
|
|
|
|
The TMPBUF option is no longer needed due to is_literal test and NONE
was only used for initial debugging. So we remove the entire option.
|
|
|
|
This commit is just for debugging purposes, will probably be reverted.
It comes with a the erts_debug:copy_shared/1 BIF. If SHCOPY_DISABLE
is defined, SHCOPY starts disabled and is dynamically enabled the first
time that the BIF is called.
|
|
|
|
* sverk/literal-memory-range:
erts: Refactor line table in loaded beam code
erts: Refactor header of loaded beam code
fix check_process_code for separate literal area
erts: Add support for fast erts_is_literal()
erts: Refactor erl_mmap to allow several mapper instances
erts: Add new allocator LITERAL
erts: Fix strangeness in treatment of MSEG_ALIGN_BITS
erts: Cleanup main carrier creation
erts: Remove unused erts_have_erts_mmap
erts: Refactor config test for posix_memalign
|
|
to use a real C struct instead of array.
|
|
|
|
|
|
* maint:
Fix crash when disassembling modules with BIFs
|
|
In a debug-compiled emulator, running erts_debug:df(io) would
trigger an assertion failure:
1> erts_debug:df(io).
beam/beam_debug.c:301:erts_debug_disassemble_1() Assertion failed: (((funcinfo[0]) & 0x3F) == ((0x0 << 4) | ((0x2 << 2) | 0x3)))
Aborted (core dumped)
It turns out that the assertion is wrong. It should have been
updated in 64ccd8c9b7a7 which made it possible to have stubs for
BIFs in the BEAM code for a module. The faulty assertion was only
found when when 16317f73f79265 added a smoke test of the BEAM
disassembler.
|
|
Since 'd' operands can only either an X register or an Y register,
we only need a single bit to distinguish them. Furthermore, we can
pre-multiply the register number with the word size to speed up
address calculation.
|
|
The 'r' type is now mandatory. That means in order to handle
both of the following instructions:
move x(0) y(7)
move x(1) y(7)
we would need to define two specific operations in ops.tab:
move r y
move x y
We want to make 'r' operands optional. That is, if we have
only this specific instruction:
move x y
it will match both of the following instructions:
move x(0) y(7)
move x(1) y(7)
Make 'r' optional allows us to save code space when we don't
want to make handling of x(0) a special case, but we can still
use 'r' to optimize commonly used instructions.
|
|
Consider the try_case_end instruction:
try_case_end s
The 's' operand type means that the operand can either be a
literal of one of the types atom, integer, or empty list, or
a register. That worked well before R12. In R12 additional
types of literals where introduced. Because of way the
overloading was done, an 's' operand cannot handle the
new types of literals. Therefore, code such as the following
is necessary in ops.tab to avoid giving an 's' operand a
literal:
try_case_end Literal=q => move Literal x | try_case_end x
While this work, it is error-prone in that it is easy to
forget to add that kind of rule. It would also be complicated
in case we wanted to introduce a new kind of addition operator
such as:
i_plus jssd
Since there are two 's' operands, two scratch registers and
two 'move' instructions would be needed.
Therefore, we'll need to find a smarter way to find tag
register operands. We will overload the pid and port tags
for X and Y register, respectively. That works because pids
and port are immediate values (fit in one word), and there
are no literals for pids and ports.
|
|
|
|
The emulator would crash.
|
|
|
|
Sentinels in select_tuple_arity instructions are not proper
tuple arities and thus cannot be checked in debug.
Print them as small integers instead.
|
|
The has_map_fields instruction is infrequently used. Thus there
is no need to have the fastest possible implementation; it is
better to have an implementation that reduces the code size in
the already big process_main() function.
We can transform has_map_fields to a get_map_elements instruction,
targeting the same unused x[0] register for all keys. That
instruction will only be marginally slower than existing
implementation.
|
|
The new_map instruction cannot fail, and thus needs no fail label.
|
|
For searching a key in an array we use linear search in arrays
up to 10 elements.
Selecting on tuple arity will always use linear search.
Instead of using two different instructions we assume selecting on
different tuple arities are always few in numbers.
|
|
Now handles map instructions correctly.
|
|
|
|
Calls to erlang:set_trace_pattern/3 will no longer block all
other schedulers.
We will still go to single-scheduler mode when new code is loaded
for a module that is traced, or when loading code when there is a
default trace pattern set. That is not impossible to fix, but that
requires much closer cooperation between tracing BIFs and the loader
BIFs.
|
|
To allow us to manage breakpoints without going to single-scheduler
mode, we will need to update the breakpoints in several stages with
memory barriers in between each stage. Prepare for that by splitting
up the breakpoint setting and clearing functions into several smaller
functions.
|
|
Rename lock_code_ix as seize_code_write_permission. Don't want to call
it a "lock" as it can be held between schedulings and different threads
and is not managed by lock checker.
Rename "activate" staging as "commit" staging. Why not be consistent
and use git terminology all the way.
|
|
We want to avoid the race when trace settings are done in the time gap
while a code stager process is waiting for thread process before
commiting and releasing code_ix lock.
|
|
|
|
|
|
Still blocking code loading
|
|
|
|
* rickard/thr-progress-block/OTP-9631:
Replace system block with thread progress block
|
|
The ERTS internal system block functionality has been replaced by
new functionality for blocking the system. The old system block
functionality had contention issues and complexity issues. The
new functionality piggy-backs on thread progress tracking functionality
needed by newly introduced lock-free synchronization in the runtime
system. When the functionality for blocking the system isn't used
there is more or less no overhead at all. This since the functionality
for tracking thread progress is there and needed anyway.
|
|
As a preparation for changing the calling convention for
BIFs, make sure that all BIFs use the macros. Also, eliminate
all calls from one BIF to another, since that also breaks
the calling convention abstraction.
|
|
Existing %bp to print pointer size integers does not work in halfword
emulator to print Eterm size integers.
|
|
|
|
The new_binary() function takes a size argument that is an
int. In the 64-bit emulator (sizeof(int) == 4, sizeof(Uint) == 8),
any sizes >= 0x8000000 become 0xffffffff80000000 and above and
triggers a memory allocation failure.
Change the type of the size argument to Uint, and change any
callers that cast the argument to an int.
Correction-by: Jon Meredith
|
|
|
|
Instead of having one i_select_val_sfI instruction that uses
the GetArg1() macro to fetch the controlling expression, use
three separate instructions for each of the register types.
That will save one word when selecting on the {x,0} register.
It should also be slightly faster since a conditional branch
is eliminated.
Although it seems that the BEAM compiler will never generate
a constant controlling expression (even with optimizations
turned off), we still make sure that they will work by
evaluating the select_val instruction at load time.
Handle the select_tuple_arity instruction in the same way.
|
|
Combine the put_tuple/2 and all following put/1 instructions
to one i_put_tuple/2 instruction. In general, that will reduce
the number of instruction words by 50 percent.
Measurements seem to indicate that the speed is about the same.
|
|
In many (not all) cases, the value for the 'I' type will
fit into 32 bits.
|
|
|
|
Introduce a new 'Q' type, similar to 'P' except that it
can be packed.
|
|
The i_jump_on_val_zero/3 and i_select_tuple_arity/3 instructions
were not disassembled correctly.
|
|
|
|
erts_debug:instructions/0 is useful for finding which specific
instructions that are not used at all.
|
|
As the overhead counter got larger and never really was needed in ets objects,
I removed them.
A few stray comments of XXX:PaN type from halfword dev removed in the process.
|
|
* pan/otp_8332_halfword:
Teach testcase in driver_suite the new prototype for driver_async
wx: Correct usage of driver callbacks from wx thread
Adopt the new (R13B04) Nif functionality to the halfword codebase
Support monitoring and demonitoring from driver threads
Fix further test-suite problems
Correct the VM to work for more test suites
Teach {wordsize,internal|external} to system_info/1
Make tracing and distribution work
Turn on instruction packing in the loader and virtual machine
Add the BeamInstr data type for loaded BEAM code
Fix the BEAM dissambler for the half-word emulator
Store pointers to heap data in 32-bit words
Add a custom mmap wrapper to force heaps into the lower address range
Fit all heap data into the 32-bit address range
|