Age | Commit message (Collapse) | Author |
|
* maint:
Revert "Merge branch 'rickard/null-char-filenames/ERL-370/OTP-14543' into maint"
|
|
This reverts commit 0717a2194e863f3a78595184ccc5637697f03353, reversing
changes made to 71a40658a0cef8b3e25df3a8e48a72d0563a89bf.
|
|
Conflicts:
erts/emulator/beam/beam_bp.c
erts/emulator/beam/erl_process.c
|
|
* lukas/erts/tracing/recv_exit_signal_deadlock/OTP-14678:
erts: Fix lock order when recv tracing trapped exit signal
|
|
* lukas/erts/fix_caller_trace_for_apply_bifs/OTP-14677:
erts: Fix caller trace for apply bifs
|
|
|
|
Bifs that are called through the export entry
using i_call_last could have their cp set to
return_trace, just like any other call. So we
have to unwind the trace stack to get the correct
cp. Not doing this creates a lot of issues for
fprof.
|
|
OTP-14327
OTP-14340
* bjorn/erts/pack-with-opcode/OTP-14325:
Pack operands for combined instructions into the instruction word
beam_makeops: Use named arguments for the code generation functions
Optimize packing for "optional use" operands
beam_makeops: Print the instruction name for fatal packing errors
Introduce a syntax for marking operands as "optional use"
beam_makeops: Refactor parsing of specific instructions
Optimize instruction prefetch
Pack operands into the instruction word
Use 32-bits pointers to C code
Move LD flags for hipe from Makefile.in to configure.in
beam_disasm: Correct printing of y registers
ops.tab: Slightly optimize badmatch on a Y register
macros.tab: Fix assertion in SET_I_REL()
|
|
Introduce a syntax to mark an operand that is not always used when
an instrution is executed. Example of such operands are the fail
label for is_nil or the number of live registers for an
allocate instruction.
Use a question mark to annotate optional use:
is_nil f? xy
allocate t t?
|
|
|
|
On 64-bit machines where the C code is always at address below 4Gb,
pack one or more operands into the instruction word.
|
|
On a 64-bit machine, we only need 32 bits to store a pointer to
the C code that implements a BEAM instruction. Refactor the code
to only use the lower 32 bits of each instruction word, and take
care to preserve the high 32 bits.
|
|
|
|
|
|
9a50a5d5fc1 changed the update of I, but forgot to update
the preceding assertion.
|
|
|
|
|
|
We can't just leave it in queue with dist_ext=NULL.
Two symptoms seen:
1. 'receive' trying to deref dist_ext as NULL.
2. GC think it's a term and put THE_NON_VALUE in root set.
|
|
* bjorn/erts/improve-beam-ops:
Move out variables from the head of combined instructions
Change operand from 'P' to 'Q' for i_apply_last and i_apply_fun_last
Add CHECK_ALIGNED() for testing storage destinations
instrs.tab: Add missing -no_next directives
beam_load.c: Generalize the 'P' operator in the packing engine
Break out most of the initialization from process_main()
Eliminate the OpCode() macro
Eliminate unnecessary and inconsistent casts
Refactor macros for accessing Beam instructions
beam_emu: Make order of macros consistent
beam_SUITE: Strengthen test of packed registers
|
|
Point out the correct line number in stack traces
|
|
* maint:
Don't allow null in filenames
|
|
* lukas/erts/poll-thread/OTP-14346: (25 commits)
erts: Trigger ready events when erts_io_control fails
erts: enif_select steal test
kernel: Rewrite gen_udp_SUITE:read_packet tc
erts: disable kernel-poll on OS X vsn < 16
erts: Fix msacc testcase with new poll-thread
erts: Add testcases to test IOp and IOt options
erts: get_internal_state(check_io_debug) now prints to error_logger
erts: Remove eager check io
erts: Move all I/O polling to a seperate thread
erts: Fix smp_select testcase to use ERL_DRV_USE
erts: Fix msacc unmanaged state counter
erts: Optimize port_task quick allocator
erts: Add ERTS_THR_PREF_QUICK_ALLOC_IMPL
erts: Update suspend of scheduler to handle multiple pollsets
erts: Add multiple poll sets
erts: Some code cleanup for gdb to work better
erts: temp_alloc can no longer be disabled
erts: Refactor check_io to use one static struct
erts: Replace check_io spinlock with lock-less list insertion
erts: Add number of enif_select's to check_io_debug
...
|
|
|
|
It is not longer relevant when using the poll thread
|
|
|
|
OTP-14652
|
|
for non scheduler threads by using ERTS_THR_PREF_QUICK_ALLOC_IMPL.
|
|
usable from any (managed?) thread.
|
|
|
|
|
|
|
|
temp_alloc is used in such a way that if it ever results
in a malloc/free sequence it will slow down the system
alot. So it will no longer be possible to disable it and
it will not be disabled when using +Mea min.
OTP-14651
|
|
Move out from the head the variables that are only used in the excute
phase.
|
|
All other instructions that increment the stack pointer takes a 'Q'
operand.
|
|
Add the CHECK_ALIGNED() macro that can be used for testing that
the storage destination is word-aligned.
|
|
|
|
In the 'P' operator, don't assume that a packed target label ('f'
or 'j') is always the leftmost argument. Instead, transfer the
patch position from the accumulator to the stack.
|
|
process_main() is already too big.
|
|
Introduce the IsOpCode() macro that can be used to compare
instructions.
|
|
Consider the types in the code below:
BeamInstr* I;
.
.
.
BeamInstr* next;
next = (BeamInstr *) *I;
Goto(next);
This is illogical. If 'I' points to a BeamInstr, then 'next' should
be a BeamInstr, not a pointer to a BeamInstr. The Goto() macros does
not require a pointer, because it will cast its argument to a void*
anyway.
Therefore, this code example can be simplified to:
BeamInstr* I;
.
.
.
BeamInstr next;
next = *I;
Goto(next);
Similarly, we can remove the casts in the macros when NO_JUMP_TABLE
is defined.
|
|
The BeamOp() macro in erl_vm.h is clumsy to use. All users
cast the return value to BeamInstr.
Define new macros that are easier to use. In the future,
we might want to pack an operand into the same word as
the pointer to the instruction, so we will define two macros.
BeamIsOpCode() is used to rewrite code like this:
if (Instr == (BeamInstr) BeamOp(op_i_func_info_IaaI) {
...
}
to:
if (BeamIsOpCode(Instr, op_i_func_info_IaaI)) {
...
}
BeamOpCodeAddr(op_apply_bif) is used when we need the address
for an instruction.
Also elimiminate the global variables em_* in beam_emu.c.
They are not really needed. Use the BeamOpCodeAddr() macro
instead.
|
|
The inconsistent order has annoyed me for a long time.
While at it, also remove the unecessary definition of LabelAddr() if
NO_JUMP_TABLE is defined.
|
|
Sometimes the line number in a stack trace could be wrong,
for example for this code:
t() ->
Res = id(x), %<== Wrong line number.
Res + 1.
id(I) -> I.
The line number pointed out in the stack trace would be the
line before the line where the exception occurred.
The reason is the way the increment instruction instruction
is implemented:
OpCase(i_increment_rWtd):
{
increment_reg_val = r(0);
}
I -= 1;
goto increment__execute;
OpCase(i_increment_xWtd):
{
increment_reg_val = xb(I[1]);
}
goto increment__execute;
increment__execute:
/* Common code for increment */
.
.
.
(The implementation in OTP 20 is similar, but hand-coded directly
in beam_emu.c instead of generated.)
The instruction i_increment_rWtd decrements the instruction pointer (I)
before jumping to the common code. That means that I points *before*
the 'increment' instruction. If there is a 'line' instruction directly
before the 'increment' instruction (as there is in this example), the
instruction pointer will point before that line. Thus the previous line
will be picked up instead.
To eliminate this bug, we must never decrement the instruction pointer.
Instead, we can increment the other (longer) instructions in the
same group of combined instructions:
OpCase(i_increment_rWtd):
{
increment_reg_val = r(0);
}
goto increment__execute;
OpCase(i_increment_xWtd):
{
increment_reg_val = xb(I[1]);
}
I += 1;
goto increment__execute;
increment__execute:
/* Common code for increment */
.
.
.
Also fix a bug that was only a potential bug when ddaed7774eb0a
introduced relative jumps, but is now a real bug. See the added
comment for SET_I_REL() in macros.tab.
|
|
For a long time, there has been the two macros IS_SSMALL() and
MY_IS_SSMALL() that do exactly the same thing.
There should only be one, and it should be called IS_SSMALL().
However, we must decide which implementation to use. When
MY_IS_SSMALL() was introduced a long time ago, it was the most
efficient. In a modern C compiler, there might not be any
difference.
To find out, I used the following small C program to examine
the code generation:
#include <stdio.h>
typedef unsigned int Uint32;
typedef unsigned long Uint64;
typedef long Sint;
#define SWORD_CONSTANT(Const) Const##L
#define SMALL_BITS (64-4)
#define MAX_SMALL ((SWORD_CONSTANT(1) << (SMALL_BITS-1))-1)
#define MIN_SMALL (-(SWORD_CONSTANT(1) << (SMALL_BITS-1)))
#define MY_IS_SSMALL32(x) (((Uint32) ((((x)) >> (SMALL_BITS-1)) + 1)) < 2)
#define MY_IS_SSMALL64(x) (((Uint64) ((((x)) >> (SMALL_BITS-1)) + 1)) < 2)
#define MY_IS_SSMALL(x) (sizeof(x) == sizeof(Uint32) ? MY_IS_SSMALL32(x) : MY_IS_SSMALL64(x))
#define IS_SSMALL(x) (((x) >= MIN_SMALL) && ((x) <= MAX_SMALL))
void original(Sint n)
{
if (IS_SSMALL(n)) {
printf("yes\n");
}
}
void enhanced(Sint n)
{
if (MY_IS_SSMALL(n)) {
printf("yes\n");
}
}
gcc 7.2 produced the following code for the original() function:
.LC0:
.string "yes"
original(long):
movabs rax, 576460752303423488
add rdi, rax
movabs rax, 1152921504606846975
cmp rdi, rax
jbe .L4
rep ret
.L4:
mov edi, OFFSET FLAT:.LC0
jmp puts
clang 5.0.0 produced the following code which is slightly better:
original(long):
movabs rax, 576460752303423488
add rax, rdi
shr rax, 60
jne .LBB0_1
mov edi, .Lstr
jmp puts # TAILCALL
.LBB0_1:
ret
.Lstr:
.asciz "yes"
However, in the context of beam_emu.c, clang could produce
similar to what gcc produced.
gcc 7.2 produced the following code when MY_IS_SSMALL() was used:
.LC0:
.string "yes"
enhanced(long):
sar rdi, 59
add rdi, 1
cmp rdi, 1
jbe .L4
rep ret
.L4:
mov edi, OFFSET FLAT:.LC0
jmp puts
clang produced similar code.
This code seems to be the cheapest. There are four instructions, and
there is no loading of huge integer constants.
|
|
|
|
* lukas/erts/fix_threads_error_printout:
erts: Print the error reason when threads fail to start
|
|
|
|
The byte_offset of sub-binaries wasn't taken into account for
ProcBins, subtly ruining the results. The test suite didn't catch
it since it didn't check for sub-binaries in particular, and only
checked for equality between variations -- not whether the output
was equal to the input.
|
|
|
|
|