Age | Commit message (Collapse) | Author |
|
|
|
This reverts commit f2b332186a for beam_emu.c only, to enable
an upgrade path for existing beam code compiled under OTP 20 with
parameterized modules.
|
|
into maint
* john/erts/spectre-configure-flag-otp_20/OTP-15430/ERIERL-237:
Allow disabling retpoline in interpreter loop
Add a ./configure flag for spectre mitigation
|
|
john/erts/spectre-configure-flag-otp_20/OTP-15430/ERIERL-237
* john/erts/spectre-configure-flag/OTP-15430/ERIERL-237:
Allow disabling retpoline in interpreter loop
Add a ./configure flag for spectre mitigation
|
|
We only do this when the user has explicitly told us it's okay to
partially disable mitigation (spectre-mitigation=incomplete). The
macro is inert if it isn't.
|
|
|
|
|
|
Communication between Erlang processes has conceptually always been
performed through asynchronous signaling. The runtime system
implementation has however previously preformed most operation
synchronously. In a system with only one true thread of execution, this
is not problematic (often the opposite). In a system with multiple threads
of execution (as current runtime system implementation with SMP support)
it becomes problematic. This since it often involves locking of structures
when updating them which in turn cause resource contention. Utilizing
true asynchronous communication often avoids these resource contention
issues.
The case that triggered this change was contention on the link lock due
to frequent updates of the monitor trees during communication with a
frequently used server. The signal order delivery guarantees of the
language makes it hard to change the implementation of only some signals
to use true asynchronous signaling. Therefore the implementations
of (almost) all signals have been changed.
Currently the following signals have been implemented as true
asynchronous signals:
- Message signals
- Exit signals
- Monitor signals
- Demonitor signals
- Monitor triggered signals (DOWN, CHANGE, etc)
- Link signals
- Unlink signals
- Group leader signals
All of the above already defined as asynchronous signals in the
language. The implementation of messages signals was quite
asynchronous to begin with, but had quite strict delivery constraints
due to the ordering guarantees of signals between a pair of processes.
The previously used message queue partitioned into two halves has been
replaced by a more general signal queue partitioned into three parts
that service all kinds of signals. More details regarding the signal
queue can be found in comments in the erl_proc_sig_queue.h file.
The monitor and link implementations have also been completely replaced
in order to fit the new asynchronous signaling implementation as good
as possible. More details regarding the new monitor and link
implementations can be found in the erl_monitor_link.h file.
|
|
|
|
When a ref is created before performing a receive that will only
receive message containing that ref, there is a compiler optimization
to avoid scanning messages that can't possible contain the newly
created ref.
Magnus Lång pointed out that the implementation of the optimization
is flawed. Exceptions or recursive calls could cause the receive
operation to scan the receive queue from a position beyond the expected
message (that is, the message containing the ref would never be
matched out). See the receive_opt_exception/1 and receive_opt_recursion/1
test cases in receive_SUITE.
It turns out that we can simplify the implementation of the
optimization while fixing the bug (suggested by Magnus Lång). We
actually don't need the c_p->msg.mark field. It is enough to have
c_p->msg.saved_pos; if it is non-zero, it is a valid position in the
message qeueue. All we need to do is to ensure that we clear
c_p->msg.saved_pos when a receive is exited normally or abnormally.
We can clear c_p->msg.saved_pos in JOIN_MESSAGE(), since it is called
both when leaving a receive because a message matched and because there
was a timeout and the 'after' clause was executed. In addition, we
need to clear c_p->msg.saved_pos when an exception is caught.
https://bugs.erlang.org/browse/ERL-511
|
|
In beam_emu, use "erts_gc_" for any function that does garbage
collection, as preparation for adding more sanity checks.
|
|
erlang:is_builtin(erlang, M, F) returns false for apply/2 and
yield/0. The documentation for erlang:is_builtin/3 says that it
returns true for BIFs that are implemented in C. apply/2 and
yield/0 are implemented in C (as BEAM instructions), and
therefore the correct return value is true.
Also see a similar argument that was made for apply/3 in the past:
http://erlang.org/pipermail/erlang-bugs/2015-October/005101.html
https://bugs.erlang.org/browse/ERL-500
|
|
|
|
On 64-bit machines where the C code is always at address below 4Gb,
pack one or more operands into the instruction word.
|
|
On a 64-bit machine, we only need 32 bits to store a pointer to
the C code that implements a BEAM instruction. Refactor the code
to only use the lower 32 bits of each instruction word, and take
care to preserve the high 32 bits.
|
|
Add the CHECK_ALIGNED() macro that can be used for testing that
the storage destination is word-aligned.
|
|
process_main() is already too big.
|
|
Introduce the IsOpCode() macro that can be used to compare
instructions.
|
|
Consider the types in the code below:
BeamInstr* I;
.
.
.
BeamInstr* next;
next = (BeamInstr *) *I;
Goto(next);
This is illogical. If 'I' points to a BeamInstr, then 'next' should
be a BeamInstr, not a pointer to a BeamInstr. The Goto() macros does
not require a pointer, because it will cast its argument to a void*
anyway.
Therefore, this code example can be simplified to:
BeamInstr* I;
.
.
.
BeamInstr next;
next = *I;
Goto(next);
Similarly, we can remove the casts in the macros when NO_JUMP_TABLE
is defined.
|
|
The BeamOp() macro in erl_vm.h is clumsy to use. All users
cast the return value to BeamInstr.
Define new macros that are easier to use. In the future,
we might want to pack an operand into the same word as
the pointer to the instruction, so we will define two macros.
BeamIsOpCode() is used to rewrite code like this:
if (Instr == (BeamInstr) BeamOp(op_i_func_info_IaaI) {
...
}
to:
if (BeamIsOpCode(Instr, op_i_func_info_IaaI)) {
...
}
BeamOpCodeAddr(op_apply_bif) is used when we need the address
for an instruction.
Also elimiminate the global variables em_* in beam_emu.c.
They are not really needed. Use the BeamOpCodeAddr() macro
instead.
|
|
The inconsistent order has annoyed me for a long time.
While at it, also remove the unecessary definition of LabelAddr() if
NO_JUMP_TABLE is defined.
|
|
* lukas/erts/remove-dirty-scheduler-defines/OTP-14613:
erts: Remove possibility to disable dirty schedulers
|
|
|
|
|
|
|
|
Besides being noisy, they were already defined by a global Unix-
specific header, causing the Windows build to fail if one forgot to
define them.
|
|
|
|
We don't need to pass x(0), x(1), and x(2) because they
can already be found in the register array.
|
|
We don't need to pass x(0), x(1), and x(2) because they
can already be found in the register array.
|
|
Keep frequently used variables in machine registers.
|
|
The bit syntax instructions are mixed among other instructions
in beam_hot.h and beam_cold.h.
Introduce a new hotness level called '%warm' with is associated
file beam_warm.h. Mark all bit syntax instructions as '%warm'.
|
|
The beam_instrs.h file serves no useful purpose. Put the
instructions in beam_hot.h instead.
|
|
The type 'd' could be used both for destination registers and
source register.
Restrict the 'd' type to only be used for destinations, and
introduce the new 'S' type to be used when a source must be
a register.
|
|
The packer had several bugs and limitations. For instance, on
a 32-bit Erlang virtual machine it would gladly pack three
't' values into one word even though it would be not safe.
The rewritten version will be more careful how much it packs
into each word. It will also be able to do packing for more
instructions.
|
|
Having the helper functions for map update knowing all the details
of operands for the instruction will make it difficult to make
improvements such as better packing.
|
|
|
|
The instruction put_map_assoc/5 (used for updating a map) has a
failure operand, but it can't actually fail provided that its "map"
argument is a map.
The following code:
M#{key=>value}.
will be compiled to:
{test,is_map,{f,3},[{x,0}]}.
{line,[...]}.
{put_map_assoc,{f,0},{x,0},{x,0},1,{list,[{atom,key},{atom,value}]}}.
return.
{label,3}.
%% Code that produces a 'badmap' exception follows.
Because of the is_map instruction, {x,0} always contains a map when
the put_map_assoc instruction is executed. Therefore we can remove
the failure operand. That will save one word, and also eliminate
two tests at run-time.
The only problem is that the compiler in OTP 17 did not emit a
is_map instruction before the put_map_assoc instruction. Therefore,
we must add an instruction that tests for a map if the code was
compiled with the OTP 17 compiler.
Unfortunately, there is no safe and relatively easy way to known that
the OTP 17 compiler was used, so we will check whether a compiler
before OTP 20 was used. OTP 20 introduced a new chunk type for atoms,
which is trivial to check.
|
|
|
|
Eliminate the need to write pre-processor macros for each instruction.
Instead allow the implementation of instruction to be written in
C directly in the .tab files. Rewrite all existing macros in this
way and remove the %macro directive.
|
|
|
|
This refactor was done using the unifdef tool like this:
for file in $(find erts/ -name *.[ch]); do unifdef -t -f defile -o $file $file; done
where defile contained:
#define ERTS_SMP 1
#define USE_THREADS 1
#define DDLL_SMP 1
#define ERTS_HAVE_SMP_EMU 1
#define SMP 1
#define ERL_BITS_REENTRANT 1
#define ERTS_USE_ASYNC_READY_Q 1
#define FDBLOCK 1
#undef ERTS_POLL_NEED_ASYNC_INTERRUPT_SUPPORT
#define ERTS_POLL_ASYNC_INTERRUPT_SUPPORT 0
#define ERTS_POLL_USE_WAKEUP_PIPE 1
#define ERTS_POLL_USE_UPDATE_REQUESTS_QUEUE 1
#undef ERTS_HAVE_PLAIN_EMU
#undef ERTS_SIGNAL_STATE
|
|
|
|
Add stacktrace entries to BIF calls from emulator
|
|
The goal of this change is to improve debugging of
emulator calls. For example, the following code
rem(1, y)
will error with atom `badarith` when y is 0 and the
stacktrace has no entry for `erlang:rem/2`, making
such cases very hard to debug. This patch makes it
so the stacktrace includes `erlang:rem(1, 0)`.
The following emulator BIFs have been changed:
* band/2
* bnot/1
* bor/2
* bsl/2
* bsr/2
* bxor/2
* div/2
* element/2
* int_div/2
* rem/2
* sminus/2
* splus/2
* stimes/2
|
|
Introduce new_map_lit operation in the loader
OTP-14502
|
|
Take advantage of the fact that small maps have a tuple for keys.
When new map is constructed and all keys are literals, we can construct
the entire keys tuple as a literal.
This should reduce the memory of maps created with literal keys almost by half,
since they all can share the same keys tuple.
|
|
Make tuple calls opt-in
OTP-14497
|
|
Tuple calls is the ability to invoke a function on a tuple
as first argument:
1> Var = dict:new().
{dict,0,16,16,8,80,48,
{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},
{{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]}}}
2> Var:size().
0
This behaviour is considered by most to be undesired and confusing,
especially when it comes to errors. For example, imagine you invoke
"Mod:new()" where a Mod is an atom and you accidentally pass {ok, dict}.
It raises:
{undef,[{ok,new,[{ok,dict}],[]},...]}
As it attempts to invoke ok:new/1, which is really hard to debug
as there is no call to new/1 on the source code.
Furthemore, this behaviour is implemented at the VM level, which
imposes such semantics on all languages running on BEAM.
Since we cannot remove the behaviour above, this proposal makes the
behaviour opt-in with a compiler flag:
-compile(tuple_calls).
This means that, if a codebase relies on this functionality, they
can keep compatibility by adding configuring their build tool to
always use the 'tuple_calls' flag or explicitly on each module.
As long as the compile attribute above is listed, the codebase will
work on old and new Erlang versions alike. The only downside of the
current implementation is that modules compiled on OTP 20 that rely
on 'tuple_calls' will have to be recompiled to run with 'tuple_calls'
on OTP 21+.
|
|
|
|
Bit syntax instructions never store their result in a Y register.
Therefore, change the bit syntax instructions to use 'x' as the
destination instead of 'd'. That will simplify the code that stores
the result, and will be a slight reduction in code size and execution
time.
|