Age | Commit message (Collapse) | Author |
|
|
|
Communication between Erlang processes has conceptually always been
performed through asynchronous signaling. The runtime system
implementation has however previously preformed most operation
synchronously. In a system with only one true thread of execution, this
is not problematic (often the opposite). In a system with multiple threads
of execution (as current runtime system implementation with SMP support)
it becomes problematic. This since it often involves locking of structures
when updating them which in turn cause resource contention. Utilizing
true asynchronous communication often avoids these resource contention
issues.
The case that triggered this change was contention on the link lock due
to frequent updates of the monitor trees during communication with a
frequently used server. The signal order delivery guarantees of the
language makes it hard to change the implementation of only some signals
to use true asynchronous signaling. Therefore the implementations
of (almost) all signals have been changed.
Currently the following signals have been implemented as true
asynchronous signals:
- Message signals
- Exit signals
- Monitor signals
- Demonitor signals
- Monitor triggered signals (DOWN, CHANGE, etc)
- Link signals
- Unlink signals
- Group leader signals
All of the above already defined as asynchronous signals in the
language. The implementation of messages signals was quite
asynchronous to begin with, but had quite strict delivery constraints
due to the ordering guarantees of signals between a pair of processes.
The previously used message queue partitioned into two halves has been
replaced by a more general signal queue partitioned into three parts
that service all kinds of signals. More details regarding the signal
queue can be found in comments in the erl_proc_sig_queue.h file.
The monitor and link implementations have also been completely replaced
in order to fit the new asynchronous signaling implementation as good
as possible. More details regarding the new monitor and link
implementations can be found in the erl_monitor_link.h file.
|
|
* john/erts/tuple-arityval-fixes/OTP-14963/ERL-577:
Make doc entry for maximum tuple size reflect reality
Assert that sz <= MAX_ARITYVAL in make_arityval(sz)
|
|
|
|
Literal tags are used by the VM as an alternative to reserving a large
virtual memory space in order to be able to quickly identify which terms
are literals. The use of literal tags harms performance, but is useful
to support systems where allocating a large amount of virtual memory is
not an option.
|
|
|
|
For a long time, there has been the two macros IS_SSMALL() and
MY_IS_SSMALL() that do exactly the same thing.
There should only be one, and it should be called IS_SSMALL().
However, we must decide which implementation to use. When
MY_IS_SSMALL() was introduced a long time ago, it was the most
efficient. In a modern C compiler, there might not be any
difference.
To find out, I used the following small C program to examine
the code generation:
#include <stdio.h>
typedef unsigned int Uint32;
typedef unsigned long Uint64;
typedef long Sint;
#define SWORD_CONSTANT(Const) Const##L
#define SMALL_BITS (64-4)
#define MAX_SMALL ((SWORD_CONSTANT(1) << (SMALL_BITS-1))-1)
#define MIN_SMALL (-(SWORD_CONSTANT(1) << (SMALL_BITS-1)))
#define MY_IS_SSMALL32(x) (((Uint32) ((((x)) >> (SMALL_BITS-1)) + 1)) < 2)
#define MY_IS_SSMALL64(x) (((Uint64) ((((x)) >> (SMALL_BITS-1)) + 1)) < 2)
#define MY_IS_SSMALL(x) (sizeof(x) == sizeof(Uint32) ? MY_IS_SSMALL32(x) : MY_IS_SSMALL64(x))
#define IS_SSMALL(x) (((x) >= MIN_SMALL) && ((x) <= MAX_SMALL))
void original(Sint n)
{
if (IS_SSMALL(n)) {
printf("yes\n");
}
}
void enhanced(Sint n)
{
if (MY_IS_SSMALL(n)) {
printf("yes\n");
}
}
gcc 7.2 produced the following code for the original() function:
.LC0:
.string "yes"
original(long):
movabs rax, 576460752303423488
add rdi, rax
movabs rax, 1152921504606846975
cmp rdi, rax
jbe .L4
rep ret
.L4:
mov edi, OFFSET FLAT:.LC0
jmp puts
clang 5.0.0 produced the following code which is slightly better:
original(long):
movabs rax, 576460752303423488
add rax, rdi
shr rax, 60
jne .LBB0_1
mov edi, .Lstr
jmp puts # TAILCALL
.LBB0_1:
ret
.Lstr:
.asciz "yes"
However, in the context of beam_emu.c, clang could produce
similar to what gcc produced.
gcc 7.2 produced the following code when MY_IS_SSMALL() was used:
.LC0:
.string "yes"
enhanced(long):
sar rdi, 59
add rdi, 1
cmp rdi, 1
jbe .L4
rep ret
.L4:
mov edi, OFFSET FLAT:.LC0
jmp puts
clang produced similar code.
This code seems to be the cheapest. There are four instructions, and
there is no loading of huge integer constants.
|
|
|
|
|
|
Magic references are *intentionally* indistinguishable from ordinary
references for the Erlang software. Magic references do not change
the language, and are intended as a pure runtime internal optimization.
An ordinary reference is typically used as a key in some table. A
magic reference has a direct pointer to a reference counted magic
binary. This makes it possible to implement various things without
having to do lookups in a table, but instead access the data directly.
Besides very fast lookups this can also improve scalability by
removing a potentially contended table. A couple of examples of
planned future usage of magic references are ETS table identifiers,
and BIF timer identifiers.
Besides future optimizations using magic references it should also
be possible to replace the exposed magic binary cludge with magic
references. That is, magic binaries that are exposed as empty
binaries to the Erlang software.
|
|
* henrik/update-copyrightyear:
update copyright-year
|
|
from future nodes.
|
|
The tag_val_def function was called and multiple switch statements
had to be traversed in term.c, and then a big switch in the calling
code to branch on the term types. By inlining the switches are
merged by the compiler and a lot fewer branches have to be taken.
Benchmarks show that this increases performance of enc_term by as
much as 10%.
|
|
|
|
* rickard/ohmq/OTP-13047:
Fragmented young heap generation and off_heap_message_queue option
Refactor GC
Introduce literal tag
Conflicts:
erts/doc/src/erlang.xml
erts/emulator/beam/erl_gc.c
|
|
|
|
|
|
Conflicts:
erts/emulator/beam/erl_printf_term.c
erts/emulator/beam/erl_term.c
erts/emulator/beam/utils.c
|
|
If the process stack contained a match state
the print function would crash the vm as it was not
recognized by tag_val_def().
Add new MATCHSTATE_DEF returned by tag_val_def().
All other callers either ignore it or has a default
clause to handle invalid terms.
|
|
Consider the try_case_end instruction:
try_case_end s
The 's' operand type means that the operand can either be a
literal of one of the types atom, integer, or empty list, or
a register. That worked well before R12. In R12 additional
types of literals where introduced. Because of way the
overloading was done, an 's' operand cannot handle the
new types of literals. Therefore, code such as the following
is necessary in ops.tab to avoid giving an 's' operand a
literal:
try_case_end Literal=q => move Literal x | try_case_end x
While this work, it is error-prone in that it is easy to
forget to add that kind of rule. It would also be complicated
in case we wanted to introduce a new kind of addition operator
such as:
i_plus jssd
Since there are two 's' operands, two scratch registers and
two 'move' instructions would be needed.
Therefore, we'll need to find a smarter way to find tag
register operands. We will overload the pid and port tags
for X and Y register, respectively. That works because pids
and port are immediate values (fit in one word), and there
are no literals for pids and ports.
|
|
|
|
|
|
|
|
Keep is_same macro for readability but remove base pointers.
|
|
* Removed COMPRESS_POINTER and EXPAND_POINTER
|
|
|
|
|
|
|
|
|
|
For consistency with other data types, add the is_not_map() macro.
|
|
|
|
|
|
as shorthand for is_flatmap || is_hashmap
|
|
|
|
MAP_DEF and HASHMAP_DEF must have adjacent values
|
|
|
|
|
|
Called with a signed int 'sz' argument on 64 bit
would cause sign extension 'sz' was larger than 33554431.
|
|
Conflicts:
erts/emulator/Makefile.in
erts/emulator/beam/bif.tab
erts/emulator/beam/erl_gc.c
erts/emulator/beam/erl_gc.h
erts/emulator/beam/erl_printf_term.c
erts/emulator/beam/erl_term.c
erts/emulator/beam/erl_term.h
|
|
to not mistake a map for an external term (pid, port or ref).
|
|
The subtag chosen for maps breaks crucial assumptions in other parts of
the system, in particular the native code generation scheme used in the
HiPE compiler. Therefore it is better to use 'the other' unused subtag.
The main change here is the use of 0xF subtag for maps instead of 0xB.
The rest of the differences are changes to and additions of comments.
One more comment from my part.
I noticed that the file contains the following two lines:
The comment of the second line is most certainly a copy and paste error
and should be appropriately fixed by the OTP developers. More importantly,
it would be great if that subtag (0x7) turned out to be unused and could
be used for maps instead of the 0xF one. This might turn out very handy
in the future. I can elaborate on this, if needed.
|
|
|
|
for the temporary conversion from float to big.
Preparation for coming bugfix of 'big_buf' array size.
|
|
|
|
* egil/enforce-tuple-specification-size/OTP-10633:
erts: Use memcpy instead of while in setelement/3
test: Refactor away ?line macro in tuple_SUITE
erts: Enforce tuple max size on BIFs
erts: Define max tuple size to 24 bits
|
|
|
|
|
|
Erlang specification 4.7.3 defines max tuple size to 65535 elements
It is now defined to 16777215 elements (24 bits)
|
|
Can still not setup -a, but cerl works.
|
|
Still does not run, just compiles.
|