Age | Commit message (Collapse) | Author |
|
Id: OTP-8912
This patch creates a new family of flags with the "+z" prefix. It
further creates a new configuration option called "dbbl" (which is the
first letter of the name dist_buf_busy_limit). Example usage of this
flag would be "+zdbbl 1048576".
This patch creates an adjustable buffer limit for the amount of data
that may be buffered by the erlang distribution code (in dist.c
specifically). Before this patch, this hard-coded constant was used:
#define ERTS_DE_BUSY_LIMIT (128*1024)
When large binaries are transmitted between nodes (or simply a lot of
medium-sized binaries), it is very easy to hit the old 128KB limit.
Processes that use the erlang:system_monitor() BIF to monitor system
events can be spammed by {monitor, busy_dist_port, ...} message tuples
at rates of tens to even hundreds of messages/second.
A larger buffer limit will allow processes to buffer more outgoing
messages over the distribution. When the buffer limit has been
reached, sending processes will be suspended until the buffer size has
shrunk. The buffer limit is per distribution channel. A higher limit
will give lower latency and higher throughput at the expense of
higher memory usage.
A variation of this patch has been in commercial production use in at
least two companies that the author is aware of. Larger buffer values
can reduce the number of {monitor, busy_dist_port, ...} system
messages drastically, lower overall messaging latencies, and prevent
false timeouts and 'nodedown' messages in extremely busy Mnesia systems.
Test suite: there are two tests:
a. In erlexec_SUITE.erl to test basic set & get of the value
b. In distribution_SUITE.erl, to verify that setting +zdbbl very
low will actually change behavior.
|
|
* rickard/halfword-bug:
Fix newly introduced halfword emulator bugs
|
|
|
|
* mp/fix-hipe-write:
fix 64-bit writes to 32-bit struct field in HiPE runtime
OTP-8877
|
|
Make sure that an update to erts/emulator/tools/make_tables will
force all generated files to be re-generated.
|
|
It seems to work (at least on a little-endian architecture)
by sheer luck.
|
|
In the HiPE part of the runtime system's Process struct
there is a state field which is 32 bits wide even on 64-bit
machines.
There is a single instruction in the HiPE AMD64 runtime
where this field is incorrectly written with a 64-bit store.
Luckily the extraneous 32 bits are written as zeros to 4
bytes of tail-padding at the end of the struct, so nothing
should have broken because of this.
The same bug exists in the HiPE PowerPC64 runtime (in
development), but on the big-endian PPC64 the effect is
to write the actual value to the tail-padding and zero
to the struct field, which potentially breaks TRAPs from
BIFs (depending on BIF arities and how many parameter
registers the runtime has been configured to use).
Thanks to Paul Guyot for noticing the oversized write on AMD64.
|
|
In a70159b33f20a26b2674d7cf777617c5f0261a5c, the _VOID_ macro
was eliminated, but one use of it inside an "#ifdef DEBUG"
was forgotten.
|
|
* rickard/timer-wheel/OTP-8835:
Use mutex instead of rwlock
|
|
|
|
* pan/binary-bif-valgrind-leak/OTP-8823:
Teach erl_bif_binary not leak memory by doing malloc(0)
|
|
Use mutex instead of rwlock since the read lock is more or less
unused and it can be quite contended.
|
|
* bjorn/http-packet-error/OTP-8831:
Make gen_tcp:recv/2 consistent with ssl:recv/2
|
|
* rickard/rwmtx-spin/OTP-8819:
Fix deadlock in reader optimized rwlock implementation
Remove unused variables
Increase spincount with many schedulers
Re-enable spin wait on ethreads rwlocks
|
|
When the HTTP packet mode has been enabled for a socket,
the ssl and gen_tcp modules have different error indications
when there is an error while parsing the HTTP header:
ssl:recv(SSLSocket, 0) -> {ok, {http_error, _Str}}
gen_tcp:recv(Socket, 0) -> {error, {http_error, _Str}}
We have decided to change gen_tcp:recv/2 to behave the same
way as ssl:recv/2. That means that there will be always be
an ok tuple if data could be succefully read from the socket,
and an error tuple if there was a read error at the socket level.
|
|
Spin wait on most ethread rwlocks used by the runtime system was
unintentionally disabled during development. Spin wait has now been enabled
again. This bug appeared in commit 59ee2a593090e7d53c97ceba63cbd300d1b9657e,
i.e., it has not been seen in any released versions.
|
|
|
|
* pg/fix-hipe-crash-in-gc_after_bif:
Fix call to erts_gc_after_bif_call in hipe glue
|
|
* mk/net-dragonfly-bsd-patches:
Remove unused variables
Use proper install method
Add support for DragonFly BSD
Add support for NetBSD
|
|
* ms/inet-bug-fixes:
inet: support retrieving MAC address on BSD
inet: fix getservbyname buffer overflow
inet: fix ifr_name buffer overflow
inet: null terminate ifr_name buffer
OTP-8816
|
|
R12B-0 changed the signature of erts_gc_after_bif_call and it now
takes 4 parameters instead of 2 in R11B-5. Yet, the glue code was not
updated accordingly. As a result, the function erts_gc_after_bif_call
was called with garbage and would randomly cause a crash later in the
garbage collector code.
The fix consists in passing NULL and 0 for the third and fourth
parameters, since there is no term to add to rootset, recovering the
behaviour of R11B-5
(see otp_src_R11B-5/erts/emulator/beam/erl_gc.c, line 314).
(Includes assembly language fixes and code style improvements
suggested by Mikael Pettersson.)
|
|
The make_term_n function in nif_SUITE.c created resources that never
got released, creating valgrind memcheck Definitely Lost warnings.
|
|
The scheduler wakeup threshold is now possible to adjust at system boot.
For more information see the `+swt' command line argument of `erl'.
|
|
Lower the scheduler wakeup threshold since schedulers aren't spuriously
woken as before (since commit 59ee2a593090e7d53c97ceba63cbd300d1b9657e).
|
|
On systems supporting getaddrinfo(), support looking up the MAC
address from inet:ifget/2. The results have the same quirks as with
Linux: if the MAC address is longer than 6 bytes (e.g., fw0 under
Mac OS X), the address is truncated; if the interface does not have
a MAC address (e.g., lo0), an address consisting of 0's is returned.
|
|
Added erlang:system_info(build_type) which makes it
easier to chose drivers, NIF libraries, etc based
on build type of the runtime system.
|
|
* rickard/cpu-info-testcase/OTP-8765:
Fix crash when calling erlang:system_info(update_cpu_info)
Add testcase for erlang:system_info(update_cpu_info)
|
|
* rani/sctp-sndrcvinfo/OTP-8795:
Fix xfer_active close expection for Solaris behaviour
Keep default #sctp_sndrcvinfo{} fields on gen_sctp:send/4
Fill in sinfo_assoc_id in struct sctp_sndrcvinfo for getopt()
Conflicts:
lib/kernel/test/gen_sctp_SUITE.erl
|
|
* rani/sctp-linger-bugfix/OTP-8726:
Fix SCTP linger option
|
|
* pg/fix-segfault-on-crash_dump-with-hipe:
Fix segmentation fault when dumping the crash log with hipe enabled and natively compiled modules
OTP-8801
|
|
* mp/fix-hipe-on_load_crash:
fix native code crash when calling unloaded module with on_load function
OTP-8799
|
|
* mp/robustify-hipe_bifs_get_hrvtime:
robustify hipe_bifs:get_hrvtime/0
OTP-8798
|
|
Calling erlang:system_info(update_cpu_info) on platforms where no
CPU topology was found could result in a crash if other CPU
information had changed. This bug was introduced in the 'dev'
branch before R14B (commit 1b273b618002d65159453fdfb9520a9476e4423a).
That is, the bug has never been seen in a released runtime system.
|
|
|
|
* rickard/cpu-info-unbind/8765:
Fix erroneous error reports about unbind failure
|
|
On platforms where binding of schedulers is not supported, numerous error
reports on the form "Scheduler <N> failed to unbind from cpu -1: enotsup"
were erroneously issued. This bug was introduced in the 'dev' branch
before R14B (commit 1b273b618002d65159453fdfb9520a9476e4423a). That is,
the bug has never been seen in a released runtime system.
Reported-By: Tuncer Ayaz
|
|
The assoc_id field was uninitialized causing random answers.
|
|
inet:setopts(S, [{linger,{true,2}}]) returned {error,einval} for
SCTP sockets. The inet_drv had a bug when checking the option size.
|
|
* pan/ets_binary_overhead/OTP-8762:
Remove binary overhead counter from ets objects
|
|
* sverker/NIF-64bit-integers/OTP-8746:
Make windows 64bit types be declared more consistently
Teach Windows about the int64 functions
NIF doc official support note
NIF 64-bit integer support
|
|
* egil/R14A/binary-gc-wrap/OTP-8730:
Increase vheap counter to Uint64
Fix wrapping in next vheap calculation
|
|
* pan/local_univ_time_bsd/OTP-8580:
Teach erl_time_sup to handle timezones w/o DST on FreeBSD as on other platforms
|
|
* pan/list_to_float/OTP-7178:
Teach Unix sys_float.c to ignore underflow in list_to_float and return 0.0
|
|
* rickard/cpu-info/OTP-8765:
Initialize environment functionality after thread lib
Fix faulty assertions
Implement automatic detection of CPU topology on Windows
Make it possible to reread and update detected CPU information
|
|
An assertion failed due to the thread library not being
initialized when initializing an rwmutex. This was however
harmless.
|
|
natively compiled modules
When loading a module, code area is allocated and header fields
code[MI_ATTR_SIZE] as well as code[MI_COMPILE_SIZE] are not
cleared. They are only set later when freeze_code is called, if the
module has attributes and compilation info, which should always be the
case. When loading a native module (as a stub), code is allocated as
well (to contain the stub functions), and code[MI_ATTR_SIZE] as well
as code[MI_COMPILE_SIZE] are not cleared either. Yet, freeze_code will
not be called (since there is no threaded code to freeze for native
modules), and as a result, these header fields are never set. They can
contain any garbage.
Later on, when writing a crash dump, the attributes and compilation
info are dumped, using these particular header fields. If the size is
garbage, the dump attribute function will iterate until it segfaults.
The fix consists in clearing code[MI_ATTR_SIZE] and
code[MI_COMPILE_SIZE] in both cases (threaded code and native
code). Even if non-native modules should contain code and attributes
and therefore the values code[MI_ATTR_SIZE] and code[MI_COMPILE_SIZE]
should be set by freeze_code, it seems cleaner and easier to maintain
to clear the whole the header in the "initialize code area"
section. As a result, crash dump will not segfault. Instead, native
modules will have an empty attributes and compilation info section in
the crash dump.
|
|
As reported in erlang-bugs, the following sequence of events crashes the VM:
1. Module M1 is loaded and in native mode.
2. Module M2 is not loaded, in emulated mode, and has an on_load function.
3. M1 calls some function in M2. This works.
4. M1 again calls some function in M2. This segfaults.
The reason for the crash is that when the beam loader fixes up export
entries after a successful on_load function call, it erroneously clears
the ->code[3] field in that module's export entries. This is redundant
(no code in beam relies on ->code[3] being NULL), inconsistent with
modules without on_load functions (there ->code[3] remains a valid beam
instruction after the module is loaded), and breaks native code which needs
the old ->address value in an export entry to remain valid after a module
load step (before the load ->address points to ->code[3], after the load
->address points to the real code but uses of the old ->address value
remain so ->code[3] must remain valid).
Thus the fix for the crash is to simply not clear ->code[3].
This patch fixes R14A and should also fix R13B04.
(There does exist a performance bug in this area, but it is unrelated
to the on_load feature so will be fixed separately.)
|
|
The HiPE runtime system has a hipe_bifs:get_hrvtime/0 BIF which
mimics the non-standard gethrvtime() C API. It's possible to
configure the implementation to use the "perfctr" Linux kernel
extension for performance-monitoring counters, in which case
get_hrvtime has very high precision and low overhead. Otherwise
it uses the same code as runtime(statistics).
This patch changes the get_hrvtime implementation to do a runtime
check to see if perfctr is available, and to use the fallback code
rather than returning a dummy value if perfctr is unavailable,
which is common.
The current dummy value return is a bug. It messes up the API
and either breaks callers (they get badarg when trying to compute
on the value) or forces them to implement checks and fallbacks
themselves. Timing code in HiPE's test suites and benchmarks
is known to be affected.
|
|
* rickard/erts-poll-race/OTP-8773:
Fix race in erts_poll()
|
|
A race condition in erts_poll() could cause
delay of poll for I/O.
|