aboutsummaryrefslogtreecommitdiffstats
path: root/erts/emulator/drivers/common
AgeCommit message (Collapse)Author
2018-10-17Merge branch 'igor/tcp-nopush-ERL-698/OTP-15357' into maintJohn Högberg
* igor/tcp-nopush-ERL-698/OTP-15357: "cork" tcp socket around file:sendfile Add nopush TCP socket option
2018-10-12Merge branch 'maint-21' into maintRickard Green
* maint-21: Updated OTP version Prepare release erts: Fix UNC path handling on Windows erts: Fix a compiler warning eldap: Fix race at socket close Fix bug for sockopt pktoptions on BSD erts: Fix memory leak on file read errors
2018-10-11Add nopush TCP socket optionIgor Slepchin
This translates to TCP_CORK on Linux and TCP_NOPUSH on BSD. In effect, this acts as super-Nagle: no partial TCP segments are sent out until this option is turned off. Once turned off, all accumulated unsent data is sent out immediately. The latter is *not* the case on OSX, hence the implementation ignores "nopush" on OSX to reduce confusion.
2018-10-02Implement {netns,NS} option for inet:getifaddrs/1Raimo Niskanen
Also implement the same option for the legacy undocumented functions inet:getif/1,getiflist/1,ifget/2,ifset/2. The arity 1 functions had before this change got signatures that took a socket port that was used to do the needed syscall, so now the signature was extended to also take an option list with the only supported option {netns,Namespace}. The Socket argument variant remains unsupported. For inet:getifaddrs/1 the documentation file was changed to old style function name definition so be able to hide the Socket argument variant that is visible in the type spec. The arity 2 functions had got an option list as second argument. This list had to be partitioned into one list for the namespace option(s) and the other for the rest. The namespace option list was then fed to the already existing namespace support for socket opening, which places the socket in a namespace and hence made all these functions that in inet_drv.c used getsockopt() work without change. The functions that used getifaddrs() in inet_drv.c had to be changed in inet_drv.c to swap namespaces around the getifaddrs() syscall. This functionality was separated into a new function call_getifaddrs().
2018-10-01Fix bug for sockopt pktoptions on BSDRaimo Niskanen
The macros for the BSD style option names had accidentally wound up outside the option parsing loop, causing unclear behaviour and Valgrind errors.
2018-09-19Fix endianness bug for CMSG parsingRaimo Niskanen
2018-09-11Fix term buffer overflow bugRaimo Niskanen
2018-09-04Implement socket option recvtos and friendsRaimo Niskanen
Implement socket options recvtclass, recvtos, recvttl and pktoptions. Document the implemented socket options, new types and message formats. The options recvtclass, recvtos and recvttl are boolean options that when activated (true) for a socket will cause ancillary data to be received through recvmsg(). That is for packet oriented sockets (UDP and SCTP). The required options for this feature were recvtclass and recvtos, and recvttl was only added to test that the ancillary data parsing handled multiple data items in one message correctly. These options does not work on Windows since ancillary data is not handled by the Winsock2 API. For stream sockets (TCP) there is no clear connection between a received packet and what is returned when reading data from the socket, so recvmsg() is not useful. It is possible to get the same ancillary data through a getsockopt() call with the IPv6 socket option IPV6_PKTOPTIONS, on Linux named IPV6_2292PKTOPTIONS after the now obsoleted RFC where it originated. (unfortunately RFC 3542 that obsoletes it explicitly undefines this way to get packet ancillary data from a stream socket) Linux also has got a way to get packet ancillary data for IPv4 TCP sockets through a getsockopt() call with IP_PKTOPTIONS, which appears to be Linux specific. This implementation uses a flag field in the inet_drv.c socket internal data that records if any setsockopt() call with recvtclass, recvtos or recvttl (IPV6_RECVTCLASS, IP_RECVTOS or IP_RECVTTL) has been activated. If so recvmsg() is used instead of recvfrom(). Ancillary data is delivered to the application by a new return tuple format from gen_udp:recv/2,3 containing a list of ancillary data tuples [{tclass,TCLASS} | {tos,TOS} | {ttl,TTL}], as returned by recvmsg(). For a socket in active mode a new message format, containing the ancillary data list, delivers the data in the same way. For gen_sctp the ancillary data is delivered in the same way, except that the gen_sctp return tuple format already contained an ancillary data list so there are just more possible elements when using these socket options. Note that the active mode message format has got an extra tuple level for the ancillary data compared to what is now implemented gen_udp. The gen_sctp active mode format was considered to be the odd one - now all tuples containing ancillary data are flat, except for gen_sctp active mode. Note that testing has not shown that Linux SCTP sockets deliver any ancillary data for these socket options, so it is probably not implemented yet. Remains to be seen what FreeBSD does... For gen_tcp inet:getopts([pktoptions]) will deliver the latest received ancillary data for any activated socket option recvtclass, recvtos or recvttl, on platforms where IP_PKTOPTIONS is defined for an IPv4 socket, or where IPV6_PKTOPTIONS or IPV6_2292PKTOPTIONS is defined for an IPv6 socket. It will be delivered as a list of ancillary data items in the same way as for gen_udp (and gen_sctp). On some platforms, e.g the BSD:s, when you activate IP_RECVTOS you get ancillary data tagged IP_RECVTOS with the TOS value, but on Linux you get ancillary data tagged IP_TOS with the TOS value. Linux follows the style of RFC 2292, and the BSD:s use an older notion. For RFC 2292 that defines the IP_PKTOPTIONS socket option it is more logical to tag the items with the tag that is the item's, than with the tag that defines that you want the item. Therefore this implementation translates all BSD style ancillary data tags to the corresponding Linux style data tags, so the application will only see the tags 'tclass', 'tos' and 'ttl' on all platforms.
2018-07-30erts: Limit the automatic max buffer for UDP to 2^16Lukas Larsson
There is no reason to have a larger buffer than this as the recvmsg call will never return more data. OTP-15206
2018-07-16erts: Free udp buffer when getting EAGAINLukas Larsson
Only SCPT should keep the recv buffer when going into select. If UDP does it, it will result in many more memory allocations than there should be which can be very detrimental to performance.
2018-06-29Merge branch 'john/erts/inet-drv-race/OTP-15158/ERL-654' into maint-21Erlang/OTP
* john/erts/inet-drv-race/OTP-15158/ERL-654: Fix a race condition when generating async operation ids
2018-06-28Fix a race condition when generating async operation idsJohn Högberg
The counter used for generating async operation ids was a plain int shared between all ports, which was incorrect but mostly worked fine since the ids only had to be unique on a per-port basis. However, some compilers (notably GCC 8.1.1) generated code that assumed that this value didn't change between reads. Using a shortened version of enq_async_w_tmo as an example: int id = async_ref++; op->id = id; //A return id; //B In GCC 7 and earlier, `async_ref` would be read once and assigned to `id` before being incremented, which kept the values at A and B consistent. In GCC 8, `async_ref` was read when assigned at A and read again at B, and then incremented, which made them inconsistent if we raced with another port. This commit fixes the issue by removing `async_ref` altogether and replacing it with a per-port counter which makes it impossible to race with someone else.
2018-06-18Update copyright yearHenrik Nord
2018-03-29Merge branch 'lukas/erts/tcp_send_return_closed/OTP-15001'Lukas Larsson
* lukas/erts/tcp_send_return_closed/OTP-15001: erts: tcp send should return {error,closed}
2017-11-30Reimplement efile_drv as a dirty NIFJohn Högberg
This improves the latency of file operations as dirty schedulers are a bit more eager to run jobs than async threads, and use a single global queue rather than per-thread queues, eliminating the risk of a job stalling behind a long-running job on the same thread while other async threads sit idle. There's no such thing as a free lunch though; the lowered latency comes at the cost of increased busy-waiting which may have an adverse effect on some applications. This behavior can be tweaked with the +sbwt flag, but unfortunately it affects all types of schedulers and not just dirty ones. We plan to add type-specific flags at a later stage. sendfile has been moved to inet_drv to lessen the effect of a nasty race; the cooperation between inet_drv and efile has never been airtight and the socket dying at the wrong time (Regardless of reason) could result in fd aliasing. Moving it to the inet driver makes it impossible to trigger this by closing the socket in the middle of a sendfile operation, while still allowing it to be aborted -- something that can't be done if it stays in the file driver. The race still occurs if the controlling process dies in the short window between dispatching the sendfile operation and the dup(2) call in the driver, but it's much less likely to happen now. A proper fix is in the works. -- Notable functional differences: * The use_threads option for file:sendfile/5 no longer has any effect. * The file-specific DTrace probes have been removed. The same effect can be achieved with normal tracing together with the nif__entry/nif__return probes to track scheduling. -- OTP-14256
2017-11-01Merge pull request #1592 from falkevik/sctp_return_only_connect_errorsRaimo Niskanen
OTP-13760 SCTP connect could return error even if the connect is ongoing
2017-10-30erts: tcp send should return {error,closed}Lukas Larsson
In the scenario where a gen_tcp:recv/2 detected an error, the next gen_tcp:send should get a closed error and not a enotconn error as was the case before.
2017-10-25sctp: change connect to only return connect errorsJonas Falkevik
gen_sctp:connect_init/4 could return error even though the connect are in progress. I.e. connect on socket returned "in progress". The error can come from packet_inet_output() which is checking the socket for errors. The issue here is that from erlang you don't think the connect is still ongoing. But you will receive a sctp_assoc_change message later on.
2017-10-17Merge branch 'maint'Sverker Eriksson
2017-10-17Merge branch 'sverker/dist-send-noreply-opt/OTP-14689' into maintSverker Eriksson
* sverker/dist-send-noreply-opt/OTP-14689: erts: Improve distribution send operations
2017-09-27Merge branch 'kvakvs/zero-size-read_file/ERL-327/PR-1524/OTP-14637'Lukas Larsson
* kvakvs/zero-size-read_file/ERL-327/PR-1524/OTP-14637: erts: On zero-size files attempt to read until EOF
2017-09-27erts: On zero-size files attempt to read until EOFDmytro Lytovchenko
2017-09-20erts: Improve distribution send operationsSverker Eriksson
by ignoring reply earlier.
2017-09-06Merge branch 'maint' into john/erts/merge-zlib-and-vector-qJohn Högberg
2017-09-05Replace the zlib driver with a NIFJohn Högberg
All operations will now yield appropriately, allowing them to be used freely in concurrent applications. This commit also deprecates the functions listed below, although they won't raise deprecation warnings until OTP 21: zlib:adler32 zlib:crc32 zlib:inflateChunk zlib:getBufSize zlib:setBufSize The behavior of throwing an error when a dictionary is required for decompression has also been deprecated.
2017-07-27Merge branch 'maint'John Högberg
* maint: Updated OTP version Update release notes Update version numbers Fix doc for the 'quiet' option; it defaults to false asn1: Fix missing quotes of external encoding call Add a dedicated close function for TCP ports to prevent issues like ERL-430/448 Close TCP ports properly on send timeout erts: Add missing release note
2017-07-27Merge branch 'maint-20' into maintJohn Högberg
* maint-20: Updated OTP version Update release notes Update version numbers Fix doc for the 'quiet' option; it defaults to false asn1: Fix missing quotes of external encoding call Add a dedicated close function for TCP ports to prevent issues like ERL-430/448 Close TCP ports properly on send timeout erts: Add missing release note
2017-07-26Merge branch 'john/erts/fix-tcp-send-timeout/OTP-14509/ERL-448' into maint-20Erlang/OTP
* john/erts/fix-tcp-send-timeout/OTP-14509/ERL-448: Add a dedicated close function for TCP ports to prevent issues like ERL-430/448 Close TCP ports properly on send timeout
2017-07-17erts: Remove ERTS_SMP and USE_THREAD definesLukas Larsson
This refactor was done using the unifdef tool like this: for file in $(find erts/ -name *.[ch]); do unifdef -t -f defile -o $file $file; done where defile contained: #define ERTS_SMP 1 #define USE_THREADS 1 #define DDLL_SMP 1 #define ERTS_HAVE_SMP_EMU 1 #define SMP 1 #define ERL_BITS_REENTRANT 1 #define ERTS_USE_ASYNC_READY_Q 1 #define FDBLOCK 1 #undef ERTS_POLL_NEED_ASYNC_INTERRUPT_SUPPORT #define ERTS_POLL_ASYNC_INTERRUPT_SUPPORT 0 #define ERTS_POLL_USE_WAKEUP_PIPE 1 #define ERTS_POLL_USE_UPDATE_REQUESTS_QUEUE 1 #undef ERTS_HAVE_PLAIN_EMU #undef ERTS_SIGNAL_STATE
2017-07-10Add a dedicated close function for TCP ports to prevent issues like ERL-430/448John Högberg
2017-07-10Close TCP ports properly on send timeoutJohn Högberg
2017-07-06Merge branch 'john/erts/runtime-lcnt' into maintJohn Högberg
* john/erts/runtime-lcnt: Document rt_mask and add warnings about copy_save Add an emulator test suite for lock counting Break erts_debug:lock_counters/1 into separate BIFs Allow toggling lock counting at runtime Move lock flags to a common header Enable register_SUITE for lcnt builds Enable lcnt smoke test on all builds that have lcnt enabled Make lock counter info independent of the locks being counted OTP-14412 OTP-13170 OTP-14413
2017-07-06Allow toggling lock counting at runtimeJohn Högberg
The implementation is still hidden behind ERTS_ENABLE_LOCK_COUNT, and all categories are still enabled by default, but the actual counting can be toggled at will. OTP-13170
2017-06-30Merge branch 'john/erts/fix-port-leak/OTP-13939/ERL-193' into maint-20Erlang/OTP
* john/erts/fix-port-leak/OTP-13939/ERL-193: Add a testcase for OTP-13939/ERL-193 Mark socket disconnected on tcp_send_or_shutdown_error # Conflicts: # lib/kernel/test/gen_tcp_misc_SUITE.erl
2017-06-27Merge branch 'maint-19' into maintJohn Högberg
* maint-19: Updated OTP version Update release notes Update version numbers Fix statistics(wall_clock) and statistics(runtime) implementation fixup! erts: Cleanup dropped port tasks correctly erts: Add tests to detect port close race Add a testcase for OTP-13939/ERL-193 erts: Cleanup dropped port tasks correctly Mark socket disconnected on tcp_send_or_shutdown_error
2017-06-26Merge branch 'john/erts/fix-port-leak/OTP-13939/ERL-193' into maint-19Erlang/OTP
* john/erts/fix-port-leak/OTP-13939/ERL-193: Add a testcase for OTP-13939/ERL-193 Mark socket disconnected on tcp_send_or_shutdown_error # Conflicts: # lib/kernel/test/gen_tcp_misc_SUITE.erl
2017-06-14Mark socket disconnected on tcp_send_or_shutdown_errorkvakvs
The socket left lingering due to {exit_on_close, false} will accept writes in a confusing way, returning either enotconn or blocking. This fix allows socket to know that it has been closed recently, and new writes won't pass.
2017-06-08Merge branch 'maint'Rickard Green
* maint: Updated OTP version Update release notes Update version numbers erts: Fix so that 81b628 (sigterm=kill) works Updated OTP version Prepare release Unconditionally clear IO buffers on send/shutdown errors Conflicts: OTP_VERSION erts/emulator/sys/unix/sys.c erts/vsn.mk
2017-06-01Unconditionally clear IO buffers on send/shutdown errorsJohn Högberg
This fixes a bug where a send/shutdown error on an active-mode socket results in the port never being properly closed.
2017-05-19Do not zero terminate Linux abstract addressesRaimo Niskanen
2017-05-04Update copyright yearRaimo Niskanen
2017-04-20implement SO_BINDTODEVICE for inet protocolsAndreas Schultz
bind to device is needed to properly support VRF-Lite under Linux (see [1] for details). [1]: https://www.kernel.org/doc/Documentation/networking/vrf.txt
2017-03-30Merge branch 'goeldeepak/erts/fix_inet_gethost_long/ERL-61/PR-1345/OTP-14310'Lukas Larsson
* goeldeepak/erts/fix_inet_gethost_long/ERL-61/PR-1345/OTP-14310: This patch fixes the issue in which erlang fails to start if the hostname is 64 characters on a linux system.
2017-03-22This patch fixes the issue in which erlang fails to startDeepak Goel
if the hostname is 64 characters on a linux system.
2017-02-14Fixed typos in ertsAndrew Dryga
2017-01-17erts: Fix zlib crash on mac for inflateGetDictionarySverker Eriksson
Work around broken build system by checking actual zlib version in runtime.
2016-12-19erts: Tidy up in efile_drv.cSverker Eriksson
ERL_DRV_USE_NO_CALLBACK only meaningful when deselecting.
2016-09-15Merge branch 'maint'Raimo Niskanen
2016-09-14Tune 'tclass' semanticsRaimo Niskanen
2016-09-12Implement IPV6_TCLASSRaimo Niskanen