aboutsummaryrefslogtreecommitdiffstats
path: root/erts/emulator/drivers/unix/unix_efile.c
AgeCommit message (Collapse)Author
2017-11-30Reimplement efile_drv as a dirty NIFJohn Högberg
This improves the latency of file operations as dirty schedulers are a bit more eager to run jobs than async threads, and use a single global queue rather than per-thread queues, eliminating the risk of a job stalling behind a long-running job on the same thread while other async threads sit idle. There's no such thing as a free lunch though; the lowered latency comes at the cost of increased busy-waiting which may have an adverse effect on some applications. This behavior can be tweaked with the +sbwt flag, but unfortunately it affects all types of schedulers and not just dirty ones. We plan to add type-specific flags at a later stage. sendfile has been moved to inet_drv to lessen the effect of a nasty race; the cooperation between inet_drv and efile has never been airtight and the socket dying at the wrong time (Regardless of reason) could result in fd aliasing. Moving it to the inet driver makes it impossible to trigger this by closing the socket in the middle of a sendfile operation, while still allowing it to be aborted -- something that can't be done if it stays in the file driver. The race still occurs if the controlling process dies in the short window between dispatching the sendfile operation and the dup(2) call in the driver, but it's much less likely to happen now. A proper fix is in the works. -- Notable functional differences: * The use_threads option for file:sendfile/5 no longer has any effect. * The file-specific DTrace probes have been removed. The same effect can be achieved with normal tracing together with the nif__entry/nif__return probes to track scheduling. -- OTP-14256
2017-11-17Merge branch 'john/erts/fix-close-eintr/OTP-14775' into maint-20Erlang/OTP
* john/erts/fix-close-eintr/OTP-14775: Remove invalid EINTR loop around close(2)
2017-11-13Remove invalid EINTR loop around close(2)John Högberg
Retrying close(2) on anything other than HP-UX is likely to close something entirely different. POSIX says that the state of the file descriptor is unspecified, and Linux/BSD guarantee that it's closed on return.
2017-06-07erts: Fix sendfile closeduring scenario on sunosLukas Larsson
On Solaris, giving a too long sfv_len results in an EINVAL error, but data is still transmitted and len is correctly. So we translate this to a success with that amount of data sent. This may hide some other errors that causes EINVAL, but it is the best we can do for now.
2017-03-29Close FD after trying to open a directoryRaimo Niskanen
2017-02-07Use fstat if it exists in efile_openfileRaimo Niskanen
2016-04-13Merge branch 'henrik/update-copyrightyear'Henrik Nord
* henrik/update-copyrightyear: update copyright-year
2016-04-07Merge branch 'bjorn/erts/huge-file-fix/OTP-13461'Björn Gustavsson
* bjorn/erts/huge-file-fix/OTP-13461: Handle multi-giga byte writes to files
2016-04-04Handle multi-giga byte writes to filesBjörn Gustavsson
Test cases that write 4Gb to a file at once would fail on OS X and FreeBSD. By running a simple test program on OS X (El Capitan 10.11.4/Darwin 15.4.0), I found that writev() can handle more than 4Gb of data, while write() only can handle less than 2Gb. (Note that efile_drv.c will use write() if there is only one element in the io vector, and writev() if there is more than one.) It is tempting to attempt to piggy-back on the existing mechanism for segmenting write operations in efile_drv.c, but because of the complex code I find it too dangerous, both from a correctness and performance perspective. Instead do the change in unix_efile.c, which is considerably simpler.
2016-03-31Refactor time_t in efile_drvBjörn-Egil Dahlberg
2016-03-15update copyright-yearHenrik Nord
2016-03-15erts: Use fcntl(fd, F_FULLFSYNC) instead of fdatasync on Mac OSXBjörn-Egil Dahlberg
The syscall fdatasync does not work as intended on Mac OSX. Both the function fsync and fdatasync now uses fcntl(fd, F_FULLFSYNC) on Mac OSX.
2016-01-27Merge branch 'theom/freebsd-sendfile-patch-2/OTP-13271' into maintLukas Larsson
* theom/freebsd-sendfile-patch-2/OTP-13271: erts: Fix sendfile:ing of large files on FreeBSD
2016-01-27erts: Fix sendfile:ing of large files on FreeBSDJP
If the file was larger than the OS send buffer the call would fail before this patch.
2015-08-27Fix ethread events with timeoutRickard Green
Lots of pthread platforms unnecessarily falled back on the pipe/select solution. This since we tried to use the same monotonic clock source for pthread_cond_timedwait() as used by OS monotonic time. This has been fixed on most platforms by using another clock source. Darwin can however not use pthread_cond_timedwait() with monotonic clock source and has to use the pipe/select solution. On darwin we now use select with _DARWIN_UNLIMITED_SELECT in order to be able to handle a large amount of file descriptors.
2015-06-18Change license text to APLv2Bruce Yinhe
2014-05-14Fix efile_openfile() to handle stat() failureMikael Pettersson
If the initial stat() fails then efile_openfile() will still proceed to open() the file. If that succeeds and the caller passed a non-NULL pSize, then it will copy bogus data from the statbuf into *pSize. This has been observed to cause file:read_file/1 to return truncated file data with no error indication. The use case involved a large file system mounted via NFS, with some directories containing large number of files, and NFS mount options that allow the NFS client to return EIO if the NFS server does not respond quickly enough. Depending on the caching state of the client and server machines, a few stat() calls (fewer than 1 per 10 million) would take long enough to trigger EIO errors, but subsequent open() calls would succeed, and read_file/1 would return truncated data. This sequence of events has been observed via "strace" on beam.smp. Signed-off-by: Mikael Pettersson <[email protected]>
2014-02-24erts: Fix unix efile assertLukas Larsson
If writev return an error (eg ENOSPC) we do not want to abort here but instead propagate upwards into erlang.
2014-02-24ose: efile driver updates.Jonas Karlsson
2013-11-15Add sync option to file:open/2Joseph Blomstedt
The sync option adds the POSIX O_SYNC flag to the open system call on platforms that support the flag or its equivalent, e.g., FILE_FLAG_WRITE_THROUGH on Windows. For platforms that don't support it, file:open/2 returns {error, enotsup} if the sync option is passed in. The semantics of O_SYNC are platform-specific. For example, not all platforms guarantee that all file metadata are written to the disk along with the file data when the flag is in effect. This issue is noted in the documentation this commit adds for the sync option. Add a test for the sync option. Note however that the underlying OS semantics for O_SYNC can't be tested automatically in any practical way, so the test assumes the OS does the right thing with the flag when present. For manual verification, dtruss on OS X and strace on Linux were both run against beam processes to watch calls to open(), and file:open/2 was called in Erlang shells to open files for writing, both with and without the sync option. Both the dtruss output and the strace output showed that the O_SYNC flag was present in the open() calls when sync was specified and was clear when sync was not specified.
2013-06-12Update copyright yearsBjörn-Egil Dahlberg
2013-03-04erts: Fix void * arithmeticBjörn-Egil Dahlberg
2013-01-09Add file:allocate/3 operationFilipe David Manana
This operation allows pre-allocation of space for files. It succeeds only on systems that support such operation. The POSIX standard defines the optional system call posix_fallocate() to implement this feature. However, some systems implement more specific functions to accomplish the same operation. On Linux, if the more specific function fallocate() is implemented, it is used instead of posix_fallocate(), falling back to posix_fallocate() if the fallocate() call failed (it's only supported for the ext4, ocfs2, xfs and btrfs file systems at the moment). On Mac OS X it uses the specific fcntl() operation F_PREALLOCATE, falling back to posix_fallocate() if it's available (at the moment Mac OS X doesn't provide posix_fallocate()). On any other UNIX system, it uses posix_fallocate() if it's available. Any other system not providing this system call or any function to pre-allocate space for files, this operation always fails with the ENOTSUP POSIX error.
2012-11-02Fix oracle solaris bug in sendfileLukas Larsson
When using values of sfv_len and sfv_off which are larger than the file in question, sendfilev can sometimes return -1 and send data. It seems to be only Oracle SunOS which this happens on.
2012-07-19erts: Remove VxWorks from driversBjörn-Egil Dahlberg
2012-05-09Correct formating in exit error messagesMichael Santos
Ensure displayed sizes are not negative.
2012-03-30Update copyright yearsBjörn-Egil Dahlberg
2012-03-20Fix bug when sending long files using selectLukas Larsson
The return value from efile_sendfile was not consistent inbetween platforms. The API should now be working as it was intended. OTP-9994
2012-01-23Merge branch 'jz/error-logic-efile_sendfile' into maintHenrik Nord
* jz/error-logic-efile_sendfile: erts: minor fix for unnecessary condition OTP-9872
2012-01-23Merge branch 'jz/sendfile_chunk_size' into maintHenrik Nord
* jz/sendfile_chunk_size: erts: change SENDFILE_CHUNK_SIZE from signed to unsigned Conflicts: erts/emulator/drivers/unix/unix_efile.c OTP-9872
2011-12-20erts: rewrite efile_writev to handle partial writes correctlyRaimo Niskanen
2011-12-09Update copyright yearsBjörn-Egil Dahlberg
2011-12-08Merge branch 'ta/sendfile/OTP-9240'Lukas Larsson
* ta/sendfile/OTP-9240: Do not use async threads on DARWIN Fix cleanup when sendfile process crashes Return {error,closed} from sendfile if closed Do not use SFV_NOWAIT as it does not exist on all solaris Clarify some code comments Make solaris use sendfilev
2011-12-08Fix compiler warning in unix_efile.cBjörn-Egil Dahlberg
2011-12-08unix_efile: Zero is a valid number in utimeBjörn-Egil Dahlberg
Both mtime and atime were incorrectly checked for zero
2011-12-07Do not use SFV_NOWAIT as it does not exist on all solarisLukas Larsson
2011-12-06erts: minor fix for unnecessary conditionJovi Zhang
In "while (retval != -1 && retval == SENDFILE_CHUNK_SIZE)", "retval != -1" is pointless.
2011-12-06erts: change SENDFILE_CHUNK_SIZE from signed to unsignedJovi Zhang
It's reasonable to use UL in SENDFILE_CHUNK_SIZE
2011-12-05Clarify some code commentsLukas Larsson
Thanks Tuncer Ayaz
2011-12-05Make solaris use sendfilevLukas Larsson
sendfilev is a richer API which allows us to do non blocking TCP on solaris. The normal sendfile API seems to have some issue with non blocking sockets and the return value of sendfile.
2011-12-02Use epoch seconds instead of datetime()Björn-Egil Dahlberg
First stage in utc-time for prim_file.
2011-12-02Remove header/trailer supportLukas Larsson
Since the API for headers/trailers seem to be very awkward to work with when using non-blocking io the feature is dropped for now. See unix_efile.c for more details.
2011-12-01Preliminary work on header/trailerLukas Larsson
Have to figure out how to represent progress in header writing when using non-blocking, not sure how to do this.
2011-12-01Remove debug printoutsLukas Larsson
2011-12-01Set chunk size to 3 GBLukas Larsson
It is not possible to use the maximum size_t/off_t for the chunks as that causes sendfile to return einval. 3GB seems to work on all *nix platforms.
2011-12-01Add ifdef's for HAVE_SENDFILELukas Larsson
2011-12-01Fix freebsd support for sendfileLukas Larsson
2011-12-01Change nbytes to 64 bitLukas Larsson
2011-12-01Fix offset calculation for darwinLukas Larsson
2011-12-01Implement sendfile when there are no async threadsLukas Larsson
When there are no async threads sendfile will use the ready_output select on the socket fd to know when to send data. The file_desc will also be put in the sending sendfile_state which buffers all other commands to that file until the sendfile is done.