diff options
author | Paul Schoenfelder <[email protected]> | 2017-01-31 17:40:34 -0600 |
---|---|---|
committer | Paul Schoenfelder <[email protected]> | 2017-02-16 08:55:15 -0500 |
commit | aa0c4b0df7cdc750450906aff4e8c81627d80605 (patch) | |
tree | 1dddc195225011bb7fefd1094bc6852629b44f21 /lib/stdlib/doc | |
parent | cce3120dd0021c5ab5bf8d5b4088e7364f678dda (diff) | |
download | otp-aa0c4b0df7cdc750450906aff4e8c81627d80605.tar.gz otp-aa0c4b0df7cdc750450906aff4e8c81627d80605.tar.bz2 otp-aa0c4b0df7cdc750450906aff4e8c81627d80605.zip |
Update erl_tar to support PAX format, etc.
This commit introduces the following key changes:
- Support for reading tar archives in formats currently in common use,
such as v7, STAR, USTAR, PAX, and GNU tar's extensions to the
STAR/USTAR format.
- Support for writing PAX archives, only when necessary, using USTAR
when possible for greater portability.
These changes result in lifting of some prior restrictions:
- Support for reading archives produced by modern tar implementations
when other restrictions described below are present.
- Support for filenames which exceed 100 bytes in length, or paths which
exceed 255 bytes (see USTAR format specification for more details on
this restriction).
- Support for filenames of arbitrary length
- Support for unicode metadata (the previous behaviour of erl_tar was
actually violating the spec, by writing unicode-encoded data to fields
which are defined to be 7-bit ASCII, even though this technically
worked when using erl_tar at source and destination, it may not have
worked with other tar utilities, and this implementation now conforms
to the spec).
- Support for uid/gid values which cannot be converted to octal
integers.
Diffstat (limited to 'lib/stdlib/doc')
-rw-r--r-- | lib/stdlib/doc/src/erl_tar.xml | 72 |
1 files changed, 38 insertions, 34 deletions
diff --git a/lib/stdlib/doc/src/erl_tar.xml b/lib/stdlib/doc/src/erl_tar.xml index 24e7b64b9e..f28d8b425b 100644 --- a/lib/stdlib/doc/src/erl_tar.xml +++ b/lib/stdlib/doc/src/erl_tar.xml @@ -37,12 +37,13 @@ </modulesummary> <description> <p>This module archives and extract files to and from - a tar file. This module supports the <c>ustar</c> format - (IEEE Std 1003.1 and ISO/IEC 9945-1). All modern <c>tar</c> - programs (including GNU tar) can read this format. To ensure that - that GNU tar produces a tar file that <c>erl_tar</c> can read, - specify option <c>--format=ustar</c> to GNU tar.</p> - + a tar file. This module supports reading most common tar formats, + namely v7, STAR, USTAR, and PAX, as well as some of GNU tar's extensions + to the USTAR format (sparse files most notably). It produces tar archives + in USTAR format, unless the files being archived require PAX format due to + restrictions in USTAR (such as unicode metadata, filename length, and more). + As such, <c>erl_tar</c> supports tar archives produced by most all modern + tar utilities, and produces tarballs which should be similarly portable.</p> <p>By convention, the name of a tar file is to end in "<c>.tar</c>". To abide to the convention, add "<c>.tar</c>" to the name.</p> @@ -83,6 +84,8 @@ <p>If <seealso marker="kernel:file#native_name_encoding/0"> <c>file:native_name_encoding/0</c></seealso> returns <c>latin1</c>, no translation of path names is done.</p> + + <p>Unicode metadata stored in PAX headers is preserved</p> </section> <section> @@ -104,21 +107,20 @@ <title>Limitations</title> <list type="bulleted"> <item> - <p>For maximum compatibility, it is safe to archive files with names - up to 100 characters in length. Such tar files can generally be - extracted by any <c>tar</c> program.</p> - </item> - <item> - <p>For filenames exceeding 100 characters in length, the resulting tar - file can only be correctly extracted by a POSIX-compatible <c>tar</c> - program (such as Solaris <c>tar</c> or a modern GNU <c>tar</c>).</p> - </item> - <item> - <p>Files with longer names than 256 bytes cannot be stored.</p> + <p>If you must remain compatible with the USTAR tar format, you must ensure file paths being + stored are less than 255 bytes in total, with a maximum filename component + length of 100 bytes. USTAR uses a header field (prefix) in addition to the name field, and + splits file paths longer than 100 bytes into two parts. This split is done on a directory boundary, + and is done in such a way to make the best use of the space available in those two fields, but in practice + this will often mean that you have less than 255 bytes for a path. <c>erl_tar</c> will + automatically upgrade the format to PAX to handle longer filenames, so this is only an issue if you + need to extract the archive with an older implementation of <c>erl_tar</c> or <c>tar</c> which does + not support PAX. In this case, the PAX headers will be extracted as regular files, and you will need to + apply them manually.</p> </item> <item> - <p>The file name a symbolic link points is always limited - to 100 characters.</p> + <p>Like the above, if you must remain USTAR compatible, you must also ensure than paths for + symbolic/hard links are no more than 100 bytes, otherwise PAX headers will be used.</p> </item> </list> </section> @@ -129,7 +131,9 @@ <fsummary>Add a file to an open tar file.</fsummary> <type> <v>TarDescriptor = term()</v> - <v>Filename = filename()</v> + <v>FilenameOrBin = filename()|binary()</v> + <v>NameInArchive = filename()</v> + <v>Filename = filename()|{NameInArchive,FilenameOrBin}</v> <v>Options = [Option]</v> <v>Option = dereference|verbose|{chunks,ChunkSize}</v> <v>ChunkSize = positive_integer()</v> @@ -139,6 +143,9 @@ <desc> <p>Adds a file to a tar file that has been opened for writing by <seealso marker="#open/2"><c>open/1</c></seealso>.</p> + <p><c>NameInArchive</c> is the name under which the file becomes + stored in the tar file. The file gets this name when it is + extracted from the tar file.</p> <p>Options:</p> <taglist> <tag><c>dereference</c></tag> @@ -183,9 +190,6 @@ <seealso marker="#open/2"><c>open/2</c></seealso>. This function accepts the same options as <seealso marker="#add/3"><c>add/3</c></seealso>.</p> - <p><c>NameInArchive</c> is the name under which the file becomes - stored in the tar file. The file gets this name when it is - extracted from the tar file.</p> </desc> </func> @@ -206,8 +210,8 @@ <fsummary>Create a tar archive.</fsummary> <type> <v>Name = filename()</v> - <v>FileList = [Filename|{NameInArchive, binary()},{NameInArchive, - Filename}]</v> + <v>FileList = [Filename|{NameInArchive, FilenameOrBin}]</v> + <v>FilenameOrBin = filename()|binary()</v> <v>Filename = filename()</v> <v>NameInArchive = filename()</v> <v>RetValue = ok|{error,{Name,Reason}}</v> @@ -225,8 +229,8 @@ <fsummary>Create a tar archive with options.</fsummary> <type> <v>Name = filename()</v> - <v>FileList = [Filename|{NameInArchive, binary()},{NameInArchive, - Filename}]</v> + <v>FileList = [Filename|{NameInArchive, FilenameOrBin}]</v> + <v>FilenameOrBin = filename()|binary()</v> <v>Filename = filename()</v> <v>NameInArchive = filename()</v> <v>OptionList = [Option]</v> @@ -275,7 +279,8 @@ <name>extract(Name) -> RetValue</name> <fsummary>Extract all files from a tar file.</fsummary> <type> - <v>Name = filename()</v> + <v>Name = filename() | {binary,binary()} | {file,Fd}</v> + <v>Fd = file_descriptor()</v> <v>RetValue = ok|{error,{Name,Reason}}</v> <v>Reason = term()</v> </type> @@ -294,8 +299,7 @@ <name>extract(Name, OptionList)</name> <fsummary>Extract files from a tar file.</fsummary> <type> - <v>Name = filename() | {binary,Binary} | {file,Fd}</v> - <v>Binary = binary()</v> + <v>Name = filename() | {binary,binary()} | {file,Fd}</v> <v>Fd = file_descriptor()</v> <v>OptionList = [Option]</v> <v>Option = {cwd,Cwd}|{files,FileList}|keep_old_files|verbose|memory</v> @@ -521,7 +525,7 @@ erl_tar:close(TarDesc)</code> <name>table(Name) -> RetValue</name> <fsummary>Retrieve the name of all files in a tar file.</fsummary> <type> - <v>Name = filename()</v> + <v>Name = filename()|{binary,binary()}|{file,file_descriptor()}</v> <v>RetValue = {ok,[string()]}|{error,{Name,Reason}}</v> <v>Reason = term()</v> </type> @@ -535,7 +539,7 @@ erl_tar:close(TarDesc)</code> <fsummary>Retrieve name and information of all files in a tar file. </fsummary> <type> - <v>Name = filename()</v> + <v>Name = filename()|{binary,binary()}|{file,file_descriptor()}</v> </type> <desc> <p>Retrieves the names of all files in the tar file <c>Name</c>.</p> @@ -546,7 +550,7 @@ erl_tar:close(TarDesc)</code> <name>t(Name)</name> <fsummary>Print the name of each file in a tar file.</fsummary> <type> - <v>Name = filename()</v> + <v>Name = filename()|{binary,binary()}|{file,file_descriptor()}</v> </type> <desc> <p>Prints the names of all files in the tar file <c>Name</c> to the @@ -559,7 +563,7 @@ erl_tar:close(TarDesc)</code> <fsummary>Print name and information for each file in a tar file. </fsummary> <type> - <v>Name = filename()</v> + <v>Name = filename()|{binary,binary()}|{file,file_descriptor()}</v> </type> <desc> <p>Prints names and information about all files in the tar file |