diff options
author | Rickard Green <[email protected]> | 2017-10-12 17:05:02 +0200 |
---|---|---|
committer | Rickard Green <[email protected]> | 2017-10-12 17:05:02 +0200 |
commit | 5dc3cc971a7ce8d5384019a9a6f14a9ab7790163 (patch) | |
tree | b7e4befb7b193bdbf466a58e9de379edad5cdbf1 /lib/kernel/doc | |
parent | 35d9fcedf1cc450ca36d95e9b8263fbc3e3a9b57 (diff) | |
parent | bb0b43eae854125688f3143e53c8974cafed4ad2 (diff) | |
download | otp-5dc3cc971a7ce8d5384019a9a6f14a9ab7790163.tar.gz otp-5dc3cc971a7ce8d5384019a9a6f14a9ab7790163.tar.bz2 otp-5dc3cc971a7ce8d5384019a9a6f14a9ab7790163.zip |
Merge branch 'rickard/null-chars/ERL-370/OTP-14543'
* rickard/null-chars/ERL-370/OTP-14543:
Don't allow null chars in various strings
Conflicts:
erts/emulator/beam/erl_alloc.types
erts/preloaded/ebin/erlang.beam
Diffstat (limited to 'lib/kernel/doc')
-rw-r--r-- | lib/kernel/doc/src/file.xml | 43 | ||||
-rw-r--r-- | lib/kernel/doc/src/os.xml | 109 |
2 files changed, 146 insertions, 6 deletions
diff --git a/lib/kernel/doc/src/file.xml b/lib/kernel/doc/src/file.xml index b674b3ca93..2ab35b9b05 100644 --- a/lib/kernel/doc/src/file.xml +++ b/lib/kernel/doc/src/file.xml @@ -41,7 +41,7 @@ <p>Regarding filename encoding, the Erlang VM can operate in two modes. The current mode can be queried using function - <seealso marker="#native_name_encoding"><c>native_name_encoding/0</c></seealso>. + <seealso marker="#native_name_encoding/0"><c>native_name_encoding/0</c></seealso>. It returns <c>latin1</c> or <c>utf8</c>.</p> <p>In <c>latin1</c> mode, the Erlang VM does not change the @@ -59,7 +59,7 @@ terminal supports UTF-8, otherwise <c>latin1</c>. The default can be overridden using <c>+fnl</c> (to force <c>latin1</c> mode) or <c>+fnu</c> (to force <c>utf8</c> mode) when starting - <seealso marker="erts:erl"><c>erts:erl</c></seealso>.</p> + <seealso marker="erts:erl"><c>erl</c></seealso>.</p> <p>On operating systems with transparent naming, files can be inconsistently named, for example, some files are encoded in UTF-8 while @@ -81,6 +81,23 @@ <p>See also section <seealso marker="stdlib:unicode_usage#notes-about-raw-filenames">Notes About Raw Filenames</seealso> in the STDLIB User's Guide.</p> + <note><p> + File operations used to accept filenames containing + null characters (integer value zero). This caused + the name to be truncated and in some cases arguments + to primitive operations to be mixed up. Filenames + containing null characters inside the filename + are now <em>rejected</em> and will cause primitive + file operations fail. + </p></note> + <warning><p> + Currently null characters at the end of the filename + will be accepted by primitive file operations. Such + filenames are however still documented as invalid. The + implementation will also change in the future and + reject such filenames. + </p></warning> + </description> <datatypes> @@ -96,9 +113,21 @@ </datatype> <datatype> <name name="filename"/> + <desc> + <p> + See also the documentation of the + <seealso marker="#type-name_all"><c>name_all()</c></seealso> type. + </p> + </desc> </datatype> <datatype> <name name="filename_all"/> + <desc> + <p> + See also the documentation of the + <seealso marker="#type-name_all"><c>name_all()</c></seealso> type. + </p> + </desc> </datatype> <datatype> <name name="io_device"/> @@ -112,21 +141,23 @@ <name name="name"/> <desc> <p>If VM is in Unicode filename mode, <c>string()</c> and <c>char()</c> - are allowed to be > 255. + are allowed to be > 255. See also the documentation of the + <seealso marker="#type-name_all"><c>name_all()</c></seealso> type. </p> </desc> </datatype> <datatype> <name name="name_all"/> <desc> - <p>If VM is in Unicode filename mode, <c>string()</c> and <c>char()</c> + <p>If VM is in Unicode filename mode, characters are allowed to be > 255. <c><anno>RawFilename</anno></c> is a filename not subject to Unicode translation, meaning that it can contain characters not conforming to the Unicode encoding expected from the file system (that is, non-UTF-8 characters although the VM is started - in Unicode filename mode). + in Unicode filename mode). Null characters (integer value zero) + are <em>not</em> allowed in filenames (not even at the end). </p> </desc> </datatype> @@ -1825,7 +1856,7 @@ f.txt: {person, "kalle", 25}. <p>The functions in the module <c>file</c> usually treat binaries as raw filenames, that is, they are passed "as is" even when the encoding of the binary does not agree with - <seealso marker="#native_name_encoding"><c>native_name_encoding()</c></seealso>. + <seealso marker="#native_name_encoding/0"><c>native_name_encoding()</c></seealso>. However, this function expects binaries to be encoded according to the value returned by <c>native_name_encoding()</c>.</p> <p>Typical error reasons are:</p> diff --git a/lib/kernel/doc/src/os.xml b/lib/kernel/doc/src/os.xml index 0e9add4161..0a08e2c78a 100644 --- a/lib/kernel/doc/src/os.xml +++ b/lib/kernel/doc/src/os.xml @@ -36,8 +36,99 @@ only run on a specific platform. On the other hand, with careful use, these functions can be of help in enabling a program to run on most platforms.</p> + + <note> + <p> + File operations used to accept filenames containing + null characters (integer value zero). This caused + the name to be truncated and in some cases arguments + to primitive operations to be mixed up. Filenames + containing null characters inside the filename + are now <em>rejected</em> and will cause primitive + file operations to fail. + </p> + <p> + Also environment variable operations used to accept + names and values of environment variables containing + null characters (integer value zero). This caused + operations to silently produce erroneous results. + Environment variable names and values containing + null characters inside the name or value are now + <em>rejected</em> and will cause environment variable + operations to fail. + </p> + </note> + <warning> + <p> + Currently null characters at the end of filenames, + environment variable names and values will be accepted + by the primitive operations. Such filenames, environment + variable names and values are however still documented as + invalid. The implementation will also change in the + future and reject such filenames, environment variable + names and values. + </p> + </warning> </description> + <datatypes> + <datatype> + <name name="env_var_name"/> + <desc> + <p>A string containing valid characters on the specific + OS for environment variable names using + <seealso marker="file#native_name_encoding/0"><c>file:native_name_encoding()</c></seealso> + encoding. Note that specifically null characters (integer + value zero) and <c>$=</c> characters are not allowed. + However, note that not all invalid characters necessarily + will cause the primitiv operations to fail, but may instead + produce invalid results. + </p> + </desc> + </datatype> + <datatype> + <name name="env_var_value"/> + <desc> + <p>A string containing valid characters on the specific + OS for environment variable values using + <seealso marker="file#native_name_encoding/0"><c>file:native_name_encoding()</c></seealso> + encoding. Note that specifically null characters (integer + value zero) are not allowed. However, note that not all + invalid characters necessarily will cause the primitiv + operations to fail, but may instead produce invalid results. + </p> + </desc> + </datatype> + <datatype> + <name name="env_var_name_value"/> + <desc> + <p> + Assuming that environment variables has been correctly + set, a strings containing valid characters on the specific + OS for environment variable names and values using + <seealso marker="file#native_name_encoding/0"><c>file:native_name_encoding()</c></seealso> + encoding. The first <c>$=</c> characters appearing in + the string separates environment variable name (on the + left) from environment variable value (on the right). + </p> + </desc> + </datatype> + <datatype> + <name name="command_input"/> + <desc> + <p>All characters needs to be valid characters on the + specific OS using + <seealso marker="file#native_name_encoding/0"><c>file:native_name_encoding()</c></seealso> + encoding. Note that specifically null characters (integer + value zero) are not allowed. However, note that not all + invalid characters not necessarily will cause + <seealso marker="#cmd/1"><c>os:cmd/1</c></seealso> + to fail, but may instead produce invalid results. + </p> + </desc> + </datatype> + </datatypes> + <funcs> <func> <name name="cmd" arity="1"/> @@ -49,6 +140,15 @@ result as a string. This function is a replacement of the previous function <c>unix:cmd/1</c>; they are equivalent on a Unix platform.</p> + <warning><p>Previous implementation used to allow all characters + as long as they were integer values greater than or equal to zero. + This sometimes lead to unwanted results since null characters + (integer value zero) often are interpreted as string termination. + Current implementation still accepts null characters at the end + of <c><anno>Command</anno></c> even though the documentation + states that no null characters are allowed. This will however + be changed in the future so that no null characters at all will + be accepted.</p></warning> <p><em>Examples:</em></p> <code type="none"> LsOut = os:cmd("ls"), % on unix platform @@ -152,6 +252,15 @@ DirOut = os:cmd("dir"), % on Win32 platform</code> <p>On Unix platforms, the environment is set using UTF-8 encoding if Unicode filename translation is in effect. On Windows, the environment is set using wide character interfaces.</p> + <note> + <p> + <c><anno>VarName</anno></c> is not allowed to contain + an <c>$=</c> character. Previous implementations used + to just let the <c>$=</c> character through which + silently caused erroneous results. Current implementation + will instead throw a <c>badarg</c> exception. + </p> + </note> </desc> </func> |