aboutsummaryrefslogtreecommitdiffstats
path: root/lib/kernel/doc
diff options
context:
space:
mode:
authorRickard Green <[email protected]>2017-10-12 17:05:02 +0200
committerRickard Green <[email protected]>2017-10-12 17:05:02 +0200
commit5dc3cc971a7ce8d5384019a9a6f14a9ab7790163 (patch)
treeb7e4befb7b193bdbf466a58e9de379edad5cdbf1 /lib/kernel/doc
parent35d9fcedf1cc450ca36d95e9b8263fbc3e3a9b57 (diff)
parentbb0b43eae854125688f3143e53c8974cafed4ad2 (diff)
downloadotp-5dc3cc971a7ce8d5384019a9a6f14a9ab7790163.tar.gz
otp-5dc3cc971a7ce8d5384019a9a6f14a9ab7790163.tar.bz2
otp-5dc3cc971a7ce8d5384019a9a6f14a9ab7790163.zip
Merge branch 'rickard/null-chars/ERL-370/OTP-14543'
* rickard/null-chars/ERL-370/OTP-14543: Don't allow null chars in various strings Conflicts: erts/emulator/beam/erl_alloc.types erts/preloaded/ebin/erlang.beam
Diffstat (limited to 'lib/kernel/doc')
-rw-r--r--lib/kernel/doc/src/file.xml43
-rw-r--r--lib/kernel/doc/src/os.xml109
2 files changed, 146 insertions, 6 deletions
diff --git a/lib/kernel/doc/src/file.xml b/lib/kernel/doc/src/file.xml
index b674b3ca93..2ab35b9b05 100644
--- a/lib/kernel/doc/src/file.xml
+++ b/lib/kernel/doc/src/file.xml
@@ -41,7 +41,7 @@
<p>Regarding filename encoding, the Erlang VM can operate in
two modes. The current mode can be queried using function
- <seealso marker="#native_name_encoding"><c>native_name_encoding/0</c></seealso>.
+ <seealso marker="#native_name_encoding/0"><c>native_name_encoding/0</c></seealso>.
It returns <c>latin1</c> or <c>utf8</c>.</p>
<p>In <c>latin1</c> mode, the Erlang VM does not change the
@@ -59,7 +59,7 @@
terminal supports UTF-8, otherwise <c>latin1</c>. The default can
be overridden using <c>+fnl</c> (to force <c>latin1</c> mode)
or <c>+fnu</c> (to force <c>utf8</c> mode) when starting
- <seealso marker="erts:erl"><c>erts:erl</c></seealso>.</p>
+ <seealso marker="erts:erl"><c>erl</c></seealso>.</p>
<p>On operating systems with transparent naming, files can be
inconsistently named, for example, some files are encoded in UTF-8 while
@@ -81,6 +81,23 @@
<p>See also section <seealso marker="stdlib:unicode_usage#notes-about-raw-filenames">Notes About Raw Filenames</seealso> in the STDLIB User's Guide.</p>
+ <note><p>
+ File operations used to accept filenames containing
+ null characters (integer value zero). This caused
+ the name to be truncated and in some cases arguments
+ to primitive operations to be mixed up. Filenames
+ containing null characters inside the filename
+ are now <em>rejected</em> and will cause primitive
+ file operations fail.
+ </p></note>
+ <warning><p>
+ Currently null characters at the end of the filename
+ will be accepted by primitive file operations. Such
+ filenames are however still documented as invalid. The
+ implementation will also change in the future and
+ reject such filenames.
+ </p></warning>
+
</description>
<datatypes>
@@ -96,9 +113,21 @@
</datatype>
<datatype>
<name name="filename"/>
+ <desc>
+ <p>
+ See also the documentation of the
+ <seealso marker="#type-name_all"><c>name_all()</c></seealso> type.
+ </p>
+ </desc>
</datatype>
<datatype>
<name name="filename_all"/>
+ <desc>
+ <p>
+ See also the documentation of the
+ <seealso marker="#type-name_all"><c>name_all()</c></seealso> type.
+ </p>
+ </desc>
</datatype>
<datatype>
<name name="io_device"/>
@@ -112,21 +141,23 @@
<name name="name"/>
<desc>
<p>If VM is in Unicode filename mode, <c>string()</c> and <c>char()</c>
- are allowed to be &gt; 255.
+ are allowed to be &gt; 255. See also the documentation of the
+ <seealso marker="#type-name_all"><c>name_all()</c></seealso> type.
</p>
</desc>
</datatype>
<datatype>
<name name="name_all"/>
<desc>
- <p>If VM is in Unicode filename mode, <c>string()</c> and <c>char()</c>
+ <p>If VM is in Unicode filename mode, characters
are allowed to be &gt; 255.
<c><anno>RawFilename</anno></c> is a filename not subject to
Unicode translation,
meaning that it can contain characters not conforming to
the Unicode encoding expected from the file system
(that is, non-UTF-8 characters although the VM is started
- in Unicode filename mode).
+ in Unicode filename mode). Null characters (integer value zero)
+ are <em>not</em> allowed in filenames (not even at the end).
</p>
</desc>
</datatype>
@@ -1825,7 +1856,7 @@ f.txt: {person, "kalle", 25}.
<p>The functions in the module <c>file</c> usually treat binaries
as raw filenames, that is, they are passed "as is" even when the
encoding of the binary does not agree with
- <seealso marker="#native_name_encoding"><c>native_name_encoding()</c></seealso>.
+ <seealso marker="#native_name_encoding/0"><c>native_name_encoding()</c></seealso>.
However, this function expects binaries to be encoded according to the
value returned by <c>native_name_encoding()</c>.</p>
<p>Typical error reasons are:</p>
diff --git a/lib/kernel/doc/src/os.xml b/lib/kernel/doc/src/os.xml
index 0e9add4161..0a08e2c78a 100644
--- a/lib/kernel/doc/src/os.xml
+++ b/lib/kernel/doc/src/os.xml
@@ -36,8 +36,99 @@
only run on a specific platform. On the other hand, with careful
use, these functions can be of help in enabling a program to run on
most platforms.</p>
+
+ <note>
+ <p>
+ File operations used to accept filenames containing
+ null characters (integer value zero). This caused
+ the name to be truncated and in some cases arguments
+ to primitive operations to be mixed up. Filenames
+ containing null characters inside the filename
+ are now <em>rejected</em> and will cause primitive
+ file operations to fail.
+ </p>
+ <p>
+ Also environment variable operations used to accept
+ names and values of environment variables containing
+ null characters (integer value zero). This caused
+ operations to silently produce erroneous results.
+ Environment variable names and values containing
+ null characters inside the name or value are now
+ <em>rejected</em> and will cause environment variable
+ operations to fail.
+ </p>
+ </note>
+ <warning>
+ <p>
+ Currently null characters at the end of filenames,
+ environment variable names and values will be accepted
+ by the primitive operations. Such filenames, environment
+ variable names and values are however still documented as
+ invalid. The implementation will also change in the
+ future and reject such filenames, environment variable
+ names and values.
+ </p>
+ </warning>
</description>
+ <datatypes>
+ <datatype>
+ <name name="env_var_name"/>
+ <desc>
+ <p>A string containing valid characters on the specific
+ OS for environment variable names using
+ <seealso marker="file#native_name_encoding/0"><c>file:native_name_encoding()</c></seealso>
+ encoding. Note that specifically null characters (integer
+ value zero) and <c>$=</c> characters are not allowed.
+ However, note that not all invalid characters necessarily
+ will cause the primitiv operations to fail, but may instead
+ produce invalid results.
+ </p>
+ </desc>
+ </datatype>
+ <datatype>
+ <name name="env_var_value"/>
+ <desc>
+ <p>A string containing valid characters on the specific
+ OS for environment variable values using
+ <seealso marker="file#native_name_encoding/0"><c>file:native_name_encoding()</c></seealso>
+ encoding. Note that specifically null characters (integer
+ value zero) are not allowed. However, note that not all
+ invalid characters necessarily will cause the primitiv
+ operations to fail, but may instead produce invalid results.
+ </p>
+ </desc>
+ </datatype>
+ <datatype>
+ <name name="env_var_name_value"/>
+ <desc>
+ <p>
+ Assuming that environment variables has been correctly
+ set, a strings containing valid characters on the specific
+ OS for environment variable names and values using
+ <seealso marker="file#native_name_encoding/0"><c>file:native_name_encoding()</c></seealso>
+ encoding. The first <c>$=</c> characters appearing in
+ the string separates environment variable name (on the
+ left) from environment variable value (on the right).
+ </p>
+ </desc>
+ </datatype>
+ <datatype>
+ <name name="command_input"/>
+ <desc>
+ <p>All characters needs to be valid characters on the
+ specific OS using
+ <seealso marker="file#native_name_encoding/0"><c>file:native_name_encoding()</c></seealso>
+ encoding. Note that specifically null characters (integer
+ value zero) are not allowed. However, note that not all
+ invalid characters not necessarily will cause
+ <seealso marker="#cmd/1"><c>os:cmd/1</c></seealso>
+ to fail, but may instead produce invalid results.
+ </p>
+ </desc>
+ </datatype>
+ </datatypes>
+
<funcs>
<func>
<name name="cmd" arity="1"/>
@@ -49,6 +140,15 @@
result as a string. This function is a replacement of
the previous function <c>unix:cmd/1</c>; they are equivalent on a
Unix platform.</p>
+ <warning><p>Previous implementation used to allow all characters
+ as long as they were integer values greater than or equal to zero.
+ This sometimes lead to unwanted results since null characters
+ (integer value zero) often are interpreted as string termination.
+ Current implementation still accepts null characters at the end
+ of <c><anno>Command</anno></c> even though the documentation
+ states that no null characters are allowed. This will however
+ be changed in the future so that no null characters at all will
+ be accepted.</p></warning>
<p><em>Examples:</em></p>
<code type="none">
LsOut = os:cmd("ls"), % on unix platform
@@ -152,6 +252,15 @@ DirOut = os:cmd("dir"), % on Win32 platform</code>
<p>On Unix platforms, the environment is set using UTF-8 encoding
if Unicode filename translation is in effect. On Windows, the
environment is set using wide character interfaces.</p>
+ <note>
+ <p>
+ <c><anno>VarName</anno></c> is not allowed to contain
+ an <c>$=</c> character. Previous implementations used
+ to just let the <c>$=</c> character through which
+ silently caused erroneous results. Current implementation
+ will instead throw a <c>badarg</c> exception.
+ </p>
+ </note>
</desc>
</func>