aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
-rw-r--r--erts/doc/src/erlang.xml46
-rw-r--r--lib/stdlib/doc/src/unicode_usage.xml2
2 files changed, 44 insertions, 4 deletions
diff --git a/erts/doc/src/erlang.xml b/erts/doc/src/erlang.xml
index 638f7eef10..78d58a1e56 100644
--- a/erts/doc/src/erlang.xml
+++ b/erts/doc/src/erlang.xml
@@ -2781,14 +2781,17 @@ os_prompt%</pre>
<name>open_port(PortName, PortSettings) -> port()</name>
<fsummary>Open a port</fsummary>
<type>
- <v>PortName = {spawn, Command} | {spawn_driver, Command} | {spawn_executable, Command} | {fd, In, Out}</v>
+ <v>PortName = {spawn, Command} | {spawn_driver, Command} | {spawn_executable, FileName} | {fd, In, Out}</v>
<v>&nbsp;Command = string()</v>
+ <v>&nbsp;FileName = [ FileNameChar ] | binary()</v>
+ <v>&nbsp;FileNameChar = int() (1..255 or any Unicode codepoint, see description)</v>
<v>&nbsp;In = Out = int()</v>
<v>PortSettings = [Opt]</v>
- <v>&nbsp;Opt = {packet, N} | stream | {line, L} | {cd, Dir} | {env, Env} | {args, [ string() ]} | {arg0, string()} | exit_status | use_stdio | nouse_stdio | stderr_to_stdout | in | out | binary | eof</v>
+ <v>&nbsp;Opt = {packet, N} | stream | {line, L} | {cd, Dir} | {env, Env} | {args, [ ArgString ]} | {arg0, ArgString} | exit_status | use_stdio | nouse_stdio | stderr_to_stdout | in | out | binary | eof</v>
<v>&nbsp;&nbsp;N = 1 | 2 | 4</v>
<v>&nbsp;&nbsp;L = int()</v>
<v>&nbsp;&nbsp;Dir = string()</v>
+ <v>&nbsp;&nbsp;ArgString = [ FileNameChar ] | binary()</v>
<v>&nbsp;&nbsp;Env = [{Name, Val}]</v>
<v>&nbsp;&nbsp;&nbsp;Name = string()</v>
<v>&nbsp;&nbsp;&nbsp;Val = string() | false</v>
@@ -2851,7 +2854,26 @@ os_prompt%</pre>
executed, the appropriate command interpreter will
implicitly be invoked, but there will still be no
command argument expansion or implicit PATH search.</p>
-
+
+ <p>The name of the executable as well as the arguments
+ given in <c>args</c> and <c>arg0</c> is subject to
+ Unicode file name translation if the system is running
+ in Unicode file name mode. To avoid
+ translation or force i.e. UTF-8, supply the executable
+ and/or arguments as a binary in the correct
+ encoding. See the <seealso
+ marker="kernel:file">file</seealso> module, the
+ <seealso marker="kernel:file#native_name_encoding/0">
+ file:native_name_encoding/0</seealso> function and the
+ <seealso marker="stdlib:unicode_usage">stdlib users guide
+ </seealso> for details.</p>
+
+ <note>The characters in the name (if given as a list)
+ can only be &gt; 255 if the Erlang VM is started in
+ Unicode file name translation mode, otherwise the name
+ of the executable is limited to the ISO-latin-1
+ character set.</note>
+
<p>If the <c>Command</c> cannot be run, an error
exception, with the posix error code as the reason, is
raised. The error reason may differ between operating
@@ -2954,6 +2976,21 @@ os_prompt%</pre>
should not be given in this list. The proper executable name will
automatically be used as argv[0] where applicable.</p>
+ <p>When the Erlang VM is running in Unicode file name
+ mode, the arguments can contain any Unicode characters and
+ will be translated into whatever is appropriate on the
+ underlying OS, which means UTF-8 for all platforms except
+ Windows, which has other (more transparent) ways of
+ dealing with Unicode arguments to programs. To avoid
+ Unicode translation of arguments, they can be supplied as
+ binaries in whatever encoding is deemed appropriate.</p>
+
+ <note>The characters in the arguments (if given as a
+ list of characters) can only be &gt; 255 if the Erlang
+ VM is started in Unicode file name mode,
+ otherwise the arguments are limited to the
+ ISO-latin-1 character set.</note>
+
<p>If one, for any reason, wants to explicitly set the
program name in the argument vector, the <c>arg0</c>
option can be used.</p>
@@ -2969,6 +3006,9 @@ os_prompt%</pre>
responds to this is highly system dependent and no specific
effect is guaranteed.</p>
+ <p>The unicode file name translation rules of the
+ <c>args</c> option apply to this option as well.</p>
+
</item>
<tag><c>exit_status</c></tag>
diff --git a/lib/stdlib/doc/src/unicode_usage.xml b/lib/stdlib/doc/src/unicode_usage.xml
index df8e6c6b47..c02ea3cbcb 100644
--- a/lib/stdlib/doc/src/unicode_usage.xml
+++ b/lib/stdlib/doc/src/unicode_usage.xml
@@ -189,7 +189,7 @@ Eshell V5.7 (abort with ^G)
<p>For most systems, turning on Unicode file name translation is no problem even if it uses transparent file naming. Very few systems have mixed file name encodings. A consistent UTF-8 named system will work perfectly in Unicode file name mode. It is still however considered experimental in R14B01. Unicode file name translation is turned on with the <c>+fnu</c> switch to the <c>erl</c> program. If the VM is started in Unicode file name translation mode, <c>file:native_name_encoding/0</c> will return the atom <c>utf8</c>.</p>
-<p>In Unicode file name mode, file names given to the BIF <c>open_port/2</c> with the option <c>{spawn_executable,...}</c> are also interpreted as Unicode. So is the parameter list given in the <c>argv</c> option available when using <c>spawn_executable</c>. The UTF-8 translation of arguments can be avoided using binaries, see the discussion about raw file names below.</p>
+<p>In Unicode file name mode, file names given to the BIF <c>open_port/2</c> with the option <c>{spawn_executable,...}</c> are also interpreted as Unicode. So is the parameter list given in the <c>args</c> option available when using <c>spawn_executable</c>. The UTF-8 translation of arguments can be avoided using binaries, see the discussion about raw file names below.</p>
<p>It is worth noting that the file <c>encoding</c> options given when opening a file has nothing to do with the file <em>name</em> encoding convention. You can very well open files containing UTF-8 but having file names in ISO-latin-1 or vice versa.</p>