diff options
author | Patrik Nyblom <[email protected]> | 2010-12-03 15:38:14 +0100 |
---|---|---|
committer | Patrik Nyblom <[email protected]> | 2010-12-03 15:38:14 +0100 |
commit | 73b53fe24ebc595a81c4ee023bcfef5a71e59db4 (patch) | |
tree | ed75b935318dbef275d62ab141a5f59b5b36bce0 /erts/doc | |
parent | 52b0fbbc4daa4e22fcad6fe39dd8f8cec8bb8d1b (diff) | |
parent | 159d79b6657e5bb4f3accb6a0d74b74f06da4ba2 (diff) | |
download | otp-73b53fe24ebc595a81c4ee023bcfef5a71e59db4.tar.gz otp-73b53fe24ebc595a81c4ee023bcfef5a71e59db4.tar.bz2 otp-73b53fe24ebc595a81c4ee023bcfef5a71e59db4.zip |
Merge branch 'pan/unicode-filenames/OTP-8887' into dev
* pan/unicode-filenames/OTP-8887: (27 commits)
Test and correct filelib and filename
Add documentation to erlang.xml and slight correction to unicode_usage.xml
Add section about Unicode file names to stdlib users guide
Correct bug in file_name_SUITE making it fail on Unix instead of Windows7
Add documentation about raw filenames and Unicode file name translation mode
Make filelib not crash on re codepoints beyond 255 in re when filename is raw
Mend on_load_embedded testcase which did not handle windows links
Correct testcase regarding windows versions supporting soft links.
Teach filelib to use re in unicode mode when filenames are not raw
Treat soft links on Windows correctly in file_name_SUITE
Adapt new soft and hard link routines on Windos to Unicode
Corrected testcases broken by unicode filenames
Update preloaded prim_file
Teach prim_file not to accept atoms and not to throw exceptions
Adapt inet_drv to Visual Studio 2008
Teach spawn_executable about Unicode
Convert filenames read on MacOSX to canonical form
Teach file to accept codepoints beyond 255.
Add testcases
Correct shell utilities to handle unicode and possibly binaries
...
Diffstat (limited to 'erts/doc')
-rw-r--r-- | erts/doc/src/erl.xml | 13 | ||||
-rw-r--r-- | erts/doc/src/erlang.xml | 46 |
2 files changed, 56 insertions, 3 deletions
diff --git a/erts/doc/src/erl.xml b/erts/doc/src/erl.xml index 9224d73b6f..77bd952d41 100644 --- a/erts/doc/src/erl.xml +++ b/erts/doc/src/erl.xml @@ -550,6 +550,19 @@ <p>Force the <c>compressed</c> option on all ETS tables. Only intended for test and evaluation.</p> </item> + <tag><c><![CDATA[+fnl]]></c></tag> + <item> + <p>The VM works with file names as if they are encoded using the ISO-latin-1 encoding, disallowing Unicode characters with codepoints beyond 255. This is default on operating systems that have transparent file naming, i.e. all Unixes except MacOSX.</p> + </item> + <tag><c><![CDATA[+fnu]]></c></tag> + <item> + <p>The VM works with file names as if they are encoded using UTF-8 (or some other system specific Unicode encoding). This is the default on operating systems that enforce Unicode encoding, i.e. Windows and MacOSX.</p> + <p>By enabling Unicode file name translation on systems where this is not default, you open up to the possibility that some file names can not be interpreted by the VM and therefore will be returned to the program as raw binaries. The option is therefore considered experimental.</p> + </item> + <tag><c><![CDATA[+fna]]></c></tag> + <item> + <p>Selection between <c>+fnl</c> and <c>+fnu</c> is done based on the current locale settings in the OS, meaning that if you have set your terminal for UTF-8 encoding, the filesystem is expected to use the same encoding for filenames (use with care).</p> + </item> <tag><c><![CDATA[+hms Size]]></c></tag> <item> <p>Sets the default heap size of processes to the size diff --git a/erts/doc/src/erlang.xml b/erts/doc/src/erlang.xml index 638f7eef10..78d58a1e56 100644 --- a/erts/doc/src/erlang.xml +++ b/erts/doc/src/erlang.xml @@ -2781,14 +2781,17 @@ os_prompt%</pre> <name>open_port(PortName, PortSettings) -> port()</name> <fsummary>Open a port</fsummary> <type> - <v>PortName = {spawn, Command} | {spawn_driver, Command} | {spawn_executable, Command} | {fd, In, Out}</v> + <v>PortName = {spawn, Command} | {spawn_driver, Command} | {spawn_executable, FileName} | {fd, In, Out}</v> <v> Command = string()</v> + <v> FileName = [ FileNameChar ] | binary()</v> + <v> FileNameChar = int() (1..255 or any Unicode codepoint, see description)</v> <v> In = Out = int()</v> <v>PortSettings = [Opt]</v> - <v> Opt = {packet, N} | stream | {line, L} | {cd, Dir} | {env, Env} | {args, [ string() ]} | {arg0, string()} | exit_status | use_stdio | nouse_stdio | stderr_to_stdout | in | out | binary | eof</v> + <v> Opt = {packet, N} | stream | {line, L} | {cd, Dir} | {env, Env} | {args, [ ArgString ]} | {arg0, ArgString} | exit_status | use_stdio | nouse_stdio | stderr_to_stdout | in | out | binary | eof</v> <v> N = 1 | 2 | 4</v> <v> L = int()</v> <v> Dir = string()</v> + <v> ArgString = [ FileNameChar ] | binary()</v> <v> Env = [{Name, Val}]</v> <v> Name = string()</v> <v> Val = string() | false</v> @@ -2851,7 +2854,26 @@ os_prompt%</pre> executed, the appropriate command interpreter will implicitly be invoked, but there will still be no command argument expansion or implicit PATH search.</p> - + + <p>The name of the executable as well as the arguments + given in <c>args</c> and <c>arg0</c> is subject to + Unicode file name translation if the system is running + in Unicode file name mode. To avoid + translation or force i.e. UTF-8, supply the executable + and/or arguments as a binary in the correct + encoding. See the <seealso + marker="kernel:file">file</seealso> module, the + <seealso marker="kernel:file#native_name_encoding/0"> + file:native_name_encoding/0</seealso> function and the + <seealso marker="stdlib:unicode_usage">stdlib users guide + </seealso> for details.</p> + + <note>The characters in the name (if given as a list) + can only be > 255 if the Erlang VM is started in + Unicode file name translation mode, otherwise the name + of the executable is limited to the ISO-latin-1 + character set.</note> + <p>If the <c>Command</c> cannot be run, an error exception, with the posix error code as the reason, is raised. The error reason may differ between operating @@ -2954,6 +2976,21 @@ os_prompt%</pre> should not be given in this list. The proper executable name will automatically be used as argv[0] where applicable.</p> + <p>When the Erlang VM is running in Unicode file name + mode, the arguments can contain any Unicode characters and + will be translated into whatever is appropriate on the + underlying OS, which means UTF-8 for all platforms except + Windows, which has other (more transparent) ways of + dealing with Unicode arguments to programs. To avoid + Unicode translation of arguments, they can be supplied as + binaries in whatever encoding is deemed appropriate.</p> + + <note>The characters in the arguments (if given as a + list of characters) can only be > 255 if the Erlang + VM is started in Unicode file name mode, + otherwise the arguments are limited to the + ISO-latin-1 character set.</note> + <p>If one, for any reason, wants to explicitly set the program name in the argument vector, the <c>arg0</c> option can be used.</p> @@ -2969,6 +3006,9 @@ os_prompt%</pre> responds to this is highly system dependent and no specific effect is guaranteed.</p> + <p>The unicode file name translation rules of the + <c>args</c> option apply to this option as well.</p> + </item> <tag><c>exit_status</c></tag> |