diff options
Diffstat (limited to 'lib/stdlib/doc')
| -rw-r--r-- | lib/stdlib/doc/src/unicode_usage.xml | 35 | 
1 files changed, 19 insertions, 16 deletions
diff --git a/lib/stdlib/doc/src/unicode_usage.xml b/lib/stdlib/doc/src/unicode_usage.xml index 33cd70e0b7..ee7dd128f1 100644 --- a/lib/stdlib/doc/src/unicode_usage.xml +++ b/lib/stdlib/doc/src/unicode_usage.xml @@ -52,8 +52,8 @@    for UTF-8 and more support for Unicode character sets in the    I/O-system.</p> -  <p>In R17, the encoding default for Erlang source files will be -  switched to UTF-8 and in R18 Erlang will support atoms in the full +  <p>In 17.0, the encoding default for Erlang source files was +  switched to UTF-8 and in 18.0 Erlang will support atoms in the full    Unicode range, meaning full Unicode function and module    names</p> @@ -290,7 +290,7 @@      <item>Having the source code in UTF-8 also allows you to write      string literals containing Unicode characters with code points >      255, although atoms, module names and function names will be -    restricted to the ISO-Latin-1 range until the R18 release. Binary +    restricted to the ISO-Latin-1 range until the 18.0 release. Binary      literals where you use the <c>/utf8</c> type, can also be      expressed using Unicode characters > 255. Having module names      using characters other than 7-bit ASCII can cause trouble on @@ -385,7 +385,7 @@ external_charlist() = maybe_improper_list(char() |    using characters from the ISO-latin-1 character set and atoms are    restricted to the same ISO-latin-1 range. These restrictions in the    language are of course independent of the encoding of the source -  file. Erlang/OTP R18 is expected to handle functions named in +  file. Erlang/OTP 18.0 is expected to handle functions named in    Unicode as well as Unicode atoms.</p>    <section>      <title>Bit-syntax</title> @@ -662,11 +662,14 @@ Eshell V5.10.1  (abort with ^G)        containing characters having code points between 128 and 255 may        be named either as plain ISO-latin-1 or using UTF-8 encoding. As        no consistency is enforced, the Erlang VM can do no consistent -      translation of all file names. If the VM would automatically -      select encoding based on heuristics, one could get unexpected -      behavior on these systems. By default, Erlang starts in "latin1" -      file name mode on such systems, meaning bytewise encoding in file -      names. This allows for list representation of all file names in +      translation of all file names.</p> + +      <p>By default on such systems, Erlang starts in <c>utf8</c> file +      name mode if the terminal supports UTF-8, otherwise in +      <c>latin1</c> mode.</p> + +      <p>In the <c>latin1</c> mode, file names are bytewise endcoded. +      This allows for list representation of all file names in        the system, but, for example, a file named "Ă–stersund.txt", will        appear in <c>file:list_dir/1</c> as either "Ă–stersund.txt" (if        the file name was encoded in bytewise ISO-Latin-1 by the program @@ -752,7 +755,7 @@ Eshell V5.10.1  (abort with ^G)  <section>    <title>Notes About Raw File Names</title> - +  <marker id="notes-about-raw-filenames"/>    <p>Raw file names were introduced together with Unicode file name    support in erts-5.8.2 (OTP R14B01). The reason "raw file    names" was introduced in the system was to be able to @@ -1014,7 +1017,8 @@ ok        allowed. This setting should correspond to the actual terminal        you are using.</p>        <p>The environment can also affect file name interpretation, if -      Erlang is started with the <c>+fna</c> flag.</p> +      Erlang is started with the <c>+fna</c> flag (which is default from +      Erlang/OTP 17.0).</p>        <p>You can check the setting of this by calling        <c>io:getopts()</c>, which will give you an option list        containing <c>{encoding,unicode}</c> or @@ -1046,8 +1050,7 @@ ok        > 255.</p>        <p><c>+fnl</c> means bytewise interpretation of file names, which        was the usual way to represent ISO-Latin-1 file names before -      UTF-8 file naming got widespread. This is the default on all -      Unix-like operating systems except MacOS X.</p> +      UTF-8 file naming got widespread.</p>        <p><c>+fnu</c> means that file names are encoded in UTF-8, which        is nowadays the common scheme (although not enforced).</p>        <p><c>+fna</c> means that you automatically select between @@ -1055,8 +1058,8 @@ ok        <c>LC_CTYPE</c> environment variables. This is optimistic        heuristics indeed, nothing enforces a user to have a terminal        with the same encoding as the file system, but usually, this is -      the case. This might be the default behavior in a future -      release.</p> +      the case.  This is the default on all Unix-like operating +      systems except MacOS X.</p>        <p>The file name translation mode can be read with the        <c>file:native_name_encoding/0</c> function, which returns @@ -1068,7 +1071,7 @@ ok        <p>This function returns the default encoding for Erlang source        files (if no encoding comment is present) in the currently        running release. For R16 this returns <c>latin1</c> (meaning -      bytewise encoding). In R17 and forward it is expected to return +      bytewise encoding). In 17.0 and forward it returns        <c>utf8</c>.</p>        <p>The encoding of each file can be specified using comments as        described in   | 
