aboutsummaryrefslogtreecommitdiffstats
path: root/lib/stdlib/doc/src
diff options
context:
space:
mode:
Diffstat (limited to 'lib/stdlib/doc/src')
-rw-r--r--lib/stdlib/doc/src/unicode_usage.xml35
1 files changed, 19 insertions, 16 deletions
diff --git a/lib/stdlib/doc/src/unicode_usage.xml b/lib/stdlib/doc/src/unicode_usage.xml
index 33cd70e0b7..ee7dd128f1 100644
--- a/lib/stdlib/doc/src/unicode_usage.xml
+++ b/lib/stdlib/doc/src/unicode_usage.xml
@@ -52,8 +52,8 @@
for UTF-8 and more support for Unicode character sets in the
I/O-system.</p>
- <p>In R17, the encoding default for Erlang source files will be
- switched to UTF-8 and in R18 Erlang will support atoms in the full
+ <p>In 17.0, the encoding default for Erlang source files was
+ switched to UTF-8 and in 18.0 Erlang will support atoms in the full
Unicode range, meaning full Unicode function and module
names</p>
@@ -290,7 +290,7 @@
<item>Having the source code in UTF-8 also allows you to write
string literals containing Unicode characters with code points &gt;
255, although atoms, module names and function names will be
- restricted to the ISO-Latin-1 range until the R18 release. Binary
+ restricted to the ISO-Latin-1 range until the 18.0 release. Binary
literals where you use the <c>/utf8</c> type, can also be
expressed using Unicode characters &gt; 255. Having module names
using characters other than 7-bit ASCII can cause trouble on
@@ -385,7 +385,7 @@ external_charlist() = maybe_improper_list(char() |
using characters from the ISO-latin-1 character set and atoms are
restricted to the same ISO-latin-1 range. These restrictions in the
language are of course independent of the encoding of the source
- file. Erlang/OTP R18 is expected to handle functions named in
+ file. Erlang/OTP 18.0 is expected to handle functions named in
Unicode as well as Unicode atoms.</p>
<section>
<title>Bit-syntax</title>
@@ -662,11 +662,14 @@ Eshell V5.10.1 (abort with ^G)
containing characters having code points between 128 and 255 may
be named either as plain ISO-latin-1 or using UTF-8 encoding. As
no consistency is enforced, the Erlang VM can do no consistent
- translation of all file names. If the VM would automatically
- select encoding based on heuristics, one could get unexpected
- behavior on these systems. By default, Erlang starts in "latin1"
- file name mode on such systems, meaning bytewise encoding in file
- names. This allows for list representation of all file names in
+ translation of all file names.</p>
+
+ <p>By default on such systems, Erlang starts in <c>utf8</c> file
+ name mode if the terminal supports UTF-8, otherwise in
+ <c>latin1</c> mode.</p>
+
+ <p>In the <c>latin1</c> mode, file names are bytewise endcoded.
+ This allows for list representation of all file names in
the system, but, for example, a file named "Ă–stersund.txt", will
appear in <c>file:list_dir/1</c> as either "Ă–stersund.txt" (if
the file name was encoded in bytewise ISO-Latin-1 by the program
@@ -752,7 +755,7 @@ Eshell V5.10.1 (abort with ^G)
<section>
<title>Notes About Raw File Names</title>
-
+ <marker id="notes-about-raw-filenames"/>
<p>Raw file names were introduced together with Unicode file name
support in erts-5.8.2 (OTP R14B01). The reason &quot;raw file
names&quot; was introduced in the system was to be able to
@@ -1014,7 +1017,8 @@ ok
allowed. This setting should correspond to the actual terminal
you are using.</p>
<p>The environment can also affect file name interpretation, if
- Erlang is started with the <c>+fna</c> flag.</p>
+ Erlang is started with the <c>+fna</c> flag (which is default from
+ Erlang/OTP 17.0).</p>
<p>You can check the setting of this by calling
<c>io:getopts()</c>, which will give you an option list
containing <c>{encoding,unicode}</c> or
@@ -1046,8 +1050,7 @@ ok
&gt; 255.</p>
<p><c>+fnl</c> means bytewise interpretation of file names, which
was the usual way to represent ISO-Latin-1 file names before
- UTF-8 file naming got widespread. This is the default on all
- Unix-like operating systems except MacOS X.</p>
+ UTF-8 file naming got widespread.</p>
<p><c>+fnu</c> means that file names are encoded in UTF-8, which
is nowadays the common scheme (although not enforced).</p>
<p><c>+fna</c> means that you automatically select between
@@ -1055,8 +1058,8 @@ ok
<c>LC_CTYPE</c> environment variables. This is optimistic
heuristics indeed, nothing enforces a user to have a terminal
with the same encoding as the file system, but usually, this is
- the case. This might be the default behavior in a future
- release.</p>
+ the case. This is the default on all Unix-like operating
+ systems except MacOS X.</p>
<p>The file name translation mode can be read with the
<c>file:native_name_encoding/0</c> function, which returns
@@ -1068,7 +1071,7 @@ ok
<p>This function returns the default encoding for Erlang source
files (if no encoding comment is present) in the currently
running release. For R16 this returns <c>latin1</c> (meaning
- bytewise encoding). In R17 and forward it is expected to return
+ bytewise encoding). In 17.0 and forward it returns
<c>utf8</c>.</p>
<p>The encoding of each file can be specified using comments as
described in