diff options
Diffstat (limited to 'lib/stdlib')
-rw-r--r-- | lib/stdlib/doc/src/unicode_usage.xml | 44 |
1 files changed, 22 insertions, 22 deletions
diff --git a/lib/stdlib/doc/src/unicode_usage.xml b/lib/stdlib/doc/src/unicode_usage.xml index ee7dd128f1..75505d7d84 100644 --- a/lib/stdlib/doc/src/unicode_usage.xml +++ b/lib/stdlib/doc/src/unicode_usage.xml @@ -5,7 +5,7 @@ <header> <copyright> <year>1999</year> - <year>2013</year> + <year>2014</year> <holder>Ericsson AB. All Rights Reserved.</holder> </copyright> <legalnotice> @@ -41,10 +41,10 @@ future.</p> <p>The functionality described in EEP10 was implemented in Erlang/OTP - as of R13A, but that was by no means the end of it. In R14B01 support + R13A, but that was by no means the end of it. In Erlang/OTP R14B01 support for Unicode file names was added, although it was in no way complete and was by default disabled on platforms where no guarantee was given - for the file name encoding. With R16A came support for UTF-8 encoded + for the file name encoding. With Erlang/OTP R16A came support for UTF-8 encoded source code, among with enhancements to many of the applications to support both Unicode encoded file names as well as support for UTF-8 encoded files in several circumstances. Most notable is the support @@ -52,8 +52,8 @@ for UTF-8 and more support for Unicode character sets in the I/O-system.</p> - <p>In 17.0, the encoding default for Erlang source files was - switched to UTF-8 and in 18.0 Erlang will support atoms in the full + <p>In Erlang/OTP 17.0, the encoding default for Erlang source files was + switched to UTF-8 and in Erlang/OTP 18.0 Erlang will support atoms in the full Unicode range, meaning full Unicode function and module names</p> @@ -222,7 +222,7 @@ <tag>Representation</tag> <item>To handle Unicode characters in Erlang, we have to have a common representation both in lists and binaries. The EEP (10) and - the subsequent initial implementation in R13A settled a standard + the subsequent initial implementation in Erlang/OTP R13A settled a standard representation of Unicode characters in Erlang.</item> <tag>Manipulation</tag> <item>The Unicode characters need to be processed by the Erlang @@ -274,9 +274,9 @@ (<c>+fnu</c>) on platforms where this is not the default.</item> <tag>Source code encoding</tag> <item>When it comes to the Erlang source code, there is support - for the UTF-8 encoding and bytewise encoding. The default in R16B - is bytewise (or latin1) encoding. You can control the encoding by - a comment like: + for the UTF-8 encoding and bytewise encoding. The default in + Erlang/OTP R16B was bytewise (or latin1) encoding; in Erlang/OTP 17.0 + it was changed to UTF-8. You can control the encoding by a comment like: <code> %% -*- coding: utf-8 -*- </code> @@ -290,7 +290,7 @@ <item>Having the source code in UTF-8 also allows you to write string literals containing Unicode characters with code points > 255, although atoms, module names and function names will be - restricted to the ISO-Latin-1 range until the 18.0 release. Binary + restricted to the ISO-Latin-1 range until the Erlang/OTP 18.0 release. Binary literals where you use the <c>/utf8</c> type, can also be expressed using Unicode characters > 255. Having module names using characters other than 7-bit ASCII can cause trouble on @@ -304,7 +304,7 @@ <section> <title>Standard Unicode Representation</title> <p>In Erlang, strings are actually lists of integers. A string was - up until R13 defined to be encoded in the ISO-latin-1 (ISO8859-1) + up until Erlang/OTP R13 defined to be encoded in the ISO-latin-1 (ISO8859-1) character set, which is, code point by code point, a sub-range of the Unicode character set.</p> <p>The standard list encoding for strings was therefore easily @@ -321,7 +321,7 @@ encoding has to be decided upon and the string should be converted to a binary in the preferred encoding using <c>unicode:characters_to_binary/{1,2,3}</c>. Strings are not - generally lists of bytes, as they were before R13. They are lists of + generally lists of bytes, as they were before Erlang/OTP R13. They are lists of characters. Characters are not generally bytes, they are Unicode code points.</p> @@ -447,8 +447,8 @@ Bin4 = <<"Hello"/utf16>>,</code> probably will not appreciate). Another way is to keep it backwards compatible so that only the ISO-Latin-1 character set is used to detect a string. A third way would be to let the user decide - exactly what Unicode ranges are to be viewed as characters. In - R16B you can select either the whole Unicode range or the + exactly what Unicode ranges are to be viewed as characters. Since + Erlang/OTP R16B you can select either the whole Unicode range or the ISO-Latin-1 range by supplying the startup flag <c>+pc </c><i>Range</i>, where <i>Range</i> is either <c>latin1</c> or <c>unicode</c>. For backwards compatibility, the default is @@ -685,7 +685,7 @@ Eshell V5.10.1 (abort with ^G) </item> </taglist> - <p>The Unicode file naming support was introduced with OTP release + <p>The Unicode file naming support was introduced with Erlang/OTP R14B01. A VM operating in Unicode file name translation mode can work with files having names in any language or character set (as long as it is supported by the underlying OS and file system). The @@ -709,7 +709,7 @@ Eshell V5.10.1 (abort with ^G) problem even if it uses transparent file naming. Very few systems have mixed file name encodings. A consistent UTF-8 named system will work perfectly in Unicode file name mode. It was still however - considered experimental in R14B01 and is still not the default on + considered experimental in Erlang/OTP R14B01 and is still not the default on such systems. Unicode file name translation is turned on with the <c>+fnu</c> switch to the On Linux, a VM started without explicitly stating the file name translation mode will default to <c>latin1</c> @@ -757,7 +757,7 @@ Eshell V5.10.1 (abort with ^G) <title>Notes About Raw File Names</title> <marker id="notes-about-raw-filenames"/> <p>Raw file names were introduced together with Unicode file name - support in erts-5.8.2 (OTP R14B01). The reason "raw file + support in erts-5.8.2 (Erlang/OTP R14B01). The reason "raw file names" was introduced in the system was to be able to consistently represent file names given in different encodings on the same system. Having the VM automatically translate a file name @@ -798,10 +798,10 @@ Eshell V5.10.1 (abort with ^G) the argument as a binary.</p> <p>To force Unicode file name translation mode on systems where this - is not the default was considered experimental in OTP R14B01 due to + is not the default was considered experimental in Erlang/OTP R14B01 due to the fact that the initial implementation did not ignore wrongly encoded file names, so that raw file names could spread unexpectedly - throughout the system. Beginning with R16B, the wrongly encoded file + throughout the system. Beginning with Erlang/OTP R16B, the wrongly encoded file names are only retrieved by special functions (e.g. <c>file:list_dir_all/1</c>), so the impact on existing code is much lower, why it is now supported. Unicode file name translation @@ -1032,7 +1032,7 @@ ok <c>io</c>/<c>io_lib:format</c> with the <c>"~tp"</c> and <c>~tP</c> formatting instructions, as described above.</p> <p>You can check this option by calling io:printable_range/0, - which in R16B will return <c>unicode</c> or <c>latin1</c>. To be + which will return <c>unicode</c> or <c>latin1</c>. To be compatible with future (expected) extensions to the settings, one should rather use <c>io_lib:printable_list/1</c> to check if a list is printable according to the setting. That function will @@ -1070,8 +1070,8 @@ ok <item> <p>This function returns the default encoding for Erlang source files (if no encoding comment is present) in the currently - running release. For R16 this returns <c>latin1</c> (meaning - bytewise encoding). In 17.0 and forward it returns + running release. In Erlang/OTP R16B <c>latin1</c> was returned (meaning + bytewise encoding). In Erlang/OTP 17.0 and forward it returns <c>utf8</c>.</p> <p>The encoding of each file can be specified using comments as described in |