aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorHans Bolinder <[email protected]>2014-03-17 12:05:18 +0100
committerHans Bolinder <[email protected]>2014-03-17 12:19:03 +0100
commit6be3e658d9999274d5c0c702bf799b90a110c745 (patch)
treef5768b4cf40ca1f02a80b2b5f7ee5fd94bf6be96
parent237264bc018b0cc17afeac5d3f6030073f314f9d (diff)
downloadotp-6be3e658d9999274d5c0c702bf799b90a110c745.tar.gz
otp-6be3e658d9999274d5c0c702bf799b90a110c745.tar.bz2
otp-6be3e658d9999274d5c0c702bf799b90a110c745.zip
Clarify the reference manual regarding source file encoding
-rw-r--r--lib/stdlib/doc/src/epp.xml8
-rw-r--r--system/doc/reference_manual/character_set.xml132
-rw-r--r--system/doc/reference_manual/introduction.xml85
-rw-r--r--system/doc/reference_manual/part.xml3
4 files changed, 140 insertions, 88 deletions
diff --git a/lib/stdlib/doc/src/epp.xml b/lib/stdlib/doc/src/epp.xml
index cf33530395..8cc977681f 100644
--- a/lib/stdlib/doc/src/epp.xml
+++ b/lib/stdlib/doc/src/epp.xml
@@ -4,7 +4,7 @@
<erlref>
<header>
<copyright>
- <year>1996</year><year>2013</year>
+ <year>1996</year><year>2014</year>
<holder>Ericsson AB. All Rights Reserved.</holder>
</copyright>
<legalnotice>
@@ -46,8 +46,10 @@
valid encodings are <c>Latin-1</c> and <c>UTF-8</c> where the
case of the characters can be chosen freely. Examples:</p>
<pre>
-%% coding: utf-8
-%% For this file we have chosen encoding = Latin-1
+%% coding: utf-8</pre>
+ <pre>
+%% For this file we have chosen encoding = Latin-1</pre>
+ <pre>
%% -*- coding: latin-1 -*-</pre>
</description>
<datatypes>
diff --git a/system/doc/reference_manual/character_set.xml b/system/doc/reference_manual/character_set.xml
new file mode 100644
index 0000000000..884898eb34
--- /dev/null
+++ b/system/doc/reference_manual/character_set.xml
@@ -0,0 +1,132 @@
+<?xml version="1.0" encoding="utf-8" ?>
+<!DOCTYPE chapter SYSTEM "chapter.dtd">
+
+<chapter>
+ <header>
+ <copyright>
+ <year>2014</year><year>2014</year>
+ <holder>Ericsson AB. All Rights Reserved.</holder>
+ </copyright>
+ <legalnotice>
+ The contents of this file are subject to the Erlang Public License,
+ Version 1.1, (the "License"); you may not use this file except in
+ compliance with the License. You should have received a copy of the
+ Erlang Public License along with this software. If not, it can be
+ retrieved online at http://www.erlang.org/.
+
+ Software distributed under the License is distributed on an "AS IS"
+ basis, WITHOUT WARRANTY OF ANY KIND, either express or implied. See
+ the License for the specific language governing rights and limitations
+ under the License.
+
+ </legalnotice>
+
+ <title>Character Set and Source File Encoding</title>
+ <prepared></prepared>
+ <docno></docno>
+ <date></date>
+ <rev></rev>
+ <file>character_set.xml</file>
+ </header>
+
+ <section>
+ <title>Character Set</title>
+ <p>In Erlang 4.8/OTP R5A the syntax of Erlang tokens was extended to
+ allow the use of the full ISO-8859-1 (Latin-1) character set. This
+ is noticeable in the following ways:</p>
+ <list type="bulleted">
+ <item>
+ <p>All the Latin-1 printable characters can be used and are
+ shown without the escape backslash convention.</p>
+ </item>
+ <item>
+ <p>Atoms and variables can use all Latin-1 letters.</p>
+ </item>
+ </list>
+ <table>
+ <row>
+ <cell align="left" valign="middle"><em>Octal</em></cell>
+ <cell align="left" valign="middle"><em>Decimal</em></cell>
+ <cell align="left" valign="middle">&nbsp;</cell>
+ <cell align="left" valign="middle"><em>Class</em></cell>
+ </row>
+ <row>
+ <cell align="left" valign="middle">200 - 237</cell>
+ <cell align="left" valign="middle">128 - 159</cell>
+ <cell align="left" valign="middle">&nbsp;</cell>
+ <cell align="left" valign="middle">Control characters</cell>
+ </row>
+ <row>
+ <cell align="left" valign="middle">240 - 277</cell>
+ <cell align="left" valign="middle">160 - 191</cell>
+ <cell align="right" valign="middle">- &iquest;</cell>
+ <cell align="left" valign="middle">Punctuation characters</cell>
+ </row>
+ <row>
+ <cell align="left" valign="middle">300 - 326</cell>
+ <cell align="left" valign="middle">192 - 214</cell>
+ <cell align="center" valign="middle">&Agrave; - &Ouml;</cell>
+ <cell align="left" valign="middle">Uppercase letters</cell>
+ </row>
+ <row>
+ <cell align="center" valign="middle">327</cell>
+ <cell align="center" valign="middle">215</cell>
+ <cell align="center" valign="middle">&times;</cell>
+ <cell align="left" valign="middle">Punctuation character</cell>
+ </row>
+ <row>
+ <cell align="left" valign="middle">330 - 336</cell>
+ <cell align="left" valign="middle">216 - 222</cell>
+ <cell align="center" valign="middle">&Oslash; - &THORN;</cell>
+ <cell align="left" valign="middle">Uppercase letters</cell>
+ </row>
+ <row>
+ <cell align="left" valign="middle">337 - 366</cell>
+ <cell align="left" valign="middle">223 - 246</cell>
+ <cell align="center" valign="middle">&szlig; - &ouml;</cell>
+ <cell align="left" valign="middle">Lowercase letters</cell>
+ </row>
+ <row>
+ <cell align="center" valign="middle">367</cell>
+ <cell align="center" valign="middle">247</cell>
+ <cell align="center" valign="middle">&divide;</cell>
+ <cell align="left" valign="middle">Punctuation character</cell>
+ </row>
+ <row>
+ <cell align="left" valign="middle">370 - 377</cell>
+ <cell align="left" valign="middle">248 - 255</cell>
+ <cell align="center" valign="middle">&oslash; - &yuml;</cell>
+ <cell align="left" valign="middle">Lowercase letters</cell>
+ </row>
+ <tcaption>Character Classes.</tcaption>
+ </table>
+ <p>In Erlang/OTP R16B the syntax of Erlang tokens was extended to
+ handle Unicode. To begin with the support is limited to
+ strings, but Erlang/OTP 18 is expected to handle Unicode atoms
+ as well. More about the usage of Unicode in Erlang source files
+ can be found in <seealso
+ marker="stdlib:unicode_usage#unicode_in_erlang">STDLIB's User's
+ Guide</seealso>.</p>
+ </section>
+ <section>
+ <title>Source File Encoding</title>
+ <p>The Erlang source file <marker
+ id="encoding">encoding</marker> is selected by a
+ comment in one of the first two lines of the source file. The
+ first string that matches the regular expression
+ <c>coding\s*[:=]\s*([-a-zA-Z0-9])+</c> selects the encoding. If
+ the matching string is not a valid encoding it is ignored. The
+ valid encodings are <c>Latin-1</c> and <c>UTF-8</c> where the
+ case of the characters can be chosen freely.</p>
+ <p>The following example selects UTF-8 as default encoding:</p>
+ <pre>
+%% coding: utf-8</pre>
+ <p>Two more examples, both selecting Latin-1 as default encoding:</p>
+ <pre>
+%% For this file we have chosen encoding = Latin-1</pre>
+ <pre>
+%% -*- coding: latin-1 -*-</pre>
+ <p>The default encoding for Erlang source files was changed from
+ Latin-1 to UTF-8 in Erlang OTP 17.0.</p>
+ </section>
+</chapter>
diff --git a/system/doc/reference_manual/introduction.xml b/system/doc/reference_manual/introduction.xml
index aa42967625..36bec17825 100644
--- a/system/doc/reference_manual/introduction.xml
+++ b/system/doc/reference_manual/introduction.xml
@@ -4,7 +4,7 @@
<chapter>
<header>
<copyright>
- <year>2003</year><year>2013</year>
+ <year>2003</year><year>2014</year>
<holder>Ericsson AB. All Rights Reserved.</holder>
</copyright>
<legalnotice>
@@ -79,88 +79,5 @@
when xor</p>
</section>
- <section>
- <title>Character Set</title>
- <p>In Erlang 4.8/OTP R5A the syntax of Erlang tokens was extended to
- allow the use of the full ISO-8859-1 (Latin-1) character set. This
- is noticeable in the following ways:</p>
- <list type="bulleted">
- <item>
- <p>All the Latin-1 printable characters can be used and are
- shown without the escape backslash convention.</p>
- </item>
- <item>
- <p>Atoms and variables can use all Latin-1 letters.</p>
- </item>
- </list>
- <table>
- <row>
- <cell align="left" valign="middle"><em>Octal</em></cell>
- <cell align="left" valign="middle"><em>Decimal</em></cell>
- <cell align="left" valign="middle">&nbsp;</cell>
- <cell align="left" valign="middle"><em>Class</em></cell>
- </row>
- <row>
- <cell align="left" valign="middle">200 - 237</cell>
- <cell align="left" valign="middle">128 - 159</cell>
- <cell align="left" valign="middle">&nbsp;</cell>
- <cell align="left" valign="middle">Control characters</cell>
- </row>
- <row>
- <cell align="left" valign="middle">240 - 277</cell>
- <cell align="left" valign="middle">160 - 191</cell>
- <cell align="right" valign="middle">- &iquest;</cell>
- <cell align="left" valign="middle">Punctuation characters</cell>
- </row>
- <row>
- <cell align="left" valign="middle">300 - 326</cell>
- <cell align="left" valign="middle">192 - 214</cell>
- <cell align="center" valign="middle">&Agrave; - &Ouml;</cell>
- <cell align="left" valign="middle">Uppercase letters</cell>
- </row>
- <row>
- <cell align="center" valign="middle">327</cell>
- <cell align="center" valign="middle">215</cell>
- <cell align="center" valign="middle">&times;</cell>
- <cell align="left" valign="middle">Punctuation character</cell>
- </row>
- <row>
- <cell align="left" valign="middle">330 - 336</cell>
- <cell align="left" valign="middle">216 - 222</cell>
- <cell align="center" valign="middle">&Oslash; - &THORN;</cell>
- <cell align="left" valign="middle">Uppercase letters</cell>
- </row>
- <row>
- <cell align="left" valign="middle">337 - 366</cell>
- <cell align="left" valign="middle">223 - 246</cell>
- <cell align="center" valign="middle">&szlig; - &ouml;</cell>
- <cell align="left" valign="middle">Lowercase letters</cell>
- </row>
- <row>
- <cell align="center" valign="middle">367</cell>
- <cell align="center" valign="middle">247</cell>
- <cell align="center" valign="middle">&divide;</cell>
- <cell align="left" valign="middle">Punctuation character</cell>
- </row>
- <row>
- <cell align="left" valign="middle">370 - 377</cell>
- <cell align="left" valign="middle">248 - 255</cell>
- <cell align="center" valign="middle">&oslash; - &yuml;</cell>
- <cell align="left" valign="middle">Lowercase letters</cell>
- </row>
- <tcaption>Character Classes.</tcaption>
- </table>
- <p>In Erlang/OTP R16 the syntax of Erlang tokens was extended to
- handle Unicode. To begin with the support is limited to strings,
- but Erlang/OTP R18 is expected to handle Unicode atoms as well.
- More about the usage of Unicode in Erlang source files can be
- found in <seealso
- marker="stdlib:unicode_usage#unicode_in_erlang">STDLIB's User'S
- Guide</seealso>. The default encoding for Erlang source files
- is still Latin-1, but in Erlang/OTP R17 the default encoding
- will be UTF-8. The details on how to state the encoding of an
- Erlang source file can be found in <seealso
- marker="stdlib:epp#encoding">epp(3)</seealso>.</p>
- </section>
</chapter>
diff --git a/system/doc/reference_manual/part.xml b/system/doc/reference_manual/part.xml
index b4f114c268..ee8f3dd7eb 100644
--- a/system/doc/reference_manual/part.xml
+++ b/system/doc/reference_manual/part.xml
@@ -4,7 +4,7 @@
<part xmlns:xi="http://www.w3.org/2001/XInclude">
<header>
<copyright>
- <year>2003</year><year>2013</year>
+ <year>2003</year><year>2014</year>
<holder>Ericsson AB. All Rights Reserved.</holder>
</copyright>
<legalnotice>
@@ -28,6 +28,7 @@
<rev></rev>
</header>
<xi:include href="introduction.xml"/>
+ <xi:include href="character_set.xml"/>
<xi:include href="data_types.xml"/>
<xi:include href="patterns.xml"/>
<xi:include href="modules.xml"/>