diff options
author | Hans Bolinder <[email protected]> | 2014-03-17 12:05:18 +0100 |
---|---|---|
committer | Hans Bolinder <[email protected]> | 2014-03-17 12:19:03 +0100 |
commit | 6be3e658d9999274d5c0c702bf799b90a110c745 (patch) | |
tree | f5768b4cf40ca1f02a80b2b5f7ee5fd94bf6be96 | |
parent | 237264bc018b0cc17afeac5d3f6030073f314f9d (diff) | |
download | otp-6be3e658d9999274d5c0c702bf799b90a110c745.tar.gz otp-6be3e658d9999274d5c0c702bf799b90a110c745.tar.bz2 otp-6be3e658d9999274d5c0c702bf799b90a110c745.zip |
Clarify the reference manual regarding source file encoding
-rw-r--r-- | lib/stdlib/doc/src/epp.xml | 8 | ||||
-rw-r--r-- | system/doc/reference_manual/character_set.xml | 132 | ||||
-rw-r--r-- | system/doc/reference_manual/introduction.xml | 85 | ||||
-rw-r--r-- | system/doc/reference_manual/part.xml | 3 |
4 files changed, 140 insertions, 88 deletions
diff --git a/lib/stdlib/doc/src/epp.xml b/lib/stdlib/doc/src/epp.xml index cf33530395..8cc977681f 100644 --- a/lib/stdlib/doc/src/epp.xml +++ b/lib/stdlib/doc/src/epp.xml @@ -4,7 +4,7 @@ <erlref> <header> <copyright> - <year>1996</year><year>2013</year> + <year>1996</year><year>2014</year> <holder>Ericsson AB. All Rights Reserved.</holder> </copyright> <legalnotice> @@ -46,8 +46,10 @@ valid encodings are <c>Latin-1</c> and <c>UTF-8</c> where the case of the characters can be chosen freely. Examples:</p> <pre> -%% coding: utf-8 -%% For this file we have chosen encoding = Latin-1 +%% coding: utf-8</pre> + <pre> +%% For this file we have chosen encoding = Latin-1</pre> + <pre> %% -*- coding: latin-1 -*-</pre> </description> <datatypes> diff --git a/system/doc/reference_manual/character_set.xml b/system/doc/reference_manual/character_set.xml new file mode 100644 index 0000000000..884898eb34 --- /dev/null +++ b/system/doc/reference_manual/character_set.xml @@ -0,0 +1,132 @@ +<?xml version="1.0" encoding="utf-8" ?> +<!DOCTYPE chapter SYSTEM "chapter.dtd"> + +<chapter> + <header> + <copyright> + <year>2014</year><year>2014</year> + <holder>Ericsson AB. All Rights Reserved.</holder> + </copyright> + <legalnotice> + The contents of this file are subject to the Erlang Public License, + Version 1.1, (the "License"); you may not use this file except in + compliance with the License. You should have received a copy of the + Erlang Public License along with this software. If not, it can be + retrieved online at http://www.erlang.org/. + + Software distributed under the License is distributed on an "AS IS" + basis, WITHOUT WARRANTY OF ANY KIND, either express or implied. See + the License for the specific language governing rights and limitations + under the License. + + </legalnotice> + + <title>Character Set and Source File Encoding</title> + <prepared></prepared> + <docno></docno> + <date></date> + <rev></rev> + <file>character_set.xml</file> + </header> + + <section> + <title>Character Set</title> + <p>In Erlang 4.8/OTP R5A the syntax of Erlang tokens was extended to + allow the use of the full ISO-8859-1 (Latin-1) character set. This + is noticeable in the following ways:</p> + <list type="bulleted"> + <item> + <p>All the Latin-1 printable characters can be used and are + shown without the escape backslash convention.</p> + </item> + <item> + <p>Atoms and variables can use all Latin-1 letters.</p> + </item> + </list> + <table> + <row> + <cell align="left" valign="middle"><em>Octal</em></cell> + <cell align="left" valign="middle"><em>Decimal</em></cell> + <cell align="left" valign="middle"> </cell> + <cell align="left" valign="middle"><em>Class</em></cell> + </row> + <row> + <cell align="left" valign="middle">200 - 237</cell> + <cell align="left" valign="middle">128 - 159</cell> + <cell align="left" valign="middle"> </cell> + <cell align="left" valign="middle">Control characters</cell> + </row> + <row> + <cell align="left" valign="middle">240 - 277</cell> + <cell align="left" valign="middle">160 - 191</cell> + <cell align="right" valign="middle">- ¿</cell> + <cell align="left" valign="middle">Punctuation characters</cell> + </row> + <row> + <cell align="left" valign="middle">300 - 326</cell> + <cell align="left" valign="middle">192 - 214</cell> + <cell align="center" valign="middle">À - Ö</cell> + <cell align="left" valign="middle">Uppercase letters</cell> + </row> + <row> + <cell align="center" valign="middle">327</cell> + <cell align="center" valign="middle">215</cell> + <cell align="center" valign="middle">×</cell> + <cell align="left" valign="middle">Punctuation character</cell> + </row> + <row> + <cell align="left" valign="middle">330 - 336</cell> + <cell align="left" valign="middle">216 - 222</cell> + <cell align="center" valign="middle">Ø - Þ</cell> + <cell align="left" valign="middle">Uppercase letters</cell> + </row> + <row> + <cell align="left" valign="middle">337 - 366</cell> + <cell align="left" valign="middle">223 - 246</cell> + <cell align="center" valign="middle">ß - ö</cell> + <cell align="left" valign="middle">Lowercase letters</cell> + </row> + <row> + <cell align="center" valign="middle">367</cell> + <cell align="center" valign="middle">247</cell> + <cell align="center" valign="middle">÷</cell> + <cell align="left" valign="middle">Punctuation character</cell> + </row> + <row> + <cell align="left" valign="middle">370 - 377</cell> + <cell align="left" valign="middle">248 - 255</cell> + <cell align="center" valign="middle">ø - ÿ</cell> + <cell align="left" valign="middle">Lowercase letters</cell> + </row> + <tcaption>Character Classes.</tcaption> + </table> + <p>In Erlang/OTP R16B the syntax of Erlang tokens was extended to + handle Unicode. To begin with the support is limited to + strings, but Erlang/OTP 18 is expected to handle Unicode atoms + as well. More about the usage of Unicode in Erlang source files + can be found in <seealso + marker="stdlib:unicode_usage#unicode_in_erlang">STDLIB's User's + Guide</seealso>.</p> + </section> + <section> + <title>Source File Encoding</title> + <p>The Erlang source file <marker + id="encoding">encoding</marker> is selected by a + comment in one of the first two lines of the source file. The + first string that matches the regular expression + <c>coding\s*[:=]\s*([-a-zA-Z0-9])+</c> selects the encoding. If + the matching string is not a valid encoding it is ignored. The + valid encodings are <c>Latin-1</c> and <c>UTF-8</c> where the + case of the characters can be chosen freely.</p> + <p>The following example selects UTF-8 as default encoding:</p> + <pre> +%% coding: utf-8</pre> + <p>Two more examples, both selecting Latin-1 as default encoding:</p> + <pre> +%% For this file we have chosen encoding = Latin-1</pre> + <pre> +%% -*- coding: latin-1 -*-</pre> + <p>The default encoding for Erlang source files was changed from + Latin-1 to UTF-8 in Erlang OTP 17.0.</p> + </section> +</chapter> diff --git a/system/doc/reference_manual/introduction.xml b/system/doc/reference_manual/introduction.xml index aa42967625..36bec17825 100644 --- a/system/doc/reference_manual/introduction.xml +++ b/system/doc/reference_manual/introduction.xml @@ -4,7 +4,7 @@ <chapter> <header> <copyright> - <year>2003</year><year>2013</year> + <year>2003</year><year>2014</year> <holder>Ericsson AB. All Rights Reserved.</holder> </copyright> <legalnotice> @@ -79,88 +79,5 @@ when xor</p> </section> - <section> - <title>Character Set</title> - <p>In Erlang 4.8/OTP R5A the syntax of Erlang tokens was extended to - allow the use of the full ISO-8859-1 (Latin-1) character set. This - is noticeable in the following ways:</p> - <list type="bulleted"> - <item> - <p>All the Latin-1 printable characters can be used and are - shown without the escape backslash convention.</p> - </item> - <item> - <p>Atoms and variables can use all Latin-1 letters.</p> - </item> - </list> - <table> - <row> - <cell align="left" valign="middle"><em>Octal</em></cell> - <cell align="left" valign="middle"><em>Decimal</em></cell> - <cell align="left" valign="middle"> </cell> - <cell align="left" valign="middle"><em>Class</em></cell> - </row> - <row> - <cell align="left" valign="middle">200 - 237</cell> - <cell align="left" valign="middle">128 - 159</cell> - <cell align="left" valign="middle"> </cell> - <cell align="left" valign="middle">Control characters</cell> - </row> - <row> - <cell align="left" valign="middle">240 - 277</cell> - <cell align="left" valign="middle">160 - 191</cell> - <cell align="right" valign="middle">- ¿</cell> - <cell align="left" valign="middle">Punctuation characters</cell> - </row> - <row> - <cell align="left" valign="middle">300 - 326</cell> - <cell align="left" valign="middle">192 - 214</cell> - <cell align="center" valign="middle">À - Ö</cell> - <cell align="left" valign="middle">Uppercase letters</cell> - </row> - <row> - <cell align="center" valign="middle">327</cell> - <cell align="center" valign="middle">215</cell> - <cell align="center" valign="middle">×</cell> - <cell align="left" valign="middle">Punctuation character</cell> - </row> - <row> - <cell align="left" valign="middle">330 - 336</cell> - <cell align="left" valign="middle">216 - 222</cell> - <cell align="center" valign="middle">Ø - Þ</cell> - <cell align="left" valign="middle">Uppercase letters</cell> - </row> - <row> - <cell align="left" valign="middle">337 - 366</cell> - <cell align="left" valign="middle">223 - 246</cell> - <cell align="center" valign="middle">ß - ö</cell> - <cell align="left" valign="middle">Lowercase letters</cell> - </row> - <row> - <cell align="center" valign="middle">367</cell> - <cell align="center" valign="middle">247</cell> - <cell align="center" valign="middle">÷</cell> - <cell align="left" valign="middle">Punctuation character</cell> - </row> - <row> - <cell align="left" valign="middle">370 - 377</cell> - <cell align="left" valign="middle">248 - 255</cell> - <cell align="center" valign="middle">ø - ÿ</cell> - <cell align="left" valign="middle">Lowercase letters</cell> - </row> - <tcaption>Character Classes.</tcaption> - </table> - <p>In Erlang/OTP R16 the syntax of Erlang tokens was extended to - handle Unicode. To begin with the support is limited to strings, - but Erlang/OTP R18 is expected to handle Unicode atoms as well. - More about the usage of Unicode in Erlang source files can be - found in <seealso - marker="stdlib:unicode_usage#unicode_in_erlang">STDLIB's User'S - Guide</seealso>. The default encoding for Erlang source files - is still Latin-1, but in Erlang/OTP R17 the default encoding - will be UTF-8. The details on how to state the encoding of an - Erlang source file can be found in <seealso - marker="stdlib:epp#encoding">epp(3)</seealso>.</p> - </section> </chapter> diff --git a/system/doc/reference_manual/part.xml b/system/doc/reference_manual/part.xml index b4f114c268..ee8f3dd7eb 100644 --- a/system/doc/reference_manual/part.xml +++ b/system/doc/reference_manual/part.xml @@ -4,7 +4,7 @@ <part xmlns:xi="http://www.w3.org/2001/XInclude"> <header> <copyright> - <year>2003</year><year>2013</year> + <year>2003</year><year>2014</year> <holder>Ericsson AB. All Rights Reserved.</holder> </copyright> <legalnotice> @@ -28,6 +28,7 @@ <rev></rev> </header> <xi:include href="introduction.xml"/> + <xi:include href="character_set.xml"/> <xi:include href="data_types.xml"/> <xi:include href="patterns.xml"/> <xi:include href="modules.xml"/> |