aboutsummaryrefslogtreecommitdiffstats
path: root/lib/stdlib/doc/src/erl_scan.xml
diff options
context:
space:
mode:
Diffstat (limited to 'lib/stdlib/doc/src/erl_scan.xml')
-rw-r--r--lib/stdlib/doc/src/erl_scan.xml470
1 files changed, 189 insertions, 281 deletions
diff --git a/lib/stdlib/doc/src/erl_scan.xml b/lib/stdlib/doc/src/erl_scan.xml
index 54240dea19..137ccd3416 100644
--- a/lib/stdlib/doc/src/erl_scan.xml
+++ b/lib/stdlib/doc/src/erl_scan.xml
@@ -1,23 +1,24 @@
-<?xml version="1.0" encoding="latin1" ?>
+<?xml version="1.0" encoding="utf-8" ?>
<!DOCTYPE erlref SYSTEM "erlref.dtd">
<erlref>
<header>
<copyright>
- <year>1996</year><year>2011</year>
+ <year>1996</year><year>2016</year>
<holder>Ericsson AB. All Rights Reserved.</holder>
</copyright>
<legalnotice>
- The contents of this file are subject to the Erlang Public License,
- Version 1.1, (the "License"); you may not use this file except in
- compliance with the License. You should have received a copy of the
- Erlang Public License along with this software. If not, it can be
- retrieved online at http://www.erlang.org/.
+ Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License.
+ You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
- Software distributed under the License is distributed on an "AS IS"
- basis, WITHOUT WARRANTY OF ANY KIND, either express or implied. See
- the License for the specific language governing rights and limitations
- under the License.
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
</legalnotice>
@@ -27,51 +28,28 @@
<docno>1</docno>
<approved>Bjarne D&auml;cker</approved>
<checked></checked>
- <date>97-01-24</date>
+ <date>1997-01-24</date>
<rev>B</rev>
- <file>erl_scan.sgml</file>
+ <file>erl_scan.xml</file>
</header>
<module>erl_scan</module>
- <modulesummary>The Erlang Token Scanner</modulesummary>
+ <modulesummary>The Erlang token scanner.</modulesummary>
<description>
- <p>This module contains functions for tokenizing characters into
+ <p>This module contains functions for tokenizing (scanning) characters into
Erlang tokens.</p>
</description>
+
<datatypes>
<datatype>
- <name name="attribute_info"></name>
- </datatype>
- <datatype>
- <name name="attributes"></name>
- </datatype>
- <datatype>
- <name name="attributes_data"></name>
- </datatype>
- <datatype>
<name name="category"></name>
</datatype>
<datatype>
- <name name="column"></name>
- </datatype>
- <datatype>
<name name="error_description"></name>
</datatype>
<datatype>
<name name="error_info"></name>
</datatype>
<datatype>
- <name name="info_line"></name>
- </datatype>
- <datatype>
- <name name="info_location"></name>
- </datatype>
- <datatype>
- <name name="line"></name>
- </datatype>
- <datatype>
- <name name="location"></name>
- </datatype>
- <datatype>
<name name="option"></name>
</datatype>
<datatype>
@@ -87,32 +65,102 @@
<name name="token"></name>
</datatype>
<datatype>
- <name name="token_info"></name>
- </datatype>
- <datatype>
<name name="tokens"></name>
</datatype>
<datatype>
<name name="tokens_result"></name>
</datatype>
</datatypes>
+
<funcs>
<func>
+ <name name="category" arity="1"/>
+ <fsummary>Return the category.</fsummary>
+ <desc>
+ <p>Returns the category of <c><anno>Token</anno></c>.</p>
+ </desc>
+ </func>
+
+ <func>
+ <name name="column" arity="1"/>
+ <fsummary>Return the column.</fsummary>
+ <desc>
+ <p>Returns the column of <c><anno>Token</anno></c>'s
+ collection of annotations.</p>
+ </desc>
+ </func>
+
+ <func>
+ <name name="end_location" arity="1"/>
+ <fsummary>Return the end location of the text.</fsummary>
+ <desc>
+ <p>Returns the end location of the text of
+ <c><anno>Token</anno></c>'s collection of annotations. If
+ there is no text, <c>undefined</c> is returned.</p>
+ </desc>
+ </func>
+
+ <func>
+ <name name="format_error" arity="1"/>
+ <fsummary>Format an error descriptor.</fsummary>
+ <desc>
+ <p>Uses an <c><anno>ErrorDescriptor</anno></c> and returns a string
+ that describes the error or warning. This function is usually
+ called implicitly when an <c>ErrorInfo</c> structure is
+ processed (see section
+ <seealso marker="#errorinfo">Error Information</seealso>).</p>
+ </desc>
+ </func>
+
+ <func>
+ <name name="line" arity="1"/>
+ <fsummary>Return the line.</fsummary>
+ <desc>
+ <p>Returns the line of <c><anno>Token</anno></c>'s collection
+ of annotations.</p>
+ </desc>
+ </func>
+
+ <func>
+ <name name="location" arity="1"/>
+ <fsummary>Return the location.</fsummary>
+ <desc>
+ <p>Returns the location of <c><anno>Token</anno></c>'s
+ collection of annotations.</p>
+ </desc>
+ </func>
+
+ <func>
+ <name name="reserved_word" arity="1"/>
+ <fsummary>Test for a reserved word.</fsummary>
+ <desc>
+ <p>Returns <c>true</c> if <c><anno>Atom</anno></c> is an
+ Erlang reserved word, otherwise <c>false</c>.</p>
+ </desc>
+ </func>
+
+ <func>
<name name="string" arity="1"/>
<name name="string" arity="2"/>
<name name="string" arity="3"/>
- <fsummary>Scan a string and return the Erlang tokens</fsummary>
+ <fsummary>Scan a string and return the Erlang tokens.</fsummary>
<desc>
<p>Takes the list of characters <c><anno>String</anno></c> and tries to
- scan (tokenize) them. Returns <c>{ok, <anno>Tokens</anno>,
- <anno>EndLocation</anno>}</c>,
- where <c><anno>Tokens</anno></c> are the Erlang tokens from
- <c><anno>String</anno></c>. <c><anno>EndLocation</anno></c>
- is the first location after the last token.</p>
- <p><c>{error, <anno>ErrorInfo</anno>, <anno>ErrorLocation</anno>}</c>
- is returned if an error occurs.
- <c><anno>ErrorLocation</anno></c> is the first location after
- the erroneous token.</p>
+ scan (tokenize) them. Returns one of the following:</p>
+ <taglist>
+ <tag><c>{ok, <anno>Tokens</anno>, <anno>EndLocation</anno>}</c></tag>
+ <item>
+ <p><c><anno>Tokens</anno></c> are the Erlang tokens from
+ <c><anno>String</anno></c>. <c><anno>EndLocation</anno></c>
+ is the first location after the last token.</p>
+ </item>
+ <tag><c>{error, <anno>ErrorInfo</anno>,
+ <anno>ErrorLocation</anno>}</c></tag>
+ <item>
+ <p>An error occurred. <c><anno>ErrorLocation</anno></c> is the
+ first location after the erroneous token.</p>
+ </item>
+ </taglist>
<p><c>string(<anno>String</anno>)</c> is equivalent to
<c>string(<anno>String</anno>, 1)</c>, and
<c>string(<anno>String</anno>,
@@ -120,79 +168,103 @@
<c>string(<anno>String</anno>,
<anno>StartLocation</anno>, [])</c>.</p>
<p><c><anno>StartLocation</anno></c> indicates the initial location
- when scanning starts. If <c><anno>StartLocation</anno></c> is a line
- <c>attributes()</c> as well as <c><anno>EndLocation</anno></c> and
- <c><anno>ErrorLocation</anno></c> will be lines. If
- <c><anno>StartLocation</anno></c> is a pair of a line and a column
- <c>attributes()</c> takes the form of an opaque compound
+ when scanning starts. If <c><anno>StartLocation</anno></c> is a line,
+ <c>Anno</c>, <c><anno>EndLocation</anno></c>, and
+ <c><anno>ErrorLocation</anno></c> are lines. If
+ <c><anno>StartLocation</anno></c> is a pair of a line and a column,
+ <c>Anno</c> takes the form of an opaque compound
data type, and <c><anno>EndLocation</anno></c> and
<c><anno>ErrorLocation</anno></c>
- will be pairs of a line and a column. The <em>token
- attributes</em> contain information about the column and the
+ are pairs of a line and a column. The <em>token
+ annotations</em> contain information about the column and the
line where the token begins, as well as the text of the
- token (if the <c>text</c> option is given), all of which can
- be accessed by calling <seealso
- marker="#token_info/1">token_info/1,2</seealso> or <seealso
- marker="#attributes_info/1">attributes_info/1,2</seealso>.</p>
+ token (if option <c>text</c> is specified), all of which can
+ be accessed by calling
+ <seealso marker="#column/1"><c>column/1</c></seealso>,
+ <seealso marker="#line/1"><c>line/1</c></seealso>,
+ <seealso marker="#location/1"><c>location/1</c></seealso>, and
+ <seealso marker="#text/1"><c>text/1</c></seealso>.</p>
<p>A <em>token</em> is a tuple containing information about
- syntactic category, the token attributes, and the actual
- terminal symbol. For punctuation characters (e.g. <c>;</c>,
+ syntactic category, the token annotations, and the
+ terminal symbol. For punctuation characters (such as <c>;</c> and
<c>|</c>) and reserved words, the category and the symbol
coincide, and the token is represented by a two-tuple.
- Three-tuples have one of the following forms: <c>{atom,
- Info, atom()}</c>,
- <c>{char, Info, integer()}</c>, <c>{comment, Info,
- string()}</c>, <c>{float, Info, float()}</c>, <c>{integer,
- Info, integer()}</c>, <c>{var, Info, atom()}</c>,
- and <c>{white_space, Info, string()}</c>.</p>
- <p>The valid options are:</p>
+ Three-tuples have one of the following forms:</p>
+ <list type="bulleted">
+ <item><c>{atom, Anno, atom()}</c></item>
+ <item><c>{char, Anno, char()}</c></item>
+ <item><c>{comment, Anno, string()}</c></item>
+ <item><c>{float, Anno, float()}</c></item>
+ <item><c>{integer, Anno, integer()}</c></item>
+ <item><c>{var, Anno, atom()}</c></item>
+ <item><c>{white_space, Anno, string()}</c></item>
+ </list>
+ <p>Valid options:</p>
<taglist>
- <tag><c>{reserved_word_fun, reserved_word_fun()}</c></tag>
- <item><p>A callback function that is called when the scanner
- has found an unquoted atom. If the function returns
- <c>true</c>, the unquoted atom itself will be the category
- of the token; if the function returns <c>false</c>,
- <c>atom</c> will be the category of the unquoted atom.</p>
- </item>
- <tag><c>return_comments</c></tag>
- <item><p>Return comment tokens.</p>
- </item>
- <tag><c>return_white_spaces</c></tag>
- <item><p>Return white space tokens. By convention, if there is
- a newline character, it is always the first character of the
- text (there cannot be more than one newline in a white space
- token).</p>
- </item>
- <tag><c>return</c></tag>
- <item><p>Short for <c>[return_comments, return_white_spaces]</c>.</p>
- </item>
- <tag><c>text</c></tag>
- <item><p>Include the token's text in the token attributes. The
- text is the part of the input corresponding to the token.</p>
- </item>
+ <tag><c>{reserved_word_fun, reserved_word_fun()}</c></tag>
+ <item><p>A callback function that is called when the scanner
+ has found an unquoted atom. If the function returns
+ <c>true</c>, the unquoted atom itself becomes the category
+ of the token. If the function returns <c>false</c>,
+ <c>atom</c> becomes the category of the unquoted atom.</p>
+ </item>
+ <tag><c>return_comments</c></tag>
+ <item><p>Return comment tokens.</p>
+ </item>
+ <tag><c>return_white_spaces</c></tag>
+ <item><p>Return white space tokens. By convention, a newline
+ character, if present, is always the first character of the
+ text (there cannot be more than one newline in a white space
+ token).</p>
+ </item>
+ <tag><c>return</c></tag>
+ <item><p>Short for <c>[return_comments, return_white_spaces]</c>.</p>
+ </item>
+ <tag><c>text</c></tag>
+ <item><p>Include the token text in the token annotation. The
+ text is the part of the input corresponding to the token.</p>
+ </item>
</taglist>
</desc>
</func>
+
+ <func>
+ <name name="symbol" arity="1"/>
+ <fsummary>Return the symbol.</fsummary>
+ <desc>
+ <p>Returns the symbol of <c><anno>Token</anno></c>.</p>
+ </desc>
+ </func>
+
+ <func>
+ <name name="text" arity="1"/>
+ <fsummary>Return the text.</fsummary>
+ <desc>
+ <p>Returns the text of <c><anno>Token</anno></c>'s collection
+ of annotations. If there is no text, <c>undefined</c> is
+ returned.</p>
+ </desc>
+ </func>
+
<func>
<name name="tokens" arity="3"/>
<name name="tokens" arity="4"/>
+ <fsummary>Re-entrant scanner.</fsummary>
<type name="char_spec"/>
<type name="return_cont"/>
- <type_desc name="return_cont">An opaque continuation</type_desc>
- <fsummary>Re-entrant scanner</fsummary>
+ <type_desc name="return_cont">An opaque continuation.</type_desc>
<desc>
- <p>This is the re-entrant scanner which scans characters until
- a <em>dot</em> ('.' followed by a white space) or
- <c>eof</c> has been reached. It returns:</p>
+ <p>This is the re-entrant scanner, which scans characters until
+ either a <em>dot</em> ('.' followed by a white space) or
+ <c>eof</c> is reached. It returns:</p>
<taglist>
<tag><c>{done, <anno>Result</anno>, <anno>LeftOverChars</anno>}</c>
</tag>
<item>
- <p>This return indicates that there is sufficient input
+ <p>Indicates that there is sufficient input
data to get a result. <c><anno>Result</anno></c> is:</p>
<taglist>
- <tag><c>{ok, Tokens, EndLocation}</c>
- </tag>
+ <tag><c>{ok, Tokens, EndLocation}</c></tag>
<item>
<p>The scanning was successful. <c>Tokens</c>
is the list of tokens including <em>dot</em>.</p>
@@ -201,8 +273,7 @@
<item>
<p>End of file was encountered before any more tokens.</p>
</item>
- <tag><c>{error, ErrorInfo, EndLocation}</c>
- </tag>
+ <tag><c>{error, ErrorInfo, EndLocation}</c></tag>
<item>
<p>An error occurred. <c><anno>LeftOverChars</anno></c>
is the remaining characters of the input data,
@@ -218,190 +289,26 @@
</item>
</taglist>
<p>The <c><anno>CharSpec</anno></c> <c>eof</c> signals end of file.
- <c><anno>LeftOverChars</anno></c> will then take the value <c>eof</c>
+ <c><anno>LeftOverChars</anno></c> then takes the value <c>eof</c>
as well.</p>
<p><c>tokens(<anno>Continuation</anno>, <anno>CharSpec</anno>,
<anno>StartLocation</anno>)</c> is equivalent to
<c>tokens(<anno>Continuation</anno>, <anno>CharSpec</anno>,
<anno>StartLocation</anno>, [])</c>.</p>
- <p>See <seealso marker="#string/3">string/3</seealso> for a
- description of the various options.</p>
- </desc>
- </func>
- <func>
- <name name="reserved_word" arity="1"/>
- <fsummary>Test for a reserved word</fsummary>
- <desc>
- <p>Returns <c>true</c> if <c><anno>Atom</anno></c> is an Erlang
- reserved word, otherwise <c>false</c>.</p>
- </desc>
- </func>
- <func>
- <name name="token_info" arity="1"/>
- <fsummary>Return information about a token</fsummary>
- <desc>
- <p>Returns a list containing information about the token
- <c><anno>Token</anno></c>. The order of the
- <c><anno>TokenInfoTuple</anno></c>s is not
- defined. See <seealso
- marker="#token_info/2">token_info/2</seealso> for
- information about specific
- <c><anno>TokenInfoTuple</anno></c>s.</p>
- <p>Note that if <c>token_info(Token, TokenItem)</c> returns
- <c>undefined</c> for some <c>TokenItem</c>, the
- item is not included in <c><anno>TokenInfo</anno></c>.</p>
- </desc>
- </func>
- <func>
- <name name="token_info" arity="2" clause_i="1"/>
- <name name="token_info" arity="2" clause_i="2"/>
- <type name="token_item"/>
- <type name="attribute_item"/>
- <fsummary>Return information about a token</fsummary>
- <desc>
- <p>Returns a list containing information about the token
- <c><anno>Token</anno></c>. If one single
- <c><anno>TokenItem</anno></c> is given the returned value is
- the corresponding
- <c>TokenInfoTuple</c>, or <c>undefined</c> if the
- <c>TokenItem</c> has no value. If a list of
- <c><anno>TokenItem</anno></c>s is given the result is a list of
- <c><anno>TokenInfoTuple</anno></c>. The
- <c><anno>TokenInfoTuple</anno></c>s will
- appear with the corresponding <c><anno>TokenItem</anno></c>s in
- the same order as the <c><anno>TokenItem</anno></c>s
- appear in the list of <c>TokenItem</c>s.
- <c><anno>TokenItem</anno></c>s with no value are not included
- in the list of <c><anno>TokenInfoTuple</anno></c>.</p>
- <p>The following <c><anno>TokenInfoTuple</anno></c>s with corresponding
- <c><anno>TokenItem</anno></c>s are valid:</p>
- <taglist>
- <tag><c>{category, <seealso marker="#type-category">
- category()</seealso>}</c></tag>
- <item><p>The category of the token.</p>
- </item>
- <tag><c>{column, <seealso marker="#type-column">
- column()</seealso>}</c></tag>
- <item><p>The column where the token begins.</p>
- </item>
- <tag><c>{length, integer() > 0}</c></tag>
- <item><p>The length of the token's text.</p>
- </item>
- <tag><c>{line, <seealso marker="#type-line">
- line()</seealso>}</c></tag>
- <item><p>The line where the token begins.</p>
- </item>
- <tag><c>{location, <seealso marker="#type-location">
- location()</seealso>}</c></tag>
- <item><p>The line and column where the token begins, or
- just the line if the column unknown.</p>
- </item>
- <tag><c>{symbol, <seealso marker="#type-symbol">
- symbol()</seealso>}</c></tag>
- <item><p>The token's symbol.</p>
- </item>
- <tag><c>{text, string()}</c></tag>
- <item><p>The token's text.</p>
- </item>
- </taglist>
- </desc>
- </func>
- <func>
- <name name="attributes_info" arity="1"/>
- <fsummary>Return information about token attributes</fsummary>
- <desc>
- <p>Returns a list containing information about the token
- attributes <c><anno>Attributes</anno></c>. The order of the
- <c><anno>AttributeInfoTuple</anno></c>s is not defined.
- See <seealso
- marker="#attributes_info/2">attributes_info/2</seealso> for
- information about specific
- <c><anno>AttributeInfoTuple</anno></c>s.</p>
- <p>Note that if <c>attributes_info(Token, AttributeItem)</c>
- returns <c>undefined</c> for some <c>AttributeItem</c> in
- the list above, the item is not included in
- <c><anno>AttributesInfo</anno></c>.</p>
- </desc>
- </func>
- <func>
- <name name="attributes_info" arity="2" clause_i="1"/>
- <name name="attributes_info" arity="2" clause_i="2"/>
- <fsummary>Return information about a token attributes</fsummary>
- <type name="attribute_item"/>
- <desc>
- <p>Returns a list containing information about the token
- attributes <c><anno>Attributes</anno></c>. If one single
- <c><anno>AttributeItem</anno></c> is given the returned value is the
- corresponding <c><anno>AttributeInfoTuple</anno></c>,
- or <c>undefined</c> if the <c><anno>AttributeItem</anno></c>
- has no value. If a list of <c><anno>AttributeItem</anno></c>
- is given the result is a list of
- <c><anno>AttributeInfoTuple</anno></c>.
- The <c><anno>AttributeInfoTuple</anno></c>s
- will appear with the corresponding <c><anno>AttributeItem</anno></c>s
- in the same order as the <c><anno>AttributeItem</anno></c>s
- appear in the list of <c><anno>AttributeItem</anno></c>s.
- <c><anno>AttributeItem</anno></c>s with no
- value are not included in the list of
- <c><anno>AttributeInfoTuple</anno></c>.</p>
- <p>The following <c><anno>AttributeInfoTuple</anno></c>s with
- corresponding <c><anno>AttributeItem</anno></c>s are valid:</p>
- <taglist>
- <tag><c>{column, <seealso marker="#type-column">
- column()</seealso>}</c></tag>
- <item><p>The column where the token begins.</p>
- </item>
- <tag><c>{length, integer() > 0}</c></tag>
- <item><p>The length of the token's text.</p>
- </item>
- <tag><c>{line, <seealso marker="#type-line">
- line()</seealso>}</c></tag>
- <item><p>The line where the token begins.</p>
- </item>
- <tag><c>{location, <seealso marker="#type-location">
- location()</seealso>}</c></tag>
- <item><p>The line and column where the token begins, or
- just the line if the column unknown.</p>
- </item>
- <tag><c>{text, string()}</c></tag>
- <item><p>The token's text.</p>
- </item>
- </taglist>
- </desc>
- </func>
- <func>
- <name name="set_attribute" arity="3"/>
- <fsummary>Set a token attribute value</fsummary>
- <desc>
- <p>Sets the value of the <c>line</c> attribute of the token
- attributes <c><anno>Attributes</anno></c>.</p>
- <p>The <c><anno>SetAttributeFun</anno></c> is called with the value of
- the <c>line</c> attribute, and is to return the new value of
- the <c>line</c> attribute.</p>
- </desc>
- </func>
- <func>
- <name name="format_error" arity="1"/>
- <fsummary>Format an error descriptor</fsummary>
- <desc>
- <p>Takes an <c><anno>ErrorDescriptor</anno></c> and returns
- a string which
- describes the error or warning. This function is usually
- called implicitly when processing an <c>ErrorInfo</c>
- structure (see below).</p>
+ <p>For a description of the options, see
+ <seealso marker="#string/3"><c>string/3</c></seealso>.</p>
</desc>
</func>
</funcs>
<section>
+ <marker id="errorinfo"/>
<title>Error Information</title>
- <p>The <c>ErrorInfo</c> mentioned above is the standard
- <c>ErrorInfo</c> structure which is returned from all IO
- modules. It has the following format:</p>
+ <p><c>ErrorInfo</c> is the standard <c>ErrorInfo</c> structure that is
+ returned from all I/O modules. The format is as follows:</p>
<code type="none">
{ErrorLocation, Module, ErrorDescriptor}</code>
- <p>A string which describes the error is obtained with the
- following call:</p>
+ <p>A string describing the error is obtained with the following call:</p>
<code type="none">
Module:format_error(ErrorDescriptor)</code>
</section>
@@ -409,14 +316,15 @@ Module:format_error(ErrorDescriptor)</code>
<section>
<title>Notes</title>
<p>The continuation of the first call to the re-entrant input
- functions must be <c>[]</c>. Refer to Armstrong, Virding and
- Williams, 'Concurrent Programming in Erlang', Chapter 13, for a
- complete description of how the re-entrant input scheme works.</p>
+ functions must be <c>[]</c>. For a complete description of how the
+ re-entrant input scheme works, see Armstrong, Virding and
+ Williams: 'Concurrent Programming in Erlang', Chapter 13.</p>
</section>
<section>
<title>See Also</title>
- <p><seealso marker="io">io(3)</seealso>,
- <seealso marker="erl_parse">erl_parse(3)</seealso></p>
+ <p><seealso marker="erl_anno"><c>erl_anno(3)</c></seealso>,
+ <seealso marker="erl_parse"><c>erl_parse(3)</c></seealso>,
+ <seealso marker="io"><c>io(3)</c></seealso></p>
</section>
</erlref>