diff options
Diffstat (limited to 'lib/stdlib/doc/src/erl_scan.xml')
-rw-r--r-- | lib/stdlib/doc/src/erl_scan.xml | 337 |
1 files changed, 174 insertions, 163 deletions
diff --git a/lib/stdlib/doc/src/erl_scan.xml b/lib/stdlib/doc/src/erl_scan.xml index ee0d6b6033..137ccd3416 100644 --- a/lib/stdlib/doc/src/erl_scan.xml +++ b/lib/stdlib/doc/src/erl_scan.xml @@ -4,7 +4,7 @@ <erlref> <header> <copyright> - <year>1996</year><year>2015</year> + <year>1996</year><year>2016</year> <holder>Ericsson AB. All Rights Reserved.</holder> </copyright> <legalnotice> @@ -28,16 +28,17 @@ <docno>1</docno> <approved>Bjarne Däcker</approved> <checked></checked> - <date>97-01-24</date> + <date>1997-01-24</date> <rev>B</rev> - <file>erl_scan.sgml</file> + <file>erl_scan.xml</file> </header> <module>erl_scan</module> - <modulesummary>The Erlang Token Scanner</modulesummary> + <modulesummary>The Erlang token scanner.</modulesummary> <description> - <p>This module contains functions for tokenizing characters into + <p>This module contains functions for tokenizing (scanning) characters into Erlang tokens.</p> </description> + <datatypes> <datatype> <name name="category"></name> @@ -70,23 +71,96 @@ <name name="tokens_result"></name> </datatype> </datatypes> + <funcs> <func> + <name name="category" arity="1"/> + <fsummary>Return the category.</fsummary> + <desc> + <p>Returns the category of <c><anno>Token</anno></c>.</p> + </desc> + </func> + + <func> + <name name="column" arity="1"/> + <fsummary>Return the column.</fsummary> + <desc> + <p>Returns the column of <c><anno>Token</anno></c>'s + collection of annotations.</p> + </desc> + </func> + + <func> + <name name="end_location" arity="1"/> + <fsummary>Return the end location of the text.</fsummary> + <desc> + <p>Returns the end location of the text of + <c><anno>Token</anno></c>'s collection of annotations. If + there is no text, <c>undefined</c> is returned.</p> + </desc> + </func> + + <func> + <name name="format_error" arity="1"/> + <fsummary>Format an error descriptor.</fsummary> + <desc> + <p>Uses an <c><anno>ErrorDescriptor</anno></c> and returns a string + that describes the error or warning. This function is usually + called implicitly when an <c>ErrorInfo</c> structure is + processed (see section + <seealso marker="#errorinfo">Error Information</seealso>).</p> + </desc> + </func> + + <func> + <name name="line" arity="1"/> + <fsummary>Return the line.</fsummary> + <desc> + <p>Returns the line of <c><anno>Token</anno></c>'s collection + of annotations.</p> + </desc> + </func> + + <func> + <name name="location" arity="1"/> + <fsummary>Return the location.</fsummary> + <desc> + <p>Returns the location of <c><anno>Token</anno></c>'s + collection of annotations.</p> + </desc> + </func> + + <func> + <name name="reserved_word" arity="1"/> + <fsummary>Test for a reserved word.</fsummary> + <desc> + <p>Returns <c>true</c> if <c><anno>Atom</anno></c> is an + Erlang reserved word, otherwise <c>false</c>.</p> + </desc> + </func> + + <func> <name name="string" arity="1"/> <name name="string" arity="2"/> <name name="string" arity="3"/> - <fsummary>Scan a string and return the Erlang tokens</fsummary> + <fsummary>Scan a string and return the Erlang tokens.</fsummary> <desc> <p>Takes the list of characters <c><anno>String</anno></c> and tries to - scan (tokenize) them. Returns <c>{ok, <anno>Tokens</anno>, - <anno>EndLocation</anno>}</c>, - where <c><anno>Tokens</anno></c> are the Erlang tokens from - <c><anno>String</anno></c>. <c><anno>EndLocation</anno></c> - is the first location after the last token.</p> - <p><c>{error, <anno>ErrorInfo</anno>, <anno>ErrorLocation</anno>}</c> - is returned if an error occurs. - <c><anno>ErrorLocation</anno></c> is the first location after - the erroneous token.</p> + scan (tokenize) them. Returns one of the following:</p> + <taglist> + <tag><c>{ok, <anno>Tokens</anno>, <anno>EndLocation</anno>}</c></tag> + <item> + <p><c><anno>Tokens</anno></c> are the Erlang tokens from + <c><anno>String</anno></c>. <c><anno>EndLocation</anno></c> + is the first location after the last token.</p> + </item> + <tag><c>{error, <anno>ErrorInfo</anno>, + <anno>ErrorLocation</anno>}</c></tag> + <item> + <p>An error occurred. <c><anno>ErrorLocation</anno></c> is the + first location after the erroneous token.</p> + </item> + </taglist> <p><c>string(<anno>String</anno>)</c> is equivalent to <c>string(<anno>String</anno>, 1)</c>, and <c>string(<anno>String</anno>, @@ -95,80 +169,102 @@ <anno>StartLocation</anno>, [])</c>.</p> <p><c><anno>StartLocation</anno></c> indicates the initial location when scanning starts. If <c><anno>StartLocation</anno></c> is a line, - <c>Anno</c> as well as <c><anno>EndLocation</anno></c> and - <c><anno>ErrorLocation</anno></c> will be lines. If - <c><anno>StartLocation</anno></c> is a pair of a line and a column + <c>Anno</c>, <c><anno>EndLocation</anno></c>, and + <c><anno>ErrorLocation</anno></c> are lines. If + <c><anno>StartLocation</anno></c> is a pair of a line and a column, <c>Anno</c> takes the form of an opaque compound data type, and <c><anno>EndLocation</anno></c> and <c><anno>ErrorLocation</anno></c> - will be pairs of a line and a column. The <em>token + are pairs of a line and a column. The <em>token annotations</em> contain information about the column and the line where the token begins, as well as the text of the - token (if the <c>text</c> option is given), all of which can + token (if option <c>text</c> is specified), all of which can be accessed by calling - <seealso marker="#column/1">column/1</seealso>, - <seealso marker="#line/1">line/1</seealso>, - <seealso marker="#location/1">location/1</seealso>, and - <seealso marker="#text/1">text/1</seealso>.</p> + <seealso marker="#column/1"><c>column/1</c></seealso>, + <seealso marker="#line/1"><c>line/1</c></seealso>, + <seealso marker="#location/1"><c>location/1</c></seealso>, and + <seealso marker="#text/1"><c>text/1</c></seealso>.</p> <p>A <em>token</em> is a tuple containing information about - syntactic category, the token annotations, and the actual - terminal symbol. For punctuation characters (e.g. <c>;</c>, + syntactic category, the token annotations, and the + terminal symbol. For punctuation characters (such as <c>;</c> and <c>|</c>) and reserved words, the category and the symbol coincide, and the token is represented by a two-tuple. - Three-tuples have one of the following forms: <c>{atom, - Info, atom()}</c>, - <c>{char, Info, integer()}</c>, <c>{comment, Info, - string()}</c>, <c>{float, Info, float()}</c>, <c>{integer, - Info, integer()}</c>, <c>{var, Info, atom()}</c>, - and <c>{white_space, Info, string()}</c>.</p> - <p>The valid options are:</p> + Three-tuples have one of the following forms:</p> + <list type="bulleted"> + <item><c>{atom, Anno, atom()}</c></item> + <item><c>{char, Anno, char()}</c></item> + <item><c>{comment, Anno, string()}</c></item> + <item><c>{float, Anno, float()}</c></item> + <item><c>{integer, Anno, integer()}</c></item> + <item><c>{var, Anno, atom()}</c></item> + <item><c>{white_space, Anno, string()}</c></item> + </list> + <p>Valid options:</p> <taglist> - <tag><c>{reserved_word_fun, reserved_word_fun()}</c></tag> - <item><p>A callback function that is called when the scanner - has found an unquoted atom. If the function returns - <c>true</c>, the unquoted atom itself will be the category - of the token; if the function returns <c>false</c>, - <c>atom</c> will be the category of the unquoted atom.</p> - </item> - <tag><c>return_comments</c></tag> - <item><p>Return comment tokens.</p> - </item> - <tag><c>return_white_spaces</c></tag> - <item><p>Return white space tokens. By convention, if there is - a newline character, it is always the first character of the - text (there cannot be more than one newline in a white space - token).</p> - </item> - <tag><c>return</c></tag> - <item><p>Short for <c>[return_comments, return_white_spaces]</c>.</p> - </item> - <tag><c>text</c></tag> - <item><p>Include the token's text in the token annotation. The - text is the part of the input corresponding to the token.</p> - </item> + <tag><c>{reserved_word_fun, reserved_word_fun()}</c></tag> + <item><p>A callback function that is called when the scanner + has found an unquoted atom. If the function returns + <c>true</c>, the unquoted atom itself becomes the category + of the token. If the function returns <c>false</c>, + <c>atom</c> becomes the category of the unquoted atom.</p> + </item> + <tag><c>return_comments</c></tag> + <item><p>Return comment tokens.</p> + </item> + <tag><c>return_white_spaces</c></tag> + <item><p>Return white space tokens. By convention, a newline + character, if present, is always the first character of the + text (there cannot be more than one newline in a white space + token).</p> + </item> + <tag><c>return</c></tag> + <item><p>Short for <c>[return_comments, return_white_spaces]</c>.</p> + </item> + <tag><c>text</c></tag> + <item><p>Include the token text in the token annotation. The + text is the part of the input corresponding to the token.</p> + </item> </taglist> </desc> </func> + + <func> + <name name="symbol" arity="1"/> + <fsummary>Return the symbol.</fsummary> + <desc> + <p>Returns the symbol of <c><anno>Token</anno></c>.</p> + </desc> + </func> + + <func> + <name name="text" arity="1"/> + <fsummary>Return the text.</fsummary> + <desc> + <p>Returns the text of <c><anno>Token</anno></c>'s collection + of annotations. If there is no text, <c>undefined</c> is + returned.</p> + </desc> + </func> + <func> <name name="tokens" arity="3"/> <name name="tokens" arity="4"/> - <fsummary>Re-entrant scanner</fsummary> + <fsummary>Re-entrant scanner.</fsummary> <type name="char_spec"/> <type name="return_cont"/> - <type_desc name="return_cont">An opaque continuation</type_desc> + <type_desc name="return_cont">An opaque continuation.</type_desc> <desc> - <p>This is the re-entrant scanner which scans characters until - a <em>dot</em> ('.' followed by a white space) or - <c>eof</c> has been reached. It returns:</p> + <p>This is the re-entrant scanner, which scans characters until + either a <em>dot</em> ('.' followed by a white space) or + <c>eof</c> is reached. It returns:</p> <taglist> <tag><c>{done, <anno>Result</anno>, <anno>LeftOverChars</anno>}</c> </tag> <item> - <p>This return indicates that there is sufficient input + <p>Indicates that there is sufficient input data to get a result. <c><anno>Result</anno></c> is:</p> <taglist> - <tag><c>{ok, Tokens, EndLocation}</c> - </tag> + <tag><c>{ok, Tokens, EndLocation}</c></tag> <item> <p>The scanning was successful. <c>Tokens</c> is the list of tokens including <em>dot</em>.</p> @@ -177,8 +273,7 @@ <item> <p>End of file was encountered before any more tokens.</p> </item> - <tag><c>{error, ErrorInfo, EndLocation}</c> - </tag> + <tag><c>{error, ErrorInfo, EndLocation}</c></tag> <item> <p>An error occurred. <c><anno>LeftOverChars</anno></c> is the remaining characters of the input data, @@ -194,110 +289,26 @@ </item> </taglist> <p>The <c><anno>CharSpec</anno></c> <c>eof</c> signals end of file. - <c><anno>LeftOverChars</anno></c> will then take the value <c>eof</c> + <c><anno>LeftOverChars</anno></c> then takes the value <c>eof</c> as well.</p> <p><c>tokens(<anno>Continuation</anno>, <anno>CharSpec</anno>, <anno>StartLocation</anno>)</c> is equivalent to <c>tokens(<anno>Continuation</anno>, <anno>CharSpec</anno>, <anno>StartLocation</anno>, [])</c>.</p> - <p>See <seealso marker="#string/3">string/3</seealso> for a - description of the various options.</p> - </desc> - </func> - <func> - <name name="reserved_word" arity="1"/> - <fsummary>Test for a reserved word</fsummary> - <desc> - <p>Returns <c>true</c> if <c><anno>Atom</anno></c> is an Erlang - reserved word, otherwise <c>false</c>.</p> - </desc> - </func> - <func> - <name name="category" arity="1"/> - <fsummary>Return the category</fsummary> - <desc> - <p>Returns the category of <c><anno>Token</anno></c>. - </p> - </desc> - </func> - <func> - <name name="symbol" arity="1"/> - <fsummary>Return the symbol</fsummary> - <desc> - <p>Returns the symbol of <c><anno>Token</anno></c>. - </p> - </desc> - </func> - <func> - <name name="column" arity="1"/> - <fsummary>Return the column</fsummary> - <desc> - <p>Returns the column of <c><anno>Token</anno></c>'s - collection of annotations. - </p> - </desc> - </func> - <func> - <name name="end_location" arity="1"/> - <fsummary>Return the end location of the text</fsummary> - <desc> - <p>Returns the end location of the text of - <c><anno>Token</anno></c>'s collection of annotations. If - there is no text, - <c>undefined</c> is returned. - </p> - </desc> - </func> - <func> - <name name="line" arity="1"/> - <fsummary>Return the line</fsummary> - <desc> - <p>Returns the line of <c><anno>Token</anno></c>'s collection - of annotations. - </p> - </desc> - </func> - <func> - <name name="location" arity="1"/> - <fsummary>Return the location</fsummary> - <desc> - <p>Returns the location of <c><anno>Token</anno></c>'s - collection of annotations. - </p> - </desc> - </func> - <func> - <name name="text" arity="1"/> - <fsummary>Return the text</fsummary> - <desc> - <p>Returns the text of <c><anno>Token</anno></c>'s collection - of annotations. If there is no text, <c>undefined</c> is - returned. - </p> - </desc> - </func> - <func> - <name name="format_error" arity="1"/> - <fsummary>Format an error descriptor</fsummary> - <desc> - <p>Takes an <c><anno>ErrorDescriptor</anno></c> and returns - a string which - describes the error or warning. This function is usually - called implicitly when processing an <c>ErrorInfo</c> - structure (see below).</p> + <p>For a description of the options, see + <seealso marker="#string/3"><c>string/3</c></seealso>.</p> </desc> </func> </funcs> <section> + <marker id="errorinfo"/> <title>Error Information</title> - <p>The <c>ErrorInfo</c> mentioned above is the standard - <c>ErrorInfo</c> structure which is returned from all IO - modules. It has the following format:</p> + <p><c>ErrorInfo</c> is the standard <c>ErrorInfo</c> structure that is + returned from all I/O modules. The format is as follows:</p> <code type="none"> {ErrorLocation, Module, ErrorDescriptor}</code> - <p>A string which describes the error is obtained with the - following call:</p> + <p>A string describing the error is obtained with the following call:</p> <code type="none"> Module:format_error(ErrorDescriptor)</code> </section> @@ -305,15 +316,15 @@ Module:format_error(ErrorDescriptor)</code> <section> <title>Notes</title> <p>The continuation of the first call to the re-entrant input - functions must be <c>[]</c>. Refer to Armstrong, Virding and - Williams, 'Concurrent Programming in Erlang', Chapter 13, for a - complete description of how the re-entrant input scheme works.</p> + functions must be <c>[]</c>. For a complete description of how the + re-entrant input scheme works, see Armstrong, Virding and + Williams: 'Concurrent Programming in Erlang', Chapter 13.</p> </section> <section> <title>See Also</title> - <p><seealso marker="io">io(3)</seealso>, - <seealso marker="erl_anno">erl_anno(3)</seealso>, - <seealso marker="erl_parse">erl_parse(3)</seealso></p> + <p><seealso marker="erl_anno"><c>erl_anno(3)</c></seealso>, + <seealso marker="erl_parse"><c>erl_parse(3)</c></seealso>, + <seealso marker="io"><c>io(3)</c></seealso></p> </section> </erlref> |