diff options
author | Sverker Eriksson <[email protected]> | 2017-08-30 20:55:08 +0200 |
---|---|---|
committer | Sverker Eriksson <[email protected]> | 2017-08-30 20:55:08 +0200 |
commit | 7c67bbddb53c364086f66260701bc54a61c9659c (patch) | |
tree | 92ab0d4b91d5e2f6e7a3f9d61ea25089e8a71fe0 /lib/stdlib/doc/src/erl_scan.xml | |
parent | 97dc5e7f396129222419811c173edc7fa767b0f8 (diff) | |
parent | 3b7a6ffddc819bf305353a593904cea9e932e7dc (diff) | |
download | otp-7c67bbddb53c364086f66260701bc54a61c9659c.tar.gz otp-7c67bbddb53c364086f66260701bc54a61c9659c.tar.bz2 otp-7c67bbddb53c364086f66260701bc54a61c9659c.zip |
Merge tag 'OTP-19.0' into sverker/19/binary_to_atom-utf8-crash/ERL-474/OTP-14590
Diffstat (limited to 'lib/stdlib/doc/src/erl_scan.xml')
-rw-r--r-- | lib/stdlib/doc/src/erl_scan.xml | 470 |
1 files changed, 189 insertions, 281 deletions
diff --git a/lib/stdlib/doc/src/erl_scan.xml b/lib/stdlib/doc/src/erl_scan.xml index 54240dea19..137ccd3416 100644 --- a/lib/stdlib/doc/src/erl_scan.xml +++ b/lib/stdlib/doc/src/erl_scan.xml @@ -1,23 +1,24 @@ -<?xml version="1.0" encoding="latin1" ?> +<?xml version="1.0" encoding="utf-8" ?> <!DOCTYPE erlref SYSTEM "erlref.dtd"> <erlref> <header> <copyright> - <year>1996</year><year>2011</year> + <year>1996</year><year>2016</year> <holder>Ericsson AB. All Rights Reserved.</holder> </copyright> <legalnotice> - The contents of this file are subject to the Erlang Public License, - Version 1.1, (the "License"); you may not use this file except in - compliance with the License. You should have received a copy of the - Erlang Public License along with this software. If not, it can be - retrieved online at http://www.erlang.org/. + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 - Software distributed under the License is distributed on an "AS IS" - basis, WITHOUT WARRANTY OF ANY KIND, either express or implied. See - the License for the specific language governing rights and limitations - under the License. + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. </legalnotice> @@ -27,51 +28,28 @@ <docno>1</docno> <approved>Bjarne Däcker</approved> <checked></checked> - <date>97-01-24</date> + <date>1997-01-24</date> <rev>B</rev> - <file>erl_scan.sgml</file> + <file>erl_scan.xml</file> </header> <module>erl_scan</module> - <modulesummary>The Erlang Token Scanner</modulesummary> + <modulesummary>The Erlang token scanner.</modulesummary> <description> - <p>This module contains functions for tokenizing characters into + <p>This module contains functions for tokenizing (scanning) characters into Erlang tokens.</p> </description> + <datatypes> <datatype> - <name name="attribute_info"></name> - </datatype> - <datatype> - <name name="attributes"></name> - </datatype> - <datatype> - <name name="attributes_data"></name> - </datatype> - <datatype> <name name="category"></name> </datatype> <datatype> - <name name="column"></name> - </datatype> - <datatype> <name name="error_description"></name> </datatype> <datatype> <name name="error_info"></name> </datatype> <datatype> - <name name="info_line"></name> - </datatype> - <datatype> - <name name="info_location"></name> - </datatype> - <datatype> - <name name="line"></name> - </datatype> - <datatype> - <name name="location"></name> - </datatype> - <datatype> <name name="option"></name> </datatype> <datatype> @@ -87,32 +65,102 @@ <name name="token"></name> </datatype> <datatype> - <name name="token_info"></name> - </datatype> - <datatype> <name name="tokens"></name> </datatype> <datatype> <name name="tokens_result"></name> </datatype> </datatypes> + <funcs> <func> + <name name="category" arity="1"/> + <fsummary>Return the category.</fsummary> + <desc> + <p>Returns the category of <c><anno>Token</anno></c>.</p> + </desc> + </func> + + <func> + <name name="column" arity="1"/> + <fsummary>Return the column.</fsummary> + <desc> + <p>Returns the column of <c><anno>Token</anno></c>'s + collection of annotations.</p> + </desc> + </func> + + <func> + <name name="end_location" arity="1"/> + <fsummary>Return the end location of the text.</fsummary> + <desc> + <p>Returns the end location of the text of + <c><anno>Token</anno></c>'s collection of annotations. If + there is no text, <c>undefined</c> is returned.</p> + </desc> + </func> + + <func> + <name name="format_error" arity="1"/> + <fsummary>Format an error descriptor.</fsummary> + <desc> + <p>Uses an <c><anno>ErrorDescriptor</anno></c> and returns a string + that describes the error or warning. This function is usually + called implicitly when an <c>ErrorInfo</c> structure is + processed (see section + <seealso marker="#errorinfo">Error Information</seealso>).</p> + </desc> + </func> + + <func> + <name name="line" arity="1"/> + <fsummary>Return the line.</fsummary> + <desc> + <p>Returns the line of <c><anno>Token</anno></c>'s collection + of annotations.</p> + </desc> + </func> + + <func> + <name name="location" arity="1"/> + <fsummary>Return the location.</fsummary> + <desc> + <p>Returns the location of <c><anno>Token</anno></c>'s + collection of annotations.</p> + </desc> + </func> + + <func> + <name name="reserved_word" arity="1"/> + <fsummary>Test for a reserved word.</fsummary> + <desc> + <p>Returns <c>true</c> if <c><anno>Atom</anno></c> is an + Erlang reserved word, otherwise <c>false</c>.</p> + </desc> + </func> + + <func> <name name="string" arity="1"/> <name name="string" arity="2"/> <name name="string" arity="3"/> - <fsummary>Scan a string and return the Erlang tokens</fsummary> + <fsummary>Scan a string and return the Erlang tokens.</fsummary> <desc> <p>Takes the list of characters <c><anno>String</anno></c> and tries to - scan (tokenize) them. Returns <c>{ok, <anno>Tokens</anno>, - <anno>EndLocation</anno>}</c>, - where <c><anno>Tokens</anno></c> are the Erlang tokens from - <c><anno>String</anno></c>. <c><anno>EndLocation</anno></c> - is the first location after the last token.</p> - <p><c>{error, <anno>ErrorInfo</anno>, <anno>ErrorLocation</anno>}</c> - is returned if an error occurs. - <c><anno>ErrorLocation</anno></c> is the first location after - the erroneous token.</p> + scan (tokenize) them. Returns one of the following:</p> + <taglist> + <tag><c>{ok, <anno>Tokens</anno>, <anno>EndLocation</anno>}</c></tag> + <item> + <p><c><anno>Tokens</anno></c> are the Erlang tokens from + <c><anno>String</anno></c>. <c><anno>EndLocation</anno></c> + is the first location after the last token.</p> + </item> + <tag><c>{error, <anno>ErrorInfo</anno>, + <anno>ErrorLocation</anno>}</c></tag> + <item> + <p>An error occurred. <c><anno>ErrorLocation</anno></c> is the + first location after the erroneous token.</p> + </item> + </taglist> <p><c>string(<anno>String</anno>)</c> is equivalent to <c>string(<anno>String</anno>, 1)</c>, and <c>string(<anno>String</anno>, @@ -120,79 +168,103 @@ <c>string(<anno>String</anno>, <anno>StartLocation</anno>, [])</c>.</p> <p><c><anno>StartLocation</anno></c> indicates the initial location - when scanning starts. If <c><anno>StartLocation</anno></c> is a line - <c>attributes()</c> as well as <c><anno>EndLocation</anno></c> and - <c><anno>ErrorLocation</anno></c> will be lines. If - <c><anno>StartLocation</anno></c> is a pair of a line and a column - <c>attributes()</c> takes the form of an opaque compound + when scanning starts. If <c><anno>StartLocation</anno></c> is a line, + <c>Anno</c>, <c><anno>EndLocation</anno></c>, and + <c><anno>ErrorLocation</anno></c> are lines. If + <c><anno>StartLocation</anno></c> is a pair of a line and a column, + <c>Anno</c> takes the form of an opaque compound data type, and <c><anno>EndLocation</anno></c> and <c><anno>ErrorLocation</anno></c> - will be pairs of a line and a column. The <em>token - attributes</em> contain information about the column and the + are pairs of a line and a column. The <em>token + annotations</em> contain information about the column and the line where the token begins, as well as the text of the - token (if the <c>text</c> option is given), all of which can - be accessed by calling <seealso - marker="#token_info/1">token_info/1,2</seealso> or <seealso - marker="#attributes_info/1">attributes_info/1,2</seealso>.</p> + token (if option <c>text</c> is specified), all of which can + be accessed by calling + <seealso marker="#column/1"><c>column/1</c></seealso>, + <seealso marker="#line/1"><c>line/1</c></seealso>, + <seealso marker="#location/1"><c>location/1</c></seealso>, and + <seealso marker="#text/1"><c>text/1</c></seealso>.</p> <p>A <em>token</em> is a tuple containing information about - syntactic category, the token attributes, and the actual - terminal symbol. For punctuation characters (e.g. <c>;</c>, + syntactic category, the token annotations, and the + terminal symbol. For punctuation characters (such as <c>;</c> and <c>|</c>) and reserved words, the category and the symbol coincide, and the token is represented by a two-tuple. - Three-tuples have one of the following forms: <c>{atom, - Info, atom()}</c>, - <c>{char, Info, integer()}</c>, <c>{comment, Info, - string()}</c>, <c>{float, Info, float()}</c>, <c>{integer, - Info, integer()}</c>, <c>{var, Info, atom()}</c>, - and <c>{white_space, Info, string()}</c>.</p> - <p>The valid options are:</p> + Three-tuples have one of the following forms:</p> + <list type="bulleted"> + <item><c>{atom, Anno, atom()}</c></item> + <item><c>{char, Anno, char()}</c></item> + <item><c>{comment, Anno, string()}</c></item> + <item><c>{float, Anno, float()}</c></item> + <item><c>{integer, Anno, integer()}</c></item> + <item><c>{var, Anno, atom()}</c></item> + <item><c>{white_space, Anno, string()}</c></item> + </list> + <p>Valid options:</p> <taglist> - <tag><c>{reserved_word_fun, reserved_word_fun()}</c></tag> - <item><p>A callback function that is called when the scanner - has found an unquoted atom. If the function returns - <c>true</c>, the unquoted atom itself will be the category - of the token; if the function returns <c>false</c>, - <c>atom</c> will be the category of the unquoted atom.</p> - </item> - <tag><c>return_comments</c></tag> - <item><p>Return comment tokens.</p> - </item> - <tag><c>return_white_spaces</c></tag> - <item><p>Return white space tokens. By convention, if there is - a newline character, it is always the first character of the - text (there cannot be more than one newline in a white space - token).</p> - </item> - <tag><c>return</c></tag> - <item><p>Short for <c>[return_comments, return_white_spaces]</c>.</p> - </item> - <tag><c>text</c></tag> - <item><p>Include the token's text in the token attributes. The - text is the part of the input corresponding to the token.</p> - </item> + <tag><c>{reserved_word_fun, reserved_word_fun()}</c></tag> + <item><p>A callback function that is called when the scanner + has found an unquoted atom. If the function returns + <c>true</c>, the unquoted atom itself becomes the category + of the token. If the function returns <c>false</c>, + <c>atom</c> becomes the category of the unquoted atom.</p> + </item> + <tag><c>return_comments</c></tag> + <item><p>Return comment tokens.</p> + </item> + <tag><c>return_white_spaces</c></tag> + <item><p>Return white space tokens. By convention, a newline + character, if present, is always the first character of the + text (there cannot be more than one newline in a white space + token).</p> + </item> + <tag><c>return</c></tag> + <item><p>Short for <c>[return_comments, return_white_spaces]</c>.</p> + </item> + <tag><c>text</c></tag> + <item><p>Include the token text in the token annotation. The + text is the part of the input corresponding to the token.</p> + </item> </taglist> </desc> </func> + + <func> + <name name="symbol" arity="1"/> + <fsummary>Return the symbol.</fsummary> + <desc> + <p>Returns the symbol of <c><anno>Token</anno></c>.</p> + </desc> + </func> + + <func> + <name name="text" arity="1"/> + <fsummary>Return the text.</fsummary> + <desc> + <p>Returns the text of <c><anno>Token</anno></c>'s collection + of annotations. If there is no text, <c>undefined</c> is + returned.</p> + </desc> + </func> + <func> <name name="tokens" arity="3"/> <name name="tokens" arity="4"/> + <fsummary>Re-entrant scanner.</fsummary> <type name="char_spec"/> <type name="return_cont"/> - <type_desc name="return_cont">An opaque continuation</type_desc> - <fsummary>Re-entrant scanner</fsummary> + <type_desc name="return_cont">An opaque continuation.</type_desc> <desc> - <p>This is the re-entrant scanner which scans characters until - a <em>dot</em> ('.' followed by a white space) or - <c>eof</c> has been reached. It returns:</p> + <p>This is the re-entrant scanner, which scans characters until + either a <em>dot</em> ('.' followed by a white space) or + <c>eof</c> is reached. It returns:</p> <taglist> <tag><c>{done, <anno>Result</anno>, <anno>LeftOverChars</anno>}</c> </tag> <item> - <p>This return indicates that there is sufficient input + <p>Indicates that there is sufficient input data to get a result. <c><anno>Result</anno></c> is:</p> <taglist> - <tag><c>{ok, Tokens, EndLocation}</c> - </tag> + <tag><c>{ok, Tokens, EndLocation}</c></tag> <item> <p>The scanning was successful. <c>Tokens</c> is the list of tokens including <em>dot</em>.</p> @@ -201,8 +273,7 @@ <item> <p>End of file was encountered before any more tokens.</p> </item> - <tag><c>{error, ErrorInfo, EndLocation}</c> - </tag> + <tag><c>{error, ErrorInfo, EndLocation}</c></tag> <item> <p>An error occurred. <c><anno>LeftOverChars</anno></c> is the remaining characters of the input data, @@ -218,190 +289,26 @@ </item> </taglist> <p>The <c><anno>CharSpec</anno></c> <c>eof</c> signals end of file. - <c><anno>LeftOverChars</anno></c> will then take the value <c>eof</c> + <c><anno>LeftOverChars</anno></c> then takes the value <c>eof</c> as well.</p> <p><c>tokens(<anno>Continuation</anno>, <anno>CharSpec</anno>, <anno>StartLocation</anno>)</c> is equivalent to <c>tokens(<anno>Continuation</anno>, <anno>CharSpec</anno>, <anno>StartLocation</anno>, [])</c>.</p> - <p>See <seealso marker="#string/3">string/3</seealso> for a - description of the various options.</p> - </desc> - </func> - <func> - <name name="reserved_word" arity="1"/> - <fsummary>Test for a reserved word</fsummary> - <desc> - <p>Returns <c>true</c> if <c><anno>Atom</anno></c> is an Erlang - reserved word, otherwise <c>false</c>.</p> - </desc> - </func> - <func> - <name name="token_info" arity="1"/> - <fsummary>Return information about a token</fsummary> - <desc> - <p>Returns a list containing information about the token - <c><anno>Token</anno></c>. The order of the - <c><anno>TokenInfoTuple</anno></c>s is not - defined. See <seealso - marker="#token_info/2">token_info/2</seealso> for - information about specific - <c><anno>TokenInfoTuple</anno></c>s.</p> - <p>Note that if <c>token_info(Token, TokenItem)</c> returns - <c>undefined</c> for some <c>TokenItem</c>, the - item is not included in <c><anno>TokenInfo</anno></c>.</p> - </desc> - </func> - <func> - <name name="token_info" arity="2" clause_i="1"/> - <name name="token_info" arity="2" clause_i="2"/> - <type name="token_item"/> - <type name="attribute_item"/> - <fsummary>Return information about a token</fsummary> - <desc> - <p>Returns a list containing information about the token - <c><anno>Token</anno></c>. If one single - <c><anno>TokenItem</anno></c> is given the returned value is - the corresponding - <c>TokenInfoTuple</c>, or <c>undefined</c> if the - <c>TokenItem</c> has no value. If a list of - <c><anno>TokenItem</anno></c>s is given the result is a list of - <c><anno>TokenInfoTuple</anno></c>. The - <c><anno>TokenInfoTuple</anno></c>s will - appear with the corresponding <c><anno>TokenItem</anno></c>s in - the same order as the <c><anno>TokenItem</anno></c>s - appear in the list of <c>TokenItem</c>s. - <c><anno>TokenItem</anno></c>s with no value are not included - in the list of <c><anno>TokenInfoTuple</anno></c>.</p> - <p>The following <c><anno>TokenInfoTuple</anno></c>s with corresponding - <c><anno>TokenItem</anno></c>s are valid:</p> - <taglist> - <tag><c>{category, <seealso marker="#type-category"> - category()</seealso>}</c></tag> - <item><p>The category of the token.</p> - </item> - <tag><c>{column, <seealso marker="#type-column"> - column()</seealso>}</c></tag> - <item><p>The column where the token begins.</p> - </item> - <tag><c>{length, integer() > 0}</c></tag> - <item><p>The length of the token's text.</p> - </item> - <tag><c>{line, <seealso marker="#type-line"> - line()</seealso>}</c></tag> - <item><p>The line where the token begins.</p> - </item> - <tag><c>{location, <seealso marker="#type-location"> - location()</seealso>}</c></tag> - <item><p>The line and column where the token begins, or - just the line if the column unknown.</p> - </item> - <tag><c>{symbol, <seealso marker="#type-symbol"> - symbol()</seealso>}</c></tag> - <item><p>The token's symbol.</p> - </item> - <tag><c>{text, string()}</c></tag> - <item><p>The token's text.</p> - </item> - </taglist> - </desc> - </func> - <func> - <name name="attributes_info" arity="1"/> - <fsummary>Return information about token attributes</fsummary> - <desc> - <p>Returns a list containing information about the token - attributes <c><anno>Attributes</anno></c>. The order of the - <c><anno>AttributeInfoTuple</anno></c>s is not defined. - See <seealso - marker="#attributes_info/2">attributes_info/2</seealso> for - information about specific - <c><anno>AttributeInfoTuple</anno></c>s.</p> - <p>Note that if <c>attributes_info(Token, AttributeItem)</c> - returns <c>undefined</c> for some <c>AttributeItem</c> in - the list above, the item is not included in - <c><anno>AttributesInfo</anno></c>.</p> - </desc> - </func> - <func> - <name name="attributes_info" arity="2" clause_i="1"/> - <name name="attributes_info" arity="2" clause_i="2"/> - <fsummary>Return information about a token attributes</fsummary> - <type name="attribute_item"/> - <desc> - <p>Returns a list containing information about the token - attributes <c><anno>Attributes</anno></c>. If one single - <c><anno>AttributeItem</anno></c> is given the returned value is the - corresponding <c><anno>AttributeInfoTuple</anno></c>, - or <c>undefined</c> if the <c><anno>AttributeItem</anno></c> - has no value. If a list of <c><anno>AttributeItem</anno></c> - is given the result is a list of - <c><anno>AttributeInfoTuple</anno></c>. - The <c><anno>AttributeInfoTuple</anno></c>s - will appear with the corresponding <c><anno>AttributeItem</anno></c>s - in the same order as the <c><anno>AttributeItem</anno></c>s - appear in the list of <c><anno>AttributeItem</anno></c>s. - <c><anno>AttributeItem</anno></c>s with no - value are not included in the list of - <c><anno>AttributeInfoTuple</anno></c>.</p> - <p>The following <c><anno>AttributeInfoTuple</anno></c>s with - corresponding <c><anno>AttributeItem</anno></c>s are valid:</p> - <taglist> - <tag><c>{column, <seealso marker="#type-column"> - column()</seealso>}</c></tag> - <item><p>The column where the token begins.</p> - </item> - <tag><c>{length, integer() > 0}</c></tag> - <item><p>The length of the token's text.</p> - </item> - <tag><c>{line, <seealso marker="#type-line"> - line()</seealso>}</c></tag> - <item><p>The line where the token begins.</p> - </item> - <tag><c>{location, <seealso marker="#type-location"> - location()</seealso>}</c></tag> - <item><p>The line and column where the token begins, or - just the line if the column unknown.</p> - </item> - <tag><c>{text, string()}</c></tag> - <item><p>The token's text.</p> - </item> - </taglist> - </desc> - </func> - <func> - <name name="set_attribute" arity="3"/> - <fsummary>Set a token attribute value</fsummary> - <desc> - <p>Sets the value of the <c>line</c> attribute of the token - attributes <c><anno>Attributes</anno></c>.</p> - <p>The <c><anno>SetAttributeFun</anno></c> is called with the value of - the <c>line</c> attribute, and is to return the new value of - the <c>line</c> attribute.</p> - </desc> - </func> - <func> - <name name="format_error" arity="1"/> - <fsummary>Format an error descriptor</fsummary> - <desc> - <p>Takes an <c><anno>ErrorDescriptor</anno></c> and returns - a string which - describes the error or warning. This function is usually - called implicitly when processing an <c>ErrorInfo</c> - structure (see below).</p> + <p>For a description of the options, see + <seealso marker="#string/3"><c>string/3</c></seealso>.</p> </desc> </func> </funcs> <section> + <marker id="errorinfo"/> <title>Error Information</title> - <p>The <c>ErrorInfo</c> mentioned above is the standard - <c>ErrorInfo</c> structure which is returned from all IO - modules. It has the following format:</p> + <p><c>ErrorInfo</c> is the standard <c>ErrorInfo</c> structure that is + returned from all I/O modules. The format is as follows:</p> <code type="none"> {ErrorLocation, Module, ErrorDescriptor}</code> - <p>A string which describes the error is obtained with the - following call:</p> + <p>A string describing the error is obtained with the following call:</p> <code type="none"> Module:format_error(ErrorDescriptor)</code> </section> @@ -409,14 +316,15 @@ Module:format_error(ErrorDescriptor)</code> <section> <title>Notes</title> <p>The continuation of the first call to the re-entrant input - functions must be <c>[]</c>. Refer to Armstrong, Virding and - Williams, 'Concurrent Programming in Erlang', Chapter 13, for a - complete description of how the re-entrant input scheme works.</p> + functions must be <c>[]</c>. For a complete description of how the + re-entrant input scheme works, see Armstrong, Virding and + Williams: 'Concurrent Programming in Erlang', Chapter 13.</p> </section> <section> <title>See Also</title> - <p><seealso marker="io">io(3)</seealso>, - <seealso marker="erl_parse">erl_parse(3)</seealso></p> + <p><seealso marker="erl_anno"><c>erl_anno(3)</c></seealso>, + <seealso marker="erl_parse"><c>erl_parse(3)</c></seealso>, + <seealso marker="io"><c>io(3)</c></seealso></p> </section> </erlref> |