aboutsummaryrefslogtreecommitdiffstats
path: root/lib/xmerl
diff options
context:
space:
mode:
authorSverker Eriksson <[email protected]>2017-08-30 21:00:35 +0200
committerSverker Eriksson <[email protected]>2017-08-30 21:00:35 +0200
commit44a83c8860bbd00878c720a7b9d940b4630bab8a (patch)
tree101b3c52ec505a94f56c8f70e078ecb8a2e8c6cd /lib/xmerl
parent7c67bbddb53c364086f66260701bc54a61c9659c (diff)
parent040bdce67f88d833bfb59adae130a4ffb4c180f0 (diff)
downloadotp-44a83c8860bbd00878c720a7b9d940b4630bab8a.tar.gz
otp-44a83c8860bbd00878c720a7b9d940b4630bab8a.tar.bz2
otp-44a83c8860bbd00878c720a7b9d940b4630bab8a.zip
Merge tag 'OTP-20.0' into sverker/20/binary_to_atom-utf8-crash/ERL-474/OTP-14590
Diffstat (limited to 'lib/xmerl')
-rw-r--r--lib/xmerl/doc/src/notes.xml112
-rw-r--r--lib/xmerl/notes.html28
-rw-r--r--lib/xmerl/src/xmerl_eventp.erl84
-rw-r--r--lib/xmerl/src/xmerl_regexp.erl4
-rw-r--r--lib/xmerl/src/xmerl_sax_old_dom.erl9
-rw-r--r--lib/xmerl/src/xmerl_sax_parser.erl198
-rw-r--r--lib/xmerl/src/xmerl_sax_parser.hrl11
-rw-r--r--lib/xmerl/src/xmerl_sax_parser_base.erlsrc411
-rw-r--r--lib/xmerl/src/xmerl_sax_parser_latin1.erlsrc38
-rw-r--r--lib/xmerl/src/xmerl_sax_parser_list.erlsrc21
-rw-r--r--lib/xmerl/src/xmerl_sax_parser_utf16be.erlsrc50
-rw-r--r--lib/xmerl/src/xmerl_sax_parser_utf16le.erlsrc50
-rw-r--r--lib/xmerl/src/xmerl_sax_parser_utf8.erlsrc50
-rw-r--r--lib/xmerl/src/xmerl_sax_simple_dom.erl7
-rw-r--r--lib/xmerl/src/xmerl_scan.erl42
-rw-r--r--lib/xmerl/src/xmerl_xpath.erl30
-rw-r--r--lib/xmerl/src/xmerl_xs.erl7
-rw-r--r--lib/xmerl/src/xmerl_xsd.erl13
-rw-r--r--lib/xmerl/test/Makefile7
-rw-r--r--lib/xmerl/test/xmerl_SUITE.erl56
-rw-r--r--lib/xmerl/test/xmerl_sax_SUITE.erl84
-rw-r--r--lib/xmerl/test/xmerl_sax_SUITE_data/test_data_1.xml4
-rw-r--r--lib/xmerl/test/xmerl_sax_std_SUITE.erl100
-rw-r--r--lib/xmerl/test/xmerl_sax_stream_SUITE.erl260
-rw-r--r--lib/xmerl/test/xmerl_sax_stream_SUITE_data/xmerl_sax_stream_one.xml17
-rw-r--r--lib/xmerl/test/xmerl_sax_stream_SUITE_data/xmerl_sax_stream_one_junk.xml18
-rw-r--r--lib/xmerl/test/xmerl_sax_stream_SUITE_data/xmerl_sax_stream_two.xml34
-rw-r--r--lib/xmerl/vsn.mk2
-rw-r--r--lib/xmerl/xmerl.pub19
29 files changed, 1344 insertions, 422 deletions
diff --git a/lib/xmerl/doc/src/notes.xml b/lib/xmerl/doc/src/notes.xml
index 0abcb87998..1162561225 100644
--- a/lib/xmerl/doc/src/notes.xml
+++ b/lib/xmerl/doc/src/notes.xml
@@ -4,7 +4,7 @@
<chapter>
<header>
<copyright>
- <year>2004</year><year>2016</year>
+ <year>2004</year><year>2017</year>
<holder>Ericsson AB. All Rights Reserved.</holder>
</copyright>
<legalnotice>
@@ -32,6 +32,116 @@
<p>This document describes the changes made to the Xmerl application.</p>
+<section><title>Xmerl 1.3.15</title>
+
+ <section><title>Fixed Bugs and Malfunctions</title>
+ <list>
+ <item>
+ <p>
+ Improves accumulator fun in xmerl_scan so that only one
+ #xmlText record is returned for strings which have
+ character references.</p>
+ <p>
+ (Thanks to Jimmy Zöger)</p>
+ <p>
+ Own Id: OTP-14377 Aux Id: PR-1369 </p>
+ </item>
+ </list>
+ </section>
+
+</section>
+
+<section><title>Xmerl 1.3.14</title>
+
+ <section><title>Fixed Bugs and Malfunctions</title>
+ <list>
+ <item>
+ <p> A couple of bugs are fixed in the sax parser
+ (xmerl_sax_parser). </p> <list> <item>The continuation
+ function was not called correctly when the XML directive
+ was fragmented. </item> <item>When the event callback
+ modules (xmerl_sax_old_dom and xmerl_sax_simple) got an
+ endDocument event at certain conditions the parser
+ crashed.</item> <item>Replaced internal ets table with
+ map to avoid table leakage.</item> </list>
+ <p>
+ Own Id: OTP-14430</p>
+ </item>
+ </list>
+ </section>
+
+</section>
+
+<section><title>Xmerl 1.3.13</title>
+
+ <section><title>Fixed Bugs and Malfunctions</title>
+ <list>
+ <item>
+ <p>
+ The namespace_conformant option in xmerl_scan did not
+ work when parsing documents without explicit XML
+ namespace declaration.</p>
+ <p>
+ Own Id: OTP-14139</p>
+ </item>
+ <item>
+ <p> Fix a "well-formedness" bug in the XML Sax parser so
+ it returns an error if there are something more in the
+ file after the matching document. If one using the
+ xmerl_sax_parser:stream() a rest is allowed which then
+ can be sent to a new call of xmerl_sax_parser:stream() to
+ parse next document. </p> <p> This is done to be
+ compliant with XML conformance tests. </p>
+ <p>
+ Own Id: OTP-14211</p>
+ </item>
+ <item>
+ <p> Fixed compiler and dialyzer warnings in the XML SAX
+ parser. </p>
+ <p>
+ Own Id: OTP-14212</p>
+ </item>
+ <item>
+ <p> Change how to interpret end of document in the XML
+ SAX parser to comply with Tim Brays comment on the
+ standard. This makes it possible to handle more than one
+ doc on a stream, the standard makes it impossible to know
+ when the document is ended without waiting for the next
+ document (and not always even that). </p> <p> Tim Brays
+ comment: </p> <p> Trailing "Misc"<br/> The fact that
+ you're allowed some trailing junk after the root element,
+ I decided (but unfortunately too late) is a real design
+ error in XML. If I'm writing a network client, I'm
+ probably going to close the link as soon as a I see the
+ root element end-tag, and not depend on the other end
+ closing it down properly.<br/> Furthermore, if I want to
+ send a succession of XML documents over a network link,
+ if I find a processing instruction after a root element,
+ is it a trailer on the previous document, or part of the
+ prolog of the next? </p>
+ <p>
+ Own Id: OTP-14213</p>
+ </item>
+ </list>
+ </section>
+
+</section>
+
+<section><title>Xmerl 1.3.12</title>
+
+ <section><title>Fixed Bugs and Malfunctions</title>
+ <list>
+ <item>
+ <p> Fix a number of broken links in the xmerl
+ documentation. </p>
+ <p>
+ Own Id: OTP-13880</p>
+ </item>
+ </list>
+ </section>
+
+</section>
+
<section><title>Xmerl 1.3.11</title>
<section><title>Improvements and New Features</title>
diff --git a/lib/xmerl/notes.html b/lib/xmerl/notes.html
deleted file mode 100644
index 43610a37fa..0000000000
--- a/lib/xmerl/notes.html
+++ /dev/null
@@ -1,28 +0,0 @@
-<HTML>
-<HEAD>
- <TITLE>xmerl Release Notes</TITLE>
- <style type="text/css">
-<!--
- body { background: white; margin: 3em }
-
- body { font-family: Verdana, Arial, Helvetica, sans-serif }
- h1 h2 h3 h4 { font-family: Verdana, Arial, Helvetica, sans-serif }
- h1 { font-size: 48 }
- p li { font-family: Verdana, Arial, Helvetica, sans-serif }
--->
- </style>
-</HEAD>
-
-<BODY BGCOLOR="#FFFFFF">
-
-<CENTER><H1>xmerl Release Notes</H1></CENTER>
-
-<h2>xmerl 1.0</h2>
-
-
-<p>
-There are also release notes for
-<a href="notes_history.html">older versions</a>.
-
-</body>
-</html>
diff --git a/lib/xmerl/src/xmerl_eventp.erl b/lib/xmerl/src/xmerl_eventp.erl
index 2cb76abc6e..8d7ea25e24 100644
--- a/lib/xmerl/src/xmerl_eventp.erl
+++ b/lib/xmerl/src/xmerl_eventp.erl
@@ -25,6 +25,90 @@
%% Each contain more elaborate settings of xmerl_scan that makes usage of
%% the customization functions.
%%
+%% @type xmlElement() = #xmlElement{}.
+%%
+%% @type option_list(). <p>Options allow to customize the behaviour of the
+%% scanner.
+%% See also <a href="xmerl_examples.html">tutorial</a> on customization
+%% functions.
+%% </p>
+%% <p>
+%% Possible options are:
+%% </p>
+%% <dl>
+%% <dt><code>{acc_fun, Fun}</code></dt>
+%% <dd>Call back function to accumulate contents of entity.</dd>
+%% <dt><code>{continuation_fun, Fun} |
+%% {continuation_fun, Fun, ContinuationState}</code></dt>
+%% <dd>Call back function to decide what to do if the scanner runs into EOF
+%% before the document is complete.</dd>
+%% <dt><code>{event_fun, Fun} |
+%% {event_fun, Fun, EventState}</code></dt>
+%% <dd>Call back function to handle scanner events.</dd>
+%% <dt><code>{fetch_fun, Fun} |
+%% {fetch_fun, Fun, FetchState}</code></dt>
+%% <dd>Call back function to fetch an external resource.</dd>
+%% <dt><code>{hook_fun, Fun} |
+%% {hook_fun, Fun, HookState}</code></dt>
+%% <dd>Call back function to process the document entities once
+%% identified.</dd>
+%% <dt><code>{close_fun, Fun}</code></dt>
+%% <dd>Called when document has been completely parsed.</dd>
+%% <dt><code>{rules, ReadFun, WriteFun, RulesState} |
+%% {rules, Rules}</code></dt>
+%% <dd>Handles storing of scanner information when parsing.</dd>
+%% <dt><code>{user_state, UserState}</code></dt>
+%% <dd>Global state variable accessible from all customization functions</dd>
+%%
+%% <dt><code>{fetch_path, PathList}</code></dt>
+%% <dd>PathList is a list of
+%% directories to search when fetching files. If the file in question
+%% is not in the fetch_path, the URI will be used as a file
+%% name.</dd>
+%% <dt><code>{space, Flag}</code></dt>
+%% <dd>'preserve' (default) to preserve spaces, 'normalize' to
+%% accumulate consecutive whitespace and replace it with one space.</dd>
+%% <dt><code>{line, Line}</code></dt>
+%% <dd>To specify starting line for scanning in document which contains
+%% fragments of XML.</dd>
+%% <dt><code>{namespace_conformant, Flag}</code></dt>
+%% <dd>Controls whether to behave as a namespace conformant XML parser,
+%% 'false' (default) to not otherwise 'true'.</dd>
+%% <dt><code>{validation, Flag}</code></dt>
+%% <dd>Controls whether to process as a validating XML parser:
+%% 'off' (default) no validation, or validation 'dtd' by DTD or 'schema'
+%% by XML Schema. 'false' and 'true' options are obsolete
+%% (i.e. they may be removed in a future release), if used 'false'
+%% equals 'off' and 'true' equals 'dtd'.</dd>
+%% <dt><code>{schemaLocation, [{Namespace,Link}|...]}</code></dt>
+%% <dd>Tells explicitly which XML Schema documents to use to validate
+%% the XML document. Used together with the
+%% <code>{validation,schema}</code> option.</dd>
+%% <dt><code>{quiet, Flag}</code></dt>
+%% <dd>Set to 'true' if xmerl should behave quietly and not output any
+%% information to standard output (default 'false').</dd>
+%% <dt><code>{doctype_DTD, DTD}</code></dt>
+%% <dd>Allows to specify DTD name when it isn't available in the XML
+%% document. This option has effect only together with
+%% <code>{validation,'dtd'</code> option.</dd>
+%% <dt><code>{xmlbase, Dir}</code></dt>
+%% <dd>XML Base directory. If using string/1 default is current directory.
+%% If using file/1 default is directory of given file.</dd>
+%% <dt><code>{encoding, Enc}</code></dt>
+%% <dd>Set default character set used (default UTF-8).
+%% This character set is used only if not explicitly given by the XML
+%% declaration. </dd>
+%% <dt><code>{document, Flag}</code></dt>
+%% <dd>Set to 'true' if xmerl should return a complete XML document
+%% as an xmlDocument record (default 'false').</dd>
+%% <dt><code>{comments, Flag}</code></dt>
+%% <dd>Set to 'false' if xmerl should skip comments otherwise they will
+%% be returned as xmlComment records (default 'true').</dd>
+%% <dt><code>{default_attrs, Flag}</code></dt>
+%% <dd>Set to 'true' if xmerl should add to elements missing attributes
+%% with a defined default value (default 'false').</dd>
+%% </dl>
+%%
-module(xmerl_eventp).
-vsn('0.19').
-date('03-09-17').
diff --git a/lib/xmerl/src/xmerl_regexp.erl b/lib/xmerl/src/xmerl_regexp.erl
index fc89b80ff1..1bf8496673 100644
--- a/lib/xmerl/src/xmerl_regexp.erl
+++ b/lib/xmerl/src/xmerl_regexp.erl
@@ -1,7 +1,7 @@
%%
%% %CopyrightBegin%
%%
-%% Copyright Ericsson AB 2006-2016. All Rights Reserved.
+%% Copyright Ericsson AB 2006-2017. All Rights Reserved.
%%
%% Licensed under the Apache License, Version 2.0 (the "License");
%% you may not use this file except in compliance with the License.
@@ -1154,7 +1154,7 @@ comp_crs([], Last) -> [{Last,maxchar}].
%% build_dfa(NFA, NfaStartState) -> {DFA,DfaStartState}.
%% Build a DFA from an NFA using "subset construction". The major
%% difference from the book is that we keep the marked and unmarked
-%% DFA states in seperate lists. New DFA states are added to the
+%% DFA states in separate lists. New DFA states are added to the
%% unmarked list and states are marked by moving them to the marked
%% list. We assume that the NFA accepting state numbers are in
%% ascending order for the rules and use ordsets to keep this order.
diff --git a/lib/xmerl/src/xmerl_sax_old_dom.erl b/lib/xmerl/src/xmerl_sax_old_dom.erl
index fefcf03fce..6d0d836487 100644
--- a/lib/xmerl/src/xmerl_sax_old_dom.erl
+++ b/lib/xmerl/src/xmerl_sax_old_dom.erl
@@ -2,7 +2,7 @@
%%--------------------------------------------------------------------
%% %CopyrightBegin%
%%
-%% Copyright Ericsson AB 2009-2016. All Rights Reserved.
+%% Copyright Ericsson AB 2009-2017. All Rights Reserved.
%%
%% Licensed under the Apache License, Version 2.0 (the "License");
%% you may not use this file except in compliance with the License.
@@ -127,9 +127,10 @@ build_dom(endDocument,
State#xmerl_sax_old_dom_state{dom=[Decl, Current#xmlElement{
content=lists:reverse(C)
}]};
- _ ->
- %%?dbg("~p\n", [D]),
- ?error("we're not at end the document when endDocument event is encountered.")
+ _ ->
+ %% endDocument is also sent by the parser when a fault occur to tell
+ %% the event receiver that no more input will be sent
+ State
end;
%% Element
diff --git a/lib/xmerl/src/xmerl_sax_parser.erl b/lib/xmerl/src/xmerl_sax_parser.erl
index 318a0cf7f4..e383c4c349 100644
--- a/lib/xmerl/src/xmerl_sax_parser.erl
+++ b/lib/xmerl/src/xmerl_sax_parser.erl
@@ -1,7 +1,7 @@
%%--------------------------------------------------------------------
%% %CopyrightBegin%
%%
-%% Copyright Ericsson AB 2008-2016. All Rights Reserved.
+%% Copyright Ericsson AB 2008-2017. All Rights Reserved.
%%
%% Licensed under the Apache License, Version 2.0 (the "License");
%% you may not use this file except in compliance with the License.
@@ -33,6 +33,7 @@
%% External exports
%%----------------------------------------------------------------------
-export([file/2,
+ stream/3,
stream/2]).
%%----------------------------------------------------------------------
@@ -63,7 +64,7 @@
%% Description: Parse file containing an XML document.
%%----------------------------------------------------------------------
file(Name,Options) ->
- case file:open(Name, [raw, read,binary]) of
+ case file:open(Name, [raw, read_ahead, read,binary]) of
{error, Reason} ->
{error,{Name, file:format_error(Reason)}};
{ok, FD} ->
@@ -72,11 +73,12 @@ file(Name,Options) ->
File = filename:basename(Name),
ContinuationFun = fun default_continuation_cb/1,
Res = stream(<<>>,
- [{continuation_fun, ContinuationFun},
- {continuation_state, FD},
- {current_location, CL},
- {entity, File}
- |Options]),
+ [{continuation_fun, ContinuationFun},
+ {continuation_state, FD},
+ {current_location, CL},
+ {entity, File}
+ |Options],
+ file),
ok = file:close(FD),
Res
end.
@@ -92,19 +94,22 @@ file(Name,Options) ->
%% EventState = term()
%% Description: Parse a stream containing an XML document.
%%----------------------------------------------------------------------
-stream(Xml, Options) when is_list(Xml), is_list(Options) ->
+stream(Xml, Options) ->
+ stream(Xml, Options, stream).
+
+stream(Xml, Options, InputType) when is_list(Xml), is_list(Options) ->
State = parse_options(Options, initial_state()),
- case State#xmerl_sax_parser_state.file_type of
+ case State#xmerl_sax_parser_state.file_type of
dtd ->
xmerl_sax_parser_list:parse_dtd(Xml,
State#xmerl_sax_parser_state{encoding = list,
- input_type = stream});
+ input_type = InputType});
normal ->
xmerl_sax_parser_list:parse(Xml,
State#xmerl_sax_parser_state{encoding = list,
- input_type = stream})
+ input_type = InputType})
end;
-stream(Xml, Options) when is_binary(Xml), is_list(Options) ->
+stream(Xml, Options, InputType) when is_binary(Xml), is_list(Options) ->
case parse_options(Options, initial_state()) of
{error, Reason} -> {error, Reason};
State ->
@@ -115,21 +120,22 @@ stream(Xml, Options) when is_binary(Xml), is_list(Options) ->
normal ->
parse
end,
- case detect_charset(Xml, State) of
- {error, Reason} -> {fatal_error,
- {
- State#xmerl_sax_parser_state.current_location,
- State#xmerl_sax_parser_state.entity,
- 1
- },
- Reason,
- [],
- State#xmerl_sax_parser_state.event_state};
- {Xml1, State1} ->
- parse_binary(Xml1,
- State1#xmerl_sax_parser_state{input_type = stream},
- ParseFunction)
- end
+ try
+ {Xml1, State1} = detect_charset(Xml, State),
+ parse_binary(Xml1,
+ State1#xmerl_sax_parser_state{input_type = InputType},
+ ParseFunction)
+ catch
+ throw:{fatal_error, {State2, Reason}} ->
+ {fatal_error,
+ {
+ State2#xmerl_sax_parser_state.current_location,
+ State2#xmerl_sax_parser_state.entity,
+ 1
+ },
+ Reason, [],
+ State2#xmerl_sax_parser_state.event_state}
+ end
end.
%%----------------------------------------------------------------------
@@ -151,8 +157,8 @@ parse_binary(Xml, #xmerl_sax_parser_state{encoding={utf16,big}}=State, F) ->
xmerl_sax_parser_utf16be:F(Xml, State);
parse_binary(Xml, #xmerl_sax_parser_state{encoding=latin1}=State, F) ->
xmerl_sax_parser_latin1:F(Xml, State);
-parse_binary(_, #xmerl_sax_parser_state{encoding=Enc}, _) ->
- {error, lists:flatten(io_lib:format("Charcter set ~p not supported", [Enc]))}.
+parse_binary(_, #xmerl_sax_parser_state{encoding=Enc}, State) ->
+ ?fatal_error(State, lists:flatten(io_lib:format("Charcter set ~p not supported", [Enc]))).
%%----------------------------------------------------------------------
%% Function: initial_state/0
@@ -206,8 +212,7 @@ parse_options([{entity, Entity} |Options], State) ->
parse_options([skip_external_dtd |Options], State) ->
parse_options(Options, State#xmerl_sax_parser_state{skip_external_dtd = true});
parse_options([O |_], _State) ->
- {error,
- lists:flatten(io_lib:format("Option: ~p not supported", [O]))}.
+ {error, lists:flatten(io_lib:format("Option: ~p not supported", [O]))}.
check_encoding_option(E) when E==utf8; E=={utf16,little}; E=={utf16,big};
@@ -225,16 +230,10 @@ check_encoding_option(E) ->
%% Output: {utf8|utf16le|utf16be|iso8859, Xml, State}
%% Description: Detects which character set is used in a binary stream.
%%----------------------------------------------------------------------
-detect_charset(<<>>, #xmerl_sax_parser_state{continuation_fun = undefined} = _) ->
- throw({error, "Can't detect character encoding due to no indata"});
-detect_charset(<<>>, #xmerl_sax_parser_state{continuation_fun = CFun,
- continuation_state = CState} = State) ->
- case CFun(CState) of
- {<<>>, _} ->
- throw({error, "Can't detect character encoding due to lack of indata"});
- {NewBytes, NewContState} ->
- detect_charset(NewBytes, State#xmerl_sax_parser_state{continuation_state = NewContState})
- end;
+detect_charset(<<>>, #xmerl_sax_parser_state{continuation_fun = undefined} = State) ->
+ ?fatal_error(State, "Can't detect character encoding due to lack of indata");
+detect_charset(<<>>, State) ->
+ cf(<<>>, State, fun detect_charset/2);
detect_charset(Bytes, State) ->
case unicode:bom_to_encoding(Bytes) of
{latin1, 0} ->
@@ -244,25 +243,47 @@ detect_charset(Bytes, State) ->
{RealBytes, State#xmerl_sax_parser_state{encoding=Enc}}
end.
+detect_charset_1(<<16#00>> = Xml, State) ->
+ cf(Xml, State, fun detect_charset_1/2);
+detect_charset_1(<<16#00, 16#3C>> = Xml, State) ->
+ cf(Xml, State, fun detect_charset_1/2);
+detect_charset_1(<<16#00, 16#3C, 16#00>> = Xml, State) ->
+ cf(Xml, State, fun detect_charset_1/2);
detect_charset_1(<<16#00, 16#3C, 16#00, 16#3F, _/binary>> = Xml, State) ->
{Xml, State#xmerl_sax_parser_state{encoding={utf16, big}}};
+detect_charset_1(<<16#3C>> = Xml, State) ->
+ cf(Xml, State, fun detect_charset_1/2);
+detect_charset_1(<<16#3C, 16#00>> = Xml, State) ->
+ cf(Xml, State, fun detect_charset_1/2);
+detect_charset_1(<<16#3C, 16#00, 16#3F>> = Xml, State) ->
+ cf(Xml, State, fun detect_charset_1/2);
detect_charset_1(<<16#3C, 16#00, 16#3F, 16#00, _/binary>> = Xml, State) ->
{Xml, State#xmerl_sax_parser_state{encoding={utf16, little}}};
-detect_charset_1(<<16#3C, 16#3F, 16#78, 16#6D, 16#6C, Xml2/binary>> = Xml, State) ->
- case parse_xml_directive(Xml2) of
+detect_charset_1(<<16#3C>> = Xml, State) ->
+ cf(Xml, State, fun detect_charset_1/2);
+detect_charset_1(<<16#3C, 16#3F>> = Xml, State) ->
+ cf(Xml, State, fun detect_charset_1/2);
+detect_charset_1(<<16#3C, 16#3F, 16#78>> = Xml, State) ->
+ cf(Xml, State, fun detect_charset_1/2);
+detect_charset_1(<<16#3C, 16#3F, 16#78, 16#6D>> = Xml, State) ->
+ cf(Xml, State, fun detect_charset_1/2);
+detect_charset_1(<<16#3C, 16#3F, 16#78, 16#6D, 16#6C, Xml2/binary>>, State) ->
+ {Xml3, State1} = read_until_end_of_xml_directive(Xml2, State),
+ case parse_xml_directive(Xml3) of
{error, Reason} ->
- {error, Reason};
+ ?fatal_error(State, Reason);
AttrList ->
case lists:keysearch("encoding", 1, AttrList) of
{value, {_, E}} ->
case convert_encoding(E) of
{error, Reason} ->
- {error, Reason};
+ ?fatal_error(State, Reason);
Enc ->
- {Xml, State#xmerl_sax_parser_state{encoding=Enc}}
+ {<<16#3C, 16#3F, 16#78, 16#6D, 16#6C, Xml3/binary>>,
+ State1#xmerl_sax_parser_state{encoding=Enc}}
end;
_ ->
- {Xml, State}
+ {<<16#3C, 16#3F, 16#78, 16#6D, 16#6C, Xml3/binary>>, State1}
end
end;
detect_charset_1(Xml, State) ->
@@ -372,7 +393,7 @@ parse_value_1(<<C, Rest/binary>>, Stop, Acc) ->
parse_value_1(Rest, Stop, [C |Acc]).
%%======================================================================
-%%Default functions
+%% Default functions
%%======================================================================
%%----------------------------------------------------------------------
%% Function: default_event_cb(Event, LineNo, State) -> Result
@@ -388,7 +409,7 @@ default_event_cb(_Event, _LineNo, State) ->
%%----------------------------------------------------------------------
%% Function: default_continuation_cb(IoDevice) -> Result
%% IoDevice = iodevice()
-%% Output: Result = {[char()], State}
+%% Output: Result = {binary(), IoDevice}
%% Description: Default continuation callback reading blocks.
%%----------------------------------------------------------------------
default_continuation_cb(IoDevice) ->
@@ -398,3 +419,82 @@ default_continuation_cb(IoDevice) ->
{ok, FileBin} ->
{FileBin, IoDevice}
end.
+
+%%----------------------------------------------------------------------
+%% Function: read_until_end_of_xml_directive(Rest, State) -> Result
+%% Rest = binary()
+%% Output: Result = {binary(), State}
+%% Description: Reads a utf8 or latin1 until it finds '?>'
+%%----------------------------------------------------------------------
+read_until_end_of_xml_directive(Rest, State) ->
+ case binary:match(Rest, <<"?>">>) of
+ nomatch ->
+ case cf(Rest, State) of
+ {<<>>, _} ->
+ ?fatal_error(State, "Can't detect character encoding due to lack of indata");
+ {NewBytes, NewState} ->
+ read_until_end_of_xml_directive(NewBytes, NewState)
+ end;
+ _ ->
+ {Rest, State}
+ end.
+
+
+%%----------------------------------------------------------------------
+%% Function : cf(Rest, State) -> Result
+%% Parameters: Rest = binary()
+%% State = #xmerl_sax_parser_state{}
+%% NextCall = fun()
+%% Result : {Rest, State}
+%% Description: Function that uses provided fun to read another chunk from
+%% input stream and calls the fun in NextCall.
+%%----------------------------------------------------------------------
+cf(_Rest, #xmerl_sax_parser_state{continuation_fun = undefined} = State) ->
+ ?fatal_error(State, "Continuation function undefined");
+cf(Rest, #xmerl_sax_parser_state{continuation_fun = CFun, continuation_state = CState} = State) ->
+ Result =
+ try
+ CFun(CState)
+ catch
+ throw:ErrorTerm ->
+ ?fatal_error(State, ErrorTerm);
+ exit:Reason ->
+ ?fatal_error(State, {'EXIT', Reason})
+ end,
+ case Result of
+ {<<>>, _} ->
+ ?fatal_error(State, "Can't detect character encoding due to lack of indata");
+ {NewBytes, NewContState} ->
+ {<<Rest/binary, NewBytes/binary>>,
+ State#xmerl_sax_parser_state{continuation_state = NewContState}}
+ end.
+
+%%----------------------------------------------------------------------
+%% Function : cf(Rest, State, NextCall) -> Result
+%% Parameters: Rest = binary()
+%% State = #xmerl_sax_parser_state{}
+%% NextCall = fun()
+%% Result : {Rest, State}
+%% Description: Function that uses provided fun to read another chunk from
+%% input stream and calls the fun in NextCall.
+%%----------------------------------------------------------------------
+cf(_Rest, #xmerl_sax_parser_state{continuation_fun = undefined} = State, _) ->
+ ?fatal_error(State, "Continuation function undefined");
+cf(Rest, #xmerl_sax_parser_state{continuation_fun = CFun, continuation_state = CState} = State,
+ NextCall) ->
+ Result =
+ try
+ CFun(CState)
+ catch
+ throw:ErrorTerm ->
+ ?fatal_error(State, ErrorTerm);
+ exit:Reason ->
+ ?fatal_error(State, {'EXIT', Reason})
+ end,
+ case Result of
+ {<<>>, _} ->
+ ?fatal_error(State, "Can't detect character encoding due to lack of indata");
+ {NewBytes, NewContState} ->
+ NextCall(<<Rest/binary, NewBytes/binary>>,
+ State#xmerl_sax_parser_state{continuation_state = NewContState})
+ end.
diff --git a/lib/xmerl/src/xmerl_sax_parser.hrl b/lib/xmerl/src/xmerl_sax_parser.hrl
index 932ab0cec5..56a3a42e5f 100644
--- a/lib/xmerl/src/xmerl_sax_parser.hrl
+++ b/lib/xmerl/src/xmerl_sax_parser.hrl
@@ -1,7 +1,7 @@
%%--------------------------------------------------------------------
%% %CopyrightBegin%
%%
-%% Copyright Ericsson AB 2008-2016. All Rights Reserved.
+%% Copyright Ericsson AB 2008-2017. All Rights Reserved.
%%
%% Licensed under the Apache License, Version 2.0 (the "License");
%% you may not use this file except in compliance with the License.
@@ -88,14 +88,7 @@
current_location, % Location of the currently parsed XML entity
entity, % Parsed XML entity
skip_external_dtd = false,% If true the external DTD is skipped during parsing
- input_type % Source type: file | stream.
- % This field is a preparation for an fix in R17 of a bug in
- % the conformance against the standard.
- % Today a file which contains two XML documents will be considered
- % well-formed and the second is placed in the rest part of the
- % return tuple, according to the conformance tests this should fail.
- % In the future this will fail if xmerl_sax_aprser:file/2 is used but
- % left to the user in the xmerl_sax_aprser:stream/2 case.
+ input_type % Source type: file | stream
}).
diff --git a/lib/xmerl/src/xmerl_sax_parser_base.erlsrc b/lib/xmerl/src/xmerl_sax_parser_base.erlsrc
index 4d75805b9b..1dca9608cb 100644
--- a/lib/xmerl/src/xmerl_sax_parser_base.erlsrc
+++ b/lib/xmerl/src/xmerl_sax_parser_base.erlsrc
@@ -1,7 +1,7 @@
%%-*-erlang-*-
%% %CopyrightBegin%
%%
-%% Copyright Ericsson AB 2008-2016. All Rights Reserved.
+%% Copyright Ericsson AB 2008-2017. All Rights Reserved.
%%
%% Licensed under the Apache License, Version 2.0 (the "License");
%% you may not use this file except in compliance with the License.
@@ -64,29 +64,42 @@
%% Description: Parsing XML from input stream.
%%----------------------------------------------------------------------
parse(Xml, State) ->
- RefTable = ets:new(xmerl_sax_entity_refs, [private]),
-
- State1 = event_callback(startDocument, State),
-
- case catch parse_document(Xml, State1#xmerl_sax_parser_state{ref_table=RefTable}) of
- {ok, Rest, State2} ->
- State3 = event_callback(endDocument, State2),
- ets:delete(RefTable),
- {ok, State3#xmerl_sax_parser_state.event_state, Rest};
- {fatal_error, {State2, Reason}} ->
- State3 = event_callback(endDocument, State2),
- ets:delete(RefTable),
- format_error(fatal_error, State3, Reason);
- {event_receiver_error, State2, {Tag, Reason}} ->
- State3 = event_callback(endDocument, State2),
- ets:delete(RefTable),
- format_error(Tag, State3, Reason);
- Other ->
- _State2 = event_callback(endDocument, State1),
- ets:delete(RefTable),
- throw(Other)
+ RefTable = maps:new(),
+
+ try
+ State1 = event_callback(startDocument, State),
+ Result = parse_document(Xml, State1#xmerl_sax_parser_state{ref_table=RefTable}),
+ handle_end_document(Result)
+ catch
+ throw:Exception ->
+ handle_end_document(Exception);
+ _:OtherError ->
+ handle_end_document({other, OtherError, State})
end.
+ % case catch parse_document(Xml, State1#xmerl_sax_parser_state{ref_table=RefTable}) of
+ % {ok, Rest, State2} ->
+ % State3 = event_callback(endDocument, State2),
+ % case check_if_rest_ok(State3#xmerl_sax_parser_state.input_type, Rest) of
+ % true ->
+ % {ok, State3#xmerl_sax_parser_state.event_state, Rest};
+ % false ->
+ % format_error(fatal_error, State3, "Input found after legal document")
+ % end;
+ % {fatal_error, {State2, Reason}} ->
+ % State3 = event_callback(endDocument, State2),
+ % format_error(fatal_error, State3, Reason);
+ % {event_receiver_error, State2, {Tag, Reason}} ->
+ % State3 = event_callback(endDocument, State2),
+ % format_error(Tag, State3, Reason);
+ % {endDocument, Rest, State2} ->
+ % State3 = event_callback(endDocument, State2),
+ % {ok, State3#xmerl_sax_parser_state.event_state, Rest};
+ % Other ->
+ % _State2 = event_callback(endDocument, State1),
+ % {fatal_error, Other}
+ % end.
+
%%----------------------------------------------------------------------
%% Function: parse_dtd(Xml, State) -> Result
%% Input: Xml = string() | binary()
@@ -96,38 +109,120 @@ parse(Xml, State) ->
%% Description: Parsing XML DTD from input stream.
%%----------------------------------------------------------------------
parse_dtd(Xml, State) ->
- RefTable = ets:new(xmerl_sax_entity_refs, [private]),
-
- State1 = event_callback(startDocument, State),
-
- case catch parse_external_entity_1(Xml, State1#xmerl_sax_parser_state{ref_table=RefTable}) of
- {fatal_error, {State2, Reason}} ->
- State3 = event_callback(endDocument, State2),
- ets:delete(RefTable),
- format_error(fatal_error, State3, Reason);
- {event_receiver_error, State2, {Tag, Reason}} ->
- State3 = event_callback(endDocument, State2),
- format_error(Tag, State3, Reason);
- {Rest, State2} when is_record(State2, xmerl_sax_parser_state) ->
- State3 = event_callback(endDocument, State2),
- ets:delete(RefTable),
- {ok, State3#xmerl_sax_parser_state.event_state, Rest};
- {endDocument, Rest, State2} when is_record(State2, xmerl_sax_parser_state) ->
- State3 = event_callback(endDocument, State2),
- ets:delete(RefTable),
- {ok, State3#xmerl_sax_parser_state.event_state, Rest};
- Other ->
- _State2 = event_callback(endDocument, State1),
- ets:delete(RefTable),
- throw(Other)
+ RefTable = maps:new(),
+
+ try
+ State1 = event_callback(startDocument, State),
+ Result = parse_external_entity_1(Xml, State1#xmerl_sax_parser_state{ref_table=RefTable}),
+ handle_end_document(Result)
+ catch
+ throw:Exception ->
+ handle_end_document(Exception);
+ _:OtherError ->
+ handle_end_document({other, OtherError, State})
end.
+ % case catch parse_external_entity_1(Xml, State1#xmerl_sax_parser_state{ref_table=RefTable}) of
+ % {fatal_error, {State2, Reason}} ->
+ % State3 = event_callback(endDocument, State2),
+ % format_error(fatal_error, State3, Reason);
+ % {event_receiver_error, State2, {Tag, Reason}} ->
+ % State3 = event_callback(endDocument, State2),
+ % format_error(Tag, State3, Reason);
+ % {Rest, State2} when is_record(State2, xmerl_sax_parser_state) ->
+ % State3 = event_callback(endDocument, State2),
+ % {ok, State3#xmerl_sax_parser_state.event_state, Rest};
+ % {endDocument, Rest, State2} when is_record(State2, xmerl_sax_parser_state) ->
+ % State3 = event_callback(endDocument, State2),
+ % {ok, State3#xmerl_sax_parser_state.event_state, Rest};
+ % Other ->
+ % _State2 = event_callback(endDocument, State1),
+ % {fatal_error, Other}
+ % end.
+
+
%%======================================================================
%% Internal functions
%%======================================================================
%%----------------------------------------------------------------------
+%% Function: handle_end_document(ParserResult) -> Result
+%% Input: ParseResult = term()
+%% Output: Result = {ok, Rest, EventState} |
+%% EventState = term()
+%% Description: Ends the parsing and formats output
+%%----------------------------------------------------------------------
+handle_end_document({ok, Rest, State}) ->
+ %%ok case from parse
+ try
+ State1 = event_callback(endDocument, State),
+ case check_if_rest_ok(State1#xmerl_sax_parser_state.input_type, Rest) of
+ true ->
+ {ok, State1#xmerl_sax_parser_state.event_state, Rest};
+ false ->
+ format_error(fatal_error, State1, "Input found after legal document")
+ end
+ catch
+ throw:{event_receiver_error, State2, {Tag, Reason}} ->
+ format_error(Tag, State2, Reason);
+ _:Other ->
+ {fatal_error, Other}
+ end;
+handle_end_document({endDocument, Rest, State}) ->
+ %% ok case from parse and parse_dtd
+ try
+ State1 = event_callback(endDocument, State),
+ {ok, State1#xmerl_sax_parser_state.event_state, Rest}
+ catch
+ throw:{event_receiver_error, State2, {Tag, Reason}} ->
+ format_error(Tag, State2, Reason);
+ _:Other ->
+ {fatal_error, Other}
+ end;
+handle_end_document({fatal_error, {State, Reason}}) ->
+ try
+ State1 = event_callback(endDocument, State),
+ format_error(fatal_error, State1, Reason)
+ catch
+ throw:{event_receiver_error, State2, {Tag, Reason}} ->
+ format_error(Tag, State2, Reason);
+ _:Other ->
+ {fatal_error, Other}
+ end;
+handle_end_document({event_receiver_error, State, {Tag, Reason}}) ->
+ try
+ State1 = event_callback(endDocument, State),
+ format_error(Tag, State1, Reason)
+ catch
+ throw:{event_receiver_error, State2, {Tag, Reason}} ->
+ format_error(Tag, State2, Reason);
+ _:Other ->
+ {fatal_error, Other}
+ end;
+handle_end_document({Rest, State}) when is_record(State, xmerl_sax_parser_state) ->
+ %%ok case from parse_dtd
+ try
+ State1 = event_callback(endDocument, State),
+ {ok, State1#xmerl_sax_parser_state.event_state, Rest}
+ catch
+ throw:{event_receiver_error, State2, {Tag, Reason}} ->
+ format_error(Tag, State2, Reason);
+ _:Other ->
+ {fatal_error, Other}
+ end;
+handle_end_document({other, Error, State}) ->
+ try
+ _State1 = event_callback(endDocument, State),
+ {fatal_error, Error}
+ catch
+ throw:{event_receiver_error, State2, {Tag, Reason}} ->
+ format_error(Tag, State2, Reason);
+ _:Other ->
+ {fatal_error, Other}
+ end.
+
+%%----------------------------------------------------------------------
%% Function: parse_document(Rest, State) -> Result
%% Input: Rest = string() | binary()
%% State = #xmerl_sax_parser_state{}
@@ -136,10 +231,11 @@ parse_dtd(Xml, State) ->
%% [1] document ::= prolog element Misc*
%%----------------------------------------------------------------------
parse_document(Rest, State) when is_record(State, xmerl_sax_parser_state) ->
- {Rest1, State1} = parse_xml_decl(Rest, State),
+ {Rest1, State1} = parse_byte_order_mark(Rest, State),
{Rest2, State2} = parse_misc(Rest1, State1, true),
{ok, Rest2, State2}.
+?PARSE_BYTE_ORDER_MARK(Bytes, State).
%%----------------------------------------------------------------------
%% Function: parse_xml_decl(Rest, State) -> Result
@@ -150,15 +246,8 @@ parse_document(Rest, State) when is_record(State, xmerl_sax_parser_state) ->
%% [22] prolog ::= XMLDecl? Misc* (doctypedecl Misc*)?
%% [23] XMLDecl ::= '<?xml' VersionInfo EncodingDecl? SDDecl? S? '?>'
%%----------------------------------------------------------------------
--dialyzer({[no_fail_call, no_match], parse_xml_decl/2}).
parse_xml_decl(?STRING_EMPTY, State) ->
cf(?STRING_EMPTY, State, fun parse_xml_decl/2);
-parse_xml_decl(?BYTE_ORDER_MARK_1, State) ->
- cf(?BYTE_ORDER_MARK_1, State, fun parse_xml_decl/2);
-parse_xml_decl(?BYTE_ORDER_MARK_2, State) ->
- cf(?BYTE_ORDER_MARK_2, State, fun parse_xml_decl/2);
-parse_xml_decl(?BYTE_ORDER_MARK_REST(Rest), State) ->
- cf(Rest, State, fun parse_xml_decl/2);
parse_xml_decl(?STRING("<") = Bytes, State) ->
cf(Bytes, State, fun parse_xml_decl/2);
parse_xml_decl(?STRING("<?") = Bytes, State) ->
@@ -170,31 +259,19 @@ parse_xml_decl(?STRING("<?xm") = Bytes, State) ->
parse_xml_decl(?STRING("<?xml") = Bytes, State) ->
cf(Bytes, State, fun parse_xml_decl/2);
parse_xml_decl(?STRING_REST("<?xml", Rest1), State) ->
- parse_xml_decl_1(Rest1, State);
-parse_xml_decl(Bytes, #xmerl_sax_parser_state{encoding=Enc} = State) when is_binary(Bytes) ->
- case unicode:characters_to_list(Bytes, Enc) of
- {incomplete, _, _} ->
- cf(Bytes, State, fun parse_xml_decl/2);
- {error, _Encoded, _Rest} ->
- ?fatal_error(State, lists:flatten(io_lib:format("Bad character, not in ~p\n", [Enc])));
- _ ->
- parse_prolog(Bytes, State)
- end;
-parse_xml_decl(Bytes, State) ->
- parse_prolog(Bytes, State).
-
+ parse_xml_decl_rest(Rest1, State);
+?PARSE_XML_DECL(Bytes, State).
-parse_xml_decl_1(?STRING_UNBOUND_REST(C, Rest) = Bytes, State) ->
+parse_xml_decl_rest(?STRING_UNBOUND_REST(C, Rest) = Bytes, State) ->
if
?is_whitespace(C) ->
{_XmlAttributes, Rest1, State1} = parse_version_info(Rest, State, []),
- %State2 = event_callback({processingInstruction, "xml", XmlAttributes}, State1),% The XML decl. should not be reported as a PI
parse_prolog(Rest1, State1);
true ->
parse_prolog(?STRING_REST("<?xml", Bytes), State)
end;
-parse_xml_decl_1(Bytes, State) ->
- unicode_incomplete_check([Bytes, State, fun parse_xml_decl_1/2], undefined).
+parse_xml_decl_rest(Bytes, State) ->
+ unicode_incomplete_check([Bytes, State, fun parse_xml_decl_rest/2], undefined).
@@ -216,8 +293,6 @@ parse_prolog(?STRING_REST("<?", Rest), State) ->
parse_prolog(Rest1, State1);
{endDocument, Rest1, State1} ->
parse_prolog(Rest1, State1)
- % IValue = ?TO_INPUT_FORMAT("<?"),
- % {?APPEND_STRING(IValue, Rest1), State1}
end;
parse_prolog(?STRING_REST("<!", Rest), State) ->
parse_prolog_1(Rest, State);
@@ -230,7 +305,6 @@ parse_prolog(Bytes, State) ->
unicode_incomplete_check([Bytes, State, fun parse_prolog/2],
"expecting < or whitespace").
-
parse_prolog_1(?STRING_EMPTY, State) ->
cf(?STRING_EMPTY, State, fun parse_prolog_1/2);
parse_prolog_1(?STRING("D") = Bytes, State) ->
@@ -442,6 +516,16 @@ check_if_new_doc_allowed(stream, []) ->
check_if_new_doc_allowed(_, _) ->
false.
+check_if_rest_ok(file, []) ->
+ true;
+check_if_rest_ok(file, <<>>) ->
+ true;
+check_if_rest_ok(stream, _) ->
+ true;
+check_if_rest_ok(_, _) ->
+ false.
+
+
%%----------------------------------------------------------------------
%% Function: parse_pi_1(Rest, State) -> Result
%% Input: Rest = string() | binary()
@@ -886,11 +970,11 @@ send_end_prefix_mapping_event([{Prefix, _Uri} |Ns], State) ->
parse_eq(?STRING_EMPTY, State) ->
cf(?STRING_EMPTY, State, fun parse_eq/2);
parse_eq(?STRING_REST("=", Rest), State) ->
- {Rest, State};
+ {Rest, State};
parse_eq(?STRING_UNBOUND_REST(C, _) = Bytes, State) when ?is_whitespace(C) ->
- {_WS, Rest, State1} =
- whitespace(Bytes, State, []),
- parse_eq(Rest, State1);
+ {_WS, Rest, State1} =
+ whitespace(Bytes, State, []),
+ parse_eq(Rest, State1);
parse_eq(Bytes, State) ->
unicode_incomplete_check([Bytes, State, fun parse_eq/2],
"expecting = or whitespace").
@@ -908,11 +992,11 @@ parse_eq(Bytes, State) ->
parse_att_value(?STRING_EMPTY, State) ->
cf(?STRING_EMPTY, State, fun parse_att_value/2);
parse_att_value(?STRING_UNBOUND_REST(C, Rest), State) when C == $'; C == $" ->
- parse_att_value(Rest, State, C, []);
+ parse_att_value(Rest, State, C, []);
parse_att_value(?STRING_UNBOUND_REST(C, _) = Bytes, State) when ?is_whitespace(C) ->
- {_WS, Rest, State1} =
- whitespace(Bytes, State, []),
- parse_att_value(Rest, State1);
+ {_WS, Rest, State1} =
+ whitespace(Bytes, State, []),
+ parse_att_value(Rest, State1);
parse_att_value(Bytes, State) ->
unicode_incomplete_check([Bytes, State, fun parse_att_value/2],
"\', \" or whitespace expected").
@@ -1024,16 +1108,21 @@ parse_etag(Bytes, State) ->
unicode_incomplete_check([Bytes, State, fun parse_etag/2],
undefined).
-
parse_etag_1(?STRING_REST(">", Rest),
#xmerl_sax_parser_state{end_tags=[{_ETag, Uri, LocalName, QName, OldNsList, NewNsList}
- |RestOfETags]} = State, _Tag) ->
+ |RestOfETags],
+ input_type=InputType} = State, _Tag) ->
State1 = event_callback({endElement, Uri, LocalName, QName}, State),
State2 = send_end_prefix_mapping_event(NewNsList, State1),
- parse_content(Rest,
- State2#xmerl_sax_parser_state{end_tags=RestOfETags,
- ns = OldNsList},
- [], true);
+ case check_if_new_doc_allowed(InputType, RestOfETags) of
+ true ->
+ throw({endDocument, Rest, State2#xmerl_sax_parser_state{ns = OldNsList}});
+ false ->
+ parse_content(Rest,
+ State2#xmerl_sax_parser_state{end_tags=RestOfETags,
+ ns = OldNsList},
+ [], true)
+ end;
parse_etag_1(?STRING_UNBOUND_REST(_C, _), State, Tag) ->
{P,TN} = Tag,
?fatal_error(State, "Bad EndTag: " ++ P ++ ":" ++ TN);
@@ -1051,21 +1140,26 @@ parse_etag_1(Bytes, State, Tag) ->
%% Description: Parsing the content part of tags
%% [43] content ::= (element | CharData | Reference | CDSect | PI | Comment)*
%%----------------------------------------------------------------------
-
parse_content(?STRING_EMPTY, State, Acc, IgnorableWS) ->
- case catch cf(?STRING_EMPTY, State, Acc, IgnorableWS, fun parse_content/4) of
- {Rest, State1} when is_record(State1, xmerl_sax_parser_state) ->
- {Rest, State1};
- {fatal_error, {State1, Msg}} ->
- case check_if_document_complete(State1, Msg) of
- true ->
- State2 = send_character_event(length(Acc), IgnorableWS, lists:reverse(Acc), State1),
- {?STRING_EMPTY, State2};
- false ->
- ?fatal_error(State1, Msg)
- end;
- Other ->
- throw(Other)
+ case check_if_document_complete(State, "No more bytes") of
+ true ->
+ State1 = send_character_event(length(Acc), IgnorableWS, lists:reverse(Acc), State),
+ {?STRING_EMPTY, State1};
+ false ->
+ case catch cf(?STRING_EMPTY, State, Acc, IgnorableWS, fun parse_content/4) of
+ {Rest, State1} when is_record(State1, xmerl_sax_parser_state) ->
+ {Rest, State1};
+ {fatal_error, {State1, Msg}} ->
+ case check_if_document_complete(State1, Msg) of
+ true ->
+ State2 = send_character_event(length(Acc), IgnorableWS, lists:reverse(Acc), State1),
+ {?STRING_EMPTY, State2};
+ false ->
+ ?fatal_error(State1, Msg)
+ end;
+ Other ->
+ throw(Other)
+ end
end;
parse_content(?STRING("\r") = Bytes, State, Acc, IgnorableWS) ->
cf(Bytes, State, Acc, IgnorableWS, fun parse_content/4);
@@ -1094,7 +1188,7 @@ parse_content(?STRING_REST("<?", Rest), State, Acc, IgnorableWS) ->
parse_content(?STRING_REST("<!", Rest1) = Rest, #xmerl_sax_parser_state{end_tags = ET} = State, Acc, IgnorableWS) ->
case ET of
[] ->
- {Rest, State}; %%LATH : Skicka ignorable WS ???
+ {Rest, State}; %% Skicka ignorable WS ???
_ ->
State1 = send_character_event(length(Acc), IgnorableWS, lists:reverse(Acc), State),
parse_cdata(Rest1, State1)
@@ -1102,7 +1196,7 @@ parse_content(?STRING_REST("<!", Rest1) = Rest, #xmerl_sax_parser_state{end_tags
parse_content(?STRING_REST("<", Rest1) = Rest, #xmerl_sax_parser_state{end_tags = ET} = State, Acc, IgnorableWS) ->
case ET of
[] ->
- {Rest, State}; %%LATH : Skicka ignorable WS ???
+ {Rest, State}; %% Skicka ignorable WS ???
_ ->
State1 = send_character_event(length(Acc), IgnorableWS, lists:reverse(Acc), State),
parse_stag(Rest1, State1)
@@ -1204,7 +1298,6 @@ send_character_event(_, true, String, State) ->
%% Description: Parse whitespaces.
%% [3] S ::= (#x20 | #x9 | #xD | #xA)+
%%----------------------------------------------------------------------
--dialyzer({no_fail_call, whitespace/3}).
whitespace(?STRING_EMPTY, State, Acc) ->
case cf(?STRING_EMPTY, State, Acc, fun whitespace/3) of
{?STRING_EMPTY, State} ->
@@ -1230,16 +1323,7 @@ whitespace(?STRING_REST("\r", Rest), State, Acc) ->
whitespace(Rest, State#xmerl_sax_parser_state{line_no=N+1}, [?lf |Acc]);
whitespace(?STRING_UNBOUND_REST(C, Rest), State, Acc) when ?is_whitespace(C) ->
whitespace(Rest, State, [C|Acc]);
-whitespace(?STRING_UNBOUND_REST(_C, _) = Bytes, State, Acc) ->
- {lists:reverse(Acc), Bytes, State};
-whitespace(Bytes, #xmerl_sax_parser_state{encoding=Enc} = State, Acc) when is_binary(Bytes) ->
- case unicode:characters_to_list(Bytes, Enc) of
- {incomplete, _, _} ->
- cf(Bytes, State, Acc, fun whitespace/3);
- {error, _Encoded, _Rest} ->
- ?fatal_error(State, lists:flatten(io_lib:format("Bad character, not in ~p\n", [Enc])))
- end.
-
+?WHITESPACE(Bytes, State, Acc).
%%----------------------------------------------------------------------
%% Function: parse_reference(Rest, State, HaveToExist) -> Result
@@ -1362,23 +1446,24 @@ parse_pe_reference_1(Bytes, State, Name) ->
"missing ; after reference " ++ Name).
-
%%----------------------------------------------------------------------
-%% Function: insert_reference(Reference, State) -> Result
-%% Parameters: Reference = string()
+%% Function: insert_reference(Name, Ref, State) -> Result
+%% Parameters: Name = string()
+%% Ref = {Type, Value}
+%% Type = atom()
+%% Value = term()
%% State = #xmerl_sax_parser_state{}
%% Result :
%%----------------------------------------------------------------------
-insert_reference({Name, Type, Value}, Table) ->
- case ets:lookup(Table, Name) of
- [{Name, _, _}] ->
- ok;
+insert_reference(Name, Value, #xmerl_sax_parser_state{ref_table = Map} = State) ->
+ case maps:find(Name, Map) of
+ error ->
+ State#xmerl_sax_parser_state{ref_table = maps:put(Name, Value, Map)};
_ ->
- ets:insert(Table, {Name, Type, Value})
+ State
end.
-
%%----------------------------------------------------------------------
%% Function: look_up_reference(Reference, State) -> Result
%% Parameters: Reference = string()
@@ -1396,8 +1481,8 @@ look_up_reference("apos", _, _) ->
look_up_reference("quot", _, _) ->
{internal_general, "quot", "\""};
look_up_reference(Name, HaveToExist, State) ->
- case ets:lookup(State#xmerl_sax_parser_state.ref_table, Name) of
- [{Name, Type, Value}] ->
+ case maps:find(Name, State#xmerl_sax_parser_state.ref_table) of
+ {ok, {Type, Value}} ->
{Type, Name, Value};
_ ->
case HaveToExist of
@@ -1479,7 +1564,7 @@ parse_system_litteral(?STRING_EMPTY, State, Stop, Acc) ->
parse_system_litteral(?STRING_UNBOUND_REST(Stop, Rest), State, Stop, Acc) ->
{lists:reverse(Acc), Rest, State};
parse_system_litteral(?STRING_UNBOUND_REST(C, Rest), State, Stop, Acc) ->
- parse_system_litteral(Rest, State, Stop, [C |Acc]);
+ parse_system_litteral(Rest, State, Stop, [C |Acc]);
parse_system_litteral(Bytes, State, Stop, Acc) ->
unicode_incomplete_check([Bytes, State, Stop, Acc, fun parse_system_litteral/4],
undefined).
@@ -1651,9 +1736,11 @@ parse_external_entity(State, _PubId, SysId) ->
end_tags = []},
- EventState = handle_external_entity(ExtRef, State1),
+ {EventState, RefTable} = handle_external_entity(ExtRef, State1),
- NewState = event_callback({endEntity, SysId}, SaveState#xmerl_sax_parser_state{event_state=EventState}),
+ NewState = event_callback({endEntity, SysId},
+ SaveState#xmerl_sax_parser_state{event_state=EventState,
+ ref_table=RefTable}),
NewState#xmerl_sax_parser_state{file_type=normal}.
@@ -1680,7 +1767,8 @@ handle_external_entity({file, FileToOpen}, State) ->
entity=filename:basename(FileToOpen),
input_type=file}),
ok = file:close(FD),
- EntityState#xmerl_sax_parser_state.event_state
+ {EntityState#xmerl_sax_parser_state.event_state,
+ EntityState#xmerl_sax_parser_state.ref_table}
end;
handle_external_entity({http, Url}, State) ->
@@ -1693,14 +1781,16 @@ handle_external_entity({http, Url}, State) ->
++ file:format_error(Reason));
{ok, FD} ->
{?STRING_EMPTY, EntityState} =
- parse_external_entity_1(<<>>,
+ parse_external_entity_byte_order_mark(<<>>,
State#xmerl_sax_parser_state{continuation_state=FD,
current_location=filename:dirname(Url),
entity=filename:basename(Url),
input_type=file}),
ok = file:close(FD),
ok = file:delete(TmpFile),
- EntityState#xmerl_sax_parser_state.event_state
+ {EntityState#xmerl_sax_parser_state.event_state,
+ EntityState#xmerl_sax_parser_state.ref_table}
+
end
catch
throw:{error, Error} ->
@@ -1709,6 +1799,8 @@ handle_external_entity({http, Url}, State) ->
handle_external_entity({Tag, _Url}, State) ->
?fatal_error(State, "Unsupported URI type: " ++ atom_to_list(Tag)).
+?PARSE_EXTERNAL_ENTITY_BYTE_ORDER_MARK(Bytes, State).
+
%%----------------------------------------------------------------------
%% Function : parse_external_entity_1(Rest, State) -> Result
%% Parameters: Rest = string() | binary()
@@ -1716,22 +1808,15 @@ handle_external_entity({Tag, _Url}, State) ->
%% Result : {Rest, State}
%% Description: Parse the external entity.
%%----------------------------------------------------------------------
--dialyzer({[no_fail_call, no_match], parse_external_entity_1/2}).
parse_external_entity_1(?STRING_EMPTY, #xmerl_sax_parser_state{file_type=Type} = State) ->
case catch cf(?STRING_EMPTY, State, fun parse_external_entity_1/2) of
{Rest, State1} when is_record(State1, xmerl_sax_parser_state) ->
- {Rest, State};
+ {Rest, State1};
{fatal_error, {State1, "No more bytes"}} when Type == dtd; Type == entity ->
{?STRING_EMPTY, State1};
Other ->
throw(Other)
end;
-parse_external_entity_1(?BYTE_ORDER_MARK_1, State) ->
- cf(?BYTE_ORDER_MARK_1, State, fun parse_external_entity_1/2);
-parse_external_entity_1(?BYTE_ORDER_MARK_2, State) ->
- cf(?BYTE_ORDER_MARK_2, State, fun parse_external_entity_1/2);
-parse_external_entity_1(?BYTE_ORDER_MARK_REST(Rest), State) ->
- parse_external_entity_1(Rest, State);
parse_external_entity_1(?STRING("<") = Bytes, State) ->
cf(Bytes, State, fun parse_external_entity_1/2);
parse_external_entity_1(?STRING("<?") = Bytes, State) ->
@@ -2452,24 +2537,24 @@ parse_entity_def(?STRING_EMPTY, State, Name) ->
cf(?STRING_EMPTY, State, Name, fun parse_entity_def/3);
parse_entity_def(?STRING_UNBOUND_REST(C, Rest), State, Name) when C == $'; C == $" ->
{Value, Rest1, State1} = parse_entity_value(Rest, State, C, []),
- insert_reference({Name, internal_general, Value}, State1#xmerl_sax_parser_state.ref_table),
- State2 = event_callback({internalEntityDecl, Name, Value}, State1),
- {_WS, Rest2, State3} = whitespace(Rest1, State2, []),
- parse_def_end(Rest2, State3);
+ State2 = insert_reference(Name, {internal_general, Value}, State1),
+ State3 = event_callback({internalEntityDecl, Name, Value}, State2),
+ {_WS, Rest2, State4} = whitespace(Rest1, State3, []),
+ parse_def_end(Rest2, State4);
parse_entity_def(?STRING_UNBOUND_REST(C, _) = Rest, State, Name) when C == $S; C == $P ->
{PubId, SysId, Rest1, State1} = parse_external_id(Rest, State, false),
{Ndata, Rest2, State2} = parse_ndata(Rest1, State1),
case Ndata of
undefined ->
- insert_reference({Name, external_general, {PubId, SysId}},
- State2#xmerl_sax_parser_state.ref_table),
- State3 = event_callback({externalEntityDecl, Name, PubId, SysId}, State2),
- {Rest2, State3};
+ State3 = insert_reference(Name, {external_general, {PubId, SysId}},
+ State2),
+ State4 = event_callback({externalEntityDecl, Name, PubId, SysId}, State3),
+ {Rest2, State4};
_ ->
- insert_reference({Name, unparsed, {PubId, SysId, Ndata}},
- State2#xmerl_sax_parser_state.ref_table),
- State3 = event_callback({unparsedEntityDecl, Name, PubId, SysId, Ndata}, State2),
- {Rest2, State3}
+ State3 = insert_reference(Name, {unparsed, {PubId, SysId, Ndata}},
+ State2),
+ State4 = event_callback({unparsedEntityDecl, Name, PubId, SysId, Ndata}, State3),
+ {Rest2, State4}
end;
parse_entity_def(Bytes, State, Name) ->
unicode_incomplete_check([Bytes, State, Name, fun parse_entity_def/3],
@@ -2646,19 +2731,19 @@ parse_pe_def(?STRING_EMPTY, State, Name) ->
parse_pe_def(?STRING_UNBOUND_REST(C, Rest), State, Name) when C == $'; C == $" ->
{Value, Rest1, State1} = parse_entity_value(Rest, State, C, []),
Name1 = "%" ++ Name,
- insert_reference({Name1, internal_parameter, Value},
- State1#xmerl_sax_parser_state.ref_table),
- State2 = event_callback({internalEntityDecl, Name1, Value}, State1),
- {_WS, Rest2, State3} = whitespace(Rest1, State2, []),
- parse_def_end(Rest2, State3);
+ State2 = insert_reference(Name1, {internal_parameter, Value},
+ State1),
+ State3 = event_callback({internalEntityDecl, Name1, Value}, State2),
+ {_WS, Rest2, State4} = whitespace(Rest1, State3, []),
+ parse_def_end(Rest2, State4);
parse_pe_def(?STRING_UNBOUND_REST(C, _) = Bytes, State, Name) when C == $S; C == $P ->
{PubId, SysId, Rest1, State1} = parse_external_id(Bytes, State, false),
Name1 = "%" ++ Name,
- insert_reference({Name1, external_parameter, {PubId, SysId}},
- State1#xmerl_sax_parser_state.ref_table),
- State2 = event_callback({externalEntityDecl, Name1, PubId, SysId}, State1),
- {_WS, Rest2, State3} = whitespace(Rest1, State2, []),
- parse_def_end(Rest2, State3);
+ State2 = insert_reference(Name1, {external_parameter, {PubId, SysId}},
+ State1),
+ State3 = event_callback({externalEntityDecl, Name1, PubId, SysId}, State2),
+ {_WS, Rest2, State4} = whitespace(Rest1, State3, []),
+ parse_def_end(Rest2, State4);
parse_pe_def(Bytes, State, Name) ->
unicode_incomplete_check([Bytes, State, Name, fun parse_pe_def/3],
"\", \', SYSTEM or PUBLIC expected").
@@ -3290,7 +3375,7 @@ cf(Rest, #xmerl_sax_parser_state{continuation_fun = CFun, continuation_state = C
catch
throw:ErrorTerm ->
?fatal_error(State, ErrorTerm);
- exit:Reason ->
+ exit:Reason ->
?fatal_error(State, {'EXIT', Reason})
end,
case Result of
diff --git a/lib/xmerl/src/xmerl_sax_parser_latin1.erlsrc b/lib/xmerl/src/xmerl_sax_parser_latin1.erlsrc
index 961806bf4c..6e59347fb8 100644
--- a/lib/xmerl/src/xmerl_sax_parser_latin1.erlsrc
+++ b/lib/xmerl/src/xmerl_sax_parser_latin1.erlsrc
@@ -2,7 +2,7 @@
%%--------------------------------------------------------------------
%% %CopyrightBegin%
%%
-%% Copyright Ericsson AB 2008-2016. All Rights Reserved.
+%% Copyright Ericsson AB 2008-2017. All Rights Reserved.
%%
%% Licensed under the Apache License, Version 2.0 (the "License");
%% you may not use this file except in compliance with the License.
@@ -34,8 +34,36 @@
-define(APPEND_STRING(Rest, New), <<Rest/binary, New/binary>>).
-define(TO_INPUT_FORMAT(Val), unicode:characters_to_binary(Val, unicode, latin1)).
-%% STRING_REST and STRING_UNBOUND_REST is only different in the list case
-define(STRING_UNBOUND_REST(MatchChar, Rest), <<MatchChar, Rest/binary>>).
--define(BYTE_ORDER_MARK_1, undefined_bom1).
--define(BYTE_ORDER_MARK_2, undefined_bom2).
--define(BYTE_ORDER_MARK_REST(Rest), <<undefined, Rest/binary>>).
+
+-define(PARSE_BYTE_ORDER_MARK(Bytes, State),
+ parse_byte_order_mark(Bytes, State) ->
+ parse_xml_decl(Bytes, State)).
+
+-define(PARSE_XML_DECL(Bytes, State),
+ parse_xml_decl(Bytes, #xmerl_sax_parser_state{encoding=Enc} = State) when is_binary(Bytes) ->
+ case unicode:characters_to_list(Bytes, Enc) of
+ {incomplete, _, _} ->
+ cf(Bytes, State, fun parse_xml_decl/2);
+ {error, _Encoded, _Rest} ->
+ ?fatal_error(State, lists:flatten(io_lib:format("Bad character, not in ~p\n", [Enc])));
+ _ ->
+ parse_prolog(Bytes, State)
+ end;
+ parse_xml_decl(Bytes, State) ->
+ parse_prolog(Bytes, State)).
+
+-define(WHITESPACE(Bytes, State, Acc),
+ whitespace(?STRING_UNBOUND_REST(_C, _) = Bytes, State, Acc) ->
+ {lists:reverse(Acc), Bytes, State};
+ whitespace(Bytes, #xmerl_sax_parser_state{encoding=Enc} = State, Acc) when is_binary(Bytes) ->
+ case unicode:characters_to_list(Bytes, Enc) of
+ {incomplete, _, _} ->
+ cf(Bytes, State, Acc, fun whitespace/3);
+ {error, _Encoded, _Rest} ->
+ ?fatal_error(State, lists:flatten(io_lib:format("Bad character, not in ~p\n", [Enc])))
+ end).
+
+-define(PARSE_EXTERNAL_ENTITY_BYTE_ORDER_MARK(Bytes, State),
+ parse_external_entity_byte_order_mark(Bytes, State) ->
+ parse_external_entity_1(Bytes, State)).
diff --git a/lib/xmerl/src/xmerl_sax_parser_list.erlsrc b/lib/xmerl/src/xmerl_sax_parser_list.erlsrc
index 624a621d92..ac89896215 100644
--- a/lib/xmerl/src/xmerl_sax_parser_list.erlsrc
+++ b/lib/xmerl/src/xmerl_sax_parser_list.erlsrc
@@ -2,7 +2,7 @@
%%--------------------------------------------------------------------
%% %CopyrightBegin%
%%
-%% Copyright Ericsson AB 2008-2016. All Rights Reserved.
+%% Copyright Ericsson AB 2008-2017. All Rights Reserved.
%%
%% Licensed under the Apache License, Version 2.0 (the "License");
%% you may not use this file except in compliance with the License.
@@ -36,6 +36,19 @@
%% In the list case we can't use a '++' when matchin against an unbound variable
-define(STRING_UNBOUND_REST(MatchChar, Rest), [MatchChar | Rest]).
--define(BYTE_ORDER_MARK_1, undefined_bom1).
--define(BYTE_ORDER_MARK_2, undefined_bom2).
--define(BYTE_ORDER_MARK_REST(Rest), [undefined|Rest]).
+
+-define(PARSE_BYTE_ORDER_MARK(Bytes, State),
+ parse_byte_order_mark(Bytes, State) ->
+ parse_xml_decl(Bytes, State)).
+
+-define(PARSE_XML_DECL(Bytes, State),
+ parse_xml_decl(Bytes, State) ->
+ parse_prolog(Bytes, State)).
+
+-define(WHITESPACE(Bytes, State, Acc),
+ whitespace(?STRING_UNBOUND_REST(_C, _) = Bytes, State, Acc) ->
+ {lists:reverse(Acc), Bytes, State}).
+
+-define(PARSE_EXTERNAL_ENTITY_BYTE_ORDER_MARK(Bytes, State),
+ parse_external_entity_byte_order_mark(Bytes, State) ->
+ parse_external_entity_1(Bytes, State)).
diff --git a/lib/xmerl/src/xmerl_sax_parser_utf16be.erlsrc b/lib/xmerl/src/xmerl_sax_parser_utf16be.erlsrc
index ff84ece97a..ec89024729 100644
--- a/lib/xmerl/src/xmerl_sax_parser_utf16be.erlsrc
+++ b/lib/xmerl/src/xmerl_sax_parser_utf16be.erlsrc
@@ -2,7 +2,7 @@
%%--------------------------------------------------------------------
%% %CopyrightBegin%
%%
-%% Copyright Ericsson AB 2008-2016. All Rights Reserved.
+%% Copyright Ericsson AB 2008-2017. All Rights Reserved.
%%
%% Licensed under the Apache License, Version 2.0 (the "License");
%% you may not use this file except in compliance with the License.
@@ -34,8 +34,50 @@
-define(APPEND_STRING(Rest, New), <<Rest/binary, New/binary>>).
-define(TO_INPUT_FORMAT(Val), unicode:characters_to_binary(Val, unicode, {utf16, big})).
-%% STRING_REST and STRING_UNBOUND_REST is only different in the list case
-define(STRING_UNBOUND_REST(MatchChar, Rest), <<MatchChar/big-utf16, Rest/binary>>).
--define(BYTE_ORDER_MARK_1, undefined_bom1).
--define(BYTE_ORDER_MARK_2, <<16#FE>>).
+-define(BYTE_ORDER_MARK_1, <<16#FE>>).
-define(BYTE_ORDER_MARK_REST(Rest), <<16#FE, 16#FF, Rest/binary>>).
+
+-define(PARSE_BYTE_ORDER_MARK(Bytes, State),
+ parse_byte_order_mark(?STRING_EMPTY, State) ->
+ cf(?STRING_EMPTY, State, fun parse_byte_order_mark/2);
+ parse_byte_order_mark(?BYTE_ORDER_MARK_1, State) ->
+ cf(?BYTE_ORDER_MARK_1, State, fun parse_byte_order_mark/2);
+ parse_byte_order_mark(?BYTE_ORDER_MARK_REST(Rest), State) ->
+ parse_xml_decl(Rest, State);
+ parse_byte_order_mark(Bytes, State) ->
+ parse_xml_decl(Bytes, State)).
+
+-define(PARSE_XML_DECL(Bytes, State),
+ parse_xml_decl(Bytes, #xmerl_sax_parser_state{encoding=Enc} = State) when is_binary(Bytes) ->
+ case unicode:characters_to_list(Bytes, Enc) of
+ {incomplete, _, _} ->
+ cf(Bytes, State, fun parse_xml_decl/2);
+ {error, _Encoded, _Rest} ->
+ ?fatal_error(State, lists:flatten(io_lib:format("Bad character, not in ~p\n", [Enc])));
+ _ ->
+ parse_prolog(Bytes, State)
+ end;
+ parse_xml_decl(Bytes, State) ->
+ parse_prolog(Bytes, State)).
+
+-define(WHITESPACE(Bytes, State, Acc),
+ whitespace(?STRING_UNBOUND_REST(_C, _) = Bytes, State, Acc) ->
+ {lists:reverse(Acc), Bytes, State};
+ whitespace(Bytes, #xmerl_sax_parser_state{encoding=Enc} = State, Acc) when is_binary(Bytes) ->
+ case unicode:characters_to_list(Bytes, Enc) of
+ {incomplete, _, _} ->
+ cf(Bytes, State, Acc, fun whitespace/3);
+ {error, _Encoded, _Rest} ->
+ ?fatal_error(State, lists:flatten(io_lib:format("Bad character, not in ~p\n", [Enc])))
+ end).
+
+-define(PARSE_EXTERNAL_ENTITY_BYTE_ORDER_MARK(Bytes, State),
+ parse_external_entity_byte_order_mark(?STRING_EMPTY, State) ->
+ cf(?STRING_EMPTY, State, fun parse_external_entity_byte_order_mark/2);
+ parse_external_entity_byte_order_mark(?BYTE_ORDER_MARK_1, State) ->
+ cf(?BYTE_ORDER_MARK_1, State, fun parse_external_entity_byte_order_mark/2);
+ parse_external_entity_byte_order_mark(?BYTE_ORDER_MARK_REST(Rest), State) ->
+ parse_external_entity_1(Rest, State);
+ parse_external_entity_byte_order_mark(Bytes, State) ->
+ parse_external_entity_1(Bytes, State)).
diff --git a/lib/xmerl/src/xmerl_sax_parser_utf16le.erlsrc b/lib/xmerl/src/xmerl_sax_parser_utf16le.erlsrc
index a330fce8d0..566333a045 100644
--- a/lib/xmerl/src/xmerl_sax_parser_utf16le.erlsrc
+++ b/lib/xmerl/src/xmerl_sax_parser_utf16le.erlsrc
@@ -2,7 +2,7 @@
%%--------------------------------------------------------------------
%% %CopyrightBegin%
%%
-%% Copyright Ericsson AB 2008-2016. All Rights Reserved.
+%% Copyright Ericsson AB 2008-2017. All Rights Reserved.
%%
%% Licensed under the Apache License, Version 2.0 (the "License");
%% you may not use this file except in compliance with the License.
@@ -34,8 +34,50 @@
-define(APPEND_STRING(Rest, New), <<Rest/binary, New/binary>>).
-define(TO_INPUT_FORMAT(Val), unicode:characters_to_binary(Val, unicode, {utf16, little})).
-%% STRING_REST and STRING_UNBOUND_REST is only different in the list case
-define(STRING_UNBOUND_REST(MatchChar, Rest), <<MatchChar/little-utf16, Rest/binary>>).
--define(BYTE_ORDER_MARK_1, undefined_bom1).
--define(BYTE_ORDER_MARK_2, <<16#FF>>).
+-define(BYTE_ORDER_MARK_1, <<16#FF>>).
-define(BYTE_ORDER_MARK_REST(Rest), <<16#FF, 16#FE, Rest/binary>>).
+
+-define(PARSE_BYTE_ORDER_MARK(Bytes, State),
+ parse_byte_order_mark(?STRING_EMPTY, State) ->
+ cf(?STRING_EMPTY, State, fun parse_byte_order_mark/2);
+ parse_byte_order_mark(?BYTE_ORDER_MARK_1, State) ->
+ cf(?BYTE_ORDER_MARK_1, State, fun parse_byte_order_mark/2);
+ parse_byte_order_mark(?BYTE_ORDER_MARK_REST(Rest), State) ->
+ parse_xml_decl(Rest, State);
+ parse_byte_order_mark(Bytes, State) ->
+ parse_xml_decl(Bytes, State)).
+
+-define(PARSE_XML_DECL(Bytes, State),
+ parse_xml_decl(Bytes, #xmerl_sax_parser_state{encoding=Enc} = State) when is_binary(Bytes) ->
+ case unicode:characters_to_list(Bytes, Enc) of
+ {incomplete, _, _} ->
+ cf(Bytes, State, fun parse_xml_decl/2);
+ {error, _Encoded, _Rest} ->
+ ?fatal_error(State, lists:flatten(io_lib:format("Bad character, not in ~p\n", [Enc])));
+ _ ->
+ parse_prolog(Bytes, State)
+ end;
+ parse_xml_decl(Bytes, State) ->
+ parse_prolog(Bytes, State)).
+
+-define(WHITESPACE(Bytes, State, Acc),
+ whitespace(?STRING_UNBOUND_REST(_C, _) = Bytes, State, Acc) ->
+ {lists:reverse(Acc), Bytes, State};
+ whitespace(Bytes, #xmerl_sax_parser_state{encoding=Enc} = State, Acc) when is_binary(Bytes) ->
+ case unicode:characters_to_list(Bytes, Enc) of
+ {incomplete, _, _} ->
+ cf(Bytes, State, Acc, fun whitespace/3);
+ {error, _Encoded, _Rest} ->
+ ?fatal_error(State, lists:flatten(io_lib:format("Bad character, not in ~p\n", [Enc])))
+ end).
+
+-define(PARSE_EXTERNAL_ENTITY_BYTE_ORDER_MARK(Bytes, State),
+ parse_external_entity_byte_order_mark(?STRING_EMPTY, State) ->
+ cf(?STRING_EMPTY, State, fun parse_external_entity_byte_order_mark/2);
+ parse_external_entity_byte_order_mark(?BYTE_ORDER_MARK_1, State) ->
+ cf(?BYTE_ORDER_MARK_1, State, fun parse_external_entity_byte_order_mark/2);
+ parse_external_entity_byte_order_mark(?BYTE_ORDER_MARK_REST(Rest), State) ->
+ parse_external_entity_1(Rest, State);
+ parse_external_entity_byte_order_mark(Bytes, State) ->
+ parse_external_entity_1(Bytes, State)).
diff --git a/lib/xmerl/src/xmerl_sax_parser_utf8.erlsrc b/lib/xmerl/src/xmerl_sax_parser_utf8.erlsrc
index d46d60d237..f41d06d013 100644
--- a/lib/xmerl/src/xmerl_sax_parser_utf8.erlsrc
+++ b/lib/xmerl/src/xmerl_sax_parser_utf8.erlsrc
@@ -2,7 +2,7 @@
%%--------------------------------------------------------------------
%% %CopyrightBegin%
%%
-%% Copyright Ericsson AB 2008-2016. All Rights Reserved.
+%% Copyright Ericsson AB 2008-2017. All Rights Reserved.
%%
%% Licensed under the Apache License, Version 2.0 (the "License");
%% you may not use this file except in compliance with the License.
@@ -34,11 +34,55 @@
-define(APPEND_STRING(Rest, New), <<Rest/binary, New/binary>>).
-define(TO_INPUT_FORMAT(Val), unicode:characters_to_binary(Val, unicode, utf8)).
-
-%% STRING_REST and STRING_UNBOUND_REST is only different in the list case
-define(STRING_UNBOUND_REST(MatchChar, Rest), <<MatchChar/utf8, Rest/binary>>).
-define(BYTE_ORDER_MARK_1, <<16#EF>>).
-define(BYTE_ORDER_MARK_2, <<16#EF, 16#BB>>).
-define(BYTE_ORDER_MARK_REST(Rest), <<16#EF, 16#BB, 16#BF, Rest/binary>>).
+-define(PARSE_BYTE_ORDER_MARK(Bytes, State),
+ parse_byte_order_mark(?STRING_EMPTY, State) ->
+ cf(?STRING_EMPTY, State, fun parse_byte_order_mark/2);
+ parse_byte_order_mark(?BYTE_ORDER_MARK_1, State) ->
+ cf(?BYTE_ORDER_MARK_1, State, fun parse_byte_order_mark/2);
+ parse_byte_order_mark(?BYTE_ORDER_MARK_2, State) ->
+ cf(?BYTE_ORDER_MARK_2, State, fun parse_byte_order_mark/2);
+ parse_byte_order_mark(?BYTE_ORDER_MARK_REST(Rest), State) ->
+ parse_xml_decl(Rest, State);
+ parse_byte_order_mark(Bytes, State) ->
+ parse_xml_decl(Bytes, State)).
+
+-define(PARSE_XML_DECL(Bytes, State),
+ parse_xml_decl(Bytes, #xmerl_sax_parser_state{encoding=Enc} = State) when is_binary(Bytes) ->
+ case unicode:characters_to_list(Bytes, Enc) of
+ {incomplete, _, _} ->
+ cf(Bytes, State, fun parse_xml_decl/2);
+ {error, _Encoded, _Rest} ->
+ ?fatal_error(State, lists:flatten(io_lib:format("Bad character, not in ~p\n", [Enc])));
+ _ ->
+ parse_prolog(Bytes, State)
+ end;
+ parse_xml_decl(Bytes, State) ->
+ parse_prolog(Bytes, State)).
+
+-define(WHITESPACE(Bytes, State, Acc),
+ whitespace(?STRING_UNBOUND_REST(_C, _) = Bytes, State, Acc) ->
+ {lists:reverse(Acc), Bytes, State};
+ whitespace(Bytes, #xmerl_sax_parser_state{encoding=Enc} = State, Acc) when is_binary(Bytes) ->
+ case unicode:characters_to_list(Bytes, Enc) of
+ {incomplete, _, _} ->
+ cf(Bytes, State, Acc, fun whitespace/3);
+ {error, _Encoded, _Rest} ->
+ ?fatal_error(State, lists:flatten(io_lib:format("Bad character, not in ~p\n", [Enc])))
+ end).
+-define(PARSE_EXTERNAL_ENTITY_BYTE_ORDER_MARK(Bytes, State),
+ parse_external_entity_byte_order_mark(?STRING_EMPTY, State) ->
+ cf(?STRING_EMPTY, State, fun parse_external_entity_byte_order_mark/2);
+ parse_external_entity_byte_order_mark(?BYTE_ORDER_MARK_1, State) ->
+ cf(?BYTE_ORDER_MARK_1, State, fun parse_external_entity_byte_order_mark/2);
+ parse_external_entity_byte_order_mark(?BYTE_ORDER_MARK_2, State) ->
+ cf(?BYTE_ORDER_MARK_2, State, fun parse_external_entity_byte_order_mark/2);
+ parse_external_entity_byte_order_mark(?BYTE_ORDER_MARK_REST(Rest), State) ->
+ parse_external_entity_1(Rest, State);
+ parse_external_entity_byte_order_mark(Bytes, State) ->
+ parse_external_entity_1(Bytes, State)).
diff --git a/lib/xmerl/src/xmerl_sax_simple_dom.erl b/lib/xmerl/src/xmerl_sax_simple_dom.erl
index 7eb3afd499..7b15cd92dc 100644
--- a/lib/xmerl/src/xmerl_sax_simple_dom.erl
+++ b/lib/xmerl/src/xmerl_sax_simple_dom.erl
@@ -2,7 +2,7 @@
%%--------------------------------------------------------------------
%% %CopyrightBegin%
%%
-%% Copyright Ericsson AB 2009-2016. All Rights Reserved.
+%% Copyright Ericsson AB 2009-2017. All Rights Reserved.
%%
%% Licensed under the Apache License, Version 2.0 (the "License");
%% you may not use this file except in compliance with the License.
@@ -129,8 +129,9 @@ build_dom(endDocument,
State#xmerl_sax_simple_dom_state{dom=[Decl, {Tag, Attributes,
lists:reverse(Content)}]};
_ ->
- ?dbg("~p\n", [D]),
- ?error("we're not at end the document when endDocument event is encountered.")
+ %% endDocument is also sent by the parser when a fault occur to tell
+ %% the event receiver that no more input will be sent
+ State
end;
%% Element
diff --git a/lib/xmerl/src/xmerl_scan.erl b/lib/xmerl/src/xmerl_scan.erl
index 2147a46a13..a1f6ad4e2c 100644
--- a/lib/xmerl/src/xmerl_scan.erl
+++ b/lib/xmerl/src/xmerl_scan.erl
@@ -1,7 +1,7 @@
%%
%% %CopyrightBegin%
%%
-%% Copyright Ericsson AB 2003-2016. All Rights Reserved.
+%% Copyright Ericsson AB 2003-2017. All Rights Reserved.
%%
%% Licensed under the Apache License, Version 2.0 (the "License");
%% you may not use this file except in compliance with the License.
@@ -111,13 +111,16 @@
%% <dd>Set to 'true' if xmerl should add to elements missing attributes
%% with a defined default value (default 'false').</dd>
%% </dl>
+%% @type xmlElement() = #xmlElement{}.
+%% The record definition is found in xmerl.hrl.
+%% @type xmlDocument() = #xmlDocument{}.
+%% The record definition is found in xmerl.hrl.
%% @type document() = xmlElement() | xmlDocument(). <p>
%% The document returned by <tt>xmerl_scan:string/[1,2]</tt> and
%% <tt>xmerl_scan:file/[1,2]</tt>. The type of the returned record depends on
%% the value of the document option passed to the function.
%% </p>
-
-module(xmerl_scan).
-vsn('0.20').
-date('03-09-16').
@@ -471,8 +474,8 @@ event(_X, S) ->
%% into multiple objects (in which case {Acc',Pos',S'} should be returned.)
%% If {Acc',S'} is returned, Pos will be incremented by 1 by default.
%% Below is an example of an acceptable operation
-acc(X = #xmlText{value = Text}, Acc, S) ->
- {[X#xmlText{value = Text}|Acc], S};
+acc(#xmlText{value = Text}, [X = #xmlText{value = AccText}], S) ->
+ {[X#xmlText{value = AccText ++ Text}], S};
acc(X, Acc, S) ->
{[X|Acc], S}.
@@ -2222,16 +2225,18 @@ processed_whole_element(S=#xmerl_scanner{hook_fun = _Hook,
AllAttrs =
case S#xmerl_scanner.default_attrs of
true ->
- [ #xmlAttribute{name = AttName,
- parents = [{Name, Pos} | Parents],
- language = Lang,
- nsinfo = NSI,
- namespace = Namespace,
- value = AttValue,
- normalized = true} ||
- {AttName, AttValue} <- get_default_attrs(S, Name),
- AttValue =/= no_value,
- not lists:keymember(AttName, #xmlAttribute.name, Attrs) ];
+ DefaultAttrs =
+ [ #xmlAttribute{name = AttName,
+ parents = [{Name, Pos} | Parents],
+ language = Lang,
+ nsinfo = NSI,
+ namespace = Namespace,
+ value = AttValue,
+ normalized = true} ||
+ {AttName, AttValue} <- get_default_attrs(S, Name),
+ AttValue =/= no_value,
+ not lists:keymember(AttName, #xmlAttribute.name, Attrs) ],
+ lists:append(Attrs, DefaultAttrs);
false ->
Attrs
end,
@@ -2304,7 +2309,9 @@ expanded_name(Name, [], #xmlNamespace{default = URI}, S) ->
expanded_name(Name, N = {"xmlns", Local}, #xmlNamespace{nodes = Ns}, S) ->
{_, Value} = lists:keyfind(Local, 1, Ns),
case Name of
- 'xmlns:xml' when Value =/= 'http://www.w3.org/XML/1998/namespace' ->
+ 'xmlns:xml' when Value =:= 'http://www.w3.org/XML/1998/namespace' ->
+ N;
+ 'xmlns:xml' when Value =/= 'http://www.w3.org/XML/1998/namespace' ->
?fatal({xml_prefix_cannot_be_redeclared, Value}, S);
'xmlns:xmlns' ->
?fatal({xmlns_prefix_cannot_be_declared, Value}, S);
@@ -2318,6 +2325,8 @@ expanded_name(Name, N = {"xmlns", Local}, #xmlNamespace{nodes = Ns}, S) ->
N
end
end;
+expanded_name(_Name, {"xml", Local}, _NS, _S) ->
+ {'http://www.w3.org/XML/1998/namespace', list_to_atom(Local)};
expanded_name(_Name, {Prefix, Local}, #xmlNamespace{nodes = Ns}, S) ->
case lists:keysearch(Prefix, 1, Ns) of
{value, {_, URI}} ->
@@ -2328,9 +2337,6 @@ expanded_name(_Name, {Prefix, Local}, #xmlNamespace{nodes = Ns}, S) ->
?fatal({namespace_prefix_not_declared, Prefix}, S)
end.
-
-
-
keyreplaceadd(K, Pos, [H|T], Obj) when K == element(Pos, H) ->
[Obj|T];
keyreplaceadd(K, Pos, [H|T], Obj) ->
diff --git a/lib/xmerl/src/xmerl_xpath.erl b/lib/xmerl/src/xmerl_xpath.erl
index bbebda1030..6146feba49 100644
--- a/lib/xmerl/src/xmerl_xpath.erl
+++ b/lib/xmerl/src/xmerl_xpath.erl
@@ -43,13 +43,27 @@
%% </pre>
%%
%% @type nodeEntity() =
-%% xmlElement()
-%% | xmlAttribute()
-%% | xmlText()
-%% | xmlPI()
-%% | xmlComment()
-%% | xmlNsNode()
-%% | xmlDocument()
+%% #xmlElement{}
+%% | #xmlAttribute{}
+%% | #xmlText{}
+%% | #xmlPI{}
+%% | #xmlComment{}
+%% | #xmlNsNode{}
+%% | #xmlDocument{}
+%%
+%% @type docNodes() = #xmlElement{}
+%% | #xmlAttribute{}
+%% | #xmlText{}
+%% | #xmlPI{}
+%% | #xmlComment{}
+%% | #xmlNsNode{}
+%%
+%% @type docEntity() = #xmlDocument{} | [docNodes()]
+%%
+%% @type xPathString() = string()
+%%
+%% @type parentList() = [{atom(), integer()}]
+%%
%% @type option_list(). <p>Options allows to customize the behaviour of the
%% XPath scanner.
%% </p>
@@ -115,7 +129,7 @@ string(Str, Doc, Options) ->
%% Parents = parentList()
%% Doc = nodeEntity()
%% Options = option_list()
-%% Scalar = xmlObj
+%% Scalar = #xmlObj{}
%% @doc Extracts the nodes from the parsed XML tree according to XPath.
%% xmlObj is a record with fields type and value,
%% where type is boolean | number | string
diff --git a/lib/xmerl/src/xmerl_xs.erl b/lib/xmerl/src/xmerl_xs.erl
index 3e9f6622b8..1ce76cfa41 100644
--- a/lib/xmerl/src/xmerl_xs.erl
+++ b/lib/xmerl/src/xmerl_xs.erl
@@ -45,7 +45,6 @@
% XSLT package which is written i C++.
% See also the <a href="xmerl_xs_examples.html">Tutorial</a>.
% </p>
-
-module(xmerl_xs).
-export([xslapply/2, value_of/1, select/2, built_in_rules/2 ]).
@@ -71,15 +70,13 @@
%% xslapply(fun template/1, E),
%% "&lt;/h1>"];
%% </pre>
-
xslapply(Fun, EList) when is_list(EList) ->
- lists:map( Fun, EList);
+ lists:map(Fun, EList);
xslapply(Fun, E = #xmlElement{})->
lists:map( Fun, E#xmlElement.content).
-
%% @spec value_of(E) -> List
-%% E = unknown()
+%% E = term()
%%
%% @doc Concatenates all text nodes within the tree.
%%
diff --git a/lib/xmerl/src/xmerl_xsd.erl b/lib/xmerl/src/xmerl_xsd.erl
index 4b5efae8dd..a89b3159ec 100644
--- a/lib/xmerl/src/xmerl_xsd.erl
+++ b/lib/xmerl/src/xmerl_xsd.erl
@@ -49,6 +49,7 @@
%% <dd>It is possible by this option to provide a state with process
%% information from an earlier validation.</dd>
%% </dl>
+%% @type filename() = string()
%% @end
%%%-------------------------------------------------------------------
-module(xmerl_xsd).
@@ -138,7 +139,7 @@ state2file(S=#xsd_state{schema_name=SN}) ->
%% @spec state2file(State,FileName) -> ok | {error,Reason}
%% State = global_state()
-%% FileName = filename()
+%% FileName = string()
%% @doc Saves the schema state with all information of the processed
%% schema in a file. You can provide the file name for the saved
%% state. FileName is saved with the <code>.xss</code> extension
@@ -153,7 +154,7 @@ state2file(S,FileName) when is_record(S,xsd_state) ->
%% @spec file2state(FileName) -> {ok,State} | {error,Reason}
%% State = global_state()
-%% FileName = filename()
+%% FileName = string()
%% @doc Reads the schema state with all information of the processed
%% schema from a file created with <code>state2file/[1,2]</code>. The
%% format of this file is internal. The state can then be used
@@ -202,7 +203,7 @@ xmerl_xsd_vsn_check(S=#xsd_state{vsn=MD5_VSN}) ->
process_validate(Schema,Xml) ->
process_validate(Schema,Xml,[]).
%% @spec process_validate(Schema,Element,Options) -> Result
-%% Schema = filename()
+%% Schema = string()
%% Element = XmlElement
%% Options = option_list()
%% Result = {ValidXmlElement,State} | {error,Reason}
@@ -282,7 +283,7 @@ validate3(_,_,S) ->
process_schema(Schema) ->
process_schema(Schema,[]).
%% @spec process_schema(Schema,Options) -> Result
-%% Schema = filename()
+%% Schema = string()
%% Result = {ok,State} | {error,Reason}
%% State = global_state()
%% Reason = [ErrorReason] | ErrorReason
@@ -324,7 +325,7 @@ process_schema2({SE,_},State,_Schema) ->
process_schemas(Schemas) ->
process_schemas(Schemas,[]).
%% @spec process_schemas(Schemas,Options) -> Result
-%% Schemas = [{NameSpace,filename()}|Schemas] | []
+%% Schemas = [{NameSpace,string()}|Schemas] | []
%% Result = {ok,State} | {error,Reason}
%% Reason = [ErrorReason] | ErrorReason
%% Options = option_list()
@@ -5426,7 +5427,7 @@ add_key_once(Key,N,El,L) ->
%% {filename:join([[io_lib:format("/~w(~w)",[X,Y])||{X,Y}<-Parents],Type]),Pos}.
%% @spec format_error(Errors) -> Result
-%% Errors = error_tuple() | [error_tuple()]
+%% Errors = tuple() | [tuple()]
%% Result = string() | [string()]
%% @doc Formats error descriptions to human readable strings.
format_error(L) when is_list(L) ->
diff --git a/lib/xmerl/test/Makefile b/lib/xmerl/test/Makefile
index 7a326e334f..3204f081ba 100644
--- a/lib/xmerl/test/Makefile
+++ b/lib/xmerl/test/Makefile
@@ -1,7 +1,7 @@
#
# %CopyrightBegin%
#
-# Copyright Ericsson AB 2004-2016. All Rights Reserved.
+# Copyright Ericsson AB 2004-2017. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
@@ -55,7 +55,8 @@ SUITE_FILES= \
xmerl_xsd_SUITE.erl \
xmerl_xsd_MS2002-01-16_SUITE.erl \
xmerl_xsd_NIST2002-01-16_SUITE.erl \
- xmerl_xsd_Sun2002-01-16_SUITE.erl
+ xmerl_xsd_Sun2002-01-16_SUITE.erl \
+ xmerl_sax_stream_SUITE.erl
XML_FILES= \
testcases.dtd \
@@ -125,4 +126,6 @@ release_tests_spec: opt
@tar cfh - xmerl_xsd_MS2002-01-16_SUITE_data | (cd "$(RELSYSDIR)"; tar xf -)
@tar cfh - xmerl_xsd_NIST2002-01-16_SUITE_data | (cd "$(RELSYSDIR)"; tar xf -)
@tar cfh - xmerl_xsd_Sun2002-01-16_SUITE_data | (cd "$(RELSYSDIR)"; tar xf -)
+ @tar cfh - xmerl_sax_SUITE_data | (cd "$(RELSYSDIR)"; tar xf -)
+ @tar cfh - xmerl_sax_stream_SUITE_data | (cd "$(RELSYSDIR)"; tar xf -)
chmod -R u+w "$(RELSYSDIR)"
diff --git a/lib/xmerl/test/xmerl_SUITE.erl b/lib/xmerl/test/xmerl_SUITE.erl
index e97b8c6a4b..66f04f6c25 100644
--- a/lib/xmerl/test/xmerl_SUITE.erl
+++ b/lib/xmerl/test/xmerl_SUITE.erl
@@ -1,7 +1,7 @@
%%
%% %CopyrightBegin%
%%
-%% Copyright Ericsson AB 2008-2016. All Rights Reserved.
+%% Copyright Ericsson AB 2008-2017. All Rights Reserved.
%%
%% Licensed under the Apache License, Version 2.0 (the "License");
%% you may not use this file except in compliance with the License.
@@ -54,7 +54,8 @@ groups() ->
cpd_expl_provided_DTD]},
{misc, [],
[latin1_alias, syntax_bug1, syntax_bug2, syntax_bug3,
- pe_ref1, copyright, testXSEIF, export_simple1, export]},
+ pe_ref1, copyright, testXSEIF, export_simple1, export,
+ default_attrs_bug, xml_ns, scan_splits_string_bug]},
{eventp_tests, [], [sax_parse_and_export]},
{ticket_tests, [],
[ticket_5998, ticket_7211, ticket_7214, ticket_7430,
@@ -223,6 +224,54 @@ syntax_bug3(Config) ->
Err -> Err
end.
+default_attrs_bug(Config) ->
+ file:set_cwd(datadir(Config)),
+ Doc = "<!DOCTYPE doc [<!ATTLIST doc b CDATA \"default\">]>\n"
+ "<doc a=\"explicit\"/>",
+ {#xmlElement{attributes = [#xmlAttribute{name = a, value = "explicit"},
+ #xmlAttribute{name = b, value = "default"}]},
+ []
+ } = xmerl_scan:string(Doc, [{default_attrs, true}]),
+ Doc2 = "<!DOCTYPE doc [<!ATTLIST doc b CDATA \"default\">]>\n"
+ "<doc b=\"also explicit\" a=\"explicit\"/>",
+ {#xmlElement{attributes = [#xmlAttribute{name = b, value = "also explicit"},
+ #xmlAttribute{name = a, value = "explicit"}]},
+ []
+ } = xmerl_scan:string(Doc2, [{default_attrs, true}]),
+ ok.
+
+
+xml_ns(Config) ->
+ Doc = "<?xml version='1.0'?>\n"
+ "<doc xml:attr1=\"implicit xml ns\"/>",
+ {#xmlElement{namespace=#xmlNamespace{default = [], nodes = []},
+ attributes = [#xmlAttribute{name = 'xml:attr1',
+ expanded_name = {'http://www.w3.org/XML/1998/namespace',attr1},
+ nsinfo = {"xml","attr1"},
+ namespace = #xmlNamespace{default = [], nodes = []}}]},
+ []
+ } = xmerl_scan:string(Doc, [{namespace_conformant, true}]),
+ Doc2 = "<?xml version='1.0'?>\n"
+ "<doc xmlns:xml=\"http://www.w3.org/XML/1998/namespace\" xml:attr1=\"explicit xml ns\"/>",
+ {#xmlElement{namespace=#xmlNamespace{default = [], nodes = [{"xml",'http://www.w3.org/XML/1998/namespace'}]},
+ attributes = [#xmlAttribute{name = 'xmlns:xml',
+ expanded_name = {"xmlns","xml"},
+ nsinfo = {"xmlns","xml"},
+ namespace = #xmlNamespace{default = [],
+ nodes = [{"xml",'http://www.w3.org/XML/1998/namespace'}]}},
+ #xmlAttribute{name = 'xml:attr1',
+ expanded_name = {'http://www.w3.org/XML/1998/namespace',attr1},
+ nsinfo = {"xml","attr1"},
+ namespace = #xmlNamespace{default = [],
+ nodes = [{"xml",'http://www.w3.org/XML/1998/namespace'}]}}]},
+ []
+ } = xmerl_scan:string(Doc2, [{namespace_conformant, true}]),
+ ok.
+
+scan_splits_string_bug(_Config) ->
+ {#xmlElement{ content = [#xmlText{ value = "Jimmy Zöger" }] }, []}
+ = xmerl_scan:string("<name>Jimmy Z&#246;ger</name>").
+
pe_ref1(Config) ->
file:set_cwd(datadir(Config)),
{#xmlElement{},[]} = xmerl_scan:file(datadir_join(Config,[misc,"PE_ref1.xml"]),[{validation,true}]).
@@ -488,8 +537,7 @@ ticket_7430(Config) ->
{xmlElement,a,a,[],
{xmlNamespace,[],[]},
[],1,[],
- [{xmlText,[{a,1}],1,[],"é",text},
- {xmlText,[{a,1}],2,[],"\né",text}],
+ [{xmlText,[{a,1}],1,[],"é\né",text}],
[],_,undeclared} ->
ok;
_ ->
diff --git a/lib/xmerl/test/xmerl_sax_SUITE.erl b/lib/xmerl/test/xmerl_sax_SUITE.erl
index f5c0a783c4..68b9bcc4a2 100644
--- a/lib/xmerl/test/xmerl_sax_SUITE.erl
+++ b/lib/xmerl/test/xmerl_sax_SUITE.erl
@@ -2,7 +2,7 @@
%%----------------------------------------------------------------------
%% %CopyrightBegin%
%%
-%% Copyright Ericsson AB 2010-2016. All Rights Reserved.
+%% Copyright Ericsson AB 2010-2017. All Rights Reserved.
%%
%% Licensed under the Apache License, Version 2.0 (the "License");
%% you may not use this file except in compliance with the License.
@@ -33,7 +33,6 @@
%%======================================================================
%% External functions
%%======================================================================
-
%%----------------------------------------------------------------------
%% Initializations
%%----------------------------------------------------------------------
@@ -42,12 +41,15 @@ all() ->
[{group, bugs}].
groups() ->
- [{bugs, [], [ticket_8213, ticket_8214, ticket_11551]}].
+ [{bugs, [], [ticket_8213, ticket_8214, ticket_11551,
+ fragmented_xml_directive,
+ old_dom_event_fun_endDocument_bug,
+ event_fun_endDocument_error_test,
+ event_fun_startDocument_error_test]}].
-%%----------------------------------------------------------------------
+%%======================================================================
%% Tests
-%%----------------------------------------------------------------------
-
+%%======================================================================
%%----------------------------------------------------------------------
%% Test Case
%% ID: ticket_8213
@@ -56,7 +58,6 @@ ticket_8213(_Config) ->
{ok,ok,[]} = xmerl_sax_parser:stream("<elem/>", [{event_fun, fun (_E,_,_) -> ok end}]),
ok.
-
%%----------------------------------------------------------------------
%% Test Case
%% ID: ticket_8214
@@ -85,17 +86,80 @@ ticket_11551(_Config) ->
<a>hej</a>
<?xml version=\"1.0\" encoding=\"utf-8\" ?>
<a>hej</a>">>,
- {ok, undefined, <<"<?xml", _/binary>>} = xmerl_sax_parser:stream(Stream1, []),
+ {ok, undefined, <<"\n<?xml", _/binary>>} = xmerl_sax_parser:stream(Stream1, []),
Stream2= <<"<?xml version=\"1.0\" encoding=\"utf-8\" ?>
<a>hej</a>
<?xml version=\"1.0\" encoding=\"utf-8\" ?>
<a>hej</a>">>,
- {ok, undefined, <<"<?xml", _/binary>>} = xmerl_sax_parser:stream(Stream2, []),
+ {ok, undefined, <<"\n\n\n<?xml", _/binary>>} = xmerl_sax_parser:stream(Stream2, []),
Stream3= <<"<a>hej</a>
<?xml version=\"1.0\" encoding=\"utf-8\" ?>
<a>hej</a>">>,
- {ok, undefined, <<"<?xml", _/binary>>} = xmerl_sax_parser:stream(Stream3, []),
+ {ok, undefined, <<"\n\n<?xml", _/binary>>} = xmerl_sax_parser:stream(Stream3, []),
+ ok.
+
+%%----------------------------------------------------------------------
+%% Test Case
+%% ID: fragmented_xml_directive
+%% Test of fragmented xml directive by reading one byte per continuation ca
+fragmented_xml_directive(Config) ->
+ DataDir = proplists:get_value(data_dir, Config),
+ Name = filename:join(DataDir, "test_data_1.xml"),
+ {ok, Fd} = file:open(Name, [raw, read,binary]),
+ Cf = fun cf_fragmented_xml_directive/1,
+ {ok, undefined, _} = xmerl_sax_parser:stream(<<>>,
+ [{continuation_fun, Cf},
+ {continuation_state, Fd}]),
+ ok.
+
+%%----------------------------------------------------------------------
+%% Test Case
+%% ID: old_dom_event_fun_endDocument_bug
+%% The old_dom backend previous generateded an uncatched exception
+%% instead of the correct fatal_error from the parser.
+old_dom_event_fun_endDocument_bug(_Config) ->
+ %% Stream contains bad characters,
+ {fatal_error, _, _, _, _} =
+ xmerl_sax_parser:stream([60,63,120,109,108,32,118,101,114,115,105,111,110,61,39,49,46,48,39,32,101,110,99,111,100,105,110,103,61,39,117,116,102,45,56,39,63,62,60,
+ 99,111,109,109,97,110,100,62,60,104,101,97,100,101,114,62,60,116,114,97,110,115,97,99,116,105,111,110,73,100,62,49,60,47,116,114,97,110,
+ 115,97,99,116,105,111,110,73,100,62,60,47,104,101,97,100,101,114,62,60,98,111,100,121,62,95,226,130,172,59,60,60,47,98,111,100,121,62,60,
+ 47,99,111,109,109,97,110,100,62,60,47,120,49,95,49,62],
+ [{event_fun,fun xmerl_sax_old_dom:event/3},
+ {event_state,xmerl_sax_old_dom:initial_state()}]),
ok.
+
+%%----------------------------------------------------------------------
+%% Test Case
+%% ID: event_fun_endDocument_error_test
+event_fun_endDocument_error_test(_Config) ->
+ Stream = <<"<?xml version=\"1.0\" encoding=\"utf-8\"?><a>hej</a>">>,
+ Ef = fun(endDocument, _ , _) -> throw({event_error, "endDocument error"});
+ (_, _, S) -> S
+ end,
+ {event_error, _, _, _, _} = xmerl_sax_parser:stream(Stream, [{event_fun, Ef}]),
+ ok.
+
+%%----------------------------------------------------------------------
+%% Test Case
+%% ID: event_fun_startDocument_error_test
+event_fun_startDocument_error_test(_Config) ->
+ Stream = <<"<?xml version=\"1.0\" encoding=\"utf-8\"?><a>hej</a>">>,
+ Ef = fun(startDocument, _ , _) -> throw({event_error, "endDocument error"});
+ (_, _, S) -> S
+ end,
+ {event_error, _, _, _, _} = xmerl_sax_parser:stream(Stream, [{event_fun, Ef}]),
+ ok.
+
+%%======================================================================
+%% Internal functions
+%%======================================================================
+cf_fragmented_xml_directive(IoDevice) ->
+ case file:read(IoDevice, 1) of
+ eof ->
+ {<<>>, IoDevice};
+ {ok, FileBin} ->
+ {FileBin, IoDevice}
+ end.
diff --git a/lib/xmerl/test/xmerl_sax_SUITE_data/test_data_1.xml b/lib/xmerl/test/xmerl_sax_SUITE_data/test_data_1.xml
new file mode 100644
index 0000000000..efbaee6b81
--- /dev/null
+++ b/lib/xmerl/test/xmerl_sax_SUITE_data/test_data_1.xml
@@ -0,0 +1,4 @@
+<?xml version="1.0" encoding="utf-8"?>
+<a>
+Hej
+</a>
diff --git a/lib/xmerl/test/xmerl_sax_std_SUITE.erl b/lib/xmerl/test/xmerl_sax_std_SUITE.erl
index 525a3b175a..b8412206cc 100644
--- a/lib/xmerl/test/xmerl_sax_std_SUITE.erl
+++ b/lib/xmerl/test/xmerl_sax_std_SUITE.erl
@@ -2,7 +2,7 @@
%%----------------------------------------------------------------------
%% %CopyrightBegin%
%%
-%% Copyright Ericsson AB 2010-2016. All Rights Reserved.
+%% Copyright Ericsson AB 2010-2017. All Rights Reserved.
%%
%% Licensed under the Apache License, Version 2.0 (the "License");
%% you may not use this file except in compliance with the License.
@@ -507,11 +507,8 @@ end_per_testcase(_Func,_Config) ->
'not-wf-sa-036'(Config) ->
file:set_cwd(xmerl_test_lib:get_data_dir(Config)),
Path = filename:join([xmerl_test_lib:get_data_dir(Config),"xmltest","not-wf/sa/036.xml"]),
- %% Special case becase we returns everything after a legal document
- %% as an rest instead of giving and error to let the user handle
- %% multipple docs on a stream.
- {ok,_,<<"Illegal data\r\n">>} = xmerl_sax_parser:file(Path, [{event_fun, fun(_,_,S) -> S end}]).
- %%check_result(R, "not-wf").
+ R = xmerl_sax_parser:file(Path, [{event_fun, fun(_,_,S) -> S end}]),
+ check_result(R, "not-wf").
%%----------------------------------------------------------------------
%% Test Case
@@ -522,11 +519,8 @@ end_per_testcase(_Func,_Config) ->
'not-wf-sa-037'(Config) ->
file:set_cwd(xmerl_test_lib:get_data_dir(Config)),
Path = filename:join([xmerl_test_lib:get_data_dir(Config),"xmltest","not-wf/sa/037.xml"]),
- %% Special case becase we returns everything after a legal document
- %% as an rest instead of giving and error to let the user handle
- %% multipple docs on a stream.
- {ok,_,<<"&#32;\r\n">>} = xmerl_sax_parser:file(Path, [{event_fun, fun(_,_,S) -> S end}]).
- %%check_result(R, "not-wf").
+ R = xmerl_sax_parser:file(Path, [{event_fun, fun(_,_,S) -> S end}]),
+ check_result(R, "not-wf").
%%----------------------------------------------------------------------
%% Test Case
@@ -561,11 +555,8 @@ end_per_testcase(_Func,_Config) ->
'not-wf-sa-040'(Config) ->
file:set_cwd(xmerl_test_lib:get_data_dir(Config)),
Path = filename:join([xmerl_test_lib:get_data_dir(Config),"xmltest","not-wf/sa/040.xml"]),
- %% Special case becase we returns everything after a legal document
- %% as an rest instead of giving and error to let the user handle
- %% multipple docs on a stream.
- {ok,_,<<"<doc></doc>\r\n">>} = xmerl_sax_parser:file(Path, [{event_fun, fun(_,_,S) -> S end}]).
- %%check_result(R, "not-wf").
+ R = xmerl_sax_parser:file(Path, [{event_fun, fun(_,_,S) -> S end}]),
+ check_result(R, "not-wf").
%%----------------------------------------------------------------------
%% Test Case
@@ -576,11 +567,8 @@ end_per_testcase(_Func,_Config) ->
'not-wf-sa-041'(Config) ->
file:set_cwd(xmerl_test_lib:get_data_dir(Config)),
Path = filename:join([xmerl_test_lib:get_data_dir(Config),"xmltest","not-wf/sa/041.xml"]),
- %% Special case becase we returns everything after a legal document
- %% as an rest instead of giving and error to let the user handle
- %% multipple docs on a stream.
- {ok,_,<<"<doc></doc>\r\n">>} = xmerl_sax_parser:file(Path, [{event_fun, fun(_,_,S) -> S end}]).
- %%check_result(R, "not-wf").
+ R = xmerl_sax_parser:file(Path, [{event_fun, fun(_,_,S) -> S end}]),
+ check_result(R, "not-wf").
%%----------------------------------------------------------------------
%% Test Case
@@ -603,11 +591,8 @@ end_per_testcase(_Func,_Config) ->
'not-wf-sa-043'(Config) ->
file:set_cwd(xmerl_test_lib:get_data_dir(Config)),
Path = filename:join([xmerl_test_lib:get_data_dir(Config),"xmltest","not-wf/sa/043.xml"]),
- %% Special case becase we returns everything after a legal document
- %% as an rest instead of giving and error to let the user handle
- %% multipple docs on a stream.
- {ok,_,<<"Illegal data\r\n">>} = xmerl_sax_parser:file(Path, [{event_fun, fun(_,_,S) -> S end}]).
- %%check_result(R, "not-wf").
+ R = xmerl_sax_parser:file(Path, [{event_fun, fun(_,_,S) -> S end}]),
+ check_result(R, "not-wf").
%%----------------------------------------------------------------------
%% Test Case
@@ -618,11 +603,8 @@ end_per_testcase(_Func,_Config) ->
'not-wf-sa-044'(Config) ->
file:set_cwd(xmerl_test_lib:get_data_dir(Config)),
Path = filename:join([xmerl_test_lib:get_data_dir(Config),"xmltest","not-wf/sa/044.xml"]),
- %% Special case becase we returns everything after a legal document
- %% as an rest instead of giving and error to let the user handle
- %% multipple docs on a stream.
- {ok,_,<<"<doc/>\r\n">>} = xmerl_sax_parser:file(Path, [{event_fun, fun(_,_,S) -> S end}]).
- %%check_result(R, "not-wf").
+ R = xmerl_sax_parser:file(Path, [{event_fun, fun(_,_,S) -> S end}]),
+ check_result(R, "not-wf").
%%----------------------------------------------------------------------
%% Test Case
@@ -669,11 +651,8 @@ end_per_testcase(_Func,_Config) ->
'not-wf-sa-048'(Config) ->
file:set_cwd(xmerl_test_lib:get_data_dir(Config)),
Path = filename:join([xmerl_test_lib:get_data_dir(Config),"xmltest","not-wf/sa/048.xml"]),
- %% Special case becase we returns everything after a legal document
- %% as an rest instead of giving and error to let the user handle
- %% multipple docs on a stream.
- {ok,_,<<"<![CDATA[]]>\r\n">>} = xmerl_sax_parser:file(Path, [{event_fun, fun(_,_,S) -> S end}]).
- %%check_result(R, "not-wf").
+ R = xmerl_sax_parser:file(Path, [{event_fun, fun(_,_,S) -> S end}]),
+ check_result(R, "not-wf").
%%----------------------------------------------------------------------
%% Test Case
@@ -1416,11 +1395,8 @@ end_per_testcase(_Func,_Config) ->
'not-wf-sa-110'(Config) ->
file:set_cwd(xmerl_test_lib:get_data_dir(Config)),
Path = filename:join([xmerl_test_lib:get_data_dir(Config),"xmltest","not-wf/sa/110.xml"]),
- %% Special case becase we returns everything after a legal document
- %% as an rest instead of giving and error to let the user handle
- %% multipple docs on a stream.
- {ok,_,<<"&e;\r\n">>} = xmerl_sax_parser:file(Path, [{event_fun, fun(_,_,S) -> S end}]).
- %%check_result(R, "not-wf").
+ R = xmerl_sax_parser:file(Path, [{event_fun, fun(_,_,S) -> S end}]),
+ check_result(R, "not-wf").
%%----------------------------------------------------------------------
%% Test Case
@@ -1914,9 +1890,9 @@ end_per_testcase(_Func,_Config) ->
%% Special case becase we returns everything after a legal document
%% as an rest instead of giving and error to let the user handle
%% multipple docs on a stream.
- {ok,_,<<"<?xml version=\"1.0\"?>\r\n">>} = xmerl_sax_parser:file(Path, [{event_fun, fun(_,_,S) -> S end}]).
- % R = xmerl_sax_parser:file(Path, [{event_fun, fun(_,_,S) -> S end}]),
- % check_result(R, "not-wf").
+ %{ok,_,<<"<?xml version=\"1.0\"?>\r\n">>} = xmerl_sax_parser:file(Path, [{event_fun, fun(_,_,S) -> S end}]).
+ R = xmerl_sax_parser:file(Path, [{event_fun, fun(_,_,S) -> S end}]),
+ check_result(R, "not-wf").
%%----------------------------------------------------------------------
%% Test Case
@@ -7784,11 +7760,8 @@ end_per_testcase(_Func,_Config) ->
'o-p01fail3'(Config) ->
file:set_cwd(xmerl_test_lib:get_data_dir(Config)),
Path = filename:join([xmerl_test_lib:get_data_dir(Config),"oasis","p01fail3.xml"]),
- %% Special case becase we returns everything after a legal document
- %% as an rest instead of giving and error to let the user handle
- %% multipple docs on a stream.
- {ok,_, <<"<bad/>", _/binary>>} = xmerl_sax_parser:file(Path, [{event_fun, fun(_,_,S) -> S end}]).
- %%check_result(R, "not-wf").
+ R = xmerl_sax_parser:file(Path, [{event_fun, fun(_,_,S) -> S end}]),
+ check_result(R, "not-wf").
%%----------------------------------------------------------------------
%% Test Case
@@ -11417,12 +11390,8 @@ end_per_testcase(_Func,_Config) ->
'ibm-not-wf-P01-ibm01n02'(Config) ->
file:set_cwd(xmerl_test_lib:get_data_dir(Config)),
Path = filename:join([xmerl_test_lib:get_data_dir(Config),"ibm","not-wf/P01/ibm01n02.xml"]),
- %% Special case becase we returns everything after a legal document
- %% as an rest instead of giving and error to let the user handle
- %% multipple docs on a stream.
- {ok,_, <<"<?xml version=\"1.0\"?>", _/binary>>} = xmerl_sax_parser:file(Path, [{event_fun, fun(_,_,S) -> S end}]).
- % R = xmerl_sax_parser:file(Path, [{event_fun, fun(_,_,S) -> S end}]),
- % check_result(R, "not-wf").
+ R = xmerl_sax_parser:file(Path, [{event_fun, fun(_,_,S) -> S end}]),
+ check_result(R, "not-wf").
%%----------------------------------------------------------------------
%% Test Case
@@ -11433,11 +11402,8 @@ end_per_testcase(_Func,_Config) ->
'ibm-not-wf-P01-ibm01n03'(Config) ->
file:set_cwd(xmerl_test_lib:get_data_dir(Config)),
Path = filename:join([xmerl_test_lib:get_data_dir(Config),"ibm","not-wf/P01/ibm01n03.xml"]),
- %% Special case becase we returns everything after a legal document
- %% as an rest instead of giving and error to let the user handle
- %% multipple docs on a stream.
- {ok,_, <<"<title>Wrong combination!</title>", _/binary>>} = xmerl_sax_parser:file(Path, [{event_fun, fun(_,_,S) -> S end}]).
- %%check_result(R, "not-wf").
+ R = xmerl_sax_parser:file(Path, [{event_fun, fun(_,_,S) -> S end}]),
+ check_result(R, "not-wf").
%%----------------------------------------------------------------------
%% Test Cases
@@ -13027,11 +12993,8 @@ end_per_testcase(_Func,_Config) ->
'ibm-not-wf-P27-ibm27n01'(Config) ->
file:set_cwd(xmerl_test_lib:get_data_dir(Config)),
Path = filename:join([xmerl_test_lib:get_data_dir(Config),"ibm","not-wf/P27/ibm27n01.xml"]),
- %% Special case becase we returns everything after a legal document
- %% as an rest instead of giving and error to let the user handle
- %% multipple docs on a stream.
- {ok,_, <<"<!ELEMENT cat EMPTY>">>} = xmerl_sax_parser:file(Path, [{event_fun, fun(_,_,S) -> S end}]).
- %%check_result(R, "not-wf").
+ R = xmerl_sax_parser:file(Path, [{event_fun, fun(_,_,S) -> S end}]),
+ check_result(R, "not-wf").
%%----------------------------------------------------------------------
%% Test Cases
@@ -13461,11 +13424,8 @@ end_per_testcase(_Func,_Config) ->
'ibm-not-wf-P39-ibm39n06'(Config) ->
file:set_cwd(xmerl_test_lib:get_data_dir(Config)),
Path = filename:join([xmerl_test_lib:get_data_dir(Config),"ibm","not-wf/P39/ibm39n06.xml"]),
- %% Special case becase we returns everything after a legal document
- %% as an rest instead of giving and error to let the user handle
- %% multipple docs on a stream.
- {ok,_,<<"content after end tag\r\n">>} = xmerl_sax_parser:file(Path, [{event_fun, fun(_,_,S) -> S end}]).
- %%check_result(R, "not-wf").
+ R = xmerl_sax_parser:file(Path, [{event_fun, fun(_,_,S) -> S end}]),
+ check_result(R, "not-wf").
%%----------------------------------------------------------------------
%% Test Cases
diff --git a/lib/xmerl/test/xmerl_sax_stream_SUITE.erl b/lib/xmerl/test/xmerl_sax_stream_SUITE.erl
new file mode 100644
index 0000000000..7315f67374
--- /dev/null
+++ b/lib/xmerl/test/xmerl_sax_stream_SUITE.erl
@@ -0,0 +1,260 @@
+%%-*-erlang-*-
+%%----------------------------------------------------------------------
+%% %CopyrightBegin%
+%%
+%% Copyright Ericsson AB 2017. All Rights Reserved.
+%%
+%% Licensed under the Apache License, Version 2.0 (the "License");
+%% you may not use this file except in compliance with the License.
+%% You may obtain a copy of the License at
+%%
+%% http://www.apache.org/licenses/LICENSE-2.0
+%%
+%% Unless required by applicable law or agreed to in writing, software
+%% distributed under the License is distributed on an "AS IS" BASIS,
+%% WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+%% See the License for the specific language governing permissions and
+%% limitations under the License.
+%%
+%% %CopyrightEnd%
+%%----------------------------------------------------------------------
+%% File : xmerl_sax_stream_SUITE.erl
+%%----------------------------------------------------------------------
+-module(xmerl_sax_stream_SUITE).
+-compile(export_all).
+
+%%----------------------------------------------------------------------
+%% Include files
+%%----------------------------------------------------------------------
+-include_lib("common_test/include/ct.hrl").
+-include_lib("kernel/include/file.hrl").
+
+%%======================================================================
+%% External functions
+%%======================================================================
+
+%%----------------------------------------------------------------------
+%% Initializations
+%%----------------------------------------------------------------------
+all() ->
+ [
+ one_document,
+ two_documents,
+ one_document_and_junk,
+ end_of_stream
+ ].
+
+%%----------------------------------------------------------------------
+%% Initializations
+%%----------------------------------------------------------------------
+
+init_per_suite(Config) ->
+ Config.
+
+end_per_suite(_Config) ->
+ ok.
+
+init_per_testcase(_TestCase, Config) ->
+ Config.
+
+end_per_testcase(_Func, _Config) ->
+ ok.
+
+%%----------------------------------------------------------------------
+%% Tests
+%%----------------------------------------------------------------------
+
+%%----------------------------------------------------------------------
+%% Send One doc over stream
+one_document(Config) ->
+ Port = 11111,
+
+ {ok, ListenSocket} = listen(Port),
+ Self = self(),
+
+ spawn(
+ fun() ->
+ case catch gen_tcp:accept(ListenSocket) of
+ {ok, S} ->
+ Result = xmerl_sax_parser:stream(<<>>,
+ [{continuation_state, S},
+ {continuation_fun,
+ fun(Sd) ->
+ io:format("Continuation called!!", []),
+ case gen_tcp:recv(Sd, 0) of
+ {ok, Packet} ->
+ io:format("Packet: ~p\n", [Packet]),
+ {Packet, Sd};
+ {error, Reason} ->
+ throw({error, Reason})
+ end
+ end}]),
+ Self ! {xmerl_sax, Result},
+ close(S);
+ Error ->
+ Self ! {xmerl_sax, {error, {accept, Error}}}
+ end
+ end),
+
+ {ok, SendSocket} = connect(localhost, Port),
+
+ {ok, Binary} = file:read_file(filename:join([datadir(Config), "xmerl_sax_stream_one.xml"])),
+
+ send_chunks(SendSocket, Binary),
+
+ receive
+ {xmerl_sax, {ok, undefined, Rest}} ->
+ <<"\n">> = Rest,
+ io:format("Ok Rest: ~p\n", [Rest])
+ after 5000 ->
+ ct:fail("Timeout")
+ end,
+ ok.
+
+%%----------------------------------------------------------------------
+%% Send Two doc over stream
+two_documents(Config) ->
+ Port = 11111,
+
+ {ok, ListenSocket} = listen(Port),
+ Self = self(),
+
+ spawn(
+ fun() ->
+ case catch gen_tcp:accept(ListenSocket) of
+ {ok, S} ->
+ Result = xmerl_sax_parser:stream(<<>>,
+ [{continuation_state, S},
+ {continuation_fun,
+ fun(Sd) ->
+ io:format("Continuation called!!", []),
+ case gen_tcp:recv(Sd, 0) of
+ {ok, Packet} ->
+ io:format("Packet: ~p\n", [Packet]),
+ {Packet, Sd};
+ {error, Reason} ->
+ throw({error, Reason})
+ end
+ end}]),
+ Self ! {xmerl_sax, Result},
+ close(S);
+ Error ->
+ Self ! {xmerl_sax, {error, {accept, Error}}}
+ end
+ end),
+
+ {ok, SendSocket} = connect(localhost, Port),
+
+ {ok, Binary} = file:read_file(filename:join([datadir(Config), "xmerl_sax_stream_two.xml"])),
+
+ send_chunks(SendSocket, Binary),
+
+ receive
+ {xmerl_sax, {ok, undefined, Rest}} ->
+ <<"\n<?x", _R/binary>> = Rest,
+ io:format("Ok Rest: ~p\n", [Rest])
+ after 5000 ->
+ ct:fail("Timeout")
+ end,
+ ok.
+
+%%----------------------------------------------------------------------
+%% Send one doc and then junk on stream
+one_document_and_junk(Config) ->
+ Port = 11111,
+
+ {ok, ListenSocket} = listen(Port),
+ Self = self(),
+
+ spawn(
+ fun() ->
+ case catch gen_tcp:accept(ListenSocket) of
+ {ok, S} ->
+ Result = xmerl_sax_parser:stream(<<>>,
+ [{continuation_state, S},
+ {continuation_fun,
+ fun(Sd) ->
+ io:format("Continuation called!!", []),
+ case gen_tcp:recv(Sd, 0) of
+ {ok, Packet} ->
+ io:format("Packet: ~p\n", [Packet]),
+ {Packet, Sd};
+ {error, Reason} ->
+ throw({error, Reason})
+ end
+ end}]),
+ Self ! {xmerl_sax, Result},
+ close(S);
+ Error ->
+ Self ! {xmerl_sax, {error, {accept, Error}}}
+ end
+ end),
+
+ {ok, SendSocket} = connect(localhost, Port),
+
+ {ok, Binary} = file:read_file(filename:join([datadir(Config), "xmerl_sax_stream_one_junk.xml"])),
+
+ send_chunks(SendSocket, Binary),
+
+ receive
+ {xmerl_sax, {ok, undefined, Rest}} ->
+ <<"\nth", _R/binary>> = Rest,
+ io:format("Ok Rest: ~p\n", [Rest])
+ after 10000 ->
+ ct:fail("Timeout")
+ end,
+ ok.
+
+%%----------------------------------------------------------------------
+%% Test of continuation when end of stream
+end_of_stream(Config) ->
+ Stream = <<"<?xml version=\"1.0\" encoding=\"utf-8\"?><a>hej</a>">>,
+ {ok, undefined, <<>>} = xmerl_sax_parser:stream(Stream, []),
+ ok.
+
+%%----------------------------------------------------------------------
+%% Utility functions
+%%----------------------------------------------------------------------
+listen(Port) ->
+ case catch gen_tcp:listen(Port, [{active, false},
+ binary,
+ {keepalive, true},
+ {reuseaddr,true}]) of
+ {ok, ListenSocket} ->
+ {ok, ListenSocket};
+ {error, Reason} ->
+ {error, {listen, Reason}}
+ end.
+
+close(Socket) ->
+ (catch gen_tcp:close(Socket)).
+
+connect(Host, Port) ->
+ Timeout = 5000,
+ % Options1 = check_options(Options),
+ Options = [binary],
+ case catch gen_tcp:connect(Host, Port, Options, Timeout) of
+ {ok, Socket} ->
+ {ok, Socket};
+ {error, Reason} ->
+ {error, Reason}
+ end.
+
+send_chunks(Socket, Binary) ->
+ BSize = erlang:size(Binary),
+ if
+ BSize > 25 ->
+ <<Head:25/binary, Tail/binary>> = Binary,
+ case gen_tcp:send(Socket, Head) of
+ ok ->
+ timer:sleep(1000),
+ send_chunks(Socket, Tail);
+ {error,closed} ->
+ ok
+ end;
+ true ->
+ gen_tcp:send(Socket, Binary)
+ end.
+
+datadir(Config) ->
+ proplists:get_value(data_dir, Config).
diff --git a/lib/xmerl/test/xmerl_sax_stream_SUITE_data/xmerl_sax_stream_one.xml b/lib/xmerl/test/xmerl_sax_stream_SUITE_data/xmerl_sax_stream_one.xml
new file mode 100644
index 0000000000..30328bb188
--- /dev/null
+++ b/lib/xmerl/test/xmerl_sax_stream_SUITE_data/xmerl_sax_stream_one.xml
@@ -0,0 +1,17 @@
+<?xml version="1.0"?>
+<person>
+<name>
+Arne Andersson
+</name>
+<address>
+<street>
+ Old Road 456
+</street>
+<zip>
+12323
+</zip>
+<city>
+Small City
+</city>
+</address>
+</person>
diff --git a/lib/xmerl/test/xmerl_sax_stream_SUITE_data/xmerl_sax_stream_one_junk.xml b/lib/xmerl/test/xmerl_sax_stream_SUITE_data/xmerl_sax_stream_one_junk.xml
new file mode 100644
index 0000000000..f730a95865
--- /dev/null
+++ b/lib/xmerl/test/xmerl_sax_stream_SUITE_data/xmerl_sax_stream_one_junk.xml
@@ -0,0 +1,18 @@
+<?xml version="1.0"?>
+<person>
+<name>
+Arne Andersson
+</name>
+<address>
+<street>
+ Old Road 456
+</street>
+<zip>
+12323
+</zip>
+<city>
+Small City
+</city>
+</address>
+</person>
+this is junk ......
diff --git a/lib/xmerl/test/xmerl_sax_stream_SUITE_data/xmerl_sax_stream_two.xml b/lib/xmerl/test/xmerl_sax_stream_SUITE_data/xmerl_sax_stream_two.xml
new file mode 100644
index 0000000000..e241a02190
--- /dev/null
+++ b/lib/xmerl/test/xmerl_sax_stream_SUITE_data/xmerl_sax_stream_two.xml
@@ -0,0 +1,34 @@
+<?xml version="1.0"?>
+<person>
+<name>
+Arne Andersson
+</name>
+<address>
+<street>
+ Old Road 456
+</street>
+<zip>
+12323
+</zip>
+<city>
+Small City
+</city>
+</address>
+</person>
+<?xml version="1.0"?>
+<person>
+<name>
+Bertil Bengtson
+</name>
+<address>
+<street>
+ New Road 4
+</street>
+<zip>
+12328
+</zip>
+<city>
+Small City
+</city>
+</address>
+</person>
diff --git a/lib/xmerl/vsn.mk b/lib/xmerl/vsn.mk
index a78a035a1f..2e9c9061d9 100644
--- a/lib/xmerl/vsn.mk
+++ b/lib/xmerl/vsn.mk
@@ -1 +1 @@
-XMERL_VSN = 1.3.11
+XMERL_VSN = 1.3.15
diff --git a/lib/xmerl/xmerl.pub b/lib/xmerl/xmerl.pub
deleted file mode 100644
index 29a81bbde2..0000000000
--- a/lib/xmerl/xmerl.pub
+++ /dev/null
@@ -1,19 +0,0 @@
-{name, "xmerl"}.
-{vsn, {0,18}}.
-{summary, "XML processing tools"}.
-{author, "Ulf Wiger, Johan Blom, Richard Carlsson, Mickael Remond", "[email protected]", "20020307"}.
-{keywords, ["xml", "html"]}.
-{needs, [{compiler, "3.0"}]}.
-{abstract,
- "Implements a set of tools for processing XML documents, as well as "
- "working with XML-like structures in Erlang. The main attraction so far "
- "is a single-pass, highly customizable XML processor. Other components are "
- "an export/translation facility and an XPATH query engine. "
- "This version fixes a few bugs in the scanner, and improves HTML export. "
- "The latest version can be found at http://sowap.sourceforge.net/ "
- "Note that this is still very much a beta product."}.
-
-
-
-
-