aboutsummaryrefslogtreecommitdiffstats
path: root/lib/xmerl/src/xmerl_scan.erl
AgeCommit message (Collapse)Author
2013-01-09Prepare OTP files for Unicode as default encodingHans Bolinder
2011-12-13[xmerl] Remove trailing blanksLars Thorsen
OTP-9821
2011-12-13[xmerl] Fix bug in namespace handling for attributesLars Thorsen
The uniqueness check of attributes failed when the namespace_conformant flag was set to true.
2011-11-11Fix bugs and add a flag to skip commentsLars Thorsen
Add flag {comments, Flag} to xmerl_scan for filtering of comments. Default (true) is that xmlComment records are returned from the scanner and this flag should be set to false if one don't want comments in the output. Fix some bugs to get the test cases to run clean.
2011-11-11Accumulate comments in element nodesAnthony Ramine
2011-11-11Add `default_attrs` optionAnthony Ramine
When `default_attrs` is `true`, any attribute with a default value defined in the doctype but not in the attribute axis of the currently scanned element is added to it.
2011-11-11Allow whole documents to be returnedAnthony Ramine
Functions `xmerl_scan:file/2` and `xmerl_scan:string/2` now accepts a new option `{document, true}` to produce a whole document as a `xmlDocument` record instead of just the root element node. You may wonder why this would be useful, this option is the only way to get to the top-level comments and processing instructions without hooking through the customization functions. Those nodes are needed to implement [Canonical XML][c14n-xml] support. [c14n-xml]: http://www.w3.org/TR/2008/PR-xml-c14n11-20080129/ "Canonical XML"
2011-11-11Track parents and namespace in `#xmlAttribute` nodesAnthony Ramine
2011-11-11Track parents in `#xmlPI` nodesAnthony Ramine
2011-11-11Set `vsn` field in `#xmlDecl` recordAnthony Ramine
2011-11-11Fix namespace-conformance constraintsAnthony Ramine
See [Namespaces in XML 1.0 (Third Edition)][1]: > The prefix xml is by definition bound to the namespace name > http://www.w3.org/XML/1998/namespace. It MAY, but need not, be > declared, and MUST NOT be bound to any other namespace name. Other > prefixes MUST NOT be bound to this namespace name, and it MUST NOT be > declared as the default namespace. > > The prefix xmlns is used only to declare namespace bindings and is by > definition bound to the namespace name http://www.w3.org/2000/xmlns/. > It MUST NOT be declared . Other prefixes MUST NOT be bound to this > namespace name, and it MUST NOT be declared as the default namespace. > Element names MUST NOT have the prefix xmlns. > > In XML documents conforming to this specification, no tag may containe > two attributes which have identical names, or have qualified names > with the same local part and with prefixes which have been bound to > namespace names that are identical. [1] http://www.w3.org/TR/REC-xml-names/
2011-09-26[xmerl] Fix streaming bug in xmerl_scan (Thanks to Simon Cornish)Lars Thorsen
2011-08-09Entity replacement in attributes doesn't work poperly.Lars Thorsen
2011-05-20Update copyright yearsBjörn-Egil Dahlberg
2011-05-10Fix separator error in tokenlists.Lars Thorsen
2011-05-09Add ticet number for tm/xmerl_attr_charref_fixHenrik Nord
OTP-9274
2011-04-28Prevent xmerl from over-normalizing character references in attributesTom Moertel
Section 3.3.3 of the XML Recommendation gives the rules for attribute-value normalization. One of those rules requires that character references not be re-normalized after being replaced with the referenced characters: For a character reference, append the referenced character to the normalized value. And, in particular: Note that if the unnormalized attribute value contains a character reference to a white space character other than space (#x20), the normalized value contains the referenced character itself (#xD, #xA or #x9). Source: http://www.w3.org/TR/xml/#AVNormalize In xmerl_scan, however, character references in attributes are normalized an extra time after replacement. For example, the character reference "&#xA" in the following XML document gets normalized (incorrectly) into a space when parsed: 2> xmerl_scan:string("<root x='&#xA;'/>"). {... [{xmlAttribute,x,[],[],[],[],1,[]," ",false}] ...} This short patch restores the correct behavior: 2> xmerl_scan:string("<root x='&#xA;'/>"). {... [{xmlAttribute,x,[],[],[],[],1,[],"\n",false}] ...} NOTE: This change does not include tests because I could not find a test suite for xmerl.
2010-12-01Fix format_man_pages so it handles all man sections and remove ↵Lars Thorsen
warnings/errors in man pages
2010-09-06Fix improperly hex replacement when document is in UTF-8 format.Lars Thorsen
2009-11-20The R13B03 release.OTP_R13B03Erlang/OTP