Age | Commit message (Collapse) | Author |
|
Add flag {comments, Flag} to xmerl_scan for filtering of comments.
Default (true) is that xmlComment records are returned from the scanner
and this flag should be set to false if one don't want comments
in the output.
Fix some bugs to get the test cases to run clean.
|
|
|
|
When `default_attrs` is `true`, any attribute with a default value defined
in the doctype but not in the attribute axis of the currently scanned
element is added to it.
|
|
Functions `xmerl_scan:file/2` and `xmerl_scan:string/2` now accepts a
new option `{document, true}` to produce a whole document as a
`xmlDocument` record instead of just the root element node.
You may wonder why this would be useful, this option is the only way to
get to the top-level comments and processing instructions without
hooking through the customization functions. Those nodes are needed to
implement [Canonical XML][c14n-xml] support.
[c14n-xml]: http://www.w3.org/TR/2008/PR-xml-c14n11-20080129/
"Canonical XML"
|
|
|
|
|
|
|
|
See [Namespaces in XML 1.0 (Third Edition)][1]:
> The prefix xml is by definition bound to the namespace name
> http://www.w3.org/XML/1998/namespace. It MAY, but need not, be
> declared, and MUST NOT be bound to any other namespace name. Other
> prefixes MUST NOT be bound to this namespace name, and it MUST NOT be
> declared as the default namespace.
>
> The prefix xmlns is used only to declare namespace bindings and is by
> definition bound to the namespace name http://www.w3.org/2000/xmlns/.
> It MUST NOT be declared . Other prefixes MUST NOT be bound to this
> namespace name, and it MUST NOT be declared as the default namespace.
> Element names MUST NOT have the prefix xmlns.
>
> In XML documents conforming to this specification, no tag may containe
> two attributes which have identical names, or have qualified names
> with the same local part and with prefixes which have been bound to
> namespace names that are identical.
[1] http://www.w3.org/TR/REC-xml-names/
|
|
|
|
|
|
|
|
|
|
OTP-9274
|
|
Section 3.3.3 of the XML Recommendation gives the rules for
attribute-value normalization. One of those rules requires
that character references not be re-normalized after being
replaced with the referenced characters:
For a character reference, append the referenced
character to the normalized value.
And, in particular:
Note that if the unnormalized attribute value contains
a character reference to a white space character other
than space (#x20), the normalized value contains the
referenced character itself (#xD, #xA or #x9).
Source: http://www.w3.org/TR/xml/#AVNormalize
In xmerl_scan, however, character references in attributes are
normalized an extra time after replacement. For example, the
character reference "
" in the following XML document gets
normalized (incorrectly) into a space when parsed:
2> xmerl_scan:string("<root x='
'/>").
{... [{xmlAttribute,x,[],[],[],[],1,[]," ",false}] ...}
This short patch restores the correct behavior:
2> xmerl_scan:string("<root x='
'/>").
{... [{xmlAttribute,x,[],[],[],[],1,[],"\n",false}] ...}
NOTE: This change does not include tests because I could not
find a test suite for xmerl.
|
|
warnings/errors in man pages
|
|
|
|
|