aboutsummaryrefslogtreecommitdiffstats
path: root/system/doc
diff options
context:
space:
mode:
authorBjörn Gustavsson <[email protected]>2011-10-20 10:49:13 +0200
committerBjörn Gustavsson <[email protected]>2011-10-20 10:49:13 +0200
commit6ef9aef50dbe839098e4330a97247aa21a15ecde (patch)
tree4e556a50e08da2e9ab139ec0c82c00c7de1a4b4f /system/doc
parent907772538853d2f89d60702eb140e164a72503ad (diff)
parent34db76765561487e526fe66d3d19ecf3b3fb9dc8 (diff)
downloadotp-6ef9aef50dbe839098e4330a97247aa21a15ecde.tar.gz
otp-6ef9aef50dbe839098e4330a97247aa21a15ecde.tar.bz2
otp-6ef9aef50dbe839098e4330a97247aa21a15ecde.zip
Merge branch 'bjorn/unicode-noncharacters/OTP-9624'
* bjorn/unicode-noncharacters/OTP-9624: Allow noncharacter code points in unicode encoding and decoding
Diffstat (limited to 'system/doc')
-rw-r--r--system/doc/reference_manual/expressions.xml12
1 files changed, 5 insertions, 7 deletions
diff --git a/system/doc/reference_manual/expressions.xml b/system/doc/reference_manual/expressions.xml
index 497d7eb464..644896cd7f 100644
--- a/system/doc/reference_manual/expressions.xml
+++ b/system/doc/reference_manual/expressions.xml
@@ -879,9 +879,8 @@ Ei = Value |
and UTF-32, respectively.</p>
<p>When constructing a segment of a <c>utf</c> type, <c>Value</c>
- must be an integer in one of the ranges 0..16#D7FF,
- 16#E000..16#FFFD, or 16#10000..16#10FFFF
- (i.e. a valid Unicode code point). Construction
+ must be an integer in the range 0..16#D7FF or
+ 16#E000....16#10FFFF. Construction
will fail with a <c>badarg</c> exception if <c>Value</c> is
outside the allowed ranges. The size of the resulting binary
segment depends on the type and/or <c>Value</c>. For <c>utf8</c>,
@@ -896,14 +895,13 @@ Ei = Value |
<c><![CDATA[<<$a/utf8,$b/utf8,$c/utf8>>]]></c>.</p>
<p>A successful match of a segment of a <c>utf</c> type results
- in an integer in one of the ranges 0..16#D7FF, 16#E000..16#FFFD,
- or 16#10000..16#10FFFF
- (i.e. a valid Unicode code point). The match will fail if returned value
+ in an integer in the range 0..16#D7FF or 16#E000..16#10FFFF.
+ The match will fail if returned value
would fall outside those ranges.</p>
<p>A segment of type <c>utf8</c> will match 1 to 4 bytes in the binary,
if the binary at the match position contains a valid UTF-8 sequence.
- (See RFC-2279 or the Unicode standard.)</p>
+ (See RFC-3629 or the Unicode standard.)</p>
<p>A segment of type <c>utf16</c> may match 2 or 4 bytes in the binary.
The match will fail if the binary at the match position does not contain