From 1bb2c76c09510bf761c4a6908ae78d1e2a87d574 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?P=C3=A9ter=20Dimitrov?= Date: Tue, 7 Nov 2017 16:23:09 +0100 Subject: stdlib: Implement compose and dissect query (HTML5) Implement functions for handling form-urlencoded query strings based on the HTML5 specification. --- lib/stdlib/doc/src/uri_string.xml | 98 ++++++++++++++++++++++++++++++++++++++- 1 file changed, 97 insertions(+), 1 deletion(-) (limited to 'lib/stdlib/doc') diff --git a/lib/stdlib/doc/src/uri_string.xml b/lib/stdlib/doc/src/uri_string.xml index 9ace2b0a05..21f470e763 100644 --- a/lib/stdlib/doc/src/uri_string.xml +++ b/lib/stdlib/doc/src/uri_string.xml @@ -31,7 +31,8 @@ URI processing functions.

This module contains functions for parsing and handling URIs - (RFC 3986). + (RFC 3986) and + form-urlencoded query strings (HTML5).

A URI is an identifier consisting of a sequence of characters matching the syntax rule named URI in RFC 3986. @@ -71,6 +72,13 @@ Transforming URIs into a normalized form

normalize/1
+ Composing form-urlencoded query strings from a list of key-value pairs

+ compose_query/1

+ compose_query/2 +
+ Dissecting form-urlencoded query strings into a list of key-value pairs

+ dissect_query/1 +

There are four different encodings present during the handling of URIs:

@@ -102,12 +110,15 @@

Error tuple indicating the type of error. Possible values of the second component:

+ invalid_character + invalid_encoding invalid_input invalid_map invalid_percent_encoding invalid_scheme invalid_uri invalid_utf8 + missing_value

The third component is a term providing additional information about the cause of the error.

@@ -133,6 +144,91 @@ + + + Compose urlencoded query string. + +

Composes a form-urlencoded QueryString based on a + QueryList, a list of non-percent-encoded key-value pairs. + Form-urlencoding is defined in section + 4.10.22.6 of the HTML5 + specification. +

+

See also the opposite operation + dissect_query/1. +

+

Example:

+
+1> uri_string:compose_query([{"foo bar","1"},{"city","örebro"}]).
+
+2> >,<<"1">>},
+2> {<<"city">>,<<"örebro"/utf8>>}]).]]>
+>]]>
+	
+
+
+ + + + Compose urlencoded query string. + +

Same as compose_query/1 but with an additional + Options parameter, that controls the encoding ("charset") + used by the encoding algorithm. There are two supported encodings: utf8 + (or unicode) and latin1. +

+

Each character in the entry's name and value that cannot be expressed using + the selected character encoding, is replaced by a string consisting of a U+0026 + AMPERSAND character (), a "#" (U+0023) character, one or more ASCII + digits representing the Unicode code point of the character in base ten, and + finally a ";" (U+003B) character. +

+

Bytes that are out of the range 0x2A, 0x2D, 0x2E, 0x30 to 0x39, 0x41 to 0x5A, 0x5F, + 0x61 to 0x7A, are percent-encoded (U+0025 PERCENT SIGN character (%) followed by + uppercase ASCII hex digits representing the hexadecimal value of the byte). +

+

See also the opposite operation + dissect_query/1. +

+

Example:

+
+1> uri_string:compose_query([{"foo bar","1"},{"city","örebro"}],
+1> [{encoding, latin1}]).
+ uri_string:compose_query([{<<"foo bar">>,<<"1">>},
+2> {<<"city">>,<<"東京"/utf8>>}], [{encoding, latin1}]).]]>
+>]]>
+	
+
+
+ + + + Dissect query string. + +

Dissects an urlencoded QueryString and returns a + QueryList, a list of non-percent-encoded key-value pairs. + Form-urlencoding is defined in section + 4.10.22.6 of the HTML5 + specification. +

+

It is not as strict for its input as the decoding algorithm defined by + HTML5 + and accepts all unicode characters.

+

See also the opposite operation + compose_query/1. +

+

Example:

+
+1> 
+[{"foo bar","1"},{"city","örebro"}]
+2> >).]]>
+>,<<"1">>},
+ {<<"city">>,<<230,157,177,228,186,172>>}] ]]>
+	
+
+
+ Syntax-based normalization. -- cgit v1.2.3