From c69bbc7ce1af2dc295fc17fcb31485e2d4caafa7 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?P=C3=A9ter=20Dimitrov?= Date: Wed, 7 Feb 2018 16:08:32 +0100 Subject: stdlib: Improve URI normalization in uri_string - normalize/1 accepts uri_map() as input type and can return error() if URI parsing fails. - Added normalize/2 that can return a normalized uri_map(). Change-Id: Icdd2e60c15019d3eec2e7bc994cae03066a79194 --- lib/stdlib/doc/src/uri_string.xml | 36 ++++++++++++++++++++++++++++++++---- 1 file changed, 32 insertions(+), 4 deletions(-) (limited to 'lib/stdlib/doc/src/uri_string.xml') diff --git a/lib/stdlib/doc/src/uri_string.xml b/lib/stdlib/doc/src/uri_string.xml index 21f470e763..6b52ffdd4d 100644 --- a/lib/stdlib/doc/src/uri_string.xml +++ b/lib/stdlib/doc/src/uri_string.xml @@ -4,7 +4,7 @@
- 20172017 + 20172018 Ericsson AB. All Rights Reserved. @@ -24,7 +24,7 @@ uri_string Péter Dimitrov 1 - 2017-10-24 + 2018-02-07 A
uri_string @@ -70,7 +70,8 @@ transcode/2 Transforming URIs into a normalized form

- normalize/1 + normalize/1

+ normalize/2
Composing form-urlencoded query strings from a list of key-value pairs

compose_query/1

@@ -233,7 +234,7 @@ Syntax-based normalization. -

Transforms URIString into a normalized form +

Transforms an URI into a normalized form using Syntax-Based Normalization as defined by RFC 3986.

This function implements case normalization, percent-encoding @@ -247,6 +248,33 @@ >]]> 3> uri_string:normalize("http://localhost:80"). "https://localhost/" +4> uri_string:normalize(#{scheme => "http",port => 80,path => "/a/b/c/./../../g", +4> host => "localhost-örebro"}). +"http://localhost-%C3%B6rebro/a/g" + + + + + + + Syntax-based normalization. + +

Same as normalize/1 but with an additional + Options parameter, that controls if the normalized URI + shall be returned as an uri_map(). + There is one supported option: return_map. +

+

Example:

+
+1> uri_string:normalize("/a/b/c/./../../g", [return_map]).
+#{path => "/a/g"}
+2> >, [return_map]).]]>
+ <<"mid/6">>}]]>
+3> uri_string:normalize("http://localhost:80", [return_map]).
+#{scheme => "http",path => "/",host => "localhost"}
+4> uri_string:normalize(#{scheme => "http",port => 80,path => "/a/b/c/./../../g",
+4> host => "localhost-örebro"}, [return_map]).
+#{scheme => "http",path => "/a/g",host => "localhost-örebro"}
 	
-- cgit v1.2.3 From c903da9a67c4900c3113bd503c9fc3adaa85bb69 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?P=C3=A9ter=20Dimitrov?= Date: Tue, 13 Feb 2018 15:48:09 +0100 Subject: stdlib: Update uri_string documentation (HTML 5.2) - Original link to HTML 5.0 specification was broken as the document was moved when later revisions were released. - Form-urlencoded query string handling conforms to the HTML 5.2 specification that references WHATWG URL (10 Jan 2018). - HTML 5.2 does not specify handling of non-UTF-8 form-urlencoded query strings, but it is still supported as described in HTML 5.0. Change-Id: I44603bb501530b16651ecbb9a26ea64e119f83d9 --- lib/stdlib/doc/src/uri_string.xml | 21 +++++++++++++-------- 1 file changed, 13 insertions(+), 8 deletions(-) (limited to 'lib/stdlib/doc/src/uri_string.xml') diff --git a/lib/stdlib/doc/src/uri_string.xml b/lib/stdlib/doc/src/uri_string.xml index 6b52ffdd4d..88d4600611 100644 --- a/lib/stdlib/doc/src/uri_string.xml +++ b/lib/stdlib/doc/src/uri_string.xml @@ -32,7 +32,11 @@

This module contains functions for parsing and handling URIs (RFC 3986) and - form-urlencoded query strings (HTML5). + form-urlencoded query strings (HTML 5.2). +

+

+ Parsing and serializing non-UTF-8 form-urlencoded query strings are also supported + (HTML 5.0).

A URI is an identifier consisting of a sequence of characters matching the syntax rule named URI in RFC 3986. @@ -152,8 +156,10 @@

Composes a form-urlencoded QueryString based on a QueryList, a list of non-percent-encoded key-value pairs. Form-urlencoding is defined in section - 4.10.22.6 of the HTML5 - specification. + 4.10.21.6 of the HTML 5.2 + specification and in section 4.10.22.6 of the + HTML 5.0 specification for + non-UTF-8 encodings.

See also the opposite operation dissect_query/1. @@ -210,12 +216,11 @@

Dissects an urlencoded QueryString and returns a QueryList, a list of non-percent-encoded key-value pairs. Form-urlencoding is defined in section - 4.10.22.6 of the HTML5 - specification. + 4.10.21.6 of the HTML 5.2 + specification and in section 4.10.22.6 of the + HTML 5.0 specification for + non-UTF-8 encodings.

-

It is not as strict for its input as the decoding algorithm defined by - HTML5 - and accepts all unicode characters.

See also the opposite operation compose_query/1.

-- cgit v1.2.3