From 3e6877b06ae395a9d4310ef664d0360867a47f62 Mon Sep 17 00:00:00 2001 From: Patrik Nyblom Date: Wed, 1 Dec 2010 17:35:40 +0100 Subject: Add documentation about raw filenames and Unicode file name translation mode --- lib/stdlib/doc/src/filelib.xml | 27 ++++++++++++++++++++++++--- lib/stdlib/doc/src/filename.xml | 15 ++++++++++++--- 2 files changed, 36 insertions(+), 6 deletions(-) (limited to 'lib/stdlib/doc') diff --git a/lib/stdlib/doc/src/filelib.xml b/lib/stdlib/doc/src/filelib.xml index 4ff3b22f32..969aff4fcb 100644 --- a/lib/stdlib/doc/src/filelib.xml +++ b/lib/stdlib/doc/src/filelib.xml @@ -36,14 +36,23 @@

This module contains utilities on a higher level than the file module.

+

The module supports Unicode file names, so that it will match against regular expressions given in Unicode and that it will find and process raw file names (i.e. files named in a way that does not confirm to the expected encoding).

+

If the VM operates in Unicode file naming mode on a machine with transparent file naming, the fun() provided to fold_files/5 needs to be prepared to handle binary file names.

+

For more information about raw file names, see the file module.

DATA TYPES -filename() = string() | atom() | DeepList -dirname() = filename() -DeepList = [char() | atom() | DeepList] +filename() = = string() | atom() | DeepList | RawFilename + DeepList = [char() | atom() | DeepList] + RawFilename = binary() + If VM is in unicode filename mode, string() and char() are allowed to be > 255. + RawFilename is a filename not subject to Unicode translation, meaning that it + can contain characters not conforming to the Unicode encoding expected from the + filesystem (i.e. non-UTF-8 characters although the VM is started in Unicode + filename mode). +dirname() = filename()
@@ -90,6 +99,18 @@ DeepList = [char() | atom() | DeepList] If Recursive is true all sub-directories to Dir are processed. The regular expression matching is done on just the filename without the directory part.

+ +

If Unicode file name translation is in effect and the file + system is completely transparent, file names that cannot be + interpreted as Unicode may be encountered, in which case the + fun() must be prepared to handle raw file names + (i.e. binaries). If the regular expression contains + codepoints beyond 255, it will not match file names that does + not conform to the expected character encoding (i.e. are not + encoded in valid UTF-8).

+ +

For more information about raw file names, see the + file module.

diff --git a/lib/stdlib/doc/src/filename.xml b/lib/stdlib/doc/src/filename.xml index fe6c6f898e..cdee6e4a81 100644 --- a/lib/stdlib/doc/src/filename.xml +++ b/lib/stdlib/doc/src/filename.xml @@ -4,7 +4,7 @@
- 19972009 + 19972010 Ericsson AB. All Rights Reserved. @@ -43,13 +43,22 @@ only, even if the arguments contain back slashes. Use join/1 to normalize a file name by removing redundant directory separators.

+

The module supports raw file names in the way that if a binary is present, or the file name cannot be interpreted according to the return value of + file:native_name_encoding/0, a raw file name will also be returned. For example filename:join/1 provided with a path component being a binary (and also not being possible to interpret under the current native file name encoding) will result in a raw file name being returned (the join operation will have been performed of course). For more information about raw file names, see the file module.

DATA TYPES -name() = string() | atom() | DeepList -DeepList = [char() | atom() | DeepList] +name() = string() | atom() | DeepList | RawFilename + DeepList = [char() | atom() | DeepList] + RawFilename = binary() + If VM is in unicode filename mode, string() and char() are allowed to be > 255. + RawFilename is a filename not subject to Unicode translation, meaning that it + can contain characters not conforming to the Unicode encoding expected from the + filesystem (i.e. non-UTF-8 characters although the VM is started in Unicode + filename mode). +
-- cgit v1.2.3