From 1b7b4f7398765188815f697444e42029454dcd3d Mon Sep 17 00:00:00 2001
From: Sverker Eriksson
Date: Wed, 8 Mar 2017 20:34:55 +0100
Subject: erts: Mark latin1 atom encoding as deprecated
which means tags ATOM_EXT and SMALL_ATOM_EXT.
---
erts/doc/src/erl_ext_dist.xml | 156 ++++++++++++++++++++----------------------
1 file changed, 76 insertions(+), 80 deletions(-)
(limited to 'erts/doc')
diff --git a/erts/doc/src/erl_ext_dist.xml b/erts/doc/src/erl_ext_dist.xml
index a436a9ca74..da2dc94e5b 100644
--- a/erts/doc/src/erl_ext_dist.xml
+++ b/erts/doc/src/erl_ext_dist.xml
@@ -51,7 +51,7 @@
term into the external format.
To convert binary data encoding to a term, the BIF
- erlang:binary_to_term/1c> is used.
+ erlang:binary_to_term/1 is used.
The distribution does this implicitly when sending messages across
@@ -119,22 +119,18 @@
Compressed Data Format when Expanded
- As from ERTS 9.0 (OTP 20), UTF-8 encoded atoms may contain any Unicode
- character. Although the support for UTF-8 encoded atoms in the external
- format is available since ERTS 5.10 (OTP R16), passing atoms that cannot
- be encoded in Latin-1 is an error in versions earlier than
- Erlang/OTP 20, and the behavior is undefined.
- When distribution flag
- DFLAG_UTF8_ATOMS has been exchanged between both nodes
- in the
- distribution handshake, all atoms in the distribution header
- are encoded in UTF-8, otherwise in Latin-1. The two
- new tags ATOM_UTF8_EXT
- and
- SMALL_ATOM_UTF8_EXT
- are only used if the distribution flag DFLAG_UTF8_ATOMS has
- been exchanged between nodes, or if an atom containing characters
- that cannot be encoded in Latin-1 is encountered.
+ As from ERTS 9.0 (OTP 20), atoms may contain any Unicode
+ characters and are always encoded using the UTF-8 external formats
+ ATOM_UTF8_EXT
+ or SMALL_ATOM_UTF8_EXT.
+ The old Latin-1 formats ATOM_EXT
+ and SMALL_ATOM_EXT
+ are deprecated and are only kept for backward
+ compatibility when decoding terms encoded by older nodes.
+ Support for UTF-8 encoded atoms in the external format has been
+ available since ERTS 5.10 (OTP R16). This abillity allows such old nodes
+ to decode, store and encode any Unicode atoms received from a new OTP 20
+ node.
The maximum number of allowed characters in an atom is 255. In the
UTF-8 case, each character can need 4 bytes to be encoded.
@@ -389,28 +385,6 @@
-
-
- ATOM_EXT
-
-
- 1 |
- 2 |
- Len |
-
-
- 100 |
- Len |
- AtomName |
-
- ATOM_EXT
-
- An atom is stored with a 2 byte unsigned length in big-endian order,
- followed by Len numbers of 8-bit Latin-1 characters that forms
- the AtomName. The maximum allowed value for Len is 255.
-
-
-
REFERENCE_EXT
@@ -432,8 +406,8 @@
Encodes a reference object (an object generated with
erlang:make_ref/0).
The Node term is an encoded atom, that is,
- ATOM_EXT,
- SMALL_ATOM_EXT, or
+ ATOM_UTF8_EXT,
+ SMALL_ATOM_UTF8_EXT, or
ATOM_CACHE_REF.
The ID field contains a big-endian unsigned integer,
but is to be regarded as uninterpreted data,
@@ -771,39 +745,6 @@
-
-
- SMALL_ATOM_EXT
-
-
- 1 |
- 1 |
- Len |
-
-
- 115 |
- Len |
- AtomName |
-
- SMALL_ATOM_EXT
-
- An atom is stored with a 1 byte unsigned length,
- followed by Len numbers of 8-bit Latin-1 characters that
- forms the AtomName. Longer atoms can be represented
- by ATOM_EXT.
-
-
-
- SMALL_ATOM_EXT was introduced in ERTS 5.7.2 and
- require an exchange of distribution flag
-
- DFLAG_SMALL_ATOM_TAGS in the
-
- distribution handshake.
-
-
-
-
FUN_EXT
@@ -838,8 +779,8 @@
Module
-
Encoded as an atom, using
- ATOM_EXT,
- SMALL_ATOM_EXT,
+ ATOM_UTF8_EXT,
+ SMALL_ATOM_UTF8_EXT,
or
ATOM_CACHE_REF.
This is the module that the fun is implemented in.
@@ -933,8 +874,8 @@
Module
-
Encoded as an atom, using
- ATOM_EXT,
- SMALL_ATOM_EXT,
+ ATOM_UTF8_EXT,
+ SMALL_ATOM_UTF8_EXT,
or
ATOM_CACHE_REF.
Is the module that the fun is implemented in.
@@ -996,8 +937,8 @@
Module and Function are atoms
- (encoded using ATOM_EXT,
- SMALL_ATOM_EXT, or
+ (encoded using ATOM_UTF8_EXT,
+ SMALL_ATOM_UTF8_EXT, or
ATOM_CACHE_REF).
@@ -1109,6 +1050,61 @@
in the beginning of this section.
+
+
+
+ ATOM_EXT (deprecated)
+
+
+ 1 |
+ 2 |
+ Len |
+
+
+ 100 |
+ Len |
+ AtomName |
+
+ ATOM_EXT
+
+ An atom is stored with a 2 byte unsigned length in big-endian order,
+ followed by Len numbers of 8-bit Latin-1 characters that forms
+ the AtomName. The maximum allowed value for Len is 255.
+
+
+
+
+
+ SMALL_ATOM_EXT (deprecated)
+
+
+ 1 |
+ 1 |
+ Len |
+
+
+ 115 |
+ Len |
+ AtomName |
+
+ SMALL_ATOM_EXT
+
+ An atom is stored with a 1 byte unsigned length,
+ followed by Len numbers of 8-bit Latin-1 characters that
+ forms the AtomName.
+
+
+
+ SMALL_ATOM_EXT was introduced in ERTS 5.7.2 and
+ require an exchange of distribution flag
+
+ DFLAG_SMALL_ATOM_TAGS in the
+
+ distribution handshake.
+
+
+
+
--
cgit v1.2.3