aboutsummaryrefslogtreecommitdiffstats
path: root/lib/stdlib/src/io_lib_format.erl
AgeCommit message (Collapse)Author
2017-06-09stdlib: Handle Unicode atoms better in io_lib_formatHans Bolinder
The field width calculation did not handle graphem clusters well.
2017-04-24stdlib: Add Unicode modifier t to control sequences w and WHans Bolinder
As of the introduction of Unicode characters in atoms, the control sequences 'w' and 'W' can return non-Latin-1 characters, unless some measure is taken. This commit makes sure that '~w' and '~W' always return Latin-1 characters, or bytes, which can be output to ports or written to raw files. The Unicode translation modifier 't' is needed to return non-Latin-1 characters.
2017-02-02Make "~s" fail for Unicode atomsBjörn Gustavsson
26b59dfe67e introduced support for arbitrary Unicode characters in atoms. After that commit, it is possible to print any atom with a "~s" format string: 1> io:format("~s\n", ['спутник']). спутник Note that the same text as a string will fail: 2> io:format("~s\n", ["спутник"]). ** exception error: bad argument in function io:format/3 called as io:format(<0.53.0>,"~s\n", [[1089,1087,1091,1090,1085,1080,1082]]) Being more permissive for atoms is probably beneficial for io:format/2. However, for io_lib:format/2, the new behavior breaks this guarantee in the documentation for io_lib:format/2: If and only if the Unicode translation modifier is used in the format string (that is, ~ts or ~tc), the resulting list can contain characters beyond the ISO Latin-1 character range (that is, numbers > 255). The problem is that you can no longer be sure whether io_lib:format/2 will return an iolist that can be successfully passed to a port or iolist_to_binary/1. We see three solutions: 1. Keep the new behavior. That means that you can get non-iolist data when you use ~s for printing an atom, but a 'badarg' when printing Unicode strings. That is inconsistent, and it delays error detection if the result is passed to a port or iolist_to_binary/1. 2. Always allow Unicode characters for ~s. That would be incompatible, because ~s says that any binary is encoded in latin1, while ~ts says that any binary is encoded in UTF-8. To implement this solution, we could no longer support latin1 binaries; all binaries would have to be encoded in UTF-8. 3. Only allow ~s for atoms where all characters are less than 256. Require ~ts to print atoms such as 'спутник'. We reject solution 1 because it is slightly incompatible and is inconsistent. We reject solution 2 because it too incompatible. Therefore, this commit implements solution 3.
2016-12-01Add comments for understanding io_lib_prettyRichard Carlsson
2016-03-15update copyright-yearHenrik Nord
2015-06-18Change license text to APLv2Bruce Yinhe
2015-03-10Make the scanned form of the io_lib format strings available for processingRichard Carlsson
This adds three new functions to io_lib - scan_format/2, unscan_format/1, and build_text/1 - which expose the parsed form of the format control sequences to make it possible to easily modify or filter the input to io_lib:format/2. This can e.g. be used in order to replace unbounded-size control sequences like ~w or ~p with corresponding depth-limited ~W and ~P before doing the actual formatting.
2014-05-20Properly handle fields too short in io_lib_formatAnthony Ramine
Values for which the precision or field width were too small in io_lib_format could trigger an infinite loop or crash in term/5. Reported-by: Richard Carlsson
2013-02-15[stdlib] Add new SDTLIB application variable 'shell_strings'Hans Bolinder
Use the new function shell:strings/1 to toggle how the Erlang shell outputs lists of integers.
2013-02-15[stdlib] Add control sequence modifier 'l'Hans Bolinder
The modifier 'l' can be used for turning off the string recognition of ~p and ~P.
2013-02-13Extend ~ts to handle binaries with characters coded in ISO-latin-1Hans Bolinder
Make sure io_lib:fwrite() with a format string including "~ts" does not crash when given binaries that cannot be interpreted as UTF-8-encoded strings. We want to avoid crashes caused by excessive use of the 't' modifier.
2013-01-25Extend char() to Unicode charactersHans Bolinder
The code related to the introduction of unicode_string() and unicode_char() has been removed. The types char() and string() have been extended to include Unicode characters. In fact char() was changed some time ago; this commit is about cleaning up the documentation and introduce better names for some functions.
2013-01-02[stdlib, kernel] Introduce Unicode support for Erlang source filesHans Bolinder
Expect modifications, additions and corrections. There is a kludge in file_io_server and erl_scan:continuation_location() that's not so pleasing.
2011-03-10Fix ~F.Fs bug, add testcase and improve documentationRaimo Niskanen
2011-03-10io_lib_format string precision fixAli Yakout
2009-11-20The R13B03 release.OTP_R13B03Erlang/OTP