Age | Commit message (Collapse) | Author |
|
|
|
Bad optimizing code introduced in 5c51e87bee9d
|
|
|
|
It has been decided that it was to early to deprecate the old
string functions.
This partially reverts commit ccb3f7f9768d3c28783c771df47eec1829e51802.
|
|
* maint:
ERL-558 Add the missing function clause for string:prefix (#1702)
|
|
OTP-14942
|
|
* maint:
Avoid falling measurements testcases on slow machines
stdlib: string optimize special case for ASCII
stdlib: Minor unicode_util opts
|
|
Avoid unicode_util module call for ASCII strings
|
|
They should not be used.
|
|
Do not invoke the internal string:lenght/1 function
when the list length is wanted.
Fixes backwards compatibility for old string functions.
|
|
Prior to this patch, the normalization functions in the
unicode module would raise a function clause error for
non-utf8 binaries.
This patch changes it so it returns {error, SoFar, Invalid}
as characters_to_binary and characters_to_list does in
the unicode module.
Note string:next_codepoint/1 and string:next_grapheme had
to be changed accordingly and also return an error tuple.
|
|
|
|
|
|
Works with unicode:chardata() as input as was decided on OTP board
meeting as response to EEP-35 a long time ago.
Works on graphemes clusters as base, with a few exceptions, does not
handle classic (nor nfd'ified) Hangul nor the extended grapheme
clusters such as the prepend class. That would make handling binaries
as input/output very slow.
List input => list output, binary input => binary output and
mixed input => mixed output for all find/split functions.
So that results can be post-processed without the need to invoke
unicode:characters_to_list|binary for intermediate data.
pad functions return lists of unicode:chardata() for performance.
|
|
|
|
|
|
We can save some time by reversing the original string
before starting the tokenization.
When there is only one separator, we can save even more time
by treating that case specially so that we don't have to call
lists:member/2 for each character.
|
|
|
|
files as delimiters.
While working on a tool that processes Erlang code and testing it against this repo,
I found out about those little sneaky 0xff. I thought it may be of help to other
people build such tools to remove non-conforming-to-standard characters.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Using a float for the number of copies results in an infinite loop.
Check that the argument is an integer.
Reported-By: Eric Pailleau
|
|
|