Age | Commit message (Collapse) | Author |
|
* rb/stdlib_re_unicode_fixes:
Fix lost unicode option in re:compile()
Refactor out repeated block in re module
Fix re:replace/4 to handle unicode charlist Replacement argument
Fix re:replace/4 to handle unicode charlist RE argument
Fix re:replace/4 to handle binary unicode output when nothing replaced
OTP-8394 A number of bugs concerning re and unicode are corrected:
- re:compile no longer loses unicode option, which also fixes bug
in re:split.
- re:replace now handles unicode charlist replacement argument
- re:replace now handles unicode RE charlist argument correctly
- re:replace now handles binary unicode output correctly when
nothing is replaced.
Most code, testcases and error isolation done by Rory Byrne.
|
|
A bug in re:replace/4 causes a badarg exception to be thrown when the
Replacement argument is a charlist containing non-ascii codepoints.
The problem is that the code incorrectly assumes that the Replacement
text is iodata() and calls iolist_to_binary/1 on it. This patch fixes
it to obey the 'unicode' option and handle charlist() Replacement
arguments correctly.
|
|
The real problem is in the re:run/3 BIF.
Noticed-by: Rory Byrne
Tests-by: Rory Byrne
|
|
A bug with re:replace/4 causes an exception when: (a) it's given a
unicode charlist as input; (b) it's set to {return,binary}; and
(c) it finds nothing to replace.
The problem is: when re:replace/4 does not find anything to replace
in its Subject input, it calls iolist_to_binary on this data. This
fails if the original input is a charlist with non-ascii codepoints.
|
|
|