aboutsummaryrefslogtreecommitdiffstats
path: root/erts/emulator/beam/erl_unicode.c
AgeCommit message (Collapse)Author
2018-09-07erts: Beautify away #ifdef DEBUGSverker Eriksson
"(void)result" will silence warning about unused variable and compiler will optimize away such unused variables.
2018-06-04erts: Refactor usage of am_atom_put to ERTS_MAKE_AMSverker Eriksson
and let compiler determine string lengths. These were actually wrong in erl_db.c: count_trap\0 replace_tra select_tra
2018-03-09Always use sys_memcpy/cmp/etc instead of plain memcpy/cmp/etcJohn Högberg
2018-01-11Merge branch 'maint'Rickard Green
* maint: Fix encoding of filenames in stacktraces
2018-01-11Fix encoding of filenames in stacktracesRickard Green
2018-01-03Disallow NULs in filename-encoded stringsJohn Högberg
Previously we accepted trailing NULs, which was backwards compatible as such usage never resulted in misbehavior in the first place. The downside is that it prevented erts_native_filename_need from returning an accurate number of *actual characters*, needlessly complicating encoding-agnostic code like erts_osenv.
2018-01-03Replace the libc environment with a thread-safe emulationJohn Högberg
putenv(3) and friends aren't thread-safe regardless of how you slice it; a global lock around all environment operations (like before) keeps things safe as far as our own operations go, but we have absolutely no control over what libc or a library dragged in by a driver/NIF does -- they're free to call getenv(3) or putenv(3) without honoring our lock. This commit solves this by setting up an "emulated" environment which can't be touched without going through our interfaces. Third-party libraries can still shoot themselves in the foot but benign uses of os:putenv/2 will no longer risk crashing the emulator.
2017-10-12Revert "Merge branch 'rickard/null-char-filenames/ERL-370/OTP-14543' into maint"Rickard Green
This reverts commit 0717a2194e863f3a78595184ccc5637697f03353, reversing changes made to 71a40658a0cef8b3e25df3a8e48a72d0563a89bf.
2017-10-11Don't allow null chars in various stringsRickard Green
Various places that now reject null chars inside strings - Primitive file operations reject it in filenames. - Primitive environment variable operations reject it in names and values. - os:cmd() reject it in its input. Also '=' characters are rejected by primitive environment variable operations in environment variable names. Documentation has been updated to document null characters in these types of data as invalid. Currently these operations accept null chars at the end of strings, but that will change in the future.
2017-09-27Don't allow null in filenamesRickard Green
2017-05-04Update copyright yearRaimo Niskanen
2017-02-14erts: Add deallocation veto for magic destructorsSverker Eriksson
A magic destructor can return 0 and thereby take control and prolong the lifetime of a magic binary.
2017-02-06Use magic refs for unicode static NIFs trapsRickard Green
2017-01-30Add new AtU8 beam chunkJosé Valim
The new chunk stores atoms encoded in UTF-8. beam_lib has also been modified to handle the new 'utf8_atoms' attribute while the 'atoms' attribute may be a missing chunk from now on. The binary_to_atom/2 BIF can now encode any utf8 binary with up to 255 characters. The list_to_atom/1 BIF can now accept codepoints higher than 255 with up to 255 characters (thanks to Björn Gustavsson).
2016-05-05erl_unicode.c: fix integer truncation problemsMikael Pettersson
- use Sint for 'left' and Uint for 'pos' - cost_to_proc(): adjust types, remove unused return value - copy_utf8_bin(): return Uint - characters_to_utf8_trap(): remove harmful cast - unicode_characters_to_binary_2(): correct types in debug code - erts_convert_filename_to_encoding(): remove useless cast
2016-03-15update copyright-yearHenrik Nord
2016-02-24erts: Change erl_exit into erts_exitSverker Eriksson
This is mostly a pure refactoring. Except for the buggy cases when calling erlang:halt() with a positive integer in the range -(INT_MIN+2) to -INT_MIN that got confused with ERTS_ABORT_EXIT, ERTS_DUMP_EXIT and ERTS_INTR_EXIT. Outcome OLD erl_exit(n, ) NEW erts_exit(n, ) ------- ------------------- ------------------------------------------- exit(Status) n = -Status <= 0 n = Status >= 0 crashdump+abort n > 0, ignore n n = ERTS_ERROR_EXIT < 0 The outcome of the old ERTS_ABORT_EXIT, ERTS_INTR_EXIT and ERTS_DUMP_EXIT are the same as before (even though their values have changed).
2015-06-18Change license text to APLv2Bruce Yinhe
2014-05-22Fix conversion of empty string in erts_convert_native_to_filename()Rickard Green
2013-12-16erts: Add 'extra' argument to erts_convert_filename_to_encodingSverker Eriksson
2013-10-16Merge branch 'sverk/load-nif-unicode'Sverker Eriksson
OTP-11408 * sverk/load-nif-unicode: erts: Fix bug in atom to filename conversions Fix open_ddll for win erts, crypto: Support NIF library with unicode filename on windows erts: Factor out erts_convert_filename_to_wchar() erts: Fix compiler warning erts: Fix loading of NIF library with unicode in path erts: Remove unused constant DRIVER_TAB_SIZE
2013-10-15Merge branch 'maint'Sverker Eriksson
Conflicts: erts/preloaded/ebin/erlang.beam
2013-09-30erts: erts_mmap supercarrier management and erts_mseg usageRickard Green
* Coalescing and trimming of free segments in supercarrier * Management of super aligned and super unaligned areas in supercarrier * Management of reservation of physical memory * erts_mseg usage of erts_mmap
2013-09-24erts: Fix bug in atom to filename conversionsSverker Eriksson
Buggy old code assumed latin1 atoms.
2013-09-19erts: Factor out erts_convert_filename_to_wchar()Sverker Eriksson
from erts_convert_filename_to_encoding()
2013-06-05Merge branch 'maint'Patrik Nyblom
2013-06-03erts: Change erlang:open_port spawn to handle unicodeDan Gudmundsson
Previously only 'spawn_executable' handled unicode input. Also change 'cd' option to always handle unicode. Update open_port documentation and tests
2013-05-02Fix faulty rest on error in unicode:characters_to_listPatrik Nyblom
2013-03-04erts: Use block comments - ansi styleBjörn-Egil Dahlberg
2013-02-18Add +pc {latin1|unicode} switch and io:printable_range/0Patrik Nyblom
This is the base for implementing configurable ~tp printouts, so that the user can define which characters to view as actually printable in the shell and by io_lib:format. The functionality is neither documented nor used in this commit
2013-02-11Teach prim_file:set_cwd() to avoid entering non-translatable directoriesBjörn Gustavsson
We have decided that we don't want to deal with the compilations of prim_file:get_cwd() returning a binary when the current directory name cannot be translated losslessly to a list (i.e. when the run-time system was started with +fnu and the current directory name contains bytes that are not part of a valid UTF-8 sequence). Therefore, if prim_file:set_cwd() is given a binary as the pathname, we will need to check the binary to make sure it can be translated to a list. We will introduce a new BIF, called prim_file:is_translatable/1, which will check both filename encoding mode, and if it is one of Unicode modes, the binary as well. We don't need to do anything special if prim_file:set_cwd() is passed a list.
2013-02-11Make prim_file skip invalid filenames in unicode modePatrik Nyblom
The fix affects list_dir and read_link. Raw filenames are now never produced, just consumed even if +fnu or +fna is used on Linux etc. This also adds the options to get error return or error handler warning messages with +fn{u|a}{i|w|e} as an option to erl. This is still not documented and there needs to be other versions of read_dir and read_link to facilitate reading of all types of filenames and links. A check that we will not change to an invalid directory is also needed.
2013-01-28Merge branch 'sverk/enc_atom-opt'Sverker Eriksson
* sverk/enc_atom-opt: erts: Optimize atom encoding to use memcpy for pure ascii erts: Refactor erts_atom_get to use ErtsAtomEncoding
2013-01-25Update copyright yearsBjörn-Egil Dahlberg
2013-01-25erts: Refactor erts_atom_get to use ErtsAtomEncodingSverker Eriksson
instead of 'is_latin1' boolean argument.
2013-01-22erts: Fix bug in analyze_utf8 causing faulty latin1 detectionSverker Eriksson
2013-01-16atom fixes for NIFs and atom_to_binarySverker Eriksson
2013-01-16UTF-8 support for distributionRickard Green
2013-01-08erts: Change internal representation of atoms to utf8Sverker Eriksson
2012-08-31Merge branch 'maint'Björn-Egil Dahlberg
Conflicts: lib/diameter/autoconf/vxworks/sed.general xcomp/README.md
2012-08-31Update copyright yearsBjörn-Egil Dahlberg
2012-08-20Merge branch 'maint'Patrik Nyblom
Conflicts: erts/doc/src/erlang.xml erts/preloaded/ebin/init.beam lib/kernel/doc/src/os.xml lib/stdlib/test/filename_SUITE.erl
2012-08-14Make get/putenv and erlexec understand UnicodePatrik Nyblom
Putenv and getenv needs to convert to the proper environment strings in Unicode depending on platform and user settings for filename encoding. Also erlexec needs to pass environment strings in an appropriate way for kernel to pick up. All environment strings on the command line, as well as home directory, is now passed in UTF8 on windows and in whatever encoding you have on Unix, kernel tries to convert all parameters and environments from UTF8 before making strings.
2012-02-21erts: Refactor new helper function erts_init_trap_exportSverker Eriksson
2011-11-16Remove remaining gcc 4.6 assigned-but-not-used warnings from ertsPatrik Nyblom
2011-10-26Use the proper macros in all BIFsBjörn Gustavsson
As a preparation for changing the calling convention for BIFs, make sure that all BIFs use the macros. Also, eliminate all calls from one BIF to another, since that also breaks the calling convention abstraction.
2011-10-13Allow noncharacter code points in unicode encoding and decodingBjörn Gustavsson
The two noncharacter code points 16#FFFE and 16#FFFF were not allowed to be encoded or decoded using the unicode module or bit syntax. That causes an inconsistency, since the noncharacters 16#FDD0 to 16#FDEF could be encoded/decoded. There is two ways to fix that inconsistency. We have chosen to allow 16#FFFE and 16#FFFF to be encoded and decoded, because the noncharacters could be useful internally within an application and it will make encoding and decoding slightly faster. Reported-by: Alisdair Sullivan
2011-05-20Update copyright yearsBjörn-Egil Dahlberg
2011-03-16erts: Remove unused variablesTuncer Ayaz
2010-12-10Fix a couple typos in filename encoding docsTuncer Ayaz