aboutsummaryrefslogtreecommitdiffstats
path: root/erts/emulator/beam/erl_unicode.c
AgeCommit message (Collapse)Author
2013-05-02Fix faulty rest on error in unicode:characters_to_listPatrik Nyblom
2013-03-04erts: Use block comments - ansi styleBjörn-Egil Dahlberg
2013-02-18Add +pc {latin1|unicode} switch and io:printable_range/0Patrik Nyblom
This is the base for implementing configurable ~tp printouts, so that the user can define which characters to view as actually printable in the shell and by io_lib:format. The functionality is neither documented nor used in this commit
2013-02-11Teach prim_file:set_cwd() to avoid entering non-translatable directoriesBjörn Gustavsson
We have decided that we don't want to deal with the compilations of prim_file:get_cwd() returning a binary when the current directory name cannot be translated losslessly to a list (i.e. when the run-time system was started with +fnu and the current directory name contains bytes that are not part of a valid UTF-8 sequence). Therefore, if prim_file:set_cwd() is given a binary as the pathname, we will need to check the binary to make sure it can be translated to a list. We will introduce a new BIF, called prim_file:is_translatable/1, which will check both filename encoding mode, and if it is one of Unicode modes, the binary as well. We don't need to do anything special if prim_file:set_cwd() is passed a list.
2013-02-11Make prim_file skip invalid filenames in unicode modePatrik Nyblom
The fix affects list_dir and read_link. Raw filenames are now never produced, just consumed even if +fnu or +fna is used on Linux etc. This also adds the options to get error return or error handler warning messages with +fn{u|a}{i|w|e} as an option to erl. This is still not documented and there needs to be other versions of read_dir and read_link to facilitate reading of all types of filenames and links. A check that we will not change to an invalid directory is also needed.
2013-01-28Merge branch 'sverk/enc_atom-opt'Sverker Eriksson
* sverk/enc_atom-opt: erts: Optimize atom encoding to use memcpy for pure ascii erts: Refactor erts_atom_get to use ErtsAtomEncoding
2013-01-25Update copyright yearsBjörn-Egil Dahlberg
2013-01-25erts: Refactor erts_atom_get to use ErtsAtomEncodingSverker Eriksson
instead of 'is_latin1' boolean argument.
2013-01-22erts: Fix bug in analyze_utf8 causing faulty latin1 detectionSverker Eriksson
2013-01-16atom fixes for NIFs and atom_to_binarySverker Eriksson
2013-01-16UTF-8 support for distributionRickard Green
2013-01-08erts: Change internal representation of atoms to utf8Sverker Eriksson
2012-08-31Merge branch 'maint'Björn-Egil Dahlberg
Conflicts: lib/diameter/autoconf/vxworks/sed.general xcomp/README.md
2012-08-31Update copyright yearsBjörn-Egil Dahlberg
2012-08-20Merge branch 'maint'Patrik Nyblom
Conflicts: erts/doc/src/erlang.xml erts/preloaded/ebin/init.beam lib/kernel/doc/src/os.xml lib/stdlib/test/filename_SUITE.erl
2012-08-14Make get/putenv and erlexec understand UnicodePatrik Nyblom
Putenv and getenv needs to convert to the proper environment strings in Unicode depending on platform and user settings for filename encoding. Also erlexec needs to pass environment strings in an appropriate way for kernel to pick up. All environment strings on the command line, as well as home directory, is now passed in UTF8 on windows and in whatever encoding you have on Unix, kernel tries to convert all parameters and environments from UTF8 before making strings.
2012-02-21erts: Refactor new helper function erts_init_trap_exportSverker Eriksson
2011-11-16Remove remaining gcc 4.6 assigned-but-not-used warnings from ertsPatrik Nyblom
2011-10-26Use the proper macros in all BIFsBjörn Gustavsson
As a preparation for changing the calling convention for BIFs, make sure that all BIFs use the macros. Also, eliminate all calls from one BIF to another, since that also breaks the calling convention abstraction.
2011-10-13Allow noncharacter code points in unicode encoding and decodingBjörn Gustavsson
The two noncharacter code points 16#FFFE and 16#FFFF were not allowed to be encoded or decoded using the unicode module or bit syntax. That causes an inconsistency, since the noncharacters 16#FDD0 to 16#FDEF could be encoded/decoded. There is two ways to fix that inconsistency. We have chosen to allow 16#FFFE and 16#FFFF to be encoded and decoded, because the noncharacters could be useful internally within an application and it will make encoding and decoding slightly faster. Reported-by: Alisdair Sullivan
2011-05-20Update copyright yearsBjörn-Egil Dahlberg
2011-03-16erts: Remove unused variablesTuncer Ayaz
2010-12-10Fix a couple typos in filename encoding docsTuncer Ayaz
2010-11-30Teach prim_file not to accept atoms and not to throw exceptionsPatrik Nyblom
2010-11-30Teach spawn_executable about UnicodePatrik Nyblom
Also corrected compressed files on Windows
2010-11-30Convert filenames read on MacOSX to canonical formPatrik Nyblom
2010-11-30Make Unicode filenames work on WindowsPatrik Nyblom
2010-11-30Handle binary file names and conversion of unicode stringsPatrik Nyblom
2010-11-29Teach filename to accept raw data and add filename enc option to emuPatrik Nyblom
2010-11-29Add bifs to translate between erlang filenames and native encodingPatrik Nyblom
2010-03-22Merge branch 'pan/otp_8332_halfword' into devErlang/OTP
* pan/otp_8332_halfword: Teach testcase in driver_suite the new prototype for driver_async wx: Correct usage of driver callbacks from wx thread Adopt the new (R13B04) Nif functionality to the halfword codebase Support monitoring and demonitoring from driver threads Fix further test-suite problems Correct the VM to work for more test suites Teach {wordsize,internal|external} to system_info/1 Make tracing and distribution work Turn on instruction packing in the loader and virtual machine Add the BeamInstr data type for loaded BEAM code Fix the BEAM dissambler for the half-word emulator Store pointers to heap data in 32-bit words Add a custom mmap wrapper to force heaps into the lower address range Fit all heap data into the 32-bit address range
2010-03-10Add the BeamInstr data type for loaded BEAM codePatrik Nyblom
For cleanliness, use BeamInstr instead of the UWord data type to any machine-sized words that are used for BEAM instructions. Only use UWord for untyped words in general.
2010-03-10Store pointers to heap data in 32-bit wordsPatrik Nyblom
Store Erlang terms in 32-bit entities on the heap, expanding the pointers to 64-bit when needed. This works because all terms are stored on addresses in the 32-bit address range (the 32 most significant bits of pointers to term data are always 0). Introduce a new datatype called UWord (along with its companion SWord), which is an integer having the exact same size as the machine word (a void *), but might be larger than Eterm/Uint. Store code as machine words, as the instructions are pointers to executable code which might reside outside the 32-bit address range. Continuation pointers are stored on the 32-bit stack and hence must point to addresses in the low range, which means that loaded beam code much be placed in the low 32-bit address range (but, as said earlier, the instructions themselves are full words). No Erlang term data can be stored on C stacks (enforced by an earlier commit). This version gives a prompt, but test cases still fail (and dump core). The loader (and emulator loop) has instruction packing disabled. The main issues has been in rewriting loader and actual virtual machine. Subsystems (like distribution) does not work yet.
2009-11-20The R13B03 release.OTP_R13B03Erlang/OTP