Age | Commit message (Collapse) | Author |
|
`beam_asm` would encode `{literal,[]}`, `{literal,erlang}`, and
`{literal,42}` in a less efficient way than the equivalent values
`nil`, `{atom,erlang}`, and `{integer,42}`. That would increase the
size of BEAM files and could increase the loaded code size. It would
probably not harm performance, because `literal` was only used this
way in code that generates `badmatch` and `case_clause` exceptions.
|
|
There are two instructions that take string operands:
{bs_put_string,Fail,NumberOfBytes,{string,String}}
{bs_match_string,Fail,Register,NumberOfBits,{string,String}}
In the canonical BEAM code that is passed to beam_asm, string String
is currently represented as a list. (The string in bs_match_string is
a bitstring before the beam_z compiler pass.) That is wasteful,
because there will be unnecessary conversions between lists and
binaries.
Change the representation of String to be a binary.
Furthermore, bs_put_string is an optimization of a bs_put_binary
instruction with a literal binary operand. Currently, the
bs_put_string instruction is introduced in beam_bs. Delay the
introduction of bs_put_string to the beam_z pass. That will simplify
optimizations and allow us to do the optimization currently done
in beam_bs in a SSA pass in a future commit.
|
|
|
|
|
|
Numbers that clearly are not smalls can be encoded as
literals. Conservatively, we assume that integers whose
absolute value is greater than 1 bsl 128 are bignums and
that they can be encoded as literals.
Literals are slightly easier for the loader to handle than
huge integers.
|
|
|
|
The clause that converted an iolist to a binary was never
executed.
Note that chunk/2 is called for all chunks in the
{extra_chunks,Chunks} option. This change will enforce that the
contents of each chunk must be a binary (as documented).
|
|
This allows compilers built on top of the compile module
to attach external compilation metadata to the compile_info
chunk.
For example, Erlang uses this chunk to store the compiler
version. Elixir and LFE may augment this by also adding
their own compiler versions, which can be useful when
debugging.
The deterministic option does not affect the user supplied
compile_info. It is therefore the responsibility of external
compilers to guarantee any added information does not violate
the determinsitic option, if such option is supported.
Finally, this code moves the building of the compile_info
options to the compile module instead of beam_asm, moving
all of the option mangling code to a single place.
|
|
The undocumented compiler option 'slim' is used when compiling
the primary bootstrap. The purpose is to make the bootstrap smaller
and to avoid unnecessary churn in the git repository. That is,
the BEAM file should be different only if the actual code in the
file is different, and not if it has merely been re-compiled on
a different computer.
Two commits have fattened the 'slim' option. In 36f7087ae0f,
extra chunks are included even in slim BEAM files. In dfb899c0229f7,
the "Dbgi" were added as an extra chunk, causing it to be included
in slim files.
Make 'slim' slim again by only including the essential chunks and
the attribute chunk (as was the case before the {extra,...} option
was added).
|
|
|
|
This allow languages such as Elixir and LFE to attach
extra chunks to the .beam file without having to parse
the beam file after compilation.
This commit also cleans up the interface to beam_asm,
allowing chunks to be passed from the compiler without
a need to change beam_asm API on every new chunk.
|
|
The new chunk stores atoms encoded in UTF-8.
beam_lib has also been modified to handle the new
'utf8_atoms' attribute while the 'atoms' attribute
may be a missing chunk from now on.
The binary_to_atom/2 BIF can now encode any utf8
binary with up to 255 characters.
The list_to_atom/1 BIF can now accept codepoints
higher than 255 with up to 255 characters (thanks
to Björn Gustavsson).
|
|
|
|
Add the option 'deterministic' to make it easier to
achieve reproducible builds.
This option omits the {options,...} and {source,...} tuples in
M:module_info(compile), because those options may contain absolute
paths.
The author of ERL-310 suggested that only compiler options that
may contain absolute paths (such as {i,...}) should be excluded. But I
find it confusing to keep only some options.
Alternatives considered: Always omitting this information. Since this
information has been available for a long time, that would probably
break some workflows. As an example that some people care about
{source,...}, 2d785c07fbf9 made it possible to give a compiler option
to set {source,...}.
ERL-310
|
|
As long as anyone can remember, the compilation time has been included
in BEAM files (and can be retrieved using Mod:module_info(compile)).
The timestamp has caused problems for anyone using tools such as 'cmp'
to compare BEAM files or for package managers:
http://erlang.org/pipermail/erlang-questions/2016-April/088717.html
Rarely has the timestamp been of any use. Yes, sometimes the timestamp
could help to figuring out which version of a module was used, but
nowadays a better way is to use Mod:module_info(md5).
To get rid of this problem, remove the timestamp from BEAM files in
OTP 19. Don't add an option to include timestamps.
Utilities that depend on the timestamp will need to be modified.
For example:
http://erlang.org/pipermail/erlang-questions/2016-April/088730.html
Instead of using the compilation time, the MD5 for the BEAM code can
be used. Example:
1> c:module_info(md5).
<<79,26,188,243,168,60,58,45,34,69,19,222,138,190,214,118>>
2> beam_lib:md5(code:which(c)).
{ok,{c,<<79,26,188,243,168,60,58,45,34,69,19,222,138,190,214,118>>}}
3>
|
|
* henrik/update-copyrightyear:
update copyright-year
|
|
compile:forms/1,2 will crash when the current working directory has
been deleted. Fix that problem, and while we are at it, also stop
including {source,""} in module_info() when no source code file is
given.
Reported-at: http://bugs.erlang.org/browse/ERL-113
Reported-by: Adam Lindberg
|
|
|
|
Eliminate searching in the list of exported functions in favor of
using a map. For modules with a huge number of exported functions
(such as NBAP-PDU-Contents in the asn1 test suite), that will mean a
significant speed-up.
|
|
|
|
The misc_SUITE:integer_encoding/1 test case is annoyingly slow.
Rewrite the encoding of integers in beam_asm to use the
binary:encode_unsigned/1 BIF.
Also tweak the test case itself. Scale the down the maximum
size of the numbers being generated, but also add test of
numbers around boundaries of power of two (which are the numbers
most likely to expose bugs in the encoding).
|
|
|
|
* Combine multiple get values with one instruction
* Combine multiple check keys with one instruction
|
|
|
|
The BEAM loader will put floating point constants into the
literal pools for the module, but it will not check for duplicates.
We can do much better by having the compiler use the literal
pool for floating point constants.
|
|
|
|
The calculation of the NewIndex field in fun entries is broken: the
sys_pre_expand and v3_kernel modules keep separate index counters
starting at zero; thus there is no guarantee that each fun within a
module will have its own unique NewIndex.
We don't really need the NewIndex any more (see below), but since
we do need the NewUniq field, we should fix NewIndex for cleanliness
sake. The simplest way is to assign NewIndex as late as possible,
namely in beam_asm, and to set it to the same value as Index.
Historical Note: Why NewIndex Was Introduced
There was once an idea that the debugger should be able to interpret
only a single function in a module (for speed). To make sure that
interpreted funs could be called from BEAM code and vice versa,
the fun identification must be visible in the abstract code.
Therefore a NewIndex field was introduced in each fun in the abstract
code.
However, it turned out that interpreting single functions does not
play well with aggressive code optimization. For example, in this
code:
f() ->
X = 1,
fun() -> X+2 end.
the variable X will seem to be free in the fun, but an aggressive
optimizer will replace X with 1 in the fun; thus the fun will no
longer have any free variables. Therefore, the debugger will always
interpret entire modules.
|
|
Funs are identified by a triple, <Module,Uniq,Index>, where Module is
the module name, Uniq is a 27 bit hash value of some intermediate
representation of the code for the fun, and index is a small integer.
When a fun is loaded, the triple for the fun will be compared to
previously loaded funs. If all elements in the triple in the newly
loaded fun are the same, the newly loaded fun will replace the previous
fun. The idea is that if Uniq are the same, the code for the fun is also
the same.
The problem is that Uniq is only based on the intermediate representation
of the fun itself. If the fun calls local functions in the same module,
Uniq may remain the same even if the behavior of the fun has been changed.
See
http://erlang.org/pipermail/erlang-bugs/2007-June/000368.htlm
for an example.
As a long-term plan to fix this problem, the NewIndex and NewUniq
fields was added to each fun in the R8 release (where NewUniq is the
MD5 of the BEAM code for the module). Unfortunately, it turns
out that the compiler does not assign unique value to NewIndex (if it
isn't tested, it doesn't work), so we cannot use the
<Module,NewUniq,NewIndex> triple as identification.
It would be possible to use <Module,NewUniq,Index>, but that seems
ugly. Therefore, fix the problem by making Uniq more unique by
taking 27 bits from the MD5 for the BEAM code. That only requires
a change to the compiler.
Also update a test case for cover, which now fails because of the
stronger Uniq calculation. (The comment in test case about why the
Pid2 process survived is not correct.)
|
|
|
|
|
|
Introduce the line/1 instruction in the compiler and the BEAM
virtual machine. It will not yet be generated by the compiler and
will not actually carry any information.
|
|
Add the gc_bif's to the VM.
Add infrastructure for gc_bif's (guard bifs that can gc) with two and.
three arguments in VM (loader and VM).
Add compiler support for gc_bif with three arguments.
Add compiler (and interpreter) support for new guard BIFs.
Add testcases for new guard BIFs in compiler and emulator.
|
|
* bg/compiler-attributes:
Remove opaque declarations from the attributes
|
|
-opaque declarations should not be retained in the attributes
(because they will be loaded along with the code and are not
useful).
While at it, filter away those Dialyzer attributes as early
as possible - in v3_kernel.
|
|
The original intention for the undocumented 'slim' option was
to omit non-essential parts of *.beam files to reduce the
size of the primary bootstrap. Therefore, debug information and
local function information (only used by xref, not by the loader)
are omitted, but information about the compilation time and
compiler version are still included.
Including compilation information is troublesome, however, when
committing the bootstrap into a revision control system, because
every beam file is guaranteed to be changed every time the bootstrap
is updated.
Therefore, change the 'slim' option to also omit compilation
information.
|
|
|