Age | Commit message (Collapse) | Author |
|
Symptom: SEGV crash on ARM in delete_code() -> export_list().
Could probably happen on other machines as well.
Problem: Staging export table was iterated in an unsafe way
while an entry was added for a new export fun.
Solution: Correct write order and some memory barriers.
|
|
Normally, calling code:delete/1 before re-loading the code for a
module is unnecessary but causes no problem.
But there will be be problems if the new code has an on_load function.
Code with an on_load function will always be loaded as old code
to allowed it to be easily purged if the on_load function would fail.
If the on_load function succeeds, the old and current code will be
swapped.
So in the scenario where code:delete/1 has been called explicitly,
there is old code but no current code. Loading code with an
on_load function will cause the reference to the old code to be
overwritten. That will at best cause a memory leak, and at worst
an emulator crash (especially if NIFs are involved).
To avoid that situation, we will put the code with the on_load
function in a special, third slot in Module.
ERL-240
|
|
|
|
* sverker/load-corrupt-beam/ERL-216:
erts:: Unsignify a bunch of loader variables
erts: Reject an invalid beam code header size
erts: Fix load of beam with invalid imports and atom numbers
|
|
|
|
|
|
|
|
|
|
This merge is actually only some left overs.
The bulk work for hipe-amd64-code-alloc has already been
merge (without ticket number) at 42a1166b47721cd444.
|
|
'hx' may be used uninitialized
|
|
Load the module as old code; swap old and new code if the
-on_load function succeeds. That way, a failed update attempt
for a module that has an -on_load function will preserve the
previous version of the code.
|
|
As a preparation for fixing some issues with -on_load(), we will
integrate insert_new_code() into erts_finish_loading. Also rename
insert_new_code() to stub_insert_new_code() to make it clear that
it will only be used when making a stub module
|
|
* henrik/update-copyrightyear:
update copyright-year
|
|
Optimizations that are possible to do by the compiler should be
done by the compiler and not by the loader.
If the compiler has done its job correctly, attempting to do the two
transformations only wastes time.
|
|
In transformations such as:
move S X0=x==0 | line Loc | call_ext Ar Func => \
line Loc | move S X0 | call_ext Ar Func
we can avoid rebuilding the last instruction in the sequence
by introducing a 'keep' instruction.
Currently, there are only 13 transformations that are hit by
this optimization, but most of them are frequently used.
|
|
Introduce a 'rename' instruction that can be used to optimize
simple renaming with unchanged operands such as:
get_tuple_element Reg P Dst => i_get_tuple_element Reg P Dst
By allowing it to lower the arity of instruction, transformations
such as the following can be handled:
trim N Remaining => i_trim N
All in all, currently 67 transformations can be optimized in this
way, including some commonly used ones.
|
|
Generic instructions have a min_window field. Its purpose is to
avoid calling transform_engine() when there are too few instructions
in the current "transformation window" for a transformation to
succeed.
Currently it does not do much good since the window size will be
decremented by one before being used. The reason for the subtraction
is probably that in some circumstances in the past, the loader could
read past the end of the BEAM module while attempting to fetch
instructions to increase the window size. Therefore, it would not
be safe to just remove the subtraction by one.
The simplest and safest solution seems to always ensure that there
are always at least TWO instructions when calling transform_engine().
That will be safe, as long as a BEAM module is always finished with
an int_code_end/0 that is not involved in any transformation.
|
|
When an instruction with a variable number operands (such as
select_val) is seen of the left side of a transformation, the
'next_arg' instruction will allocate a buffer to fit all variables and
all operands will be copied into the buffer. Very often, the 'commit'
instruction will never be reached because of a test or predicate
failing or because of a short window; in that case, the variable
buffer will be deallocated.
Note that originally there were only few instructions with a variable
number of operands, but now common operations such as tuple building
also have a variable number of operands.
To avoid those frequent allocations and deallocations, modify the
'next_arg' instruction to only save a pointer to the first of the
"rest" arguments. Also move the deallocation of the instructions
on the left side from the 'commit' instruction to the 'end'
instruction to ensure that 'store_rest_args' will still work.
|
|
We used to set last_op_next and last_op to NULL just in case.
Setting last_op_next to causes a rescan of the instructions
to find the last instruction in the chain, so we would want
to avoid that unless really necessary.
|
|
62473daf introduced an unsafe optimization in the loader.
See the comments in the test case for an explanation of
the problem.
|
|
* Accept a raw data buffer instead of ErlNifBinary
* Accept option ERL_NIF_BIN2TERM_SAFE
* Return number of read bytes
|
|
|
|
We will need a way to check whether an prepared BEAM modules has
an on_load function.
|
|
* sverk/fix-list-length-int/OTP-13288:
erts: Fix error cases in enif_get_list_length
erts: Use Sint instead of int for list lengths
|
|
This avoids potential integer arithmetic overflow for very large lists.
|
|
|
|
Line table was left uninitialized for hipe (stub) modules
causing process_info(OtherPid, current_location) to crash.
|
|
* egil/pd-opt-get/OTP-13167:
erts: Add i_get_hash instruction
erts: Use internal hash for process dictionaries
|
|
Calculate hashvalue in load-time for constant process dictionary gets.
|
|
* The youngest generation of the heap can now consist of multiple
blocks. Heap fragments and message fragments are added to the
youngest generation when needed without triggering a GC. After
a GC the youngest generation is contained in one single block.
* The off_heap_message_queue process flag has been added. When
enabled all message data in the queue is kept off heap. When
a message is selected from the queue, the message fragment (or
heap fragment) containing the actual message is attached to the
youngest generation. Messages stored off heap is not part of GC.
|
|
|
|
to use real C struct with correct types
|
|
to use a real C struct instead of array.
|
|
|
|
|
|
Since 'd' operands can only either an X register or an Y register,
we only need a single bit to distinguish them. Furthermore, we can
pre-multiply the register number with the word size to speed up
address calculation.
|
|
Sequences of three move instructionst that effectively swap the
contents of two registers are fairly common. We can replace them
with a swap_temp/3 instruction. The third operand is the temporary
register to be used for swapping, since the temporary register
may actually be used.
If swap_temp/3 instruction is followed by a call, the temporary
register will often (but not always) be killed by the call. If
it is killed, we can replace the swap_temp/3 instruction with a
slightly cheaper swap/2 instruction.
|
|
|
|
|
|
The 'r' type is now mandatory. That means in order to handle
both of the following instructions:
move x(0) y(7)
move x(1) y(7)
we would need to define two specific operations in ops.tab:
move r y
move x y
We want to make 'r' operands optional. That is, if we have
only this specific instruction:
move x y
it will match both of the following instructions:
move x(0) y(7)
move x(1) y(7)
Make 'r' optional allows us to save code space when we don't
want to make handling of x(0) a special case, but we can still
use 'r' to optimize commonly used instructions.
|
|
Consider the try_case_end instruction:
try_case_end s
The 's' operand type means that the operand can either be a
literal of one of the types atom, integer, or empty list, or
a register. That worked well before R12. In R12 additional
types of literals where introduced. Because of way the
overloading was done, an 's' operand cannot handle the
new types of literals. Therefore, code such as the following
is necessary in ops.tab to avoid giving an 's' operand a
literal:
try_case_end Literal=q => move Literal x | try_case_end x
While this work, it is error-prone in that it is easy to
forget to add that kind of rule. It would also be complicated
in case we wanted to introduce a new kind of addition operator
such as:
i_plus jssd
Since there are two 's' operands, two scratch registers and
two 'move' instructions would be needed.
Therefore, we'll need to find a smarter way to find tag
register operands. We will overload the pid and port tags
for X and Y register, respectively. That works because pids
and port are immediate values (fit in one word), and there
are no literals for pids and ports.
|
|
|
|
The purpose of this series of commits is to improve code generation
for the Clang compiler.
As a first step we want to change the meaning of 'x' in a
transformation such as:
operation Literal=q => move Literal x | operation x
Currently, a plain 'x' means reg[0] or x(0), which is the first
element in the X register array. That element is distinct from
r(0) which is a variable in process_main(). Therefore, since r(0)
and x(0) are currently distinct it is fine to use x(0) as a
scratch register.
However, in the next commit we will eliminate the separate variable
for storing the contents of X register zero (thus, x(0) and r(0)
will point to the same location in the X register array). Therefore,
we must use another scratch register in transformation. Redefine
a plain 'x' in a transformation to mean x(1023). Also define
SCRATCH_X_REG so that we can refer to the register by name from
C code.
|
|
|
|
|
|
* hamt_bin2term:
erts: Add erts_factory_trim_and_close
erts: Optimize driver_deliver_term
erts: Remove hashmap probabilistic heap overestimation
Conflicts:
erts/emulator/beam/beam_load.c
|
|
by adding a dynamic heap factory.
"binary_to_term" is now a hybrid solution with both
a call to decoded_size() to calculate needed heap space
AND possible dynamic allocation of more heap space
if needed for big maps.
The heap size returned from decoded_size() is guaranteed
to be sufficient for all term heap data except for hashmap
nodes. All hashmap nodes are created at the end of dec_term()
by invoking the heap factory interface that may allocate more
heap space on process heap or in fragments.
With this commit it is no longer guaranteed that a message
is confined to only one heap fragment.
|
|
Add a check to protect from segfault when erlang:get_module_info/1/2 is
called on a deleted module (i.e. with no current code). Also refactor
erts_module_info_0/1 to avoid repeated calls to erts_active_code_ix() and
remove some obsolete comments. Add test for module_info on deleted modules.
|
|
Use the md5 of the native code chunk instead of the Beam code md5.
|
|
|