aboutsummaryrefslogblamecommitdiffstats
path: root/erts/doc/src/persistent_term.xml
blob: d2a138d65fad487b881c482e970f1db3ab348657 (plain) (tree)

































































































































































































































































































                                                                               
<?xml version="1.0" encoding="utf-8" ?>
<!DOCTYPE erlref SYSTEM "erlref.dtd">

<erlref>
  <header>
    <copyright>
      <year>2018</year><year>2018</year>
      <holder>Ericsson AB. All Rights Reserved.</holder>
    </copyright>
    <legalnotice>
      Licensed under the Apache License, Version 2.0 (the "License");
      you may not use this file except in compliance with the License.
      You may obtain a copy of the License at

          http://www.apache.org/licenses/LICENSE-2.0

      Unless required by applicable law or agreed to in writing, software
      distributed under the License is distributed on an "AS IS" BASIS,
      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
      See the License for the specific language governing permissions and
      limitations under the License.

    </legalnotice>

    <title>persistent_term</title>
    <prepared></prepared>
    <docno></docno>
    <date></date>
    <rev></rev>
    <file>persistent_term.xml</file>
  </header>
  <module>persistent_term</module>
  <modulesummary>Persistent terms.</modulesummary>
  <description>
    <p>This module is similar to <seealso
    marker="stdlib:ets"><c>ets</c></seealso> in that it provides a
    storage for Erlang terms that can be accessed in constant time,
    but with the difference that <c>persistent_term</c> has been
    highly optimized for reading terms at the expense of writing and
    updating terms. When a persistent term is updated or deleted, a
    global garbage collection pass is run to scan all processes for
    the deleted term, and to copy it into each process that still uses
    it. Therefore, <c>persistent_term</c> is suitable for storing
    Erlang terms that are frequently accessed but never or
    infrequently updated.</p>

    <warning><p>Persistent terms is an advanced feature and is not a
    general replacement for ETS tables. Before using persistent terms,
    make sure to fully understand the consequence to system
    performance when updating or deleting persistent terms.</p></warning>

    <p>Term lookup (using <seealso
    marker="#get/1"><c>get/1</c></seealso>), is done in constant time
    and without taking any locks, and the term is <strong>not</strong>
    copied to the heap (as is the case with terms stored in ETS
    tables).</p>

    <p>Storing or updating a term (using <seealso
    marker="#put/2"><c>put/2</c></seealso>) is proportional to the
    number of already created persistent terms because the hash table
    holding the keys will be copied. In addition, the term itself will
    be copied.</p>

    <p>When a (complex) term is deleted (using <seealso
    marker="#erase/1"><c>erase/1</c></seealso>) or replaced by another
    (using <seealso marker="#put/2"><c>put/2</c></seealso>), a global
    garbage collection is initiated. It works like this:</p>

    <list>
      <item><p>All processes in the system will be scheduled to run a
      scan of their heaps for the term that has been deleted.  While
      such scan is relatively light-weight, if there are many
      processes, the system can become less responsive until all
      process have scanned theirs heaps.</p></item>

      <item><p>If the deleted term (or any part of it) is still used
      by a process, that process will do a major (fullsweep) garbage
      collection and copy the term into the process. However, at most
      two processes at a time will be scheduled to do that kind of
      garbage collection.</p></item>
    </list>

    <p>Deletion of atoms and other terms that fit in one machine word
    is specially optimized to avoid doing a global GC. It is still not
    recommended to update persistent terms with such values too
    frequently because the hash table holding the keys is copied every
    time a persistent term is updated.</p>

    <p>Some examples are suitable uses for persistent terms are:</p>

    <list>
      <item><p>Storing of configuration data that must be easily
      accessible by all processes.</p></item>

      <item><p>Storing of references for NIF resources.</p></item>

      <item><p>Storing of references for efficient counters.</p></item>

      <item><p>Storing an atom to indicate a logging level or whether debugging
      is turned on.</p></item>
    </list>

  </description>

  <section>
    <title>Storing Huge Persistent Terms</title>
    <p>The current implementation of persistent terms uses the literal
    <seealso marker="erts_alloc">allocator</seealso> also used for
    literals (constant terms) in BEAM code.  By default, 1 GB of
    virtual address space is reserved for literals in BEAM code and
    persistent terms. The amount of virtual address space reserved for
    literals can be changed by using the <seealso
    marker="erts_alloc#MIscs"><c>+MIscs option</c></seealso> when
    starting the emulator.</p>

    <p>Here is an example how the reserved virtual address space for literals
    can be raised to 2 GB (2048 MB):</p>

    <pre>
    erl +MIscs 2048</pre>
  </section>

  <section>
    <title>Warning For Many Persistent Terms</title>
    <p>The runtime system will send a warning report to the
    error logger if more than 20000 persistent terms have been
    created. It will look like this:</p>

<pre>
More than 20000 persistent terms have been created.
It is recommended to avoid creating an excessive number of
persistent terms, as creation and deletion of persistent terms
will be slower as the number of persistent terms increases.</pre>
  </section>

  <section>
    <title>Best Practices for Using Persistent Terms</title>

    <p>It is recommended to use keys like <c>?MODULE</c> or
    <c>{?MODULE,SubKey}</c> to avoid name collisions.</p>

    <p>Prefer creating a few large persistent terms to creating many
    small persistent terms. The execution time for storing a
    persistent term is proportional to the number of already existing
    terms.</p>

    <p>Updating a persistent term with the same value as it already
    has is specially optimized to do nothing quickly; thus, there is
    no need compare the old and new values and avoid calling
    <seealso marker="#put/2"><c>put/2</c></seealso> if the values
    are equal.</p>

    <p>When atoms or other terms that fit in one machine word are
    deleted, no global GC is needed. Therefore, persistent terms that
    have atoms as their values can be updated more frequently, but
    note that updating such persistent terms is still much more
    expensive than reading them.</p>

    <p>Updating or deleting a persistent term will trigger a global GC
    if the term does not fit in one machine word. Processes will be
    scheduled as usual, but all processes will be made runnable at
    once, which will make the system less responsive until all process
    have run and scanned their heaps for the deleted terms. One way to
    minimize the effects on responsiveness could be to minimize the
    number of processes on the node before updating or deleting a
    persistent term. It would also be wise to avoid updating terms
    when the system is at peak load.</p>

    <p>Avoid storing a retrieved persistent term in a process if that
    persistent term could be deleted or updated in the future. If a
    process holds a reference to a persistent term when the term is
    deleted, the process will be garbage collected and the term copied
    to process.</p>

    <p>Avoid updating or deleting more than one persistent term at a
    time.  Each deleted term will trigger its own global GC. That
    means that deleting N terms will make the system less responsive N
    times longer than deleting a single persistent term. Therefore,
    terms that are to be updated at the same time should be collected
    into a larger term, for example, a map or a tuple.</p>
  </section>

  <section>
    <title>Example</title>

    <p>The following example shows how lock contention for ETS tables
    can be minimized by having one ETS table for each scheduler. The
    table identifiers for the ETS tables are stored as a single
    persistent term:</p>

<pre>
    %% There is one ETS table for each scheduler.
    Sid = erlang:system_info(scheduler_id),
    Tid = element(Sid, persistent_term:get(?MODULE)),
    ets:update_counter(Tid, Key, 1).</pre>

  </section>

  <datatypes>
    <datatype>
      <name name="key"/>
      <desc>
        <p>Any Erlang term.</p>
      </desc>
    </datatype>
    <datatype>
      <name name="value"/>
      <desc>
        <p>Any Erlang term.</p>
      </desc>
    </datatype>
  </datatypes>

  <funcs>
    <func>
      <name name="erase" arity="1"/>
      <fsummary>Erase the name for a persistent term.</fsummary>
      <desc>
        <p>Erase the name for the persistent term with key
	<c><anno>Key</anno></c>. The return value will be <c>true</c>
	if there was a persistent term with the key
	<c><anno>Key</anno></c>, and <c>false</c> if there was no
	persistent term associated with the key.</p>
	<p>If there existed a previous persistent term associated with
	key <c><anno>Key</anno></c>, a global GC has been initiated
	when <c>erase/1</c> returns. See <seealso
	marker="#description">Description</seealso>.</p>
      </desc>
    </func>

    <func>
      <name name="get" arity="0"/>
      <fsummary>Get all persistent terms.</fsummary>
      <desc>
        <p>Retrieve the keys and values for all persistent terms.
	The keys will be copied to the heap for the process calling
	<c>get/0</c>, but the values will not.</p>
      </desc>
    </func>

    <func>
      <name name="get" arity="1"/>
      <fsummary>Get the value for a persistent term.</fsummary>
      <desc>
        <p>Retrieve the value for the persistent term associated with
        the key <c><anno>Key</anno></c>. The lookup will be made in
	constant time and the value will not be copied to the heap
	of the calling process.</p>
	<p>This function fails with a <c>badarg</c> exception if no
	term has been stored with the key
	<c><anno>Key</anno></c>.</p>
	<p>If the calling process holds on to the value of the
	persistent term and the persistent term is deleted in the future,
	the term will be copied to the process.</p>
      </desc>
    </func>

    <func>
      <name name="info" arity="0"/>
      <fsummary>Get information about persistent terms.</fsummary>
      <desc>
        <p>Return information about persistent terms in a map. The map
	has the following keys:</p>
	<taglist>
	  <tag><c>count</c></tag>
	  <item><p>The number of persistent terms.</p></item>
	  <tag><c>memory</c></tag>
	  <item><p>The total amount of memory (measured in bytes)
	  used by all persistent terms.</p></item>
	</taglist>
      </desc>
    </func>

    <func>
      <name name="put" arity="2"/>
      <fsummary>Store a term.</fsummary>
      <desc>
        <p>Store the value <c><anno>Value</anno></c> as a persistent term and
	associate it with the key <c><anno>Key</anno></c>.</p>
	<p>If the value <c><anno>Value</anno></c> is equal to the value
	previously stored for the key, <c>put/2</c> will do nothing and return
	quickly.</p>
	<p>If there existed a previous persistent term associated with
	key <c><anno>Key</anno></c>, a global GC has been initiated
	when <c>put/2</c> returns. See <seealso
	marker="#description">Description</seealso>.</p>
      </desc>
    </func>
  </funcs>
</erlref>