The R13B03 release.OTP_R13B03

author: Erlang/OTP <[email protected]> 2009-11-20 14:54:40 +0000
committer: Erlang/OTP <[email protected]> 2009-11-20 14:54:40 +0000
commit: 84adefa331c4159d432d22840663c38f155cd4c1 (patch)
tree: bff9a9c66adda4df2106dfd0e5c053ab182a12bd /lib/stdlib/doc/src/ms_transform.xml
download: otp-84adefa331c4159d432d22840663c38f155cd4c1.tar.gz
otp-84adefa331c4159d432d22840663c38f155cd4c1.tar.bz2
otp-84adefa331c4159d432d22840663c38f155cd4c1.zip
1 files changed, 651 insertions, 0 deletions
diff --git a/lib/stdlib/doc/src/ms_transform.xml b/lib/stdlib/doc/src/ms_transform.xml
new file mode 100644
index 0000000000..9f178b426c
--- /dev/null
+++ b/lib/stdlib/doc/src/ms_transform.xml
@@ -0,0 +1,651 @@
+<?xml version="1.0" encoding="latin1" ?>
+<!DOCTYPE erlref SYSTEM "erlref.dtd">
+
+<erlref>
+  <header>
+    <copyright>
+      <year>2002</year><year>2009</year>
+      <holder>Ericsson AB. All Rights Reserved.</holder>
+    </copyright>
+    <legalnotice>
+      The contents of this file are subject to the Erlang Public License,
+      Version 1.1, (the "License"); you may not use this file except in
+      compliance with the License. You should have received a copy of the
+      Erlang Public License along with this software. If not, it can be
+      retrieved online at http://www.erlang.org/.
+    
+      Software distributed under the License is distributed on an "AS IS"
+      basis, WITHOUT WARRANTY OF ANY KIND, either express or implied. See
+      the License for the specific language governing rights and limitations
+      under the License.
+    
+    </legalnotice>
+
+    <title>ms_transform</title>
+    <prepared>Patrik Nyblom</prepared>
+    <responsible>Bjarne Dacker</responsible>
+    <docno>1</docno>
+    <approved>Bjarne D&auml;cker</approved>
+    <checked></checked>
+    <date>99-02-09</date>
+    <rev>C</rev>
+    <file>ms_transform.sgml</file>
+  </header>
+  <module>ms_transform</module>
+  <modulesummary>Parse_transform that translates fun syntax into match specifications. </modulesummary>
+  <description>
+    <marker id="top"></marker>
+    <p>This module implements the parse_transform that makes calls to
+      <c>ets</c> and <c>dbg</c>:<c>fun2ms/1</c> translate into literal
+      match specifications. It also implements the back end for the same
+      functions when called from the Erlang shell.</p>
+    <p>The translations from fun's to match_specs 
+      is accessed through the two "pseudo
+      functions" <c>ets:fun2ms/1</c> and <c>dbg:fun2ms/1</c>.</p>
+    <p>Actually this introduction is more or less an introduction to the
+      whole concept of match specifications. Since everyone trying to use
+      <c>ets:select</c> or <c>dbg</c> seems to end up reading
+      this page, it seems in good place to explain a little more than
+      just what this module does.</p>
+    <p>There are some caveats one should be aware of, please read through
+      the whole manual page if it's the first time you're using the
+      transformations. </p>
+    <p>Match specifications are used more or less as filters. 
+      They resemble usual Erlang matching in a list comprehension or in
+      a <c>fun</c> used in conjunction with <c>lists:foldl</c> etc. The
+      syntax of pure match specifications is somewhat awkward though, as
+      they are made up purely by Erlang terms and there is no syntax in the
+      language to make the match specifications more readable.</p>
+    <p>As the match specifications execution and structure is quite like
+      that of a fun, it would for most programmers be more straight forward
+      to simply write it using the familiar fun syntax and having that
+      translated into a match specification automatically. Of course a real
+      fun is more powerful than the match specifications allow, but bearing
+      the match specifications in mind, and what they can do, it's still
+      more convenient to write it all as a fun. This module contains the
+      code that simply translates the fun syntax into match_spec terms.</p>
+    <p>Let's start with an ets example. Using <c>ets:select</c> and
+      a match specification, one can filter out rows of a table and construct
+      a list of tuples containing relevant parts of the data in these
+      rows. Of course one could use <c>ets:foldl</c> instead, but the
+      select call is far more efficient. Without the translation, one has to
+      struggle with writing match specifications terms to accommodate this,
+      or one has to resort to the less powerful
+      <c>ets:match(_object)</c> calls, or simply give up and use
+      the more inefficient method of <c>ets:foldl</c>. Using the
+      <c>ets:fun2ms</c> transformation, a <c>ets:select</c> call
+      is at least as easy to write as any of the alternatives.</p>
+    <p>As an example, consider a simple table of employees:</p>
+    <code type="none">
+-record(emp, {empno,     %Employee number as a string, the key
+              surname,   %Surname of the employee
+              givenname, %Given name of employee
+              dept,      %Department one of {dev,sales,prod,adm}
+              empyear}). %Year the employee was employed    </code>
+    <p>We create the table using:</p>
+    <code type="none">
+ets:new(emp_tab,[{keypos,#emp.empno},named_table,ordered_set]).    </code>
+    <p>Let's also fill it with some randomly chosen data for the examples:</p>
+    <code type="none">
+[{emp,"011103","Black","Alfred",sales,2000},
+ {emp,"041231","Doe","John",prod,2001},
+ {emp,"052341","Smith","John",dev,1997},
+ {emp,"076324","Smith","Ella",sales,1995},
+ {emp,"122334","Weston","Anna",prod,2002},
+ {emp,"535216","Chalker","Samuel",adm,1998},
+ {emp,"789789","Harrysson","Joe",adm,1996},
+ {emp,"963721","Scott","Juliana",dev,2003},
+ {emp,"989891","Brown","Gabriel",prod,1999}]    </code>
+    <p>Now, the amount of data in the table is of course to small to justify
+      complicated ets searches, but on real tables, using <c>select</c> to get
+      exactly the data you want will increase efficiency remarkably.</p>
+    <p>Lets say for example that we'd want the employee numbers of
+      everyone in the sales department. One might use <c>ets:match</c>
+      in such a situation:</p>
+    <pre>
+1> <input>ets:match(emp_tab, {'_', '$1', '_', '_', sales, '_'}).</input>
+[["011103"],["076324"]]    </pre>
+    <p>Even though <c>ets:match</c> does not require a full match
+      specification, but a simpler type, it's still somewhat unreadable, and
+      one has little control over the returned result, it's always a list of
+      lists. OK, one might use <c>ets:foldl</c> or
+      <c>ets:foldr</c> instead:</p>
+    <code type="none">
+ets:foldr(fun(#emp{empno = E, dept = sales},Acc) -> [E | Acc];
+             (_,Acc) -> Acc
+          end,
+          [],
+          emp_tab).    </code>
+    <p>Running that would result in <c>["011103","076324"]</c>
+      , which at least gets rid of the extra lists. The fun is also quite
+      straightforward, so the only problem is that all the data from the
+      table has to be transferred from the table to the calling process for
+      filtering. That's inefficient compared to the <c>ets:match</c>
+      call where the filtering can be done "inside" the emulator and only
+      the result is transferred to the process. Remember that ets tables are
+      all about efficiency, if it wasn't for efficiency all of ets could be
+      implemented in Erlang, as a process receiving requests and sending
+      answers back. One uses ets because one wants performance, and
+      therefore one wouldn't want all of the table transferred to the
+      process for filtering. OK, let's look at a pure
+      <c>ets:select</c> call that does what the <c>ets:foldr</c>
+      does:</p>
+    <code type="none">
+ets:select(emp_tab,[{#emp{empno = '$1', dept = sales, _='_'},[],['$1']}]).    </code>
+    <p>Even though the record syntax is used, it's still somewhat hard to
+      read and even harder to write. The first element of the tuple,
+      <c>#emp{empno = '$1', dept = sales, _='_'}</c> tells what to
+      match, elements not matching this will not be returned at all, as in
+      the <c>ets:match</c> example. The second element, the empty list
+      is a list of guard expressions, which we need none, and the third
+      element is the list of expressions constructing the return value (in
+      ets this almost always is a list containing one single term). In our
+      case <c>'$1'</c> is bound to the employee number in the head
+      (first element of tuple), and hence it is the employee number that is
+      returned. The result is <c>["011103","076324"]</c>, just as in
+      the <c>ets:foldr</c> example, but the result is retrieved much
+      more efficiently in terms of execution speed and memory consumption.</p>
+    <p>We have one efficient but hardly readable way of doing it and one
+      inefficient but fairly readable (at least to the skilled Erlang
+      programmer) way of doing it. With the use of <c>ets:fun2ms</c>,
+      one could have something that is as efficient as possible but still is
+      written as a filter using the fun syntax:</p>
+    <code type="none">
+-include_lib("stdlib/include/ms_transform.hrl").
+
+% ...
+
+ets:select(emp_tab, ets:fun2ms(
+                      fun(#emp{empno = E, dept = sales}) ->
+                              E
+                      end)).    </code>
+    <p>This may not be the shortest of the expressions, but it requires no
+      special knowledge of match specifications to read. The fun's head
+      should simply match what you want to filter out and the body returns
+      what you want returned. As long as the fun can be kept within the
+      limits of the match specifications, there is no need to transfer all
+      data of the table to the process for filtering as in the
+      <c>ets:foldr</c> example. In fact it's even easier to read then
+      the <c>ets:foldr</c> example, as the select call in itself
+      discards anything that doesn't match, while the fun of the
+      <c>foldr</c> call needs to handle both the elements matching and
+      the ones not matching.</p>
+    <p>It's worth noting in the above <c>ets:fun2ms</c> example that one
+      needs to include <c>ms_transform.hrl</c> in the source code, as this is
+      what triggers the parse transformation of the <c>ets:fun2ms</c> call
+      to a valid match specification. This also implies that the
+      transformation is done at compile time (except when called from the
+      shell of course) and therefore will take no resources at all in
+      runtime. So although you use the more intuitive fun syntax, it gets as
+      efficient in runtime as writing match specifications by hand.</p>
+    <p>Let's look at some more <c>ets</c> examples. Let's say one
+      wants to get all the employee numbers of any employee hired before the
+      year 2000. Using <c>ets:match</c> isn't an alternative here as
+      relational operators cannot be expressed there. Once again, an
+      <c>ets:foldr</c> could do it (slowly, but correct):</p>
+    <code type="none"><![CDATA[
+ets:foldr(fun(#emp{empno = E, empyear = Y},Acc) when Y < 2000 -> [E | Acc];
+                  (_,Acc) -> Acc
+          end,
+          [],
+          emp_tab).    ]]></code>
+    <p>The result will be
+      <c>["052341","076324","535216","789789","989891"]</c>, as
+      expected. Now the equivalent expression using a handwritten match
+      specification would look something like this:</p>
+    <code type="none"><![CDATA[
+ets:select(emp_tab,[{#emp{empno = '$1', empyear = '$2', _='_'},
+                     [{'<', '$2', 2000}],
+                     ['$1']}]).    ]]></code>
+    <p>This gives the same result, the <c><![CDATA[[{'<', '$2', 2000}]]]></c> is in
+      the guard part and therefore discards anything that does not have a
+      empyear (bound to '$2' in the head) less than 2000, just as the guard
+      in the <c>foldl</c> example. Lets jump on to writing it using 
+      <c>ets:fun2ms</c></p>
+    <code type="none"><![CDATA[
+-include_lib("stdlib/include/ms_transform.hrl").
+
+% ...
+
+ets:select(emp_tab, ets:fun2ms(
+                      fun(#emp{empno = E, empyear = Y}) when Y < 2000 ->
+                              E
+                      end)).    ]]></code>
+    <p>Obviously readability is gained by using the parse transformation.</p>
+    <p>I'll show some more examples without the tiresome
+      comparing-to-alternatives stuff. Let's say we'd want the whole object
+      matching instead of only one element. We could of course assign a
+      variable to every part of the record and build it up once again in the
+      body of the <c>fun</c>, but it's easier to do like this:</p>
+    <code type="none"><![CDATA[
+ets:select(emp_tab, ets:fun2ms(
+                      fun(Obj = #emp{empno = E, empyear = Y}) 
+                         when Y < 2000 ->
+                              Obj
+                      end)).    ]]></code>
+    <p>Just as in ordinary Erlang matching, you can bind a variable to the
+      whole matched object using a "match in then match", i.e. a
+      <c>=</c>. Unfortunately this is not general in <c>fun's</c> translated
+      to match specifications, only on the "top level", i.e. matching the
+      <em>whole</em> object arriving to be matched into a separate variable,
+      is it allowed. For the one's used to writing match specifications by
+      hand, I'll have to mention that the variable A will simply be
+      translated into '$_'. It's not general, but it has very common usage,
+      why it is handled as a special, but useful, case. If this bothers you,
+      the pseudo function <c>object</c> also returns the whole matched
+      object, see the part about caveats and limitations below.</p>
+    <p>Let's do something in the <c>fun</c>'s body too: Let's say
+      that someone realizes that there are a few people having an employee
+      number beginning with a zero (<c>0</c>), which shouldn't be
+      allowed. All those should have their numbers changed to begin with a
+      one (<c>1</c>) instead  and one wants the
+      list <c><![CDATA[[{<Old empno>,<New empno>}]]]></c> created:</p>
+    <code type="none">
+ets:select(emp_tab, ets:fun2ms(
+                      fun(#emp{empno = [$0 | Rest] }) ->
+                              {[$0|Rest],[$1|Rest]}
+                      end)).    </code>
+    <p>As a matter of fact, this query hit's the feature of partially bound
+      keys in the table type <c>ordered_set</c>, so that not the whole
+      table need be searched, only the part of the table containing keys
+      beginning with <c>0</c> is in fact looked into. </p>
+    <p>The fun of course can have several clauses, so that if one could do
+      the following: For each employee, if he or she is hired prior to 1997,
+      return the tuple <c><![CDATA[{inventory, <employee number>}]]></c>, for each hired 1997
+      or later, but before 2001, return <c><![CDATA[{rookie, <employee number>}]]></c>, for all others return <c><![CDATA[{newbie, <employee number>}]]></c>. All except for the ones named <c>Smith</c> as
+      they would be affronted by anything other than the tag
+      <c>guru</c> and that is also what's returned for their numbers; 
+      <c><![CDATA[{guru, <employee number>}]]></c>:</p>
+    <code type="none"><![CDATA[
+ets:select(emp_tab, ets:fun2ms(
+                      fun(#emp{empno = E, surname = "Smith" }) ->
+                              {guru,E};
+                         (#emp{empno = E, empyear = Y}) when Y < 1997  ->
+                              {inventory, E};
+                         (#emp{empno = E, empyear = Y}) when Y > 2001  ->
+                              {newbie, E};
+                         (#emp{empno = E, empyear = Y}) -> % 1997 -- 2001
+                              {rookie, E}
+                      end)).    ]]></code>
+    <p>The result will be:</p>
+    <code type="none">
+[{rookie,"011103"},
+ {rookie,"041231"},
+ {guru,"052341"},
+ {guru,"076324"},
+ {newbie,"122334"},
+ {rookie,"535216"},
+ {inventory,"789789"},
+ {newbie,"963721"},
+ {rookie,"989891"}]    </code>
+    <p>and so the Smith's will be happy...</p>
+    <p>So, what more can you do? Well, the simple answer would be; look
+      in the documentation of match specifications in ERTS users
+      guide. However let's briefly go through the most useful "built in
+      functions" that you can use when the <c>fun</c> is to be
+      translated into a match specification by <c>ets:fun2ms</c> (it's
+      worth mentioning, although it might be obvious to some, that calling
+      other functions than the one's allowed in match specifications cannot
+      be done. No "usual" Erlang code can be executed by the <c>fun</c> being
+      translated by <c>fun2ms</c>, the <c>fun</c> is after all limited
+      exactly to the power of the match specifications, which is
+      unfortunate, but the price one has to pay for the execution speed of
+      an <c>ets:select</c> compared to <c>ets:foldl/foldr</c>).</p>
+    <p>The head of the <c>fun</c> is obviously a head matching (or mismatching) 
+      <em>one</em> parameter, one object of the table we <c>select</c>
+      from. The object is always a single variable (can be <c>_</c>) or
+      a tuple, as that's what's in <c>ets, dets</c> and
+      <c>mnesia</c> tables (the match specification returned by
+      <c>ets:fun2ms</c> can of course be used with
+      <c>dets:select</c> and <c>mnesia:select</c> as well as
+      with <c>ets:select</c>). The use of <c>=</c> in the head
+      is allowed (and encouraged) on the top level.</p>
+    <p>The guard section can contain any guard expression of Erlang.
+      Even the "old" type test are allowed on the toplevel of the guard 
+      (<c>integer(X)</c> instead of <c>is_integer(X)</c>). As the new type tests (the
+      <c>is_</c> tests) are in practice just guard bif's they can also
+      be called from within the body of the fun, but so they can in ordinary
+      Erlang code. Also arithmetics is allowed, as well as ordinary guard
+      bif's. Here's a list of bif's and expressions:</p>
+    <list type="bulleted">
+      <item>The type tests: is_atom, is_constant, is_float, is_integer,
+       is_list, is_number, is_pid, is_port, is_reference, is_tuple,
+       is_binary, is_function, is_record</item>
+      <item>The boolean operators: not, and, or, andalso, orelse </item>
+      <item>The relational operators: >, >=, &lt;, =&lt;, =:=, ==, =/=, /=</item>
+      <item>Arithmetics: +, -, *, div, rem</item>
+      <item>Bitwise operators: band, bor, bxor, bnot, bsl, bsr</item>
+      <item>The guard bif's: abs, element, hd, length, node, round, size, tl, 
+       trunc, self</item>
+      <item>The obsolete type test (only in guards):
+       atom, constant, float, integer,
+       list, number, pid, port, reference, tuple,
+       binary, function, record</item>
+    </list>
+    <p>Contrary to the fact with "handwritten" match specifications, the
+      <c>is_record</c> guard works as in ordinary Erlang code.</p>
+    <p>Semicolons (<c>;</c>) in guards are allowed, the result will be (as
+      expected) one "match_spec-clause" for each semicolon-separated
+      part of the guard. The semantics being identical to the Erlang
+      semantics.</p>
+    <p>The body of the <c>fun</c> is used to construct the
+      resulting value. When selecting from tables one usually just construct
+      a suiting term here, using ordinary Erlang term construction, like
+      tuple parentheses, list brackets and variables matched out in the
+      head, possibly in conjunction with the occasional constant. Whatever
+      expressions are allowed in guards are also allowed here, but there are
+      no special functions except <c>object</c> and
+      <c>bindings</c> (see further down), which returns the whole
+      matched object and all known variable bindings respectively.</p>
+    <p>The <c>dbg</c> variants of match specifications have an
+      imperative approach to the match specification body, the ets dialect
+      hasn't. The fun body for <c>ets:fun2ms</c> returns the result
+      without side effects, and as matching (<c>=</c>) in the body of
+      the match specifications is not allowed (for performance reasons) the
+      only thing left, more or less, is term construction...</p>
+    <p>Let's move on to the <c>dbg</c> dialect, the slightly
+      different match specifications translated by <c>dbg:fun2ms</c>. </p>
+    <p>The same reasons for using the parse transformation applies to
+      <c>dbg</c>, maybe even more so as filtering using Erlang code is
+      simply not a good idea when tracing (except afterwards, if you trace
+      to file). The concept is similar to that of <c>ets:fun2ms</c>
+      except that you usually use it directly from the shell (which can also
+      be done with <c>ets:fun2ms</c>). </p>
+    <p>Let's manufacture a toy module to trace on  </p>
+    <code type="none">
+-module(toy).
+
+-export([start/1, store/2, retrieve/1]).
+
+start(Args) ->
+    toy_table = ets:new(toy_table,Args).
+
+store(Key, Value) ->
+    ets:insert(toy_table,{Key,Value}).
+
+retrieve(Key) ->
+    [{Key, Value}] = ets:lookup(toy_table,Key),
+    Value.    </code>
+    <p>During model testing, the first test bails out with a
+      <c>{badmatch,16}</c> in <c>{toy,start,1}</c>, why?</p>
+    <p>We suspect the ets call, as we match hard on the return value, but
+      want only the particular <c>new</c> call with
+      <c>toy_table</c> as first parameter.
+      So we start a default tracer on the node:</p>
+    <pre>
+1> <input>dbg:tracer().</input>
+{ok,&lt;0.88.0>}</pre>
+    <p>And so we turn on call tracing for all processes, we are going to
+      make a pretty restrictive trace pattern, so there's no need to call
+      trace only a few processes (it usually isn't):</p>
+    <pre>
+2> <input>dbg:p(all,call).</input>
+{ok,[{matched,nonode@nohost,25}]}    </pre>
+    <p>It's time to specify the filter. We want to view calls that resemble
+      <c><![CDATA[ets:new(toy_table,<something>)]]></c>:</p>
+    <pre>
+3> <input>dbg:tp(ets,new,dbg:fun2ms(fun([toy_table,_]) -> true end)).</input>
+{ok,[{matched,nonode@nohost,1},{saved,1}]}    </pre>
+    <p>As can be seen, the <c>fun</c>'s used with
+      <c>dbg:fun2ms</c> takes a single list as parameter instead of a
+      single tuple. The list matches a list of the parameters to the traced
+      function.  A single variable may also be used of course. The body
+      of the fun expresses in a more imperative way actions to be taken if
+      the fun head (and the guards) matches. I return <c>true</c> here, but it's
+      only because the body of a fun cannot be empty, the return value will
+      be discarded. </p>
+    <p>When we run the test of our module now, we get the following trace
+      output:</p>
+    <code type="none"><![CDATA[
+(<0.86.0>) call ets:new(toy_table,[ordered_set])    ]]></code>
+    <p>Let's play we haven't spotted the problem yet, and want to see what 
+      <c>ets:new</c> returns. We do a slightly different trace
+      pattern:</p>
+    <pre>
+4> <input>dbg:tp(ets,new,dbg:fun2ms(fun([toy_table,_]) -> return_trace() end)).</input></pre>
+    <p>Resulting in the following trace output when we run the test:</p>
+    <code type="none"><![CDATA[
+(<0.86.0>) call ets:new(toy_table,[ordered_set])
+(<0.86.0>) returned from ets:new/2 -> 24    ]]></code>
+    <p>The call to <c>return_trace</c>, makes a trace message appear
+      when the function returns. It applies only to the specific function call
+      triggering the match specification (and matching the head/guards of
+      the match specification). This is the by far the most common call in the
+      body of a <c>dbg</c> match specification.</p>
+    <p>As the test now fails with <c>{badmatch,24}</c>, it's obvious 
+      that the badmatch is because the atom <c>toy_table</c> does not
+      match the number returned for an unnamed table. So we spotted the
+      problem, the table should be named and the arguments supplied by our
+      test program does not include <c>named_table</c>. We rewrite the
+      start function to:</p>
+    <code type="none">
+start(Args) ->
+    toy_table = ets:new(toy_table,[named_table |Args]).    </code>
+    <p>And with the same tracing turned on, we get the following trace
+      output:</p>
+    <code type="none"><![CDATA[
+(<0.86.0>) call ets:new(toy_table,[named_table,ordered_set])
+(<0.86.0>) returned from ets:new/2 -> toy_table    ]]></code>
+    <p>Very well. Let's say the module now passes all testing and goes into
+      the system. After a while someone realizes that the table
+      <c>toy_table</c> grows while the system is running and that for some
+      reason there are a lot of elements with atom's as keys. You had
+      expected only integer keys and so does the rest of the system. Well,
+      obviously not all of the system. You turn on call tracing and try to
+      see calls to your module with an atom as the key:</p>
+    <pre>
+1> <input>dbg:tracer().</input>
+{ok,&lt;0.88.0>}
+2> <input>dbg:p(all,call).</input>
+{ok,[{matched,nonode@nohost,25}]}
+3> <input>dbg:tpl(toy,store,dbg:fun2ms(fun([A,_]) when is_atom(A) -> true end)).</input>
+{ok,[{matched,nonode@nohost,1},{saved,1}]}</pre>
+    <p>We use <c>dbg:tpl</c> here to make sure to catch local calls
+      (let's say the module has grown since the smaller version and we're
+      not sure this inserting of atoms is not done locally...). When in
+      doubt always use local call tracing.</p>
+    <p>Let's say nothing happens when we trace in this way. Our function
+      is never called with these parameters. We make the conclusion that
+      someone else (some other module) is doing it and we realize that we
+      must trace on ets:insert and want to see the calling function. The
+      calling function may be retrieved using the match specification
+      function <c>caller</c> and to get it into the trace message, one
+      has to use the match spec function <c>message</c>. The filter
+      call looks like this (looking for calls to <c>ets:insert</c>):</p>
+    <pre>
+4> <input>dbg:tpl(ets,insert,dbg:fun2ms(fun([toy_table,{A,_}]) when is_atom(A) -> </input>
+<input>                                    message(caller()) </input>
+<input>                                  end)). </input>
+{ok,[{matched,nonode@nohost,1},{saved,2}]}    </pre>
+    <p>The caller will now appear in the "additional message" part of the
+      trace output, and so after a while, the following output comes:</p>
+    <code type="none"><![CDATA[
+(<0.86.0>) call ets:insert(toy_table,{garbage,can}) ({evil_mod,evil_fun,2})    ]]></code>
+    <p>You have found out that the function <c>evil_fun</c> of the
+      module <c>evil_mod</c>, with arity <c>2</c>, is the one
+      causing all this trouble.</p>
+    <p>This was just a toy example, but it illustrated the most used
+      calls in match specifications for <c>dbg</c> The other, more
+      esotheric calls are listed and explained in the <em>Users guide of the ERTS application</em>, they really are beyond the scope of this
+      document.</p>
+    <p>To end this chatty introduction with something more precise, here
+      follows some parts about caveats and restrictions concerning the fun's
+      used in conjunction with <c>ets:fun2ms</c> and
+      <c>dbg:fun2ms</c>:</p>
+    <warning>
+      <p>To use the pseudo functions triggering the translation, one
+        <em>has to</em> include the header file <c>ms_transform.hrl</c>
+        in the source code. Failure to do so will possibly result in
+        runtime errors rather than compile time, as the expression may
+        be valid as a plain Erlang program without translation.</p>
+    </warning>
+    <warning>
+      <p>The <c>fun</c> has to be literally constructed inside the
+        parameter list to the pseudo functions. The <c>fun</c> cannot
+        be bound to a variable first and then passed to
+        <c>ets:fun2ms</c> or <c>dbg:fun2ms</c>, i.e this
+        will work: <c>ets:fun2ms(fun(A) -> A end)</c> but not this:
+        <c>F = fun(A) -> A end, ets:fun2ms(F)</c>. The later will result
+        in a compile time error if the header is included, otherwise a
+        runtime error. Even if the later construction would ever
+        appear to work, it really doesn't, so don't ever use it.</p>
+    </warning>
+    <p>Several restrictions apply to the fun that is being translated
+      into a match_spec. To put it simple you cannot use anything in
+      the fun that you cannot use in a match_spec. This means that,
+      among others, the following restrictions apply to the fun itself:</p>
+    <list type="bulleted">
+      <item>Functions written in Erlang cannot be called, neither
+       local functions, global functions or real fun's</item>
+      <item>Everything that is written as a function call will be
+       translated into a match_spec call to a builtin function, so that
+       the call <c>is_list(X)</c> will be translated to <c>{'is_list', '$1'}</c> (<c>'$1'</c> is just an example, the numbering may
+       vary). If one tries to call a function that is not a match_spec
+       builtin, it will cause an error.</item>
+      <item>Variables occurring in the head of the <c>fun</c> will be
+       replaced by match_spec variables in the order of occurrence, so
+       that the fragment <c>fun({A,B,C})</c> will be replaced by
+      <c>{'$1', '$2', '$3'}</c> etc. Every occurrence of such a
+       variable later in the match_spec will be replaced by a
+       match_spec variable in the same way, so that the fun
+      <c>fun({A,B}) when is_atom(A) -> B end</c> will be translated into
+      <c>[{{'$1','$2'},[{is_atom,'$1'}],['$2']}]</c>.</item>
+      <item>
+        <p>Variables that are not appearing in the head are imported 
+          from the environment and made into
+          match_spec <c>const</c> expressions. Example from the shell:</p>
+        <pre>
+1> <input>X = 25.</input>
+25
+2> <input>ets:fun2ms(fun({A,B}) when A > X -> B end).</input>
+[{{'$1','$2'},[{'>','$1',{const,25}}],['$2']}]</pre>
+      </item>
+      <item>
+        <p>Matching with <c>=</c> cannot be used in the body. It can only
+          be used on the top level in the head of the fun. 
+          Example from the shell again:</p>
+        <pre>
+1> <input>ets:fun2ms(fun({A,[B|C]} = D) when A > B -> D end).</input>
+[{{'$1',['$2'|'$3']},[{'>','$1','$2'}],['$_']}]
+2> <input>ets:fun2ms(fun({A,[B|C]=D}) when A > B -> D end).</input>
+Error: fun with head matching ('=' in head) cannot be translated into 
+match_spec 
+{error,transform_error}
+3> <input>ets:fun2ms(fun({A,[B|C]}) when A > B -> D = [B|C], D end).</input>
+Error: fun with body matching ('=' in body) is illegal as match_spec
+{error,transform_error}        </pre>
+        <p>All variables are bound in the head of a match_spec, so the 
+          translator can not allow multiple bindings. The special case
+          when matching is done on the top level makes the variable bind
+          to <c>'$_'</c> in the resulting match_spec, it is to allow a more
+          natural access to the whole matched object. The pseudo
+          function <c>object()</c> could be used instead, see below. 
+          The following expressions are translated equally: </p>
+        <code type="none">
+ets:fun2ms(fun({a,_} = A) -> A end).
+ets:fun2ms(fun({a,_}) -> object() end).</code>
+      </item>
+      <item>
+        <p>The special match_spec variables <c>'$_'</c> and <c>'$*'</c>
+          can be accessed through the pseudo functions <c>object()</c>
+          (for <c>'$_'</c>) and <c>bindings()</c> (for <c>'$*'</c>).
+          as an example, one could translate the following
+          <c>ets:match_object/2</c> call to a <c>ets:select</c> call:</p>
+        <code type="none">
+ets:match_object(Table, {'$1',test,'$2'}). </code>
+        <p>...is the same as...</p>
+        <code type="none">
+ets:select(Table, ets:fun2ms(fun({A,test,B}) -> object() end)).</code>
+        <p>(This was just an example, in this simple case the former
+          expression is probably preferable in terms of readability).
+          The <c>ets:select/2</c> call will conceptually look like this
+          in the resulting code:</p>
+        <code type="none">
+ets:select(Table, [{{'$1',test,'$2'},[],['$_']}]).</code>
+        <p>Matching on the top level of the fun head might feel like a
+          more natural way to access <c>'$_'</c>, see above.</p>
+      </item>
+      <item>Term constructions/literals are translated as much as is
+       needed to get them into valid match_specs, so that tuples are
+       made into match_spec tuple constructions (a one element tuple
+       containing the tuple) and constant expressions are used when
+       importing variables from the environment. Records are also
+       translated into plain tuple constructions, calls to element
+       etc. The guard test <c>is_record/2</c> is translated into
+       match_spec code using the three parameter version that's built
+       into match_specs, so that <c>is_record(A,t)</c> is translated
+       into <c>{is_record,'$1',t,5}</c> given that the record size of
+       record type <c>t</c> is 5.</item>
+      <item>Language constructions like <c>case</c>, <c>if</c>,
+      <c>catch</c> etc that are not present in match_specs are not
+       allowed.</item>
+      <item>If the header file <c>ms_transform.hrl</c> is not included,
+       the fun won't be translated, which may result in a
+      <em>runtime error</em> (depending on if the fun is valid in a
+       pure Erlang context). Be absolutely sure that the header is
+       included when using <c>ets</c> and <c>dbg:fun2ms/1</c> in
+       compiled code.</item>
+      <item>If the pseudo function triggering the translation is
+      <c>ets:fun2ms/1</c>, the fun's head must contain a single
+       variable or a single tuple. If the pseudo function is
+      <c>dbg:fun2ms/1</c> the fun's head must contain a single
+       variable or a single list.</item>
+    </list>
+    <p>The translation from fun's to match_specs is done at compile
+      time, so runtime performance is not affected by using these pseudo
+      functions. The compile time might be somewhat longer though. </p>
+    <p>For more information about match_specs, please read about them
+      in <em>ERTS users guide</em>.</p>
+  </description>
+  <funcs>
+    <func>
+      <name>parse_transform(Forms,_Options) -> Forms</name>
+      <fsummary>Transforms Erlang abstract format containing calls to ets/dbg:fun2ms into literal match specifications.</fsummary>
+      <type>
+        <v>Forms = Erlang abstract code format, see the erl_parse module description </v>
+        <v>_Options = Option list, required but not used</v>
+      </type>
+      <desc>
+        <p>Implements the actual transformation at compile time. This
+          function is called by the compiler to do the source code
+          transformation if and when the <c>ms_transform.hrl</c> header
+          file is included in your source code. See the <c>ets</c> and
+          <c>dbg</c>:<c>fun2ms/1</c> function manual pages for
+          documentation on how to use this parse_transform, see the
+          <c>match_spec</c> chapter in <c>ERTS</c> users guide for a
+          description of match specifications. </p>
+      </desc>
+    </func>
+    <func>
+      <name>transform_from_shell(Dialect,Clauses,BoundEnvironment) -> term()</name>
+      <fsummary>Used when transforming fun's created in the shell into match_specifications.</fsummary>
+      <type>
+        <v>Dialect = ets | dbg</v>
+        <v>Clauses = Erlang abstract form for a single fun</v>
+        <v>BoundEnvironment = [{atom(), term()}, ...], list of variable bindings in the shell environment</v>
+      </type>
+      <desc>
+        <p>Implements the actual transformation when the <c>fun2ms</c>
+          functions are called from the shell. In this case the abstract
+          form is for one single fun (parsed by the Erlang shell), and
+          all imported variables should be in the key-value list passed
+          as <c>BoundEnvironment</c>. The result is a term, normalized,
+          i.e. not in abstract format.</p>
+      </desc>
+    </func>
+    <func>
+      <name>format_error(Errcode) -> ErrMessage</name>
+      <fsummary>Error formatting function as required by the parse_transform interface.</fsummary>
+      <type>
+        <v>Errcode = term()</v>
+        <v>ErrMessage = string()</v>
+      </type>
+      <desc>
+        <p>Takes an error code returned by one of the other functions
+          in the module and creates a textual description of the
+          error. Fairly uninteresting function actually.</p>
+      </desc>
+    </func>
+  </funcs>
+</erlref>
+
author	Erlang/OTP <[email protected]>	2009-11-20 14:54:40 +0000
committer	Erlang/OTP <[email protected]>	2009-11-20 14:54:40 +0000
commit	84adefa331c4159d432d22840663c38f155cd4c1 (patch)
tree	bff9a9c66adda4df2106dfd0e5c053ab182a12bd /lib/stdlib/doc/src/ms_transform.xml
download	otp-84adefa331c4159d432d22840663c38f155cd4c1.tar.gz otp-84adefa331c4159d432d22840663c38f155cd4c1.tar.bz2 otp-84adefa331c4159d432d22840663c38f155cd4c1.zip