diff options
author | Erlang/OTP <[email protected]> | 2009-11-20 14:54:40 +0000 |
---|---|---|
committer | Erlang/OTP <[email protected]> | 2009-11-20 14:54:40 +0000 |
commit | 84adefa331c4159d432d22840663c38f155cd4c1 (patch) | |
tree | bff9a9c66adda4df2106dfd0e5c053ab182a12bd /lib/stdlib/doc/src/qlc.xml | |
download | otp-84adefa331c4159d432d22840663c38f155cd4c1.tar.gz otp-84adefa331c4159d432d22840663c38f155cd4c1.tar.bz2 otp-84adefa331c4159d432d22840663c38f155cd4c1.zip |
The R13B03 release.OTP_R13B03
Diffstat (limited to 'lib/stdlib/doc/src/qlc.xml')
-rw-r--r-- | lib/stdlib/doc/src/qlc.xml | 1486 |
1 files changed, 1486 insertions, 0 deletions
diff --git a/lib/stdlib/doc/src/qlc.xml b/lib/stdlib/doc/src/qlc.xml new file mode 100644 index 0000000000..da24ee9914 --- /dev/null +++ b/lib/stdlib/doc/src/qlc.xml @@ -0,0 +1,1486 @@ +<?xml version="1.0" encoding="latin1" ?> +<!DOCTYPE erlref SYSTEM "erlref.dtd"> + +<erlref> + <header> + <copyright> + <year>2004</year><year>2009</year> + <holder>Ericsson AB. All Rights Reserved.</holder> + </copyright> + <legalnotice> + The contents of this file are subject to the Erlang Public License, + Version 1.1, (the "License"); you may not use this file except in + compliance with the License. You should have received a copy of the + Erlang Public License along with this software. If not, it can be + retrieved online at http://www.erlang.org/. + + Software distributed under the License is distributed on an "AS IS" + basis, WITHOUT WARRANTY OF ANY KIND, either express or implied. See + the License for the specific language governing rights and limitations + under the License. + + </legalnotice> + + <title>qlc</title> + <prepared>Hans Bolinder</prepared> + <responsible>nobody</responsible> + <docno></docno> + <approved>nobody</approved> + <checked>no</checked> + <date>2004-08-25</date> + <rev>PA1</rev> + <file>qlc.sgml</file> + </header> + <module>qlc</module> + <modulesummary>Query Interface to Mnesia, ETS, Dets, etc</modulesummary> + <description> + <p>The <c>qlc</c> module provides a query interface to Mnesia, ETS, + Dets and other data structures that implement an iterator style + traversal of objects. </p> + </description> + + <section><title>Overview</title> + + <p>The <c>qlc</c> module implements a query interface to <em>QLC + tables</em>. Typical QLC tables are ETS, Dets, and Mnesia + tables. There is also support for user defined tables, see the + <seealso marker="#implementing_a_qlc_table">Implementing a QLC + table</seealso> section. A <em>query</em> is stated using + <em>Query List Comprehensions</em> (QLCs). The answers to a + query are determined by data in QLC tables that fulfill the + constraints expressed by the QLCs of the query. QLCs are similar + to ordinary list comprehensions as described in the Erlang + Reference Manual and Programming Examples except that variables + introduced in patterns cannot be used in list expressions. In + fact, in the absence of optimizations and options such as + <c>cache</c> and <c>unique</c> (see below), every QLC free of + QLC tables evaluates to the same list of answers as the + identical ordinary list comprehension. </p> + + <p>While ordinary list comprehensions evaluate to lists, calling + <seealso marker="#q">qlc:q/1,2</seealso> returns a <em>Query + Handle</em>. To obtain all the answers to a query, <seealso + marker="#eval">qlc:eval/1,2</seealso> should be called with the + query handle as first argument. Query handles are essentially + functional objects ("funs") created in the module calling <c>q/1,2</c>. + As the funs refer to the module's code, one should + be careful not to keep query handles too long if the module's + code is to be replaced. + Code replacement is described in the <seealso + marker="doc/reference_manual:code_loading">Erlang Reference + Manual</seealso>. The list of answers can also be traversed in + chunks by use of a <em>Query Cursor</em>. Query cursors are + created by calling <seealso + marker="#cursor">qlc:cursor/1,2</seealso> with a query handle as + first argument. Query cursors are essentially Erlang processes. + One answer at a time is sent from the query cursor process to + the process that created the cursor.</p> + + </section> + + <section><title>Syntax</title> + + <p>Syntactically QLCs have the same parts as ordinary list + comprehensions:</p> + + <code type="none">[Expression || Qualifier1, Qualifier2, ...]</code> + + <p><c>Expression</c> (the <em>template</em>) is an arbitrary + Erlang expression. Qualifiers are either <em>filters</em> or + <em>generators</em>. Filters are Erlang expressions returning + <c>bool()</c>. Generators have the form + <c><![CDATA[Pattern <- ListExpression]]></c>, where + <c>ListExpression</c> is an expression evaluating to a query + handle or a list. Query handles are returned from + <c>qlc:table/2</c>, <c>qlc:append/1,2</c>, <c>qlc:sort/1,2</c>, + <c>qlc:keysort/2,3</c>, <c>qlc:q/1,2</c>, and + <c>qlc:string_to_handle/1,2,3</c>.</p> + + </section> + + <section><title>Evaluation</title> + + <p>The evaluation of a query handle begins by the inspection of + options and the collection of information about tables. As a + result qualifiers are modified during the optimization phase. + Next all list expressions are evaluated. If a cursor has been + created evaluation takes place in the cursor process. For those + list expressions that are QLCs, the list expressions of the + QLCs' generators are evaluated as well. One has to be careful if + list expressions have side effects since the order in which list + expressions are evaluated is unspecified. Finally the answers + are found by evaluating the qualifiers from left to right, + backtracking when some filter returns <c>false</c>, or + collecting the template when all filters return <c>true</c>.</p> + + <p>Filters that do not return <c>bool()</c> but fail are handled + differently depending on their syntax: if the filter is a guard + it returns <c>false</c>, otherwise the query evaluation fails. + This behavior makes it possible for the <c>qlc</c> module to do + some optimizations without affecting the meaning of a query. For + example, when testing some position of a table and one or more + constants for equality, only + the objects with equal values are candidates for further + evaluation. The other objects are guaranteed to make the filter + return <c>false</c>, but never fail. The (small) set of + candidate objects can often be found by looking up some key + values of the table or by traversing the table using a match + specification. It is necessary to place the guard filters + immediately after the table's generator, otherwise the candidate + objects will not be restricted to a small set. The reason is + that objects that could make the query evaluation fail must not + be excluded by looking up a key or running a match + specification.</p> + + </section> + + <section><title>Join</title> + + <p>The <c>qlc</c> module supports fast join of two query handles. + Fast join is possible if some position <c>P1</c> of one query + handler and some position <c>P2</c> of another query handler are + tested for equality. Two fast join methods have been + implemented:</p> + + <list type="bulleted"> + <item>Lookup join traverses all objects of one query handle and + finds objects of the other handle (a QLC table) such that the + values at <c>P1</c> and <c>P2</c> match or compare equal. + The <c>qlc</c> module does not create + any indices but looks up values using the key position and + the indexed positions of the QLC table. + </item> + <item>Merge join sorts the objects of each query handle if + necessary and filters out objects where the values at + <c>P1</c> and <c>P2</c> do not compare equal. If there are + many objects with the same value of <c>P2</c> a temporary + file will be used for the equivalence classes. + </item> + </list> + + <p>The <c>qlc</c> module warns at compile time if a QLC + combines query handles in such a way that more than one join is + possible. In other words, there is no query planner that can + choose a good order between possible join operations. It is up + to the user to order the joins by introducing query handles.</p> + + <p>The join is to be expressed as a guard filter. The filter must + be placed immediately after the two joined generators, possibly + after guard filters that use variables from no other generators + but the two joined generators. The <c>qlc</c> module inspects + the operands of + <c>=:=/2</c>, <c>==/2</c>, <c>is_record/2</c>, <c>element/2</c>, + and logical operators (<c>and/2</c>, <c>or/2</c>, + <c>andalso/2</c>, <c>orelse/2</c>, <c>xor/2</c>) when + determining which joins to consider.</p> + + </section> + + <section><title>Common options</title> + + <p>The following options are accepted by <c>cursor/2</c>, + <c>eval/2</c>, <c>fold/4</c>, and <c>info/2</c>:</p> + + <list type="bulleted"> + <item><c>{cache_all, Cache}</c> where <c>Cache</c> is + equal to <c>ets</c> or <c>list</c> adds a + <c>{cache, Cache}</c> option to every list expression + of the query except tables and lists. Default is + <c>{cache_all, no}</c>. The option <c>cache_all</c> is + equivalent to <c>{cache_all, ets}</c>. + </item> + <item><c>{max_list_size, MaxListSize}</c> <marker + id="max_list_size"></marker> where <c>MaxListSize</c> is the + size in bytes of terms on the external format. If the + accumulated size of collected objects exceeds + <c>MaxListSize</c> the objects are written onto a temporary + file. This option is used by the <c>{cache, list}</c> + option as well as by the merge join method. Default is + 512*1024 bytes. + </item> + <item><c>{tmpdir_usage, TmpFileUsage}</c> determines the + action taken when <c>qlc</c> is about to create temporary + files on the directory set by the <c>tmpdir</c> option. If the + value is <c>not_allowed</c> an error tuple is returned, + otherwise temporary files are created as needed. Default is + <c>allowed</c> which means that no further action is taken. + The values <c>info_msg</c>, <c>warning_msg</c>, and + <c>error_msg</c> mean that the function with the corresponding + name in the module <c>error_logger</c> is called for printing + some information (currently the stacktrace). + </item> + <item><c>{tmpdir, TempDirectory}</c> sets the directory used by + merge join for temporary files and by the + <c>{cache, list}</c> option. The option also overrides + the <c>tmpdir</c> option of <c>keysort/3</c> and + <c>sort/2</c>. The default value is <c>""</c> which means that + the directory returned by <c>file:get_cwd()</c> is used. + </item> + <item><c>{unique_all, true}</c> adds a + <c>{unique, true}</c> option to every list expression of + the query. Default is <c>{unique_all, false}</c>. The + option <c>unique_all</c> is equivalent to + <c>{unique_all, true}</c>. + </item> + </list> + + </section> + + <section><title>Common data types</title> + + <list type="bulleted"> + <item><p><c>QueryCursor = {qlc_cursor, term()}</c></p> + </item> + <item><p><c>QueryHandle = {qlc_handle, term()}</c></p> + </item> + <item><p><c>QueryHandleOrList = QueryHandle | list()</c></p> + </item> + <item><p><c>Answers = [Answer]</c></p> + </item> + <item><p><c>Answer = term()</c></p> + </item> + <item><p><c>AbstractExpression = </c> - parse trees + for Erlang expressions, see the <seealso + marker="erts:absform">abstract format</seealso> + documentation in the ERTS User's Guide -</p> + </item> + <item><p><c>MatchExpression = </c> + - match specifications, see the <seealso + marker="erts:match_spec">match specification</seealso> + documentation in the ERTS User's Guide and <seealso + marker="ms_transform">ms_transform(3)</seealso> -</p> + </item> + <item><p><c>SpawnOptions = default | spawn_options()</c></p> + </item> + <item><p><c>SortOptions = [SortOption] | SortOption</c></p> + </item> + <item><p><c>SortOption = {compressed, bool()} + | {no_files, NoFiles} + | {order, Order} + | {size, Size} + | {tmpdir, TempDirectory} + | {unique, bool()} </c> + - see <seealso + marker="file_sorter">file_sorter(3)</seealso> -</p> + </item> + <item><p><c>Order = ascending | descending | OrderFun</c></p> + </item> + <item><p><c>OrderFun = fun(term(), term()) -> bool()</c></p> + </item> + <item><p><c>TempDirectory = "" | filename()</c></p> + </item> + <item><p><c>Size = int() > 0</c></p> + </item> + <item><p><c>NoFiles = int() > 1</c></p> + </item> + <item><p><c>KeyPos = int() > 0 | [int() > 0]</c></p> + </item> + <item><p><c>MaxListSize = int() >= 0</c></p> + </item> + <item><p><c>bool() = true | false</c></p> + </item> + <item><p><c>Cache = ets | list | no</c></p> + </item> + <item><p><c>TmpFileUsage = allowed | not_allowed | info_msg + | warning_msg | error_msg</c></p> + </item> + <item><p><c>filename() = </c> - see <seealso + marker="filename">filename(3)</seealso> -</p> + </item> + <item><p><c>spawn_options() = </c> - see <seealso + marker="erts:erlang">erlang(3)</seealso> -</p> + </item> + + </list> + + </section> + + <section><title>Getting started</title> + + <p><marker id="getting_started"></marker> As already mentioned + queries are stated in the list comprehension syntax as described + in the <seealso marker="doc/reference_manual:expressions">Erlang + Reference Manual</seealso>. In the following some familiarity + with list comprehensions is assumed. There are examples in + <seealso + marker="doc/programming_examples:list_comprehensions">Programming + Examples</seealso> that can get you started. It should be + stressed that list comprehensions do not add any computational + power to the language; anything that can be done with list + comprehensions can also be done without them. But they add a + syntax for expressing simple search problems which is compact + and clear once you get used to it.</p> + + <p>Many list comprehension expressions can be evaluated by the + <c>qlc</c> module. Exceptions are expressions such that + variables introduced in patterns (or filters) are used in some + generator later in the list comprehension. As an example + consider an implementation of lists:append(L): + <c><![CDATA[[X ||Y <- L, X <- Y]]]></c>. + Y is introduced in the first generator and used in the second. + The ordinary list comprehension is normally to be preferred when + there is a choice as to which to use. One difference is that + <c>qlc:eval/1,2</c> collects answers in a list which is finally + reversed, while list comprehensions collect answers on the stack + which is finally unwound.</p> + + <p>What the <c>qlc</c> module primarily adds to list + comprehensions is that data can be read from QLC tables in small + chunks. A QLC table is created by calling <c>qlc:table/2</c>. + Usually <c>qlc:table/2</c> is not called directly from the query + but via an interface function of some data structure. There are + a few examples of such functions in Erlang/OTP: + <c>mnesia:table/1,2</c>, <c>ets:table/1,2</c>, and + <c>dets:table/1,2</c>. For a given data structure there can be + several functions that create QLC tables, but common for all + these functions is that they return a query handle created by + <c>qlc:table/2</c>. Using the QLC tables provided by OTP is + probably sufficient in most cases, but for the more advanced + user the section <seealso + marker="#implementing_a_qlc_table">Implementing a QLC + table</seealso> describes the implementation of a function + calling <c>qlc:table/2</c>.</p> + + <p>Besides <c>qlc:table/2</c> there are other functions that + return query handles. They might not be used as often as tables, + but are useful from time to time. <c>qlc:append</c> traverses + objects from several tables or lists after each other. If, for + instance, you want to traverse all answers to a query QH and + then finish off by a term <c>{finished}</c>, you can do that by + calling <c>qlc:append(QH, [{finished}])</c>. <c>append</c> first + returns all objects of QH, then <c>{finished}</c>. If there is + one tuple <c>{finished}</c> among the answers to QH it will be + returned twice from <c>append</c>.</p> + + <p>As another example, consider concatenating the answers to two + queries QH1 and QH2 while removing all duplicates. The means to + accomplish this is to use the <c>unique</c> option:</p> + + <code type="none"><![CDATA[ +qlc:q([X || X <- qlc:append(QH1, QH2)], {unique, true})]]></code> + + <p>The cost is substantial: every returned answer will be stored + in an ETS table. Before returning an answer it is looked up in + the ETS table to check if it has already been returned. Without + the <c>unique</c> options all answers to QH1 would be returned + followed by all answers to QH2. The <c>unique</c> options keeps + the order between the remaining answers.</p> + + <p>If the order of the answers is not important there is the + alternative to sort the answers uniquely:</p> + + <code type="none"><![CDATA[ +qlc:sort(qlc:q([X || X <- qlc:append(QH1, QH2)], {unique, true})).]]></code> + + <p>This query also removes duplicates but the answers will be + sorted. If there are many answers temporary files will be used. + Note that in order to get the first unique answer all answers + have to be found and sorted. Both alternatives find duplicates + by comparing answers, that is, if A1 and A2 are answers found in + that order, then A2 is a removed if A1 == A2.</p> + + <p>To return just a few answers cursors can be used. The following + code returns no more than five answers using an ETS table for + storing the unique answers:</p> + + <code type="none"><![CDATA[ +C = qlc:cursor(qlc:q([X || X <- qlc:append(QH1, QH2)],{unique,true})), +R = qlc:next_answers(C, 5), +ok = qlc:delete_cursor(C), +R.]]></code> + + <p>Query list comprehensions are convenient for stating + constraints on data from two or more tables. An example that + does a natural join on two query handles on position 2:</p> + + <code type="none"><![CDATA[ +qlc:q([{X1,X2,X3,Y1} || + {X1,X2,X3} <- QH1, + {Y1,Y2} <- QH2, + X2 =:= Y2])]]></code> + + <p>The <c>qlc</c> module will evaluate this differently depending on + the query + handles <c>QH1</c> and <c>QH2</c>. If, for example, <c>X2</c> is + matched against the key of a QLC table the lookup join method + will traverse the objects of <c>QH2</c> while looking up key + values in the table. On the other hand, if neither <c>X2</c> nor + <c>Y2</c> is matched against the key or an indexed position of a + QLC table, the merge join method will make sure that <c>QH1</c> + and <c>QH2</c> are both sorted on position 2 and next do the + join by traversing the objects one by one.</p> + + <p>The <c>join</c> option can be used to force the <c>qlc</c> module + to use a + certain join method. For the rest of this section it is assumed + that the excessively slow join method called "nested loop" has + been chosen:</p> + + <code type="none"><![CDATA[ +qlc:q([{X1,X2,X3,Y1} || + {X1,X2,X3} <- QH1, + {Y1,Y2} <- QH2, + X2 =:= Y2], + {join, nested_loop})]]></code> + + <p>In this case the filter will be applied to every possible pair + of answers to QH1 and QH2, one at a time. If there are M answers + to QH1 and N answers to QH2 the filter will be run M*N + times.</p> + + <p>If QH2 is a call to the function for <c>gb_trees</c> as defined + in the <seealso marker="#implementing_a_qlc_table">Implementing + a QLC table</seealso> section, <c>gb_table:table/1</c>, the + iterator for the gb-tree will be initiated for each answer to + QH1 after which the objects of the gb-tree will be returned one + by one. This is probably the most efficient way of traversing + the table in that case since it takes minimal computational + power to get the following object. But if QH2 is not a table but + a more complicated QLC, it can be more efficient use some RAM + memory for collecting the answers in a cache, particularly if + there are only a few answers. It must then be assumed that + evaluating QH2 has no side effects so that the meaning of the + query does not change if QH2 is evaluated only once. One way of + caching the answers is to evaluate QH2 first of all and + substitute the list of answers for QH2 in the query. Another way + is to use the <c>cache</c> option. It is stated like this:</p> + + <code type="none"><![CDATA[ +QH2' = qlc:q([X || X <- QH2], {cache, ets})]]></code> + + <p>or just</p> + + <code type="none"><![CDATA[ +QH2' = qlc:q([X || X <- QH2], cache)]]></code> + + <p>The effect of the <c>cache</c> option is that when the + generator QH2' is run the first time every answer is stored in + an ETS table. When next answer of QH1 is tried, answers to QH2' + are copied from the ETS table which is very fast. As for the + <c>unique</c> option the cost is a possibly substantial amount + of RAM memory. The <c>{cache, list}</c> option offers the + possibility to store the answers in a list on the process heap. + While this has the potential of being faster than ETS tables + since there is no need to copy answers from the table it can + often result in slower evaluation due to more garbage + collections of the process' heap as well as increased RAM memory + consumption due to larger heaps. Another drawback with cache + lists is that if the size of the list exceeds a limit a + temporary file will be used. Reading the answers from a file is + very much slower than copying them from an ETS table. But if the + available RAM memory is scarce setting the <seealso + marker="#max_list_size">limit</seealso> to some low value is an + alternative.</p> + + <p>There is an option <c>cache_all</c> that can be set to + <c>ets</c> or <c>list</c> when evaluating a query. It adds a + <c>cache</c> or <c>{cache, list}</c> option to every list + expression except QLC tables and lists on all levels of the + query. This can be used for testing if caching would improve + efficiency at all. If the answer is yes further testing is + needed to pinpoint the generators that should be cached.</p> + + </section> + + <section><title>Implementing a QLC table</title> + + <p><marker id="implementing_a_qlc_table"></marker>As an example of + how to use the <seealso marker="#q">qlc:table/2</seealso> + function the implementation of a QLC table for the <seealso + marker="gb_trees">gb_trees</seealso> module is given:</p> + + <code type="none"><![CDATA[ +-module(gb_table). + +-export([table/1]). + +table(T) -> + TF = fun() -> qlc_next(gb_trees:next(gb_trees:iterator(T))) end, + InfoFun = fun(num_of_objects) -> gb_trees:size(T); + (keypos) -> 1; + (is_sorted_key) -> true; + (is_unique_objects) -> true; + (_) -> undefined + end, + LookupFun = + fun(1, Ks) -> + lists:flatmap(fun(K) -> + case gb_trees:lookup(K, T) of + {value, V} -> [{K,V}]; + none -> [] + end + end, Ks) + end, + FormatFun = + fun({all, NElements, ElementFun}) -> + ValsS = io_lib:format("gb_trees:from_orddict(~w)", + [gb_nodes(T, NElements, ElementFun)]), + io_lib:format("gb_table:table(~s)", [ValsS]); + ({lookup, 1, KeyValues, _NElements, ElementFun}) -> + ValsS = io_lib:format("gb_trees:from_orddict(~w)", + [gb_nodes(T, infinity, ElementFun)]), + io_lib:format("lists:flatmap(fun(K) -> " + "case gb_trees:lookup(K, ~s) of " + "{value, V} -> [{K,V}];none -> [] end " + "end, ~w)", + [ValsS, [ElementFun(KV) || KV <- KeyValues]]) + end, + qlc:table(TF, [{info_fun, InfoFun}, {format_fun, FormatFun}, + {lookup_fun, LookupFun},{key_equality,'=='}]). + +qlc_next({X, V, S}) -> + [{X,V} | fun() -> qlc_next(gb_trees:next(S)) end]; +qlc_next(none) -> + []. + +gb_nodes(T, infinity, ElementFun) -> + gb_nodes(T, -1, ElementFun); +gb_nodes(T, NElements, ElementFun) -> + gb_iter(gb_trees:iterator(T), NElements, ElementFun). + +gb_iter(_I, 0, _EFun) -> + '...'; +gb_iter(I0, N, EFun) -> + case gb_trees:next(I0) of + {X, V, I} -> + [EFun({X,V}) | gb_iter(I, N-1, EFun)]; + none -> + [] + end.]]></code> + + <p><c>TF</c> is the traversal function. The <c>qlc</c> module + requires that there is a way of traversing all objects of the + data structure; in <c>gb_trees</c> there is an iterator function + suitable for that purpose. Note that for each object returned a + new fun is created. As long as the list is not terminated by + <c>[]</c> it is assumed that the tail of the list is a nullary + function and that calling the function returns further objects + (and functions).</p> + + <p>The lookup function is optional. It is assumed that the lookup + function always finds values much faster than it would take to + traverse the table. The first argument is the position of the + key. Since <c>qlc_next</c> returns the objects as + {Key, Value} pairs the position is 1. Note that the lookup + function should return {Key, Value} pairs, just as the + traversal function does.</p> + + <p>The format function is also optional. It is called by + <c>qlc:info</c> to give feedback at runtime of how the query + will be evaluated. One should try to give as good feedback as + possible without showing too much details. In the example at + most 7 objects of the table are shown. The format function + handles two cases: <c>all</c> means that all objects of the + table will be traversed; <c>{lookup, 1, KeyValues}</c> + means that the lookup function will be used for looking up key + values.</p> + + <p>Whether the whole table will be traversed or just some keys + looked up depends on how the query is stated. If the query has + the form</p> + + <code type="none"><![CDATA[ +qlc:q([T || P <- LE, F])]]></code> + + <p>and P is a tuple, the <c>qlc</c> module analyzes P and F in + compile time to find positions of the tuple P that are tested + for equality to constants. If such a position at runtime turns + out to be the key position, the lookup function can be used, + otherwise all objects of the table have to be traversed. It is + the info function <c>InfoFun</c> that returns the key position. + There can be indexed positions as well, also returned by the + info function. An index is an extra table that makes lookup on + some position fast. Mnesia maintains indices upon request, + thereby introducing so called secondary keys. The <c>qlc</c> + module prefers to look up objects using the key before secondary + keys regardless of the number of constants to look up.</p> + + </section> + + <section><title>Key equality</title> + + <p>In Erlang there are two operators for testing term equality, + namely <c>==/2</c> and <c>=:=/2</c>. The difference between them + is all about the integers that can be represented by floats. For + instance, <c>2 == 2.0</c> evaluates to + <c>true</c> while <c>2 =:= 2.0</c> evaluates to <c>false</c>. + Normally this is a minor issue, but the <c>qlc</c> module cannot + ignore the difference, which affects the user's choice of + operators in QLCs.</p> + + <p>If the <c>qlc</c> module can find out at compile time that some + constant is free of integers, it does not matter which one of + <c>==/2</c> or <c>=:=/2</c> is used:</p> + + <pre> +1> <input>E1 = ets:new(t, [set]), % uses =:=/2 for key equality</input> +<input>Q1 = qlc:q([K ||</input> +<input>{K} <- ets:table(E1),</input> +<input>K == 2.71 orelse K == a]),</input> +<input>io:format("~s~n", [qlc:info(Q1)]).</input> +ets:match_spec_run(lists:flatmap(fun(V) -> + ets:lookup(20493, V) + end, + [a,2.71]), + ets:match_spec_compile([{{'$1'},[],['$1']}]))</pre> + + <p>In the example the <c>==/2</c> operator has been handled + exactly as <c>=:=/2</c> would have been handled. On the other + hand, if it cannot be determined at compile time that some + constant is free of integers and the table uses <c>=:=/2</c> + when comparing keys for equality (see the option <seealso + marker="#key_equality">key_equality</seealso>), the + <c>qlc</c> module will not try to look up the constant. The + reason is that there is in the general case no upper limit on + the number of key values that can compare equal to such a + constant; every combination of integers and floats has to be + looked up:</p> + + <pre> +2> <input>E2 = ets:new(t, [set]),</input> +<input>true = ets:insert(E2, [{{2,2},a},{{2,2.0},b},{{2.0,2},c}]),</input> +<input>F2 = fun(I) -></input> +<input>qlc:q([V || {K,V} <- ets:table(E2), K == I])</input> +<input>end,</input> +<input>Q2 = F2({2,2}),</input> +<input>io:format("~s~n", [qlc:info(Q2)]).</input> +ets:table(53264, + [{traverse, + {select,[{{'$1','$2'},[{'==','$1',{const,{2,2}}}],['$2']}]}}]) +3> <input>lists:sort(qlc:e(Q2)).</input> +[a,b,c]</pre> + + <p>Looking up just <c>{2,2}</c> would not return <c>b</c> and + <c>c</c>.</p> + + <p>If the table uses <c>==/2</c> when comparing keys for equality, + the <c>qlc</c> module will look up the constant regardless of + which operator is used in the QLC. However, <c>==/2</c> is to + be preferred:</p> + + <pre> +4> <input>E3 = ets:new(t, [ordered_set]), % uses ==/2 for key equality</input> +<input>true = ets:insert(E3, [{{2,2.0},b}]),</input> +<input>F3 = fun(I) -></input> +<input>qlc:q([V || {K,V} <- ets:table(E3), K == I])</input> +<input>end,</input> +<input>Q3 = F3({2,2}),</input> +<input>io:format("~s~n", [qlc:info(Q3)]).</input> +ets:match_spec_run(ets:lookup(86033, {2,2}), + ets:match_spec_compile([{{'$1','$2'},[],['$2']}])) +5> <input>qlc:e(Q3).</input> +[b]</pre> + + <p>Lookup join is handled analogously to lookup of constants in a + table: if the join operator is <c>==/2</c> and the table where + constants are to be looked up uses <c>=:=/2</c> when testing + keys for equality, the <c>qlc</c> module will not consider + lookup join for that table.</p> + + </section> + + <funcs> + + <func> + <name>append(QHL) -> QH</name> + <fsummary>Return a query handle.</fsummary> + <type> + <v>QHL = [QueryHandleOrList]</v> + <v>QH = QueryHandle</v> + </type> + <desc> + <p>Returns a query handle. When evaluating the query handle + <c>QH</c> all answers to the first query handle in + <c>QHL</c> is returned followed by all answers to the rest + of the query handles in <c>QHL</c>.</p> + </desc> + </func> + + <func> + <name>append(QH1, QH2) -> QH3</name> + <fsummary>Return a query handle.</fsummary> + <type> + <v>QH1 = QH2 = QueryHandleOrList</v> + <v>QH3 = QueryHandle</v> + </type> + <desc> + <p>Returns a query handle. When evaluating the query handle + <c>QH3</c> all answers to <c>QH1</c> are returned followed + by all answers to <c>QH2</c>.</p> + + <p><c>append(QH1, QH2)</c> is equivalent to + <c>append([QH1, QH2])</c>.</p> + </desc> + </func> + + <func> + <name>cursor(QueryHandleOrList [, Options]) -> QueryCursor</name> + <fsummary>Create a query cursor.</fsummary> + <type> + <v>Options = [Option] | Option</v> + <v>Option = {cache_all, Cache} | cache_all + | {max_list_size, MaxListSize} + | {spawn_options, SpawnOptions} + | {tmpdir_usage, TmpFileUsage} + | {tmpdir, TempDirectory} + | {unique_all, bool()} | unique_all</v> + </type> + <desc> + <p><marker id="cursor"></marker>Creates a query cursor and + makes the calling process the owner of the cursor. The + cursor is to be used as argument to <c>next_answers/1,2</c> + and (eventually) <c>delete_cursor/1</c>. Calls + <c>erlang:spawn_opt</c> to spawn and link a process which + will evaluate the query handle. The value of the option + <c>spawn_options</c> is used as last argument when calling + <c>spawn_opt</c>. The default value is <c>[link]</c>.</p> + + <pre> +1> <input>QH = qlc:q([{X,Y} || X <- [a,b], Y <- [1,2]]),</input> +<input>QC = qlc:cursor(QH),</input> +<input>qlc:next_answers(QC, 1).</input> +[{a,1}] +2> <input>qlc:next_answers(QC, 1).</input> +[{a,2}] +3> <input>qlc:next_answers(QC, all_remaining).</input> +[{b,1},{b,2}] +4> <input>qlc:delete_cursor(QC).</input> +ok</pre> + </desc> + </func> + + <func> + <name>delete_cursor(QueryCursor) -> ok</name> + <fsummary>Delete a query cursor.</fsummary> + <desc> + <p>Deletes a query cursor. Only the owner of the cursor can + delete the cursor.</p> + </desc> + </func> + + <func> + <name>eval(QueryHandleOrList [, Options]) -> Answers | Error</name> + <name>e(QueryHandleOrList [, Options]) -> Answers</name> + <fsummary>Return all answers to a query.</fsummary> + <type> + <v>Options = [Option] | Option</v> + <v>Option = {cache_all, Cache} | cache_all + | {max_list_size, MaxListSize} + | {tmpdir_usage, TmpFileUsage} + | {tmpdir, TempDirectory} + | {unique_all, bool()} | unique_all</v> + <v>Error = {error, module(), Reason}</v> + <v>Reason = - as returned by file_sorter(3) -</v> + </type> + <desc> + <p><marker id="eval"></marker>Evaluates a query handle in the + calling process and collects all answers in a list.</p> + + <pre> +1> <input>QH = qlc:q([{X,Y} || X <- [a,b], Y <- [1,2]]),</input> +<input>qlc:eval(QH).</input> +[{a,1},{a,2},{b,1},{b,2}]</pre> + </desc> + </func> + + <func> + <name>fold(Function, Acc0, QueryHandleOrList [, Options]) -> + Acc1 | Error</name> + <fsummary>Fold a function over the answers to a query.</fsummary> + <type> + <v>Function = fun(Answer, AccIn) -> AccOut</v> + <v>Acc0 = Acc1 = AccIn = AccOut = term()</v> + <v>Options = [Option] | Option</v> + <v>Option = {cache_all, Cache} | cache_all + | {max_list_size, MaxListSize} + | {tmpdir_usage, TmpFileUsage} + | {tmpdir, TempDirectory} + | {unique_all, bool()} | unique_all</v> + <v>Error = {error, module(), Reason}</v> + <v>Reason = - as returned by file_sorter(3) -</v> + </type> + <desc> + <p>Calls <c>Function</c> on successive answers to the query + handle together with an extra argument <c>AccIn</c>. The + query handle and the function are evaluated in the calling + process. <c>Function</c> must return a new accumulator which + is passed to the next call. <c>Acc0</c> is returned if there + are no answers to the query handle.</p> + + <pre> +1> <input>QH = [1,2,3,4,5,6],</input> +<input>qlc:fold(fun(X, Sum) -> X + Sum end, 0, QH).</input> +21</pre> + </desc> + </func> + + <func> + <name>format_error(Error) -> Chars</name> + <fsummary>Return an English description of a an error tuple.</fsummary> + <type> + <v>Error = {error, module(), term()}</v> + <v>Chars = [char() | Chars]</v> + </type> + <desc> + <p>Returns a descriptive string in English of an error tuple + returned by some of the functions of the <c>qlc</c> module + or the parse transform. This function is mainly used by the + compiler invoking the parse transform.</p> + </desc> + </func> + + <func> + <name>info(QueryHandleOrList [, Options]) -> Info</name> + <fsummary>Return code describing a query handle.</fsummary> + <type> + <v>Options = [Option] | Option</v> + <v>Option = EvalOption | ReturnOption</v> + <v>EvalOption = {cache_all, Cache} | cache_all + | {max_list_size, MaxListSize} + | {tmpdir_usage, TmpFileUsage} + | {tmpdir, TempDirectory} + | {unique_all, bool()} | unique_all</v> + <v>ReturnOption = {depth, Depth} + | {flat, bool()} + | {format, Format} + | {n_elements, NElements}</v> + <v>Depth = infinity | int() >= 0</v> + <v>Format = abstract_code | string</v> + <v>NElements = infinity | int() > 0</v> + <v>Info = AbstractExpression | string()</v> + </type> + <desc> + <p><marker id="info"></marker>Returns information about a + query handle. The information describes the simplifications + and optimizations that are the results of preparing the + query for evaluation. This function is probably useful + mostly during debugging.</p> + + <p>The information has the form of an Erlang expression where + QLCs most likely occur. Depending on the format functions of + mentioned QLC tables it may not be absolutely accurate.</p> + + <p>The default is to return a sequence of QLCs in a block, but + if the option <c>{flat, false}</c> is given, one single + QLC is returned. The default is to return a string, but if + the option <c>{format, abstract_code}</c> is given, + abstract code is returned instead. In the abstract code + port identifiers, references, and pids are represented by + strings. The default is to return + all elements in lists, but if the + <c>{n_elements, NElements}</c> option is given, only a + limited number of elements are returned. The default is to + show all of objects and match specifications, but if the + <c>{depth, Depth}</c> option is given, parts of terms + below a certain depth are replaced by <c>'...'</c>.</p> + + <pre> +1> <input>QH = qlc:q([{X,Y} || X <- [x,y], Y <- [a,b]]),</input> +<input>io:format("~s~n", [qlc:info(QH, unique_all)]).</input> +begin + V1 = + qlc:q([ + SQV || + SQV <- [x,y] + ], + [{unique,true}]), + V2 = + qlc:q([ + SQV || + SQV <- [a,b] + ], + [{unique,true}]), + qlc:q([ + {X,Y} || + X <- V1, + Y <- V2 + ], + [{unique,true}]) +end</pre> + + <p>In this example two simple QLCs have been inserted just to + hold the <c>{unique, true}</c> option.</p> + + <pre> +1> <input>E1 = ets:new(e1, []),</input> +<input>E2 = ets:new(e2, []),</input> +<input>true = ets:insert(E1, [{1,a},{2,b}]),</input> +<input>true = ets:insert(E2, [{a,1},{b,2}]),</input> +<input>Q = qlc:q([{X,Z,W} ||</input> +<input>{X, Z} <- ets:table(E1),</input> +<input>{W, Y} <- ets:table(E2),</input> +<input>X =:= Y]),</input> +<input>io:format("~s~n", [qlc:info(Q)]).</input> +begin + V1 = + qlc:q([ + P0 || + P0 = {W,Y} <- ets:table(17) + ]), + V2 = + qlc:q([ + [G1|G2] || + G2 <- V1, + G1 <- ets:table(16), + element(2, G1) =:= element(1, G2) + ], + [{join,lookup}]), + qlc:q([ + {X,Z,W} || + [{X,Z}|{W,Y}] <- V2 + ]) +end</pre> + + <p>In this example the query list comprehension <c>V2</c> has + been inserted to show the joined generators and the join + method chosen. A convention is used for lookup join: the + first generator (<c>G2</c>) is the one traversed, the second + one (<c>G1</c>) is the table where constants are looked up.</p> + </desc> + </func> + + <func> + <name>keysort(KeyPos, QH1 [, SortOptions]) -> QH2</name> + <fsummary>Return a query handle.</fsummary> + <type> + <v>QH1 = QueryHandleOrList</v> + <v>QH2 = QueryHandle</v> + </type> + <desc> + <p>Returns a query handle. When evaluating the query handle + <c>QH2</c> the answers to the query handle <c>QH1</c> are + sorted by <seealso + marker="file_sorter">file_sorter:keysort/4</seealso> + according to the options.</p> + + <p>The sorter will use temporary files only if <c>QH1</c> does + not evaluate to a list and the size of the binary + representation of the answers exceeds <c>Size</c> bytes, + where <c>Size</c> is the value of the <c>size</c> option.</p> + </desc> + </func> + + <func> + <name>next_answers(QueryCursor [, NumberOfAnswers]) -> + Answers | Error</name> + <fsummary>Return some or all answers to a query.</fsummary> + <type> + <v>NumberOfAnswers = all_remaining | int() > 0</v> + <v>Error = {error, module(), Reason}</v> + <v>Reason = - as returned by file_sorter(3) -</v> + </type> + <desc> + <p>Returns some or all of the remaining answers to a query + cursor. Only the owner of <c>Cursor</c> can retrieve + answers.</p> + + <p>The optional argument <c>NumberOfAnswers</c>determines the + maximum number of answers returned. The default value is + <c>10</c>. If less than the requested number of answers is + returned, subsequent calls to <c>next_answers</c> will + return <c>[]</c>.</p> + </desc> + </func> + + <func> + <name>q(QueryListComprehension [, Options]) -> QueryHandle</name> + <fsummary>Return a handle for a query list comprehension.</fsummary> + <type> + <v>QueryListComprehension = + - literal query listcomprehension -</v> + <v>Options = [Option] | Option</v> + <v>Option = {max_lookup, MaxLookup} + | {cache, Cache} | cache + | {join, Join} + | {lookup, Lookup} + | {unique, bool()} | unique</v> + <v>MaxLookup = int() >= 0 | infinity</v> + <v>Join = any | lookup | merge | nested_loop</v> + <v>Lookup = bool() | any</v> + </type> + <desc> + <p><marker id="q"></marker>Returns a query handle for a query + list comprehension. The query list comprehension must be the + first argument to <c>qlc:q/1,2</c> or it will be evaluated + as an ordinary list comprehension. It is also necessary to + add the line</p> + + <code type="none"> +-include_lib("stdlib/include/qlc.hrl").</code> + + <p>to the source file. This causes a parse transform to + substitute a fun for the query list comprehension. The + (compiled) fun will be called when the query handle is + evaluated.</p> + + <p>When calling <c>qlc:q/1,2</c> from the Erlang shell the + parse transform is automatically called. When this happens + the fun substituted for the query list comprehension is not + compiled but will be evaluated by <c>erl_eval(3)</c>. This + is also true when expressions are evaluated by means of + <c>file:eval/1,2</c> or in the debugger.</p> + + <p>To be very explicit, this will not work:</p> + + <pre> +... +A = [X || {X} <- [{1},{2}]], +QH = qlc:q(A), +...</pre> + + <p>The variable <c>A</c> will be bound to the evaluated value + of the list comprehension (<c>[1,2]</c>). The compiler + complains with an error message ("argument is not a query + list comprehension"); the shell process stops with a + <c>badarg</c> reason.</p> + + <p>The <c>{cache, ets}</c> option can be used to cache + the answers to a query list comprehension. The answers are + stored in one ETS table for each cached query list + comprehension. When a cached query list comprehension is + evaluated again, answers are fetched from the table without + any further computations. As a consequence, when all answers + to a cached query list comprehension have been found, the + ETS tables used for caching answers to the query list + comprehension's qualifiers can be emptied. The option + <c>cache</c> is equivalent to <c>{cache, ets}</c>.</p> + + <p>The <c>{cache, list}</c> option can be used to cache + the answers to a query list comprehension just like + <c>{cache, ets}</c>. The difference is that the answers + are kept in a list (on the process heap). If the answers + would occupy more than a certain amount of RAM memory a + temporary file is used for storing the answers. The option + <c>max_list_size</c> sets the limit in bytes and the temporary + file is put on the directory set by the <c>tmpdir</c> option.</p> + + <p>The <c>cache</c> option has no effect if it is known that + the query list comprehension will be evaluated at most once. + This is always true for the top-most query list + comprehension and also for the list expression of the first + generator in a list of qualifiers. Note that in the presence + of side effects in filters or callback functions the answers + to query list comprehensions can be affected by the + <c>cache</c> option.</p> + + <p>The <c>{unique, true}</c> option can be used to remove + duplicate answers to a query list comprehension. The unique + answers are stored in one ETS table for each query list + comprehension. The table is emptied every time it is known + that there are no more answers to the query list + comprehension. The option <c>unique</c> is equivalent to + <c>{unique, true}</c>. If the <c>unique</c> option is + combined with the <c>{cache, ets}</c> option, two ETS + tables are used, but the full answers are stored in one + table only. If the <c>unique</c> option is combined with the + <c>{cache, list}</c> option the answers are sorted + twice using <c>keysort/3</c>; once to remove duplicates, and + once to restore the order.</p> + + <p>The <c>cache</c> and <c>unique</c> options apply not only + to the query list comprehension itself but also to the + results of looking up constants, running match + specifications, and joining handles. </p> + + <pre> +1> <input>Q = qlc:q([{A,X,Z,W} ||</input> +<input>A <- [a,b,c],</input> +<input>{X,Z} <- [{a,1},{b,4},{c,6}],</input> +<input>{W,Y} <- [{2,a},{3,b},{4,c}],</input> +<input>X =:= Y],</input> +<input>{cache, list}),</input> +<input>io:format("~s~n", [qlc:info(Q)]).</input> +begin + V1 = + qlc:q([ + P0 || + P0 = {X,Z} <- + qlc:keysort(1, [{a,1},{b,4},{c,6}], []) + ]), + V2 = + qlc:q([ + P0 || + P0 = {W,Y} <- + qlc:keysort(2, [{2,a},{3,b},{4,c}], []) + ]), + V3 = + qlc:q([ + [G1|G2] || + G1 <- V1, + G2 <- V2, + element(1, G1) == element(2, G2) + ], + [{join,merge},{cache,list}]), + qlc:q([ + {A,X,Z,W} || + A <- [a,b,c], + [{X,Z}|{W,Y}] <- V3, + X =:= Y + ]) +end</pre> + + <p>In this example the cached results of the merge join are + traversed for each value of <c>A</c>. Note that without the + <c>cache</c> option the join would have been carried out + three times, once for each value of <c>A</c></p> + + <p><c>sort/1,2</c> and <c>keysort/2,3</c> can also be used for + caching answers and for removing duplicates. When sorting + answers are cached in a list, possibly stored on a temporary + file, and no ETS tables are used.</p> + + <p>Sometimes (see <seealso + marker="#lookup_fun">qlc:table/2</seealso> below) traversal + of tables can be done by looking up key values, which is + assumed to be fast. Under certain (rare) circumstances it + could happen that there are too many key values to look up. + <marker id="max_lookup"></marker> The + <c>{max_lookup, MaxLookup}</c> option can then be used + to limit the number of lookups: if more than + <c>MaxLookup</c> lookups would be required no lookups are + done but the table traversed instead. The default value is + <c>infinity</c> which means that there is no limit on the + number of keys to look up.</p> + <pre> +1> <input>T = gb_trees:empty(),</input> +<input>QH = qlc:q([X || {{X,Y},_} <- gb_table:table(T),</input> +<input>((X == 1) or (X == 2)) andalso</input> +<input>((Y == a) or (Y == b) or (Y == c))]),</input> +<input>io:format("~s~n", [qlc:info(QH)]).</input> +ets:match_spec_run( + lists:flatmap(fun(K) -> + case + gb_trees:lookup(K, + gb_trees:from_orddict([])) + of + {value,V} -> + [{K,V}]; + none -> + [] + end + end, + [{1,a},{1,b},{1,c},{2,a},{2,b},{2,c}]), + ets:match_spec_compile([{{{'$1','$2'},'_'},[],['$1']}]))</pre> + + <p>In this example using the <c>gb_table</c> module from the + <seealso marker="#implementing_a_qlc_table">Implementing a + QLC table</seealso> section there are six keys to look up: + <c>{1,a}</c>, <c>{1,b}</c>, <c>{1,c}</c>, <c>{2,a}</c>, + <c>{2,b}</c>, and <c>{2,c}</c>. The reason is that the two + elements of the key {X, Y} are compared separately.</p> + + <p>The <c>{lookup, true}</c> option can be used to ensure + that the <c>qlc</c> module will look up constants in some + QLC table. If there + are more than one QLC table among the generators' list + expressions, constants have to be looked up in at least one + of the tables. The evaluation of the query fails if there + are no constants to look up. This option is useful in + situations when it would be unacceptable to traverse all + objects in some table. Setting the <c>lookup</c> option to + <c>false</c> ensures that no constants will be looked up + (<c>{max_lookup, 0}</c> has the same effect). The + default value is <c>any</c> which means that constants will + be looked up whenever possible.</p> + + <p>The <c>{join, Join}</c> option can be used to ensure + that a certain join method will be used: + <c>{join, lookup}</c> invokes the lookup join method; + <c>{join, merge}</c> invokes the merge join method; and + <c>{join, nested_loop}</c> invokes the method of + matching every pair of objects from two handles. The last + method is mostly very slow. The evaluation of the query + fails if the <c>qlc</c> module cannot carry out the chosen + join method. The + default value is <c>any</c> which means that some fast join + method will be used if possible.</p> + </desc> + </func> + + <func> + <name>sort(QH1 [, SortOptions]) -> QH2</name> + <fsummary>Return a query handle.</fsummary> + <type> + <v>QH1 = QueryHandleOrList</v> + <v>QH2 = QueryHandle</v> + </type> + <desc> + <p>Returns a query handle. When evaluating the query handle + <c>QH2</c> the answers to the query handle <c>QH1</c> are + sorted by <seealso + marker="file_sorter">file_sorter:sort/3</seealso> according + to the options.</p> + + <p>The sorter will use temporary files only if <c>QH1</c> does + not evaluate to a list and the size of the binary + representation of the answers exceeds <c>Size</c> bytes, + where <c>Size</c> is the value of the <c>size</c> option.</p> + </desc> + </func> + + <func> + <name>string_to_handle(QueryString [, Options [, Bindings]]) -> + QueryHandle | Error</name> + <fsummary>Return a handle for a query list comprehension.</fsummary> + <type> + <v>QueryString = string()</v> + <v>Options = [Option] | Option</v> + <v>Option = {max_lookup, MaxLookup} + | {cache, Cache} | cache + | {join, Join} + | {lookup, Lookup} + | {unique, bool()} | unique</v> + <v>MaxLookup = int() >= 0 | infinity</v> + <v>Join = any | lookup | merge | nested_loop</v> + <v>Lookup = bool() | any</v> + <v>Bindings = - as returned by + erl_eval:bindings/1 -</v> + <v>Error = {error, module(), Reason}</v> + <v>Reason = - ErrorInfo as returned by + erl_scan:string/1 or erl_parse:parse_exprs/1 -</v> + </type> + <desc> + <p>A string version of <c>qlc:q/1,2</c>. When the query handle + is evaluated the fun created by the parse transform is + interpreted by <c>erl_eval(3)</c>. The query string is to be + one single query list comprehension terminated by a + period.</p> + + <pre> +1> <input>L = [1,2,3],</input> +<input>Bs = erl_eval:add_binding('L', L, erl_eval:new_bindings()),</input> +<input>QH = qlc:string_to_handle("[X+1 || X <- L].", [], Bs),</input> +<input>qlc:eval(QH).</input> +[2,3,4]</pre> + + <p>This function is probably useful mostly when called from + outside of Erlang, for instance from a driver written in C.</p> + </desc> + </func> + + <func> + <name>table(TraverseFun, Options) -> QueryHandle</name> + <fsummary>Return a query handle for a table.</fsummary> + <type> + <v>TraverseFun = TraverseFun0 | TraverseFun1</v> + <v>TraverseFun0 = fun() -> TraverseResult</v> + <v>TraverseFun1 = fun(MatchExpression) -> TraverseResult</v> + <v>TraverseResult = Objects | term()</v> + <v>Objects = [] | [term() | ObjectList]</v> + <v>ObjectList = TraverseFun0 | Objects</v> + <v>Options = [Option] | Option</v> + <v>Option = {format_fun, FormatFun} + | {info_fun, InfoFun} + | {lookup_fun, LookupFun} + | {parent_fun, ParentFun} + | {post_fun, PostFun} + | {pre_fun, PreFun} + | {key_equality, KeyComparison}</v> + <v>FormatFun = undefined | fun(SelectedObjects) -> FormatedTable</v> + <v>SelectedObjects = all + | {all, NElements, DepthFun} + | {match_spec, MatchExpression} + | {lookup, Position, Keys} + | {lookup, Position, Keys, NElements, DepthFun}</v> + <v>NElements = infinity | int() > 0</v> + <v>DepthFun = fun(term()) -> term()</v> + <v>FormatedTable = {Mod, Fun, Args} + | AbstractExpression + | character_list()</v> + <v>InfoFun = undefined | fun(InfoTag) -> InfoValue</v> + <v>InfoTag = indices | is_unique_objects | keypos | num_of_objects</v> + <v>InfoValue = undefined | term()</v> + <v>LookupFun = undefined | fun(Position, Keys) -> LookupResult</v> + <v>LookupResult = [term()] | term()</v> + <v>ParentFun = undefined | fun() -> ParentFunValue</v> + <v>PostFun = undefined | fun() -> void()</v> + <v>PreFun = undefined | fun([PreArg]) -> void()</v> + <v>PreArg = {parent_value, ParentFunValue} | {stop_fun, StopFun}</v> + <v>ParentFunValue = undefined | term()</v> + <v>StopFun = undefined | fun() -> void()</v> + <v>KeyComparison = '=:=' | '=='</v> + <v>Position = int() > 0</v> + <v>Keys = [term()]</v> + <v>Mod = Fun = atom()</v> + <v>Args = [term()]</v> + </type> + <desc> + <p><marker id="table"></marker>Returns a query handle for a + QLC table. In Erlang/OTP there is support for ETS, Dets and + Mnesia tables, but it is also possible to turn many other + data structures into QLC tables. The way to accomplish this + is to let function(s) in the module implementing the data + structure create a query handle by calling + <c>qlc:table/2</c>. The different ways to traverse the table + as well as properties of the table are handled by callback + functions provided as options to <c>qlc:table/2</c>.</p> + + <p>The callback function <c>TraverseFun</c> is used for + traversing the table. It is to return a list of objects + terminated by either <c>[]</c> or a nullary fun to be used + for traversing the not yet traversed objects of the table. + Any other return value is immediately returned as value of + the query evaluation. Unary <c>TraverseFun</c>s are to + accept a match specification as argument. The match + specification is created by the parse transform by analyzing + the pattern of the generator calling <c>qlc:table/2</c> and + filters using variables introduced in the pattern. If the + parse transform cannot find a match specification equivalent + to the pattern and filters, <c>TraverseFun</c> will be + called with a match specification returning every object. + Modules that can utilize match specifications for optimized + traversal of tables should call <c>qlc:table/2</c> with a + unary <c>TraverseFun</c> while other modules can provide a + nullary <c>TraverseFun</c>. <c>ets:table/2</c> is an example + of the former; <c>gb_table:table/1</c> in the <seealso + marker="#implementing_a_qlc_table">Implementing a QLC + table</seealso> section is an example of the latter.</p> + + <p><c>PreFun</c> is a unary callback function that is called + once before the table is read for the first time. If the + call fails, the query evaluation fails. Similarly, the + nullary callback function <c>PostFun</c> is called once + after the table was last read. The return value, which is + caught, is ignored. If <c>PreFun</c> has been called for a + table, <c>PostFun</c> is guaranteed to be called for that + table, even if the evaluation of the query fails for some + reason. The order in which pre (post) functions for + different tables are evaluated is not specified. Other table + access than reading, such as calling <c>InfoFun</c>, is + assumed to be OK at any time. The argument <c>PreArgs</c> is + a list of tagged values. Currently there are two tags, + <c>parent_value</c> and <c>stop_fun</c>, used by Mnesia for + managing transactions. The value of <c>parent_value</c> is + the value returned by <c>ParentFun</c>, or <c>undefined</c> + if there is no <c>ParentFun</c>. <c>ParentFun</c> is called + once just before the call of <c>PreFun</c> in the context of + the process calling <c>eval</c>, <c>fold</c>, or + <c>cursor</c>. The value of <c>stop_fun</c> is a nullary fun + that deletes the cursor if called from the parent, or + <c>undefined</c> if there is no cursor.</p> + + <p><marker id="lookup_fun"></marker>The binary callback + function <c>LookupFun</c> is used for looking up objects in + the table. The first argument <c>Position</c> is the key + position or an indexed position and the second argument + <c>Keys</c> is a sorted list of unique values. The return + value is to be a list of all objects (tuples) such that the + element at <c>Position</c> is a member of <c>Keys</c>. Any + other return value is immediately returned as value of the + query evaluation. <c>LookupFun</c> is called instead of + traversing the table if the parse transform at compile time + can find out that the filters match and compare the element + at <c>Position</c> in such a way that only <c>Keys</c> need + to be looked up in order to find all potential answers. The + key position is obtained by calling <c>InfoFun(keypos)</c> + and the indexed positions by calling + <c>InfoFun(indices)</c>. If the key position can be used for + lookup it is always chosen, otherwise the indexed position + requiring the least number of lookups is chosen. If there is + a tie between two indexed positions the one occurring first + in the list returned by <c>InfoFun</c> is chosen. Positions + requiring more than <seealso + marker="#max_lookup">max_lookup</seealso> lookups are + ignored.</p> + + <p>The unary callback function <c>InfoFun</c> is to return + information about the table. <c>undefined</c> should be + returned if the value of some tag is unknown:</p> + + <list type="bulleted"> + <item><c>indices</c>. Returns a list of indexed + positions, a list of positive integers. + </item> + <item><c>is_unique_objects</c>. Returns <c>true</c> if + the objects returned by <c>TraverseFun</c> are unique. + </item> + <item><c>keypos</c>. Returns the position of the table's + key, a positive integer. + </item> + <item><c>is_sorted_key</c>. Returns <c>true</c> if + the objects returned by <c>TraverseFun</c> are sorted + on the key. + </item> + <item><c>num_of_objects</c>. Returns the number of + objects in the table, a non-negative integer. + </item> + </list> + + <p>The unary callback function <c>FormatFun</c> is used by + <seealso marker="#info">qlc:info/1,2</seealso> for + displaying the call that created the table's query handle. + The default value, <c>undefined</c>, means that + <c>info/1,2</c> displays a call to <c>'$MOD':'$FUN'/0</c>. + It is up to <c>FormatFun</c> to present the selected objects + of the table in a suitable way. However, if a character list + is chosen for presentation it must be an Erlang expression + that can be scanned and parsed (a trailing dot will be added + by <c>qlc:info</c> though). <c>FormatFun</c> is called with + an argument that describes the selected objects based on + optimizations done as a result of analyzing the filters of + the QLC where the call to <c>qlc:table/2</c> occurs. The + possible values of the argument are:</p> + + <list type="bulleted"> + <item><c>{lookup, Position, Keys, NElements, DepthFun}</c>. + <c>LookupFun</c> is used for looking up objects in the + table. + </item> + <item><c>{match_spec, MatchExpression}</c>. No way of + finding all possible answers by looking up keys was + found, but the filters could be transformed into a + match specification. All answers are found by calling + <c>TraverseFun(MatchExpression)</c>. + </item> + <item><c>{all, NElements, DepthFun}</c>. No optimization was + found. A match specification matching all objects will be + used if <c>TraverseFun</c> is unary. + </item> + </list> + + <p><c>NElements</c> is the value of the <c>info/1,2</c> option + <c>n_elements</c>, and <c>DepthFun</c> is a function that + can be used for limiting the size of terms; calling + <c>DepthFun(Term)</c> substitutes <c>'...'</c> for parts of + <c>Term</c> below the depth specified by the <c>info/1,2</c> + option <c>depth</c>. If calling <c>FormatFun</c> with an + argument including <c>NElements</c> and <c>DepthFun</c> + fails, <c>FormatFun</c> is called once again with an + argument excluding <c>NElements</c> and <c>DepthFun</c> + (<c>{lookup, Position, Keys}</c> or + <c>all</c>).</p> + + <p><marker id="key_equality"></marker>The value of + <c>key_equality</c> is to be <c>'=:='</c> if the table + considers two keys equal if they match, and to be + <c>'=='</c> if two keys are equal if they compare equal. The + default is <c>'=:='</c>.</p> + + <p>See <seealso marker="ets#qlc_table">ets(3)</seealso>, + <seealso marker="dets#qlc_table">dets(3)</seealso> and + <seealso marker="mnesia:mnesia#qlc_table">mnesia(3)</seealso> + for the various options recognized by <c>table/1,2</c> in + respective module.</p> + </desc> + </func> + + </funcs> + + <section> + <title>See Also</title> + <p><seealso marker="dets">dets(3)</seealso>, + <seealso marker="doc/reference_manual:users_guide"> + Erlang Reference Manual</seealso>, + <seealso marker="erl_eval">erl_eval(3)</seealso>, + <seealso marker="erts:erlang">erlang(3)</seealso>, + <seealso marker="ets">ets(3)</seealso>, + <seealso marker="kernel:file">file(3)</seealso>, + <seealso marker="error_logger:file">error_logger(3)</seealso>, + <seealso marker="file_sorter">file_sorter(3)</seealso>, + <seealso marker="mnesia:mnesia">mnesia(3)</seealso>, + <seealso marker="doc/programming_examples:users_guide"> + Programming Examples</seealso>, + <seealso marker="shell">shell(3)</seealso></p> + </section> +</erlref> + |