20012011 Ericsson AB. All Rights Reserved. The contents of this file are subject to the Erlang Public License, Version 1.1, (the "License"); you may not use this file except in compliance with the License. You should have received a copy of the Erlang Public License along with this software. If not, it can be retrieved online at http://www.erlang.org/. Software distributed under the License is distributed on an "AS IS" basis, WITHOUT WARRANTY OF ANY KIND, either express or implied. See the License for the specific language governing rights and limitations under the License. How to implement a driver Jakob C 2000-11-28 PA1 driver.xml

This document was written a long time ago. A lot of it is still interesting since it explains important concepts, but it was written for an older driver interface so the examples does not work anymore. The reader is encouraged to read erl_driver and the driver_entry documentation.

Introduction

This chapter tells you how to build your own driver for erlang.

A driver in Erlang is a library written in C, that is linked to the Erlang emulator and called from erlang. Drivers can be used when C is more suitable than Erlang, to speed things up, or to provide access to OS resources not directly accessible from Erlang.

A driver can be dynamically loaded, as a shared library (known as a DLL on windows), or statically loaded, linked with the emulator when it is compiled and linked. Only dynamically loaded drivers are described here, statically linked drivers are beyond the scope of this chapter.

When a driver is loaded it is executed in the context of the emulator, shares the same memory and the same thread. This means that all operations in the driver must be non-blocking, and that any crash in the driver will bring the whole emulator down. In short: you have to be extremely careful!

Sample driver

This is a simple driver for accessing a postgres database using the libpq C client library. Postgres is used because it's free and open source. For information on postgres, refer to the website www.postgres.org.

The driver is synchronous, it uses the synchronous calls of the client library. This is only for simplicity, and is generally not good, since it will halt the emulator while waiting for the database. This will be improved on below with an asynchronous sample driver.

The code is quite straight-forward: all communication between Erlang and the driver is done with , and the driver returns data back using the .

An Erlang driver only exports one function: the driver entry function. This is defined with a macro, , and returns a pointer to a C containing the entry points that are called from the emulator. The defines the entries that the emulator calls to call the driver, with a pointer for entries that are not defined and used by the driver.

The entry is called when the driver is opened as a port with . Here we allocate memory for a user data structure. This user data will be passed every time the emulator calls us. First we store the driver handle, because it is needed in subsequent calls. We allocate memory for the connection handle that is used by LibPQ. We also set the port to return allocated driver binaries, by setting the flag , calling . (This is because we don't know whether our data will fit in the result buffer of , which has a default size set up by the emulator, currently 64 bytes.)

There is an entry which is called when the driver is loaded, but we don't use this, since it is executed only once, and we want to have the possibility of several instances of the driver.

The entry is called when the port is closed.

The entry is called from the emulator when the Erlang code calls , to do the actual work. We have defined a simple set of commands: to login to the database, to log out and to send a SQL-query and get the result. All results are returned through . The library in is used to encode data in binary term format. The result is returned to the emulator as binary terms, so is called in Erlang to convert the result to term form.

The code is available in in the directory of .

The driver entry contains the functions that will be called by the emulator. In our simple example, we only provide , and .

We have a structure to store state needed by the driver, in this case we only need to keep the database connection.

These are control codes we have defined.

This just returns the driver structure. The macro defines the only exported function. All the other functions are static, and will not be exported from the library.

Here we do some initialization, is called from . The data will be passed to and .

conn = NULL; set_port_control_flags(port, PORT_CONTROL_FLAG_BINARY); return (ErlDrvData)data; } ]]>

We call disconnect to log out from the database. (This should have been done from Erlang, but just in case.)

We use the binary format only to return data to the emulator; input data is a string paramater for and . The returned data consists of Erlang terms.

The functions and are utilities that are used to make the code shorter. duplicates the string and zero-terminates it, since the postgres client library wants that. takes an buffer and allocates a binary and copies the data there. This binary is returned in . (Note that this binary is freed by the emulator, not by us.)

is where we log in to the database. If the connection was successful we store the connection handle in our driver data, and return ok. Otherwise, we return the error message from postgres, and store in the driver data.

conn = conn; return 0; } ]]>

If we are connected (if the connection handle is not ), we log out from the database. We need to check if we should encode an ok, since we might get here from the function, which doesn't return data to the emulator.

conn == NULL) return 0; PQfinish(data->conn); data->conn = NULL; if (x != NULL) encode_ok(x); return 0; } ]]>

We execute a query and encode the result. Encoding is done in another C module, which is also provided as sample code.

conn, s); encode_result(x, res, data->conn); PQclear(res); return 0; } ]]>

Here we simply check the result from postgres, and if it's data we encode it as lists of lists with column data. Everything from postgres is C strings, so we just use to send the result as strings to Erlang. (The head of the list contains the column names.)

Compiling and linking the sample driver

The driver should be compiled and linked to a shared library (DLL on windows). With gcc this is done with the link flags and . Since we use the library we should include it too. There are several versions of , compiled for debug or non-debug and multi-threaded or single-threaded. In the makefile for the samples the directory is used for the library, meaning that we use the non-debug, single-threaded version.

Calling a driver as a port in Erlang

Before a driver can be called from Erlang, it must be loaded and opened. Loading is done using the module (the driver that loads dynamic driver, is actually a driver itself). If loading is ok the port can be opened with . The port name must match the name of the shared library and the name in the driver entry structure.

When the port has been opened, the driver can be called. In the example, we don't have any data from the port, only the return value from the .

The following code is the Erlang part of the synchronous postgres driver, .

case erl_ddll:load_driver(".", "pg_sync") of ok -> ok; {error, already_loaded} -> ok; E -> exit({error, E}) end, Port = open_port({spawn, ?MODULE}, []), case binary_to_term(port_control(Port, ?DRV_CONNECT, ConnectStr)) of ok -> {ok, Port}; Error -> Error end. disconnect(Port) -> R = binary_to_term(port_control(Port, ?DRV_DISCONNECT, "")), port_close(Port), R. select(Port, Query) -> binary_to_term(port_control(Port, ?DRV_SELECT, Query)). ]]>

The API is simple: loads the driver, opens it and logs on to the database, returning the Erlang port if successful, sends a query to the driver, and returns the result, closes the database connection and the driver. (It does not unload it, however.) The connection string should be a connection string for postgres.

The driver is loaded with , and if this is successful, or if it's already loaded, it is opened. This will call the function in the driver.

We use the function for all calls into the driver, the result from the driver is returned immediately, and converted to terms by calling . (We trust that the terms returned from the driver are well-formed, otherwise the calls could be contained in a .)

Sample asynchronous driver

Sometimes database queries can take long time to complete, in our driver, the emulator halts while the driver is doing its job. This is often not acceptable, since no other Erlang process gets a chance to do anything. To improve on our postgres driver, we reimplement it using the asynchronous calls in LibPQ.

The asynchronous version of the driver is in the sample files and .

Here some things have changed from : we use the entry for and which will be called from the emulator only when there is input to be read from the socket. (Actually, the socket is used in a function inside the emulator, and when the socket is signalled, indicating there is data to read, the entry is called. More on this below.)

Our driver data is also extended, we keep track of the socket used for communication with postgres, and also the port, which is needed when we send data to the port with . We have a flag to tell whether the driver is waiting for a connection or waiting for the result of a query. (This is needed since the entry will be called both when connecting and when there is a query result.)

port, x.buff, x.index); ei_x_free(&x); } PQconnectPoll(conn); int socket = PQsocket(conn); data->socket = socket; driver_select(data->port, (ErlDrvEvent)socket, DO_READ, 1); driver_select(data->port, (ErlDrvEvent)socket, DO_WRITE, 1); data->conn = conn; data->connecting = 1; return 0; } ]]>

The function looks a bit different too. We connect using the asynchronous function. After the connection is started, we retrieve the socket for the connection with . This socket is used with the function to wait for connection. When the socket is ready for input or for output, the function will be called.

Note that we only return data (with ) if there is an error here, otherwise we wait for the connection to be completed, in which case our function will be called.

connecting = 0; PGconn* conn = data->conn; /* if there's an error return it now */ if (PQsendQuery(conn, s) == 0) { ei_x_buff x; ei_x_new_with_version(&x); encode_error(&x, conn); driver_output(data->port, x.buff, x.index); ei_x_free(&x); } /* else wait for ready_output to get results */ return 0; } ]]>

The function initiates a select, and returns if there is no immediate error. The actual result will be returned when is called.

conn; ei_x_buff x; ei_x_new_with_version(&x); if (data->connecting) { ConnStatusType status; PQconnectPoll(conn); status = PQstatus(conn); if (status == CONNECTION_OK) encode_ok(&x); else if (status == CONNECTION_BAD) encode_error(&x, conn); } else { PQconsumeInput(conn); if (PQisBusy(conn)) return; res = PQgetResult(conn); encode_result(&x, res, conn); PQclear(res); for (;;) { res = PQgetResult(conn); if (res == NULL) break; PQclear(res); } } if (x.index > 1) { driver_output(data->port, x.buff, x.index); if (data->connecting) driver_select(data->port, (ErlDrvEvent)data->socket, DO_WRITE, 0); } ei_x_free(&x); } ]]>

The function will be called when the socket we got from postgres is ready for input or output. Here we first check if we are connecting to the database. In that case we check connection status and return ok if the connection is successful, or error if it's not. If the connection is not yet established, we simply return; will be called again.

If we have a result from a connect, indicated by having data in the buffer, we no longer need to select on output (), so we remove this by calling .

If we're not connecting, we're waiting for results from a , so we get the result and return it. The encoding is done with the same functions as in the earlier example.

We should add error handling here, for instance checking that the socket is still open, but this is just a simple example.

The Erlang part of the asynchronous driver consists of the sample file .

case erl_ddll:load_driver(".", "pg_async") of ok -> ok; {error, already_loaded} -> ok; _ -> exit({error, could_not_load_driver}) end, Port = open_port({spawn, ?MODULE}, [binary]), port_control(Port, ?DRV_CONNECT, ConnectStr), case return_port_data(Port) of ok -> {ok, Port}; Error -> Error end. disconnect(Port) -> port_control(Port, ?DRV_DISCONNECT, ""), R = return_port_data(Port), port_close(Port), R. select(Port, Query) -> port_control(Port, ?DRV_SELECT, Query), return_port_data(Port). return_port_data(Port) -> receive {Port, {data, Data}} -> binary_to_term(Data) end. ]]>

The Erlang code is slightly different, this is because we don't return the result synchronously from , instead we get it from as data in the message queue. The function above receives data from the port. Since the data is in binary format, we use to convert it to an Erlang term. Note that the driver is opened in binary mode ( is called with the option ). This means that data sent from the driver to the emulator is sent as binaries. Without the option, they would have been lists of integers.

An asynchronous driver using driver_async

As a final example we demonstrate the use of . We also use the driver term interface. The driver is written in C++. This enables us to use an algorithm from STL. We will use the algorithm to get the next permutation of a list of integers. For large lists (more than 100000 elements), this will take some time, so we will perform this as an asynchronous task.

The asynchronous API for drivers is quite complicated. First of all, the work must be prepared. In our example we do this in . We could have used just as well, but we want some variation in our examples. In our driver, we allocate a structure that contains anything that's needed for the asynchronous task to do the work. This is done in the main emulator thread. Then the asynchronous function is called from a driver thread, separate from the main emulator thread. Note that the driver-functions are not reentrant, so they shouldn't be used. Finally, after the function is completed, the driver callback is called from the main emulator thread, this is where we return the result to Erlang. (We can't return the result from within the asynchronous function, since we can't call the driver-functions.)

The code below is from the sample file .

The driver entry looks like before, but also contains the call-back .

The function allocates the work-area of the asynchronous function. Since we use C++, we use a struct, and stuff the data in it. We have to copy the original data, it is not valid after we have returned from the function, and the function will be called later, and from another thread. We return no data here, instead it will be sent later from the call-back.

The will be passed to the function. We do not use a function (the last argument to ), it's only used if the task is cancelled programmatically.

data; our_async_data(ErlDrvPort p, int command, const char* buf, int len); }; our_async_data::our_async_data(ErlDrvPort p, int command, const char* buf, int len) : prev(command == 2), data((int*)buf, (int*)buf + len / sizeof(int)) { } static void do_perm(void* async_data); static void output(ErlDrvData drv_data, char *buf, int len) { if (*buf < 1 || *buf > 2) return; ErlDrvPort port = reinterpret_cast(drv_data); void* async_data = new our_async_data(port, *buf, buf+1, len); driver_async(port, NULL, do_perm, async_data, do_free); } ]]>

In the we simply do the work, operating on the structure that was allocated in .

(async_data); if (d->prev) prev_permutation(d->data.begin(), d->data.end()); else next_permutation(d->data.begin(), d->data.end()); } ]]>

In the function, the output is sent back to the emulator. We use the driver term format instead of . This is the only way to send Erlang terms directly to a driver, without having the Erlang code to call . In our simple example this works well, and we don't need to use to handle the binary term format.

When the data is returned we deallocate our data.

(drv_data); our_async_data* d = reinterpret_cast(async_data); int n = d->data.size(), result_n = n*2 + 3; ErlDrvTermData *result = new ErlDrvTermData[result_n], *rp = result; for (vector::iterator i = d->data.begin(); i != d->data.end(); ++i) { *rp++ = ERL_DRV_INT; *rp++ = *i; } *rp++ = ERL_DRV_NIL; *rp++ = ERL_DRV_LIST; *rp++ = n+1; driver_output_term(port, result, result_n); delete[] result; delete d; } ]]>

This driver is called like the others from Erlang, however, since we use , there is no need to call binary_to_term. The Erlang code is in the sample file .

The input is changed into a list of integers and sent to the driver.

case whereis(next_perm) of undefined -> case erl_ddll:load_driver(".", "next_perm") of ok -> ok; {error, already_loaded} -> ok; E -> exit(E) end, Port = open_port({spawn, "next_perm"}, []), register(next_perm, Port); _ -> ok end. list_to_integer_binaries(L) -> [<> || I <- L]. next_perm(L) -> next_perm(L, 1). prev_perm(L) -> next_perm(L, 2). next_perm(L, Nxt) -> load(), B = list_to_integer_binaries(L), port_control(next_perm, Nxt, B), receive Result -> Result end. all_perm(L) -> New = prev_perm(L), all_perm(New, L, [New]). all_perm(L, L, Acc) -> Acc; all_perm(L, Orig, Acc) -> New = prev_perm(L), all_perm(New, Orig, [New | Acc]). ]]>