From 68d53c01b0b8e9a007a6a30158c19e34b2d2a34e Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Bj=C3=B6rn=20Gustavsson?= Date: Wed, 18 May 2016 15:53:35 +0200 Subject: Update STDLIB documentation MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Language cleaned up by the technical writers xsipewe and tmanevik from Combitech. Proofreading and corrections by Björn Gustavsson and Hans Bolinder. --- lib/stdlib/doc/src/io_protocol.xml | 1172 +++++++++++++++++++----------------- 1 file changed, 608 insertions(+), 564 deletions(-) (limited to 'lib/stdlib/doc/src/io_protocol.xml') diff --git a/lib/stdlib/doc/src/io_protocol.xml b/lib/stdlib/doc/src/io_protocol.xml index f2a669a49a..84b5f62c7f 100644 --- a/lib/stdlib/doc/src/io_protocol.xml +++ b/lib/stdlib/doc/src/io_protocol.xml @@ -23,7 +23,7 @@ - The Erlang I/O-protocol + The Erlang I/O Protocol Patrik Nyblom @@ -34,183 +34,217 @@ io_protocol.xml - -

The I/O-protocol in Erlang specifies a way for a client to communicate -with an I/O server and vice versa. The I/O server is a process that handles -the requests and performs the requested task on e.g. an IO device. The -client is any Erlang process wishing to read or write data from/to the -IO device.

- -

The common I/O-protocol has been present in OTP since the -beginning, but has been fairly undocumented and has also somewhat -evolved over the years. In an addendum to Robert Virdings rationale -the original I/O-protocol is described. This document describes the -current I/O-protocol.

- -

The original I/O-protocol was simple and flexible. Demands for spacial -and execution time efficiency has triggered extensions to the protocol -over the years, making the protocol larger and somewhat less easy to -implement than the original. It can certainly be argued that the -current protocol is too complex, but this text describes how it looks -today, not how it should have looked.

- -

The basic ideas from the original protocol still hold. The I/O server -and client communicate with one single, rather simplistic protocol and -no server state is ever present in the client. Any I/O server can be -used together with any client code and client code need not be aware -of the actual IO device the I/O server communicates with.

- -
-Protocol Basics - -

As described in Robert's paper, I/O servers and clients communicate using -io_request/io_reply tuples as follows:

- -

{io_request, From, ReplyAs, Request}
-{io_reply, ReplyAs, Reply}

- -

The client sends an io_request tuple to the I/O server and -the server eventually sends a corresponding io_reply tuple.

- - -From is the pid() of the client, the process which -the I/O server sends the IO reply to. - -ReplyAs can be any datum and is returned in the corresponding -io_reply. The io module monitors -the I/O server, and uses the monitor reference as the ReplyAs datum. -A more complicated client -could have several outstanding I/O requests to the same I/O server and -would then use different references (or something else) to differentiate among -the incoming IO replies. The ReplyAs element should be considered -opaque by the I/O server. Note that the pid() of the I/O server is not -explicitly present in the io_reply tuple. The reply can be sent from any -process, not necessarily the actual I/O server. The ReplyAs element is -the only thing that connects one I/O request with an I/O-reply. - -Request and Reply are described below. - - -

When an I/O server receives an io_request tuple, it acts upon the actual -Request part and eventually sends an io_reply tuple with the corresponding -Reply part.

-
-
-Output Requests - -

To output characters on an IO device, the following Requests exist:

- -

-{put_chars, Encoding, Characters}
-{put_chars, Encoding, Module, Function, Args} -

- -Encoding is either unicode or latin1, meaning that the - characters are (in case of binaries) encoded as either UTF-8 or - ISO-latin-1 (pure bytes). A well behaved I/O server should also - return error if list elements contain integers > 255 when - Encoding is set to latin1. Note that this does not in any way tell - how characters should be put on the actual IO device or how the - I/O server should handle them. Different I/O servers may handle the - characters however they want, this simply tells the I/O server which - format the data is expected to have. In the Module/Function/Args - case, Encoding tells which format the designated function - produces. Note that byte-oriented data is simplest sent using the ISO-latin-1 - encoding. - -Characters are the data to be put on the IO device. If Encoding is - latin1, this is an iolist(). If Encoding is unicode, this is an - Erlang standard mixed Unicode list (one integer in a list per - character, characters in binaries represented as UTF-8). - -Module, Function, and Args denote a function which will be called to - produce the data (like io_lib:format/2). Args is a list of arguments - to the function. The function should produce data in the given - Encoding. The I/O server should call the function as - apply(Mod, Func, Args) and will put the returned data on the IO device as if it was sent - in a {put_chars, Encoding, Characters} request. If the function - returns anything else than a binary or list or throws an exception, - an error should be sent back to the client. - - -

The I/O server replies to the client with an io_reply tuple where the Reply -element is one of:

-

-ok
-{error, Error} -

- - -Error describes the error to the client, which may do whatever - it wants with it. The Erlang io - module typically returns it as is. - - -

For backward compatibility the following Requests should also be -handled by an I/O server (these requests should not be present after -R15B of OTP):

-

-{put_chars, Characters}
-{put_chars, Module, Function, Args} -

- -

These should behave as {put_chars, latin1, Characters} and -{put_chars, latin1, Module, Function, Args} respectively.

-
-
-Input Requests - -

To read characters from an IO device, the following Requests exist:

- -

{get_until, Encoding, Prompt, Module, Function, ExtraArgs}

- - -Encoding denotes how data is to be sent back to the client and - what data is sent to the function denoted by - Module/Function/ExtraArgs. If the function supplied returns data as a - list, the data is converted to this encoding. If however the - function supplied returns data in some other format, no conversion - can be done and it is up to the client supplied function to return - data in a proper way. If Encoding is latin1, lists of integers - 0..255 or binaries containing plain bytes are sent back to the - client when possible; if Encoding is unicode, lists with integers in - the whole Unicode range or binaries encoded in UTF-8 are sent to the - client. The user supplied function will always see lists of integers, never - binaries, but the list may contain numbers > 255 if the Encoding is - unicode. - -Prompt is a list of characters (not mixed, no binaries) or an atom - to be output as a prompt for input on the IO device. Prompt is - often ignored by the I/O server and if set to '' it should always - be ignored (and result in nothing being written to the IO device). - -

Module, Function, and ExtraArgs denote a function and arguments to - determine when enough data is written. The function should take two - additional arguments, the last state, and a list of characters. The - function should return one of:

-

-{done, Result, RestChars}
-{more, Continuation} -

-

The Result can be any Erlang term, but if it is a list(), the - I/O server may convert it to a binary() of appropriate format before - returning it to the client, if the I/O server is set in binary mode (see - below).

- -

The function will be called with the data the I/O server finds on - its IO device, returning {done, Result, RestChars} when enough data is - read (in which case Result is sent to the client and RestChars is - kept in the I/O server as a buffer for subsequent input) or - {more, Continuation}, indicating that more characters are needed to - complete the request. The Continuation will be sent as the state in - subsequent calls to the function when more characters are - available. When no more characters are available, the function - shall return {done, eof, Rest}. - The initial state is the empty list and the data when an - end of file is reached on the IO device is the atom eof. An emulation - of the get_line request could be (inefficiently) implemented using - the following functions:

- +

The I/O protocol in Erlang enables bi-directional communication between + clients and servers.

+ + + +

The I/O server is a process that handles the requests and performs + the requested task on, for example, an I/O device.

+
+ +

The client is any Erlang process wishing to read or write data from/to + the I/O device.

+
+
+ +

The common I/O protocol has been present in OTP since the beginning, but + has been undocumented and has also evolved over the years. In an + addendum to Robert Virding's rationale, the original I/O protocol is + described. This section describes the current I/O protocol.

+ +

The original I/O protocol was simple and flexible. Demands for memory + efficiency and execution time efficiency have triggered extensions + to the protocol over the years, making the protocol larger and somewhat + less easy to implement than the original. It can certainly be argued that + the current protocol is too complex, but this section describes how it + looks today, not how it should have looked.

+ +

The basic ideas from the original protocol still hold. The I/O server + and client communicate with one single, rather simplistic protocol and no + server state is ever present in the client. Any I/O server can be used + together with any client code, and the client code does not need to be + aware of the I/O device that the I/O server communicates with.

+ +
+ Protocol Basics +

As described in Robert's paper, I/O servers and clients communicate + using io_request/io_reply tuples as follows:

+ +
+{io_request, From, ReplyAs, Request}
+{io_reply, ReplyAs, Reply}
+ +

The client sends an io_request tuple to the I/O server and the + server eventually sends a corresponding io_reply tuple.

+ + + +

From is the pid() of the client, the process which + the I/O server sends the I/O reply to.

+
+ +

ReplyAs can be any datum and is returned in the + corresponding io_reply. The + io module monitors the + the I/O server and uses the monitor reference as the ReplyAs + datum. A more complicated client can have many outstanding I/O + requests to the same I/O server and can use different references (or + something else) to differentiate among the incoming I/O replies. + Element ReplyAs is to be considered opaque by the I/O + server.

+

Notice that the pid() of the I/O server is not explicitly + present in tuple io_reply. The reply can be sent from any + process, not necessarily the actual I/O server.

+
+ +

Request and Reply are described below.

+
+
+ +

When an I/O server receives an io_request tuple, it acts upon the + Request part and eventually sends an io_reply tuple with + the corresponding Reply part.

+
+ +
+ Output Requests +

To output characters on an I/O device, the following Requests + exist:

+ +
+{put_chars, Encoding, Characters}
+{put_chars, Encoding, Module, Function, Args}
+ + + +

Encoding is unicode or latin1, meaning that the + characters are (in case of binaries) encoded as UTF-8 or ISO Latin-1 + (pure bytes). A well-behaved I/O server is also to return an error + indication if list elements contain integers > 255 + when Encoding is set to latin1.

+

Notice that this does not in any way tell how characters are to be + put on the I/O device or handled by the I/O server. Different I/O + servers can handle the characters however they want, this only tells + the I/O server which format the data is expected to have. In the + Module/Function/Args case, Encoding tells + which format the designated function produces.

+

Notice also that byte-oriented data is simplest sent using the ISO + Latin-1 encoding.

+
+ +

Characters are the data to be put on the I/O device. If + Encoding is latin1, this is an iolist(). If + Encoding is unicode, this is an Erlang standard mixed + Unicode list (one integer in a list per character, characters in + binaries represented as UTF-8).

+
+ +

Module, Function, and Args denote a function + that is called to produce the data (like + io_lib:format/2). +

+

Args is a list of arguments to the function. The function is + to produce data in the specified Encoding. The I/O server is + to call the function as apply(Mod, Func, Args) and put the + returned data on the I/O device as if it was sent in a + {put_chars, Encoding, Characters} request. If the function + returns anything else than a binary or list, or throws an exception, + an error is to be sent back to the client.

+
+
+ +

The I/O server replies to the client with an io_reply tuple, where + element Reply is one of:

+ +
+ok
+{error, Error}
+ + + Error describes the error to the client, which can do + whatever it wants with it. The + io module typically + returns it "as is". + + +

For backward compatibility, the following Requests are also to be + handled by an I/O server (they are not to be present after + Erlang/OTP R15B):

+ +
+{put_chars, Characters}
+{put_chars, Module, Function, Args}
+ +

These are to behave as {put_chars, latin1, Characters} and + {put_chars, latin1, Module, Function, Args}, respectively.

+
+ +
+ Input Requests +

To read characters from an I/O device, the following Requests + exist:

+ +
+{get_until, Encoding, Prompt, Module, Function, ExtraArgs}
+ + + +

Encoding denotes how data is to be sent back to the client + and what data is sent to the function denoted by + Module/Function/ExtraArgs. If the function + supplied returns data as a list, the data is converted to this + encoding. If the function supplied returns data in some other format, + no conversion can be done, and it is up to the client-supplied + function to return data in a proper way.

+

If Encoding is latin1, lists of integers 0..255 + or binaries containing plain bytes are sent back to the client when + possible. If Encoding is unicode, lists with integers + in the whole Unicode range or binaries encoded in UTF-8 are sent to + the client. The user-supplied function always sees lists of + integers, never binaries, but the list can contain numbers > 255 + if Encoding is unicode.

+
+ +

Prompt is a list of characters (not mixed, no binaries) or an + atom to be output as a prompt for input on the I/O device. + Prompt is often ignored by the I/O server; if set to '', + it is always to be ignored (and results in nothing being written to + the I/O device).

+
+ +

Module, Function, and ExtraArgs denote a + function and arguments to determine when enough data is written. The + function is to take two more arguments, the last state, and a list of + characters. The function is to return one of:

+
+{done, Result, RestChars}
+{more, Continuation}
+

Result can be any Erlang term, but if it is a list(), + the I/O server can convert it to a binary() of appropriate + format before returning it to the client, if the I/O server is set in + binary mode (see below).

+

The function is called with the data the I/O server finds on its I/O + device, returning one of:

+ + +

{done, Result, RestChars} when enough data is read. In + this case Result is sent to the client and RestChars + is kept in the I/O server as a buffer for later input.

+
+ +

{more, Continuation}, which indicates that more + characters are needed to complete the request.

+
+
+

Continuation is sent as the state in later calls to the + function when more characters are available. When no more characters + are available, the function must return {done, eof, Rest}. The + initial state is the empty list. The data when an end of file is + reached on the IO device is the atom eof.

+

An emulation of the get_line request can be (inefficiently) + implemented using the following functions:

+ -module(demo). -export([until_newline/3, get_line/1]). @@ -234,226 +268,253 @@ get_line(IoServer) -> receive {io_reply, IoServer, Data} -> Data - end. - -

Note especially that the last element in the Request tuple ([$\n]) - is appended to the argument list when the function is called. The - function should be called like - apply(Module, Function, [ State, Data | ExtraArgs ]) by the I/O server

-
-
- -

A fixed number of characters is requested using this Request:

-

-{get_chars, Encoding, Prompt, N} -

- - -Encoding and Prompt as for get_until. - -N is the number of characters to be read from the IO device. - - -

A single line (like in the example above) is requested with this Request:

-

-{get_line, Encoding, Prompt} -

- - -Encoding and Prompt as above. - - -

Obviously, the get_chars and get_line could be implemented with the -get_until request (and indeed they were originally), but demands for -efficiency has made these additions necessary.

- -

The I/O server replies to the client with an io_reply tuple where the Reply -element is one of:

-

-Data
-eof
-{error, Error} -

- - -Data is the characters read, in either list or binary form - (depending on the I/O server mode, see below). -Error describes the error to the client, which may do whatever it - wants with it. The Erlang io - module typically returns it as is. -eof is returned when input end is reached and no more data is -available to the client process. - - -

For backward compatibility the following Requests should also be -handled by an I/O server (these reqeusts should not be present after -R15B of OTP):

- -

-{get_until, Prompt, Module, Function, ExtraArgs}
-{get_chars, Prompt, N}
-{get_line, Prompt}
-

- -

These should behave as {get_until, latin1, Prompt, Module, Function, -ExtraArgs}, {get_chars, latin1, Prompt, N} and {get_line, latin1, -Prompt} respectively.

-
-
-I/O-server Modes - -

Demands for efficiency when reading data from an I/O server has not -only lead to the addition of the get_line and get_chars requests, but -has also added the concept of I/O server options. No options are -mandatory to implement, but all I/O servers in the Erlang standard -libraries honor the binary option, which allows the Data element of the -io_reply tuple to be a binary instead of a list when possible. -If the data is sent as a binary, Unicode data will be sent in the -standard Erlang Unicode -format, i.e. UTF-8 (note that the function of the get_until request still gets -list data regardless of the I/O server mode).

- -

Note that i.e. the get_until request allows for a function with the data specified as always being a list. Also the return value data from such a function can be of any type (as is indeed the case when an io:fread request is sent to an I/O server). The client has to be prepared for data received as answers to those requests to be in a variety of forms, but the I/O server should convert the results to binaries whenever possible (i.e. when the function supplied to get_until actually returns a list). The example shown later in this text does just that.

- -

An I/O-server in binary mode will affect the data sent to the client, -so that it has to be able to handle binary data. For convenience, it -is possible to set and retrieve the modes of an I/O server using the -following I/O requests:

- -

-{setopts, Opts} -

- - - -Opts is a list of options in the format recognized by proplists (and - of course by the I/O server itself). - -

As an example, the I/O server for the interactive shell (in group.erl) -understands the following options:

-

-{binary, boolean()} (or binary/list)
-{echo, boolean()}
-{expand_fun, fun()}
-{encoding, unicode/latin1} (or unicode/latin1) -

- -

- of which the binary and encoding options are common for all -I/O servers in OTP, while echo and expand are valid only for this -I/O server. It is worth noting that the unicode option notifies how -characters are actually put on the physical IO device, i.e. if the -terminal per se is Unicode aware, it does not affect how characters -are sent in the I/O-protocol, where each request contains encoding -information for the provided or returned data.

- -

The I/O server should send one of the following as Reply:

-

-ok
-{error, Error} -

- -

An error (preferably enotsup) is to be expected if the option is -not supported by the I/O server (like if an echo option is sent in a -setopts request to a plain file).

- -

To retrieve options, this request is used:

-

-getopts -

- -

The getopts request asks for a complete list of all options -supported by the I/O server as well as their current values.

- -

The I/O server replies:

-

-OptList
-{error, Error} -

- - -OptList is a list of tuples {Option, Value} where Option is always - an atom. - -
-
-Multiple I/O Requests - -

The Request element can in itself contain several Requests by using -the following format:

-

-{requests, Requests} -

- -Requests is a list of valid io_request tuples for the protocol, they - shall be executed in the order in which they appear in the list and - the execution should continue until one of the requests result in an - error or the list is consumed. The result of the last request is - sent back to the client. - - -

The I/O server can for a list of requests send any of the valid results in -the reply:

- -

-ok
-{ok, Data}
-{ok, Options}
-{error, Error} -

-

- depending on the actual requests in the list.

-
-
-Optional I/O Requests - -

The following I/O request is optional to implement and a client -should be prepared for an error return:

-

-{get_geometry, Geometry} -

- -Geometry is either the atom rows or the atom columns. - -

The I/O server should send the Reply as:

-

-{ok, N}
-{error, Error} -

- - -N is the number of character rows or columns the IO device has, if - applicable to the IO device the I/O server handles, otherwise {error, - enotsup} is a good answer. - -
-
-Unimplemented Request Types - -

If an I/O server encounters a request it does not recognize (i.e. the -io_request tuple is in the expected format, but the actual Request is -unknown), the I/O server should send a valid reply with the error tuple:

-

-{error, request} -

- -

This makes it possible to extend the protocol with optional requests -and for the clients to be somewhat backwards compatible.

-
-
-An Annotated and Working Example I/O Server - -

An I/O server is any process capable of handling the I/O protocol. There is -no generic I/O server behavior, but could well be. The framework is -simple enough, a process handling incoming requests, usually both -I/O-requests and other IO device-specific requests (for i.e. positioning, -closing etc.).

- -

Our example I/O server stores characters in an ETS table, making up a -fairly crude ram-file (it is probably not useful, but working).

- -

The module begins with the usual directives, a function to start the -I/O server and a main loop handling the requests:

- - + end. +

Notice that the last element in the Request tuple + ([$\n]) is appended to the argument list when the function is + called. The function is to be called like + apply(Module, Function, [ State, Data | ExtraArgs ]) by the + I/O server.

+ + + +

A fixed number of characters is requested using the following + Request:

+ +
+{get_chars, Encoding, Prompt, N}
+ + + +

Encoding and Prompt as for get_until.

+
+ +

N is the number of characters to be read from the I/O + device.

+
+
+ +

A single line (as in former example) is requested with the + following Request:

+ +
+{get_line, Encoding, Prompt}
+ + + Encoding and Prompt as for get_until. + + +

Clearly, get_chars and get_line could be implemented with + the get_until request (and indeed they were originally), but + demands for efficiency have made these additions necessary.

+ +

The I/O server replies to the client with an io_reply tuple, where + element Reply is one of:

+ +
+Data
+eof
+{error, Error}
+ + + +

Data is the characters read, in list or binary form + (depending on the I/O server mode, see the next section).

+
+ +

eof is returned when input end is reached and no more data is + available to the client process.

+
+ +

Error describes the error to the client, which can do + whatever it wants with it. The + io module typically + returns it as is.

+
+
+ +

For backward compatibility, the following Requests are also to be + handled by an I/O server (they are not to be present after + Erlang/OTP R15B):

+ +
+{get_until, Prompt, Module, Function, ExtraArgs}
+{get_chars, Prompt, N}
+{get_line, Prompt}
+ +

These are to behave as + {get_until, latin1, Prompt, Module, Function, ExtraArgs}, + {get_chars, latin1, Prompt, N}, and + {get_line, latin1, Prompt}, respectively.

+
+ +
+ I/O Server Modes +

Demands for efficiency when reading data from an I/O server has not only + lead to the addition of the get_line and get_chars requests, + but has also added the concept of I/O server options. No options are + mandatory to implement, but all I/O servers in the Erlang standard + libraries honor the binary option, which allows element + Data of the io_reply tuple to be a binary instead of a list + when possible. If the data is sent as a binary, Unicode data is + sent in the standard Erlang Unicode format, that is, UTF-8 (notice that + the function of the get_until request still gets list data + regardless of the I/O server mode).

+ +

Notice that the get_until request allows for a function with the + data specified as always being a list. Also, the return value data from + such a function can be of any type (as is indeed the case when an + io:fread/2,3 + request is sent to an I/O server). + The client must be prepared for data received as + answers to those requests to be in various forms. However, the I/O + server is to convert the results to binaries whenever possible (that is, + when the function supplied to get_until returns a list). This is + done in the example in section + An Annotated and Working Example I/O Server. +

+ +

An I/O server in binary mode affects the data sent to the client, so that + it must be able to handle binary data. For convenience, the modes of an + I/O server can be set and retrieved using the following I/O requests:

+ +
+{setopts, Opts}
+ + + Opts is a list of options in the format recognized by the + proplists module + (and by the I/O server). + + +

As an example, the I/O server for the interactive shell (in + group.erl) understands the following options:

+ +
+{binary, boolean()} (or binary/list)
+{echo, boolean()}
+{expand_fun, fun()}
+{encoding, unicode/latin1} (or unicode/latin1)
+ +

Options binary and encoding are common for all I/O servers + in OTP, while echo and expand are valid only for this I/O + server. Option unicode notifies how characters are put on the + physical I/O device, that is, if the terminal itself is Unicode-aware. + It does not affect how characters are sent in the I/O protocol, where + each request contains encoding information for the provided or returned + data.

+ +

The I/O server is to send one of the following as Reply:

+ +
+ok
+{error, Error}
+ +

An error (preferably enotsup) is to be expected if the option is + not supported by the I/O server (like if an echo option is sent in + a setopts request to a plain file).

+ +

To retrieve options, the following request is used:

+ +
+getopts
+ +

This request asks for a complete list of all options supported by the + I/O server as well as their current values.

+ +

The I/O server replies:

+ +
+OptList
+{error, Error}
+ + + OptList is a list of tuples {Option, Value}, where + Option always is an atom. + +
+ +
+ Multiple I/O Requests +

The Request element can in itself contain many Requests + by using the following format:

+ +
+{requests, Requests}
+ + + Requests is a list of valid io_request tuples for the + protocol. They must be executed in the order that they appear in + the list. The execution is to continue until one of the requests results + in an error or the list is consumed. The result of the last request is + sent back to the client. + + +

The I/O server can, for a list of requests, send any of the following + valid results in the reply, depending on the requests in the list:

+ +
+ok
+{ok, Data}
+{ok, Options}
+{error, Error}
+
+ +
+ Optional I/O Request +

The following I/O request is optional to implement and a client is to + be prepared for an error return:

+ +
+{get_geometry, Geometry}
+ + + Geometry is the atom rows or the atom + columns. + + +

The I/O server is to send the Reply as:

+ +
+{ok, N}
+{error, Error}
+ + + N is the number of character rows or columns that the I/O + device has, if applicable to the I/O device handled by the I/O server, + otherwise {error, enotsup} is a good answer. + +
+ +
+ Unimplemented Request Types +

If an I/O server encounters a request that it does not recognize (that + is, the io_request tuple has the expected format, but the + Request is unknown), the I/O server is to send a valid reply with + the error tuple:

+ +
+{error, request}
+ +

This makes it possible to extend the protocol with optional requests + and for the clients to be somewhat backward compatible.

+
+ +
+ An Annotated and Working Example I/O Server + +

An I/O server is any process capable of handling the I/O protocol. There + is no generic I/O server behavior, but could well be. The framework is + simple, a process handling incoming requests, usually both I/O-requests + and other I/O device-specific requests (positioning, closing, and so on). +

+ +

The example I/O server stores characters in an ETS table, making + up a fairly crude RAM file.

+ +

The module begins with the usual directives, a function to start the + I/O server and a main loop handling the requests:

+ + -module(ets_io_server). -export([start_link/0, init/0, loop/1, until_newline/3, until_enough/3]). @@ -490,39 +551,34 @@ loop(State) -> ?MODULE:loop(State#state{position = 0}); _Unknown -> ?MODULE:loop(State) - end. - - -

The main loop receives messages from the client (which might be using -the io module to send requests). -For each request the function -request/2 is called and a reply is eventually sent using the reply/3 -function.

+ end.
-

The "private" message {From, rewind} results in the -current position in the pseudo-file to be reset to 0 (the beginning of -the "file"). This is a typical example of IO device-specific -messages not being part of the I/O-protocol. It is usually a bad idea -to embed such private messages in io_request tuples, as that might be -confusing to the reader.

+

The main loop receives messages from the client (which can use the + the io module to send + requests). For each request, the function request/2 is called and a + reply is eventually sent using function reply/3.

-

Let us look at the reply function first...

+

The "private" message {From, rewind} results in the + current position in the pseudo-file to be reset to 0 (the beginning + of the "file"). This is a typical example of I/O device-specific + messages not being part of the I/O protocol. It is usually a bad idea to + embed such private messages in io_request tuples, as that can + confuse the reader.

- +

First, we examine the reply function:

+ reply(From, ReplyAs, Reply) -> - From ! {io_reply, ReplyAs, Reply}. + From ! {io_reply, ReplyAs, Reply}. -
+

It sends the io_reply tuple back to the client, providing element + ReplyAs received in the request along with the result of the + request, as described earlier.

-

Simple enough, it sends the io_reply tuple back to the client, -providing the ReplyAs element received in the request along with the -result of the request, as described above.

+

We need to handle some requests. First the requests for writing + characters:

-

Now look at the different requests we need to handle. First the -requests for writing characters:

- - + request({put_chars, Encoding, Chars}, State) -> put_chars(unicode:characters_to_list(Chars,Encoding),State); request({put_chars, Encoding, Module, Function, Args}, State) -> @@ -531,23 +587,22 @@ request({put_chars, Encoding, Module, Function, Args}, State) -> catch _:_ -> {error, {error,Function}, State} - end; - + end; -

The Encoding tells us how the characters in the request are -represented. We want to store the characters as lists in the -ETS table, so we convert them to lists using the -unicode:characters_to_list/2 function. The conversion function -conveniently accepts the encoding types unicode or latin1, so we can -use Encoding directly.

+

The Encoding says how the characters in the request are + represented. We want to store the characters as lists in the ETS + table, so we convert them to lists using function + unicode:characters_to_list/2. + The conversion function conveniently accepts the encoding types + unicode and latin1, so we can use Encoding directly.

-

When Module, Function and Arguments are provided, we simply apply it -and do the same thing with the result as if the data was provided -directly.

+

When Module, Function, and Arguments are provided, + we apply it and do the same with the result as if the data was provided + directly.

-

Let us handle the requests for retrieving data too:

+

We handle the requests for retrieving data:

- + request({get_until, Encoding, _Prompt, M, F, As}, State) -> get_until(Encoding, M, F, As, State); request({get_chars, Encoding, _Prompt, N}, State) -> @@ -555,17 +610,16 @@ request({get_chars, Encoding, _Prompt, N}, State) -> get_until(Encoding, ?MODULE, until_enough, [N], State); request({get_line, Encoding, _Prompt}, State) -> %% To simplify the code, get_line is implemented using get_until - get_until(Encoding, ?MODULE, until_newline, [$\n], State); - + get_until(Encoding, ?MODULE, until_newline, [$\n], State); -

Here we have cheated a little by more or less only implementing -get_until and using internal helpers to implement get_chars and -get_line. In production code, this might be too inefficient, but that -of course depends on the frequency of the different requests. Before -we start actually implementing the functions put_chars/2 and -get_until/5, let us look into the few remaining requests:

+

Here we have cheated a little by more or less only implementing + get_until and using internal helpers to implement get_chars + and get_line. In production code, this can be inefficient, but + that depends on the frequency of the different requests. Before we start + implementing functions put_chars/2 and get_until/5, we + examine the few remaining requests:

- + request({get_geometry,_}, State) -> {error, {error,enotsup}, State}; request({setopts, Opts}, State) -> @@ -573,23 +627,23 @@ request({setopts, Opts}, State) -> request(getopts, State) -> getopts(State); request({requests, Reqs}, State) -> - multi_request(Reqs, {ok, ok, State}); - + multi_request(Reqs, {ok, ok, State}); -

The get_geometry request has no meaning for this I/O server, so the -reply will be {error, enotsup}. The only option we handle is the -binary/list option, which is done in separate functions.

+

Request get_geometry has no meaning for this I/O server, so the + reply is {error, enotsup}. The only option we handle is + binary/list, which is done in separate functions.

-

The multi-request tag (requests) is handled in a separate loop -function applying the requests in the list one after another, -returning the last result.

+

The multi-request tag (requests) is handled in a separate loop + function applying the requests in the list one after another, returning + the last result.

-

What is left is to handle backward compatibility and the file module -(which uses the old requests until backward compatibility with pre-R13 -nodes is no longer needed). Note that the I/O server will not work with -a simple file:write/2 if these are not added:

+

We need to handle backward compatibility and the + file module (which + uses the old requests until backward compatibility with pre-R13 nodes is + no longer needed). Notice that the I/O server does not work with a simple + file:write/2 if these are not added:

- + request({put_chars,Chars}, State) -> request({put_chars,latin1,Chars}, State); request({put_chars,M,F,As}, State) -> @@ -599,38 +653,35 @@ request({get_chars,Prompt,N}, State) -> request({get_line,Prompt}, State) -> request({get_line,latin1,Prompt}, State); request({get_until, Prompt,M,F,As}, State) -> - request({get_until,latin1,Prompt,M,F,As}, State); - + request({get_until,latin1,Prompt,M,F,As}, State); -

OK, what is left now is to return {error, request} if the request is -not recognized:

+

{error, request} must be returned if the request is not + recognized:

- + request(_Other, State) -> - {error, {error, request}, State}. - + {error, {error, request}, State}. -

Let us move further and actually handle the different requests, first -the fairly generic multi-request type:

+

Next we handle the different requests, first the fairly generic + multi-request type:

- + multi_request([R|Rs], {ok, _Res, State}) -> multi_request(Rs, request(R, State)); multi_request([_|_], Error) -> Error; multi_request([], Result) -> - Result. - + Result. -

We loop through the requests one at the time, stopping when we either -encounter an error or the list is exhausted. The last return value is -sent back to the client (it is first returned to the main loop and then -sent back by the function io_reply).

+

We loop through the requests one at the time, stopping when we either + encounter an error or the list is exhausted. The last return value is + sent back to the client (it is first returned to the main loop and then + sent back by function io_reply).

-

The getopts and setopts requests are also simple to handle, we just -change or read our state record:

+

Requests getopts and setopts are also simple to handle. + We only change or read the state record:

- + setopts(Opts0,State) -> Opts = proplists:unfold( proplists:substitute_negations( @@ -662,46 +713,44 @@ getopts(#state{mode=M} = S) -> true; _ -> false - end}],S}. - + end}],S}. -

As a convention, all I/O servers handle both {setopts, [binary]}, -{setopts, [list]} and {setopts,[{binary, boolean()}]}, hence the trick -with proplists:substitute_negations/2 and proplists:unfold/1. If -invalid options are sent to us, we send {error, enotsup} back to the -client.

+

As a convention, all I/O servers handle both {setopts, [binary]}, + {setopts, [list]}, and {setopts,[{binary, boolean()}]}, + hence the trick with proplists:substitute_negations/2 and + proplists:unfold/1. If invalid options are sent to us, we send + {error, enotsup} back to the client.

-

The getopts request should return a list of {Option, Value} tuples, -which has the twofold function of providing both the current values -and the available options of this I/O server. We have only one option, -and hence return that.

+

Request getopts is to return a list of {Option, Value} + tuples. This has the twofold function of providing both the current values + and the available options of this I/O server. We have only one option, and + hence return that.

-

So far our I/O server has been fairly generic (except for the rewind -request handled in the main loop and the creation of an ETS table). -Most I/O servers contain code similar to the one above.

+

So far this I/O server is fairly generic (except for request + rewind handled in the main loop and the creation of an ETS + table). Most I/O servers contain code similar to this one.

-

To make the example runnable, we now start implementing the actual -reading and writing of the data to/from the ETS table. First the -put_chars/3 function:

+

To make the example runnable, we start implementing the reading and + writing of the data to/from the ETS table. First function + put_chars/3:

- + put_chars(Chars, #state{table = T, position = P} = State) -> R = P div ?CHARS_PER_REC, C = P rem ?CHARS_PER_REC, [ apply_update(T,U) || U <- split_data(Chars, R, C) ], - {ok, ok, State#state{position = (P + length(Chars))}}. - + {ok, ok, State#state{position = (P + length(Chars))}}. -

We already have the data as (Unicode) lists and therefore just split -the list in runs of a predefined size and put each run in the -table at the current position (and forward). The functions -split_data/3 and apply_update/2 are implemented below.

+

We already have the data as (Unicode) lists and therefore only split + the list in runs of a predefined size and put each run in the table at + the current position (and forward). Functions split_data/3 and + apply_update/2 are implemented below.

-

Now we want to read data from the table. The get_until/5 function reads -data and applies the function until it says it is done. The result is -sent back to the client:

+

Now we want to read data from the table. Function get_until/5 + reads data and applies the function until it says that it is done. The + result is sent back to the client:

- + get_until(Encoding, Mod, Func, As, #state{position = P, mode = M, table = T} = State) -> case get_loop(Mod,Func,As,T,P,[]) of @@ -737,34 +786,34 @@ get_loop(M,F,A,T,P,C) -> get_loop(M,F,A,T,NewP,NewC); _ -> {error,F} - end. - - -

Here we also handle the mode (binary or list) that can be set by -the setopts request. By default, all OTP I/O servers send data back to -the client as lists, but switching mode to binary might increase -efficiency if the I/O server handles it in an appropriate way. The -implementation of get_until is hard to get efficient as the supplied -function is defined to take lists as arguments, but get_chars and -get_line can be optimized for binary mode. This example does not -optimize anything however. It is important though that the returned -data is of the right type depending on the options set, so we convert -the lists to binaries in the correct encoding if possible -before returning. The function supplied in the get_until request tuple may, -as its final result return anything, so only functions actually -returning lists can get them converted to binaries. If the request -contained the encoding tag unicode, the lists can contain all Unicode -codepoints and the binaries should be in UTF-8, if the encoding tag -was latin1, the client should only get characters in the range -0..255. The function check/2 takes care of not returning arbitrary -Unicode codepoints in lists if the encoding was given as latin1. If -the function did not return a list, the check cannot be performed and -the result will be that of the supplied function untouched.

- -

Now we are more or less done. We implement the utility functions below -to actually manipulate the table:

- - + end. + +

Here we also handle the mode (binary or list) that can be + set by request setopts. By default, all OTP I/O servers send data + back to the client as lists, but switching mode to binary can + increase efficiency if the I/O server handles it in an appropriate way. + The implementation of get_until is difficult to get efficient, as + the supplied function is defined to take lists as arguments, but + get_chars and get_line can be optimized for binary mode. + However, this example does not optimize anything.

+ +

It is important though that the returned data is of the correct type + depending on the options set. We therefore convert the lists to binaries + in the correct encoding if possible before returning. The + function supplied in the get_until request tuple can, as its final + result return anything, so only functions returning lists can get them + converted to binaries. If the request contains encoding tag + unicode, the lists can contain all Unicode code points and the + binaries are to be in UTF-8. If the encoding tag is latin1, the + client is only to get characters in the range 0..255. Function + check/2 takes care of not returning arbitrary Unicode code points + in lists if the encoding was specified as latin1. If the function + does not return a list, the check cannot be performed and the result is + that of the supplied function untouched.

+ +

To manipulate the table we implement the following utility functions:

+ + check(unicode, List) -> List; check(latin1, List) -> @@ -775,18 +824,16 @@ check(latin1, List) -> catch throw:_ -> {error,{cannot_convert, unicode, latin1}} - end. - + end.
-

The function check takes care of providing an error tuple if Unicode -codepoints above 255 is to be returned if the client requested -latin1.

+

The function check provides an error tuple if Unicode code points > + 255 are to be returned if the client requested latin1.

-

The two functions until_newline/3 and until_enough/3 are helpers used -together with the get_until/5 function to implement get_chars and -get_line (inefficiently):

- - +

The two functions until_newline/3 and until_enough/3 are + helpers used together with function get_until/5 to implement + get_chars and get_line (inefficiently):

+ + until_newline([],eof,_MyStopCharacter) -> {done,eof,[]}; until_newline(ThisFar,eof,_MyStopCharacter) -> @@ -810,16 +857,15 @@ until_enough(ThisFar,CharList,N) {Res,Rest} = my_split(N,ThisFar ++ CharList, []), {done,Res,Rest}; until_enough(ThisFar,CharList,_N) -> - {more,ThisFar++CharList}. - + {more,ThisFar++CharList}.
-

As can be seen, the functions above are just the type of functions -that should be provided in get_until requests.

+

As can be seen, the functions above are just the type of functions that + are to be provided in get_until requests.

-

Now we only need to read and write the table in an appropriate way to -complete the I/O server:

+

To complete the I/O server, we only need to read and write the table in + an appropriate way:

- + get(P,Tab) -> R = P div ?CHARS_PER_REC, C = P rem ?CHARS_PER_REC, @@ -856,18 +902,16 @@ apply_update(Table, {Row, Col, List}) -> {Part1,_} = my_split(Col,OldData,[]), {_,Part2} = my_split(Col+length(List),OldData,[]), ets:insert(Table,{Row, Part1 ++ List ++ Part2}) - end. - - -

The table is read or written in chunks of ?CHARS_PER_REC, overwriting -when necessary. The implementation is obviously not efficient, it is -just working.

- -

This concludes the example. It is fully runnable and you can read or -write to the I/O server by using i.e. the io module or even the file -module. It is as simple as that to implement a fully fledged I/O server -in Erlang.

-
+ end. + +

The table is read or written in chunks of ?CHARS_PER_REC, + overwriting when necessary. The implementation is clearly not efficient, + it is just working.

+ +

This concludes the example. It is fully runnable and you can read or + write to the I/O server by using, for example, the + io module or even the + file module. It is + as simple as that to implement a fully fledged I/O server in Erlang.

+
- - -- cgit v1.2.3