The HTTP server, also referred to as httpd, handles HTTP requests as described in RFC 2616 with a few exceptions such as gateway and proxy functionality. The server supports ipv6 as long as the underlying mechanisms also do so.
The server implements numerous features such as SSL (Secure Sockets Layer), ESI (Erlang Scripting Interface), CGI (Common Gateway Interface), User Authentication(using Mnesia, dets or plain text database), Common Logfile Format (with or without disk_log(3) support), URL Aliasing, Action Mappings, Directory Listings and SSI (Server-Side Includes).
The configuration of the server is provided as an erlang property list, and for backwards compatibility also a configuration file using apache-style configuration directives is supported.
As of inets version 5.0 the HTTP server is an easy to start/stop and customize web server that provides the most basic web server functionality. Depending on your needs there are also other erlang based web servers that may be of interest such as Yaws, http://yaws.hyber.org, that for instance has its own markup support to generate html, and supports certain buzzword technologies such as SOAP.
Allmost all server functionality has been implemented using an especially crafted server API, it is described in the Erlang Web Server API. This API can be used to advantage by all who wants to enhance the server core functionality, for example custom logging and authentication.
What to put in the erlang node application configuration file in order to start a http server at application startup.
[{inets, [{services, [{httpd, [{proplist_file,
"/var/tmp/server_root/conf/8888_props.conf"}]},
{httpd, [{proplist_file,
"/var/tmp/server_root/conf/8080_props.conf"}]}]}]}].
The server is configured using an erlang property list.
For the available properties see
All possible config properties are as follows
httpd_service() -> {httpd, httpd()}
httpd() -> [httpd_config()]
httpd_config() -> {file, file()} |
{proplist_file, file()}
{debug, debug()} |
{accept_timeout, integer()}
debug() -> disable | [debug_options()]
debug_options() -> {all_functions, modules()} |
{exported_functions, modules()} |
{disable, modules()}
modules() -> [atom()]
{proplist_file, file()} File containing an erlang property list, followed by a full stop, describing the HTTP server configuration.
{file, file()} If you use an old apace-like configuration file.
{debug, debug()} - Can enable trace on all functions or only exported functions on chosen modules.
{accept_timeout, integer()} sets the wanted timeout value for the server to set up a request connection.
1 > inets:start().
ok
Start a HTTP server with minimal required configuration. Note that if you specify port 0 an arbitrary available port will be used and you can use the info function to find out which port number that was picked.
2 > {ok, Pid} = inets:start(httpd, [{port, 0},
{server_name,"httpd_test"}, {server_root,"/tmp"},
{document_root,"/tmp/htdocs"}, {bind_address, "localhost"}]).
{ok, 0.79.0}
3 > httpd:info(Pid).
[{mime_types,[{"html","text/html"},{"htm","text/html"}]},
{server_name,"httpd_test"},
{bind_address, {127,0,0,1}},
{server_root,"/tmp"},
{port,59408},
{document_root,"/tmp/htdocs"}]
Reload the configuration without restarting the server. Note port and bind_address can not be changed. Clients trying to access the server during the reload will get a service temporary unavailable answer.
4 > httpd:reload_config([{port, 59408},
{server_name,"httpd_test"}, {server_root,"/tmp/www_test"},
{document_root,"/tmp/www_test/htdocs"},
{bind_address, "localhost"}], non_disturbing).
ok.
5 > httpd:info(Pid, [server_root, document_root]).
[{server_root,"/tmp/www_test"},{document_root,"/tmp/www_test/htdocs"}]
6 > ok = inets:stop(httpd, Pid).
Alternative:
6 > ok = inets:stop(httpd, {{127,0,0,1}, 59408}).
Note that bind_address has to be the ip address reported by the info function and can not be the hostname that is allowed when inputting bind_address.
If users of the web server needs to manage authentication of web pages that are local to their user and do not have server administrative privileges. They can use the per-directory runtime configurable user-authentication scheme that Inets calls htaccess. It works the following way:
In every directory under the
DIRECTIVE: "allow"
Syntax:
Default:
Same as the directive allow for the server config file.
DIRECTIVE: "AllowOverRide"
Syntax:
Default:
If only one access-file exists setting this parameter to none can lessen the burden on the server since the server will stop looking for access-files.
DIRECTIVE: "AuthGroupfile"
Syntax:
Default:
AuthGroupFile indicates which file that contains the list of groups. Filename must contain the absolute path to the file. The format of the file is one group per row and every row contains the name of the group and the members of the group separated by a space, for example:
GroupName: Member1 Member2 .... MemberN
DIRECTIVE: "AuthName"
Syntax:
Default:
Same as the directive AuthName for the server config file.
DIRECTIVE: "AuthType"
Syntax:
Default:
DIRECTIVE: "AuthUserFile"
Syntax:
Default:
UserName:Password UserName:Password
DIRECTIVE: "deny"
Syntax:
Context: Limit
Same as the directive deny for the server config file.
DIRECTIVE: "Limit"
Syntax:
Default: - None -
<Limit POST GET HEAD> order allow deny require group group1 allow from 123.145.244.5 </Limit>
DIRECTIVE: "order"
Syntax:
Default: allow deny
If the order is set to allow deny, then first the users network address is controlled to be in the allow subset. If the users network address is not in the allowed subset he will be denied to get the asset. If the network-address is in the allowed subset then a second control will be preformed, that the users network address is not in the subset of network addresses that shall be denied as specified by the deny parameter.
If the order is set to deny allow then only users from networks specified to be in the allowed subset will succeed to request assets in the limited area.
DIRECTIVE: "require"
Syntax:
Default:
Context: Limit
See the require directive in the documentation of mod_auth(3) for more information.
The Inets HTTP server provides two ways of creating dynamic web pages, each with its own advantages and disadvantages.
First there are CGI-scripts that can be written in any programming language. CGI-scripts are standardized and supported by most web servers. The drawback with CGI-scripts is that they are resource intensive because of their design. CGI requires the server to fork a new OS process for each executable it needs to start.
Second there are ESI-functions that provide a tight and efficient interface to the execution of Erlang functions, this interface on the other hand is Inets specific.
The mod_cgi module makes it possible to execute CGI scripts in the server. A file that matches the definition of a ScriptAlias config directive is treated as a CGI script. A CGI script is executed by the server and its output is returned to the client.
The CGI Script response comprises a message-header and a message-body, separated by a blank line. The message-header contains one or more header fields. The body may be empty. Example:
"Content-Type:text/plain\nAccept-Ranges:none\n\nsome very
plain text"
The server will interpret the cgi-headers and most of them will be transformed into HTTP headers and sent back to the client together with the body.
Support for CGI-1.1 is implemented in accordance with the RFC 3875.
The erlang server interface is implemented by the module mod_esi.
The erl scheme is designed to mimic plain CGI, but without the extra overhead. An URL which calls an Erlang erl function has the following syntax (regular expression):
http://your.server.org/***/Module[:/]Function(?QueryString|/PathInfo)
*** above depends on how the ErlScriptAlias config directive has been used
The module (Module) referred to must be found in the code
path, and it must define a function (Function) with an arity
of two or three. It is preferable to implement a funtion
with arity three as it permits you to send chunks of the
webpage beeing generated to the client during the generation
phase instead of first generating the whole web page and
then sending it to the client. The option to implement a
function with arity two is only kept for
backwardcompatibilty reasons.
See
The eval scheme is straight-forward and does not mimic the behavior of plain CGI. An URL which calls an Erlang eval function has the following syntax:
http://your.server.org/***/Mod:Func(Arg1,...,ArgN)
*** above depends on how the ErlScriptAlias config directive has been used
The module (Mod) referred to must be found in the code
path, and data returned by the function (Func) is passed
back to the client. Data returned from the
function must furthermore take the form as specified in
the CGI specification. See
The eval scheme can seriously threaten the integrity of the Erlang node housing a Web server, for example:
http://your.server.org/eval?httpd_example:print(atom_to_list(apply(erlang,halt,[])))
which effectively will close down the Erlang node, therefor, use the erl scheme instead, until this security breach has been fixed.
Today there are no good way of solving this problem and therefore Eval Scheme may be removed in future release of Inets.
There are three types of logs supported. Transfer logs, security logs and error logs. The de-facto standard Common Logfile Format is used for the transfer and security logging. There are numerous statistics programs available to analyze Common Logfile Format. The Common Logfile Format looks as follows:
remotehost rfc931 authuser [date] "request" status bytes
Internal server errors are recorde in the error log file. The format of this file is a more ad hoc format than the logs using Common Logfile Format, but conforms to the following syntax:
[date] access to path failed for remotehost, reason: reason
Server Side Includes enables the server to run code embedded in HTML pages to generate the response to the client.
Having the server parse HTML pages is a double edged sword! It can be costly for a heavily loaded server to perform parsing of HTML pages while sending them. Furthermore, it can be considered a security risk to have average users executing commands in the name of the Erlang node user. Carefully consider these items before activating server-side includes.
The server must be told which filename extensions to be used
for the parsed files. These files, while very similar to HTML,
are not HTML and are thus not treated the same. Internally, the
server uses the magic MIME type
text/x-server-parsed-html shtml shtm
This makes files ending with
text/x-server-parsed-html html htm
All server-side include directives to the server are formatted as SGML comments within the HTML page. This is in case the document should ever find itself in the client's hands unparsed. Each directive has the following format:
<!--#command tag1="value1" tag2="value2" -->
Each command takes different arguments, most only accept one tag at a time. Here is a breakdown of the commands and their associated tags:
The config directive controls various aspects of the file parsing. There are two valid tags:
controls the message sent back to the client if an error occurred while parsing the document. All errors are logged in the server's error log.
determines the format used to display the size of
a file. Valid choices are
The include directory will insert the text of a document into the parsed document. This command accepts two tags:
gives a virtual path to a document on the server. Only normal files and other parsed documents can be accessed in this way.
gives a pathname relative to the current
directory.
The echo directive prints the value of one of the include
variables (defined below). The only valid tag to this
command is
The fsize directive prints the size of the specified
file. Valid tags are the same as with the
The lastmod directive prints the last modification date of
the specified file. Valid tags are the same as with the
The exec directive executes a given shell command or CGI script. Valid tags are:
executes the given string using
executes the given virtual path to a CGI script and includes its output. The server does not perform error checking on the script output.
A number of variables are made available to parsed documents. In addition to the CGI variable set, the following variables are made available:
The current filename.
The virtual path to this document (such as
The unescaped version of any search query the client
sent, with all shell-special characters escaped with
The current date, local time zone.
Same as DATE_LOCAL but in Greenwich mean time.
The last modification date of the current document.
The process of handling a HTTP request involves several steps such as:
To provide customization and extensibility of the HTTP servers request handling most of these steps are handled by one or more modules that may be replaced or removed at runtime, and of course new ones can be added. For each request all modules will be traversed in the order specified by the modules directive in the server configuration file. Some parts mainly the communication related steps are considered server core functionality and are not implemented using the Erlang Web Server API. A description of functionality implemented by the Erlang Webserver API is described in the section Inets Webserver Modules.
A module can use data generated by previous modules in the Erlang Webserver API module sequence or generate data to be used by consecutive Erlang Web Server API modules. This is made possible due to an internal list of key-value tuples, also referred to as interaction data.
Interaction data enforces module dependencies and should be avoided if possible. This means the order of modules in the Modules property is significant.
Each module implements server functionality using the Erlang Web Server API should implement the following call back functions:
The latter functions are needed only when new config
directives are to be introduced. For details see
The convention is that all modules implementing some webserver functionality has the name mod_*. When configuring the web server an appropriate selection of these modules should be present in the Module directive. Please note that there are some interaction dependencies to take into account so the order of the modules can not be totally random.
Runs CGI scripts whenever a file of a certain type or HTTP method (See RFC 1945) is requested.
Uses the following Erlang Web Server API interaction data:
Exports the following Erlang Web Server API interaction data, if possible:
This module makes it possible to map different parts of the host file system into the document tree e.i. creates aliases and redirections.
Exports the following Erlang Web Server API interaction data, if possible:
This module provides for basic user authentication using textual files, dets databases as well as mnesia databases.
Uses the following Erlang Web Server API interaction data:
Exports the following Erlang Web Server API interaction data:
If Mnesia is used as storage method, Mnesia must be started prio to the HTTP server. The first time Mnesia is started the schema and the tables must be created before Mnesia is started. A naive example of a module with two functions that creates and start mnesia is provided here. The function shall be used the first time. first_start/0 creates the schema and the tables. The second function start/0 shall be used in consecutive startups. start/0 Starts Mnesia and wait for the tables to be initiated. This function must only be used when the schema and the tables already is created.
-module(mnesia_test).
-export([start/0,load_data/0]).
-include_lib("mod_auth.hrl").
first_start() ->
mnesia:create_schema([node()]),
mnesia:start(),
mnesia:create_table(httpd_user,
[{type, bag},
{disc_copies, [node()]},
{attributes, record_info(fields,
httpd_user)}]),
mnesia:create_table(httpd_group,
[{type, bag},
{disc_copies, [node()]},
{attributes, record_info(fields,
httpd_group)}]),
mnesia:wait_for_tables([httpd_user, httpd_group], 60000).
start() ->
mnesia:start(),
mnesia:wait_for_tables([httpd_user, httpd_group], 60000).
To create the Mnesia tables we use two records defined in mod_auth.hrl so the file must be included. The first function first_start/0 creates a schema that specify on which nodes the database shall reside. Then it starts Mnesia and creates the tables. The first argument is the name of the tables, the second argument is a list of options how the table will be created, see Mnesia documentation for more information. Since the current implementation of the mod_auth_mnesia saves one row for each user the type must be bag. When the schema and the tables is created the second function start/0 shall be used to start Mensia. It starts Mnesia and wait for the tables to be loaded. Mnesia use the directory specified as mnesia_dir at startup if specified, otherwise Mnesia use the current directory. For security reasons, make sure that the Mnesia tables are stored outside the document tree of the HTTP server. If it is placed in the directory which it protects, clients will be able to download the tables. Only the dets and mnesia storage methods allow writing of dynamic user data to disk. plain is a read only method.
This module handles invoking of CGI scripts
This module generates an HTML directory listing (Apache-style) if a client sends a request for a directory instead of a file. This module needs to be removed from the Modules config directive if directory listings is unwanted.
Uses the following Erlang Web Server API interaction data:
Exports the following Erlang Web Server API interaction data:
Standard logging using the "Common Logfile Format" and disk_log(3).
Uses the following Erlang Web Server API interaction data:
This module implements the Erlang Server Interface (ESI) that provides a tight and efficient interface to the execution of Erlang functions.
Uses the following Erlang Web Server API interaction data:
Exports the following Erlang Web Server API interaction data:
This module is responsible for handling GET requests to regular files. GET requests for parts of files is handled by mod_range.
Uses the following Erlang Web Server API interaction data:
This module is responsible for handling HEAD requests to regular files. HEAD requests for dynamic content is handled by each module responsible for dynamic content.
Uses the following Erlang Web Server API interaction data:
This module provides per-directory user configurable access control.
Uses the following Erlang Web Server API interaction data:
Exports the following Erlang Web Server API interaction data:
This module makes it possible to expand "macros" embedded in HTML pages before they are delivered to the client, that is Server-Side Includes (SSI).
Uses the following Erlang Webserver API interaction data:
Exports the following Erlang Webserver API interaction data:
Standard logging using the "Common Logfile Format" and text files.
Uses the following Erlang Webserver API interaction data:
This module response to requests for one or many ranges of a file. This is especially useful when downloading large files, since a broken download may be resumed.
Note that request for multiple parts of a document will report a size of zero to the log file.
Uses the following Erlang Webserver API interaction data:
This module controls that the conditions in the requests is fulfilled. For example a request may specify that the answer only is of interest if the content is unchanged since last retrieval. Or if the content is changed the range-request shall be converted to a request for the whole file instead.
If
a client sends more then one of the header fields that restricts
the servers right to respond, the standard does not specify how
this shall be handled. httpd will control each field in the
following order and if one of the fields not match the current
state the request will be rejected with a proper response.
1.If-modified
2.If-Unmodified
3.If-Match
4.If-Nomatch
Uses the following Erlang Webserver API interaction data:
Exports the following Erlang Webserver API interaction data:
This module serves as a filter for authenticated requests handled in mod_auth. It provides possibility to restrict users from access for a specified amount of time if they fail to authenticate several times. It logs failed authentication as well as blocking of users, and it also calls a configurable call-back module when the events occur.
There is also an API to manually block, unblock and list blocked users or users, who have been authenticated within a configurable amount of time.
mod_trace is responsible for handling of TRACE requests. Trace is a new request method in HTTP/1.1. The intended use of trace requests is for testing. The body of the trace response is the request message that the responding Web server or proxy received.