Age | Commit message (Collapse) | Author |
|
Based on RabbitMQ performance testing.
|
|
|
|
There's not a big performance difference between 8192 and 1024
so let's use less memory at the start of the connection.
|
|
Cowboy will set the socket's buffer size dynamically to
better fit the current workload. When the incoming data
is small, a low buffer size reduces the memory footprint
and improves responsiveness and therefore performance.
When the incoming data is large, such as large HTTP
request bodies, a larger buffer size helps us avoid
doing too many binary appends and related allocations.
Setting a large buffer size for all use cases is
sub-optimal because allocating more than needed
necessarily results in a performance hit (not just
increased memory usage).
By default Cowboy starts with a buffer size of 8192 bytes.
It then doubles or halves the buffer size depending on
the size of the data it receives from the socket. It
stops decreasing at 8192 and increasing at 131072 by
default.
To keep track of the size of the incoming data Cowboy
maintains a moving average. It allows Cowboy to avoid
changing the buffer too often but still react quickly
when necessary. Cowboy will increase the buffer size
when the moving average is above 90% of the current
buffer size, and decrease when the moving average is
below 40% of the current buffer size.
The current buffer size and moving average are
propagated when switching protocols. The dynamic buffer
is implemented in HTTP/1, HTTP/2 and HTTP/1 Websocket.
HTTP/2 Websocket has it disabled because it doesn't
interact directly with the socket; in that case it
is HTTP/2 that has a dynamic buffer.
The dynamic buffer provides a very large performance improvement
in many scenarios, at minimal cost for others. Because it largely
depend on the underlying protocol the improvements are no all equal.
TLS and compression also impact the results.
The improvement when reading a large request body, with the
requests repeated in a fast loop are:
* HTTP: 6x to 20x faster
* HTTPS: 2x to 6x faster
* H2: 4x to 5x faster
* H2C: 20x to 40x faster
I am not sure why H2C's performance was so bad, especially compared
to H2, when using default buffer sizes. Dynamic buffers make H2C a
lot more viable with default settings.
The performance impact on "hello world" type requests is minimal,
it goes from -5% to +5% roughly.
Websocket improvements vary again depending on the protocol, but
also depending on whether compression is enabled:
* HTTP echo: roughly 2x faster
* HTTP send: roughly 4x faster
* H2C echo: roughly 2x faster
* H2C send: 3x to 4x faster
In the echo test we reply back, and Gun doesn't have the dynamic
buffer optimisation, so that probably explains the x2 difference.
With compression however there isn't much improvement. The results
are roughly within -10% to +10% of each other. Zlib compression
seems to be a bottleneck, or at least to modify the performance
profile to such an extent that the size of the buffer does not
matter. This happens to randomly generated binary data as well
so it is probably not caused by the test data.
|
|
This includes Websocket over HTTP/3.
Since quicer, which provides the QUIC implementation,
is a NIF, Cowboy cannot depend directly on it. In order
to enable QUIC and HTTP/3, users have to set the
COWBOY_QUICER environment variable:
export COWBOY_QUICER=1
In order to run the test suites, the same must be done
for Gun:
export GUN_QUICER=1
HTTP/3 support is currently not available on Windows
due to compilation issues of quicer which have yet to
be looked at or resolved.
HTTP/3 support is also unavailable on the upcoming
OTP-27 due to compilation errors in quicer dependencies.
Once resolved HTTP/3 should work on OTP-27.
Because of how QUIC currently works, it's possible
that streams that get reset after sending a response
do not receive that response. The test suite was
modified to accomodate for that. A future extension
to QUIC will allow us to gracefully reset streams.
This also updates Erlang.mk.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
This commit reworks the logging that Cowboy does via
error_logger to make the module that will do the actual
logging configurable.
The logger module interface must be the same as logger
and lager: a separate function per log level with the
same log levels they support.
The default behavior remains to call error_logger,
although some messages were downgraded to warnings
instead of errors. Since error_logger only supports
three different log levels, some messages may get
downgraded/upgraded depending on what the original
log level was to make them compatible with error_logger.
The {log, Level, Format, Args} command was also
added to stream handlers. Stream handlers should
use this command to log messages because it allows
writing a stream handler to intercept some of those
messages and extract information or block them as
necessary.
The logger option only applies to Cowboy itself,
not to the messages Ranch logs, so more work remains
to be done in that area.
|
|
|
|
They are now cowboy:start_clear/3 and cowboy:start_tls/3.
The NumAcceptors argument can be specified via the
num_acceptor transport option. Ranch has been updated
to 1.4.0 to that effect.
|
|
|
|
The stream handlers can be specified using the protocol
option 'stream_handlers'. It defaults to [cowboy_stream_h].
The cowboy_stream_h module currently does not forward the
calls to further stream handlers. It feels like an edge
case; usually we'd want to put our own handlers between
the protocol code and the request process. I am therefore
going to focus on other things for now.
The various types and specifications for stream handlers
have been updated and the cowboy_stream module can now
be safely used as a behavior. The interface might change
a little more, though.
This commit does not include tests or documentation.
They will follow separately.
|
|
|
|
|
|
|
|
This is a large commit. The cowboy_req interface has largely
changed, and will change a little more. It's possible that
some examples or tests have not been converted to the new
interface yet. The documentation has not yet been updated.
All of this will be fixed in smaller subsequent commits.
Gotta start somewhere...
|
|
|
|
|
|
Breaking changes with previous commit. This is a very large change,
and I am giving up on making a single commit that fixes everything.
More commits will follow slowly adding back features, introducing
new tests and fixing the documentation.
This change contains most of the work toward unifying the interface
for handling both HTTP/1.1 and HTTP/2. HTTP/1.1 connections are now
no longer 1 process per connection; instead by default 1 process per
request is also created. This has a number of pros and cons.
Because it has cons, we also allow users to use a lower-level API
that acts on "streams" (requests/responses) directly at the connection
process-level. If performance is a concern, one can always write a
stream handler. The performance in this case will be even greater
than with Cowboy 1, although all the special handlers are unavailable.
When switching to Websocket, after the handler returns from init/2,
Cowboy stops the stream and the Websocket protocol takes over the
connection process. Websocket then calls websocket_init/2 for any
additional initialization such as timers, because the process is
different in init/2 and websocket_*/* functions. This however would
allow us to use websocket_init/2 for sending messages on connect,
instead of sending ourselves a message and be subject to races.
Note that websocket_init/2 is optional.
This is all a big change and while most of the tests pass, some
functionality currently doesn't. SPDY is broken and will be removed
soon in favor of HTTP/2. Automatic compression is currently disabled.
The cowboy_req interface probably still have a few functions that
need to be updated. The docs and examples do not refer the current
functionality anymore.
Everything will be fixed over time. Feedback is more than welcome.
Open a ticket!
|
|
This commit is not only an early preview of HTTP/2, it is an
early preview of the new Cowboy architecture that will be
presented tomorrow in my talk. If you have found it before
the talk, great! It's not complete so you better go watch
the talk anyway.
|
|
It was redundant with middlewares. Allows us to save a few operations
for every incoming requests.
|
|
Simplify the interface for most cowboy_req functions. They all return
a single value except the four body reading functions. The reply functions
now only return a Req value.
Access functions do not return a Req anymore.
Functions that used to cache results do not have a cache anymore.
The interface for accessing query string and cookies has therefore
been changed.
There are now three query string functions: qs/1 provides access
to the raw query string value; parse_qs/1 returns the query string
as a list of key/values; match_qs/2 returns a map containing the
values requested in the second argument, after applying constraints
and default value.
Similarly, there are two cookie functions: parse_cookies/1 and
match_cookies/2. More match functions will be added in future commits.
None of the functions return an error tuple anymore. It either works
or crashes. Cowboy will attempt to provide an appropriate status code
in the response of crashed handlers.
As a result, the content decode function has its return value changed
to a simple binary, and the body reading functions only return on success.
|
|
|
|
|
|
|
|
|
|
The SPDY connection processes are also supervisors.
Missing:
* sendfile support
* request body reading support
|
|
|
|
|
|
|
|
Should improve the detection of wrong protocol options.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
This is the first of many API incompatible changes.
You have been warned.
|
|
Also update the CHANGELOG and copyright years.
|
|
|
|
This allows any application to upgrade the protocol options without
having to restart the listener. This is most useful to update the
dispatch list of HTTP servers, for example.
The upgrade is done at the acceptor level, meaning only new connections
receive the new protocol options.
|
|
|
|
This new exported function returns a Child Spec suitable for embedding
cowboy in another applications supervisor structure. While here,
implement `start_listener/6` in terms of it.
|