Age | Commit message (Collapse) | Author |
|
|
|
Cowboy will set the socket's buffer size dynamically to
better fit the current workload. When the incoming data
is small, a low buffer size reduces the memory footprint
and improves responsiveness and therefore performance.
When the incoming data is large, such as large HTTP
request bodies, a larger buffer size helps us avoid
doing too many binary appends and related allocations.
Setting a large buffer size for all use cases is
sub-optimal because allocating more than needed
necessarily results in a performance hit (not just
increased memory usage).
By default Cowboy starts with a buffer size of 8192 bytes.
It then doubles or halves the buffer size depending on
the size of the data it receives from the socket. It
stops decreasing at 8192 and increasing at 131072 by
default.
To keep track of the size of the incoming data Cowboy
maintains a moving average. It allows Cowboy to avoid
changing the buffer too often but still react quickly
when necessary. Cowboy will increase the buffer size
when the moving average is above 90% of the current
buffer size, and decrease when the moving average is
below 40% of the current buffer size.
The current buffer size and moving average are
propagated when switching protocols. The dynamic buffer
is implemented in HTTP/1, HTTP/2 and HTTP/1 Websocket.
HTTP/2 Websocket has it disabled because it doesn't
interact directly with the socket; in that case it
is HTTP/2 that has a dynamic buffer.
The dynamic buffer provides a very large performance improvement
in many scenarios, at minimal cost for others. Because it largely
depend on the underlying protocol the improvements are no all equal.
TLS and compression also impact the results.
The improvement when reading a large request body, with the
requests repeated in a fast loop are:
* HTTP: 6x to 20x faster
* HTTPS: 2x to 6x faster
* H2: 4x to 5x faster
* H2C: 20x to 40x faster
I am not sure why H2C's performance was so bad, especially compared
to H2, when using default buffer sizes. Dynamic buffers make H2C a
lot more viable with default settings.
The performance impact on "hello world" type requests is minimal,
it goes from -5% to +5% roughly.
Websocket improvements vary again depending on the protocol, but
also depending on whether compression is enabled:
* HTTP echo: roughly 2x faster
* HTTP send: roughly 4x faster
* H2C echo: roughly 2x faster
* H2C send: 3x to 4x faster
In the echo test we reply back, and Gun doesn't have the dynamic
buffer optimisation, so that probably explains the x2 difference.
With compression however there isn't much improvement. The results
are roughly within -10% to +10% of each other. Zlib compression
seems to be a bottleneck, or at least to modify the performance
profile to such an extent that the size of the buffer does not
matter. This happens to randomly generated binary data as well
so it is probably not caused by the test data.
|
|
Where it wasn't already async. To slightly improve performance.
|
|
Before this commit frames could "cheat" by compressing data
below the limit which would get expanded above the limit.
Now Cowboy will stop decompressing data when the limit is
reached.
|
|
This can be used to limit the maximum frame size before
some authentication or other validation is completed.
|
|
`perf` has shown that Cowboy spends a lot of time
cancelling and starting this timer. Instead of resetting
for every data received, we now only reset a field in the
state.
Before it was working like this:
- start idle timeout timer
- on trigger, close the connection
- on data, cancel and start again
Now it's working like this:
- start idle timeout timer for a tenth of its duration, with tick number = 0
- on trigger, if tick number != 10
- start the timer again, again for a tenth of its duration
- increment tick number
- on trigger, if tick number = 10
- close the connection
- on data, set tick number to 0
|
|
This includes Websocket over HTTP/3.
Since quicer, which provides the QUIC implementation,
is a NIF, Cowboy cannot depend directly on it. In order
to enable QUIC and HTTP/3, users have to set the
COWBOY_QUICER environment variable:
export COWBOY_QUICER=1
In order to run the test suites, the same must be done
for Gun:
export GUN_QUICER=1
HTTP/3 support is currently not available on Windows
due to compilation issues of quicer which have yet to
be looked at or resolved.
HTTP/3 support is also unavailable on the upcoming
OTP-27 due to compilation errors in quicer dependencies.
Once resolved HTTP/3 should work on OTP-27.
Because of how QUIC currently works, it's possible
that streams that get reset after sending a response
do not receive that response. The test suite was
modified to accomodate for that. A future extension
to QUIC will allow us to gracefully reset streams.
This also updates Erlang.mk.
|
|
|
|
This reduces the number of times we need to ask for more packets,
and as a result we get a fairly large boost in performance,
especially with HTTP/1.1.
Unfortunately this makes Cowboy require at least Erlang/OTP 21.3+
because the ssl application did not have active,N. For simplicity
the version required will be Erlang/OTP 22+.
In addition this change improves hibernate handling in
cowboy_websocket. Hibernate will now work for HTTP/2 transport
as well, and stray or unrelated messages will no longer cancel
hibernate (the process will handle the message and go back into
hibernation).
Thanks go to Stressgrid for benchmarking an early version of this
commit: https://stressgrid.com/blog/cowboy_performance_part_2/
|
|
It has been deprecated in OTP and the new way is available
on all supported OTP versions.
|
|
This allows changing the normal exit reason of Websocket
processes, providing a way to signal other processes of
why the exit occurred.
|
|
This allows disabling the UTF-8 validation check
for text and close frames.
|
|
While the protocol does not allow sending data before
receiving a successful Websocket upgrade response, we
do not want to discard that data if it does come in.
|
|
|
|
If we bind too late there might be an exception triggered
in the terminate function and we will not get the correct
stacktrace as a result.
|
|
It allows overriding the idle_timeout option only for now.
|
|
It allows to temporarily disable Websocket compression
when it was negotiated. It's ignored otherwise. This
can be used as fine-grained control when some frames
do not compress well.
|
|
They allow the server to configure what it is willing to accept
for both the negotiated configuration (takeover and window bits)
and the other zlib options (level, mem_level and strategy).
This can be used to reduce the memory and/or CPU footprint of
the compressed data, which comes with a cost in compression ratio.
|
|
|
|
This command is currently not documented. It allows disabling
the reading of incoming data from the socket, and can be used
as a poor man's flow control.
|
|
This feature is currently experimental. It will become the
preferred way to use Websocket handlers once it becomes
documented.
A commands-based interface enables adding commands without
having to change the interface much. It mirrors the interface
of stream handlers or gen_statem. It will enable adding
commands that have been needed for some time but were not
implemented for fear of making the interface too complex.
|
|
|
|
Thanks Artem.
|
|
|
|
Using the current draft:
https://tools.ietf.org/html/draft-ietf-httpbis-h2-websockets-01
|
|
Option allows to limit a frame by size before decoding its payload.
LH: I have added a test for when the limit is reached on a nofin
fragmented frame (the last commit addressed that case but it had
no test). I have fixed formatting and other, and changed the
default value to infinity since it might otherwise be incompatible
with existing code. I also added documentation and a bunch of other
minor changes.
|
|
Also rename a bunch of functions to make the code easier to read.
|
|
|
|
|
|
|
|
|
|
|
|
Unfortunately compression will be disabled for 20.1, 20.1.1
and 20.1.2. In additiona I do not recommend 20.1.3 due to
issues inflating some specific sizes.
|
|
|
|
This option allows customizing the compacting of the Req object
when using Websocket. By default it will keep most public fields
excluding headers of course, since those can be large.
|
|
|
|
Before this commit we had an issue where configuring a
Websocket connection was simply not possible without
doing magic, adding callbacks or extra return values.
The init/2 function only allowed setting hibernate
and timeout options.
After this commit, when switching to a different
type of handler you can either return
{module, Req, State}
or
{module, Req, State, Opts}
where Opts is any value (as far as the sub protocol
interface is concerned) and is ultimately checked
by the custom handlers.
A large protocol like Websocket would accept only
a map there, with many different options, while a
small interface like loop handlers would allow
passing hibernate and nothing else.
For Websocket, hibernate must be set from the
websocket_init/1 callback, because init/2 executes
in a separate process.
Sub protocols now have two callbacks: one with the
Opts value, one without.
The loop handler code was largely reworked and
simplified. It does not need to manage a timeout
or read from the socket anymore, it's the job of
the protocol code. A lot of unnecessary stuff was
therefore removed.
Websocket compression must now be enabled from
the handler options instead of per listener. This
means that a project can have two separate Websocket
handlers with different options. Compression is
still disabled by default, and the idle_timeout
value was changed from inifnity to 60000 (60 seconds),
as that's safer and is also a good value for mobile
devices.
|
|
|
|
|
|
Includes refactoring of the related code to avoid repetition.
|
|
|
|
|
|
After the switch to Websocket, we are no longer in a request/response
scenario, therefore a lot of the cowboy_req functions do not apply
anymore.
Any data required from the request will need to be taken from Req
in init/2 and saved in the handler's state.
|
|
|
|
The option for enabling Websocket compression has been
renamed. Previously it was shared with HTTP compression,
now it's specific to Websocket. The new option is named
'websocket_compress'.
|
|
|
|
|
|
|
|
|
|
|