Age | Commit message (Collapse) | Author |
|
Since messages are accumulated by appending binaries as of three
commits back, the accumulated binary is prone to being copied, as
discussed in the Efficiency Guide. Matching the Message Length header
as bytes are being accumulated is one cause of this, so work around it
by splitting the binary and extracting the length without a match.
This doesn't feel like something that should be necessary: that matching
a binary would cause an append to copy isn't obvious. The first attempt
at simplifying the accumulation was simply to append an incoming binary
to the current fragment, match against <<_, Len:24, _/binary>> to
extract the length, and then test if there are enough bytes for a
message. This lead to horrible performance (response times for 2 MB
messages approximately 100 times worse than previously), and it wasn't
at all obvious that the reason was the accumulated binary being copied
with each append as a result of the match. Using split_binary avoids the
match context that forces the copying.
|
|
Not with each setopts to re-activate the socket.
|
|
Only reset the fragment after all messages have been extracted.
|
|
Appending to a binary is efficient, so just append message fragments
Only match out the length once per message since doing so for every
packet from TCP causes the binary to be copied.
|
|
* anders/diameter/transport/ERL-332: (35 commits)
Capitulate on SCTP vs sparc-sun-solaris2.10
Remove obsolete traffic testcase
Fix dialyzer warnings
Remove client/server string decode from traffic suite
Add diameter_sctp option packet
Add diameter_sctp send/recv callbacks
Let diameter_tcp send/recv callbacks deal in diameter_packet
Randomly select traffic testcases
Exercise diameter_tcp message callbacks in traffic suite
Exercise diameter_{tcp,sctp} sender in traffic suite
Remove upgrade from diameter_traffic
Add diameter_tcp send/recv callbacks
Make diameter_{tcp,sctp} sender configurable
Remove upgrade from diameter_sctp; tweak diameter_tcp to match
Fix incomprehensible dialyzer warning
Simplify acks to transport processes
Strip throttling callbacks from diameter_tcp
Deal with (another) SCTP association id quirk on Solaris
Use binary:copy/2 when generating largish data in test suites
Deal with SCTP association id quirk on Solaris
...
|
|
diameter_sctp.erl:292: Record construction #transport{parent::pid(),mode::{'accept',atom() | pid() | port() | {atom(),atom()}},active::'false',recv::'true',os::0,packet::'true',message_cb::'undefined',send::'false'} violates the declared type of field message_cb::'false' | fun() | maybe_improper_list(fun() | maybe_improper_list(any(),[any()]) | {atom(),atom(),[any()]},[any()]) | {atom(),atom(),[any()]}
diameter_sctp.erl:302: Record construction #transport{mode::{'accept',atom() | pid() | port() | {atom(),atom()}},active::'false',recv::'true',os::0,packet::'true',message_cb::'undefined',send::'false'} violates the declared type of field message_cb::'false' | fun() | maybe_improper_list(fun() | maybe_improper_list(any(),[any()]) | {atom(),atom(),[any()]},[any()]) | {atom(),atom(),[any()]}
|
|
To determine the wrapping of messages passed to recv callbacks and into
diameter. The default passing of the input stream in transport_data is
probably of no practical use, but has been set since time immemorial.
|
|
Corresponding to diameter_tcp callbacks a few commits back. Exercise the
callbacks in the traffic suite.
|
|
To let a recv callback for an incoming request set transport_data and
have it returned in a send callback.
|
|
From the receiver process, that can return binaries to send/receive and
stop the transport process from reading on the socket.
This is still undocumented, and may change.
|
|
With sends still from the receiving process by default, since changing
the default behaviour may well have negative effects. A separate sender
probably implies a greater need for some form of load regulation for
one, since a blocking send would no longer imply that incoming messages
are no longer recevied. Dealing with this could result in the same
deadlock that the sending process intends to avoid, but the user should
be in control over how/when incoming traffic is regulated.
|
|
Added in commit 2afd1fe5. Only rename variables in diameter_tcp, no
functional change.
|
|
This:
diameter_tcp.erl:241: Record construction #transport{parent::'false',ssl::boolean() | maybe_improper_list(),frag::<<>>,tref::'false',flush::'false',pending::0,reset::{1 | 4,0 | 2},throttled::boolean(),q::{0,queue:queue(_)},monitor::'undefined' | pid()} violates the declared type of field parent::pid()
The problem isn't #transport.pid at all, it's #monitor.pid, and the only
relation is that the pid that's assigned to the latter is also (later)
assigned to the former. There is no record construction that assigns
false to #transport.parent.
Introduced in commit 33a535e4.
|
|
Commits starting at 472a080c added a throttle_cb option to diameter_tcp
to let a callback apply backpressure when it decides that additional
requests should not be read. It didn't provide a hook for knowing that
an answer was sent however, which is needed when sends no longer take
place in the receiver process, and is more complicated than it should
be. Strip it all away, in preparation for a simpler incarnation.
|
|
Shutdown events have been seen to get a different association id.
For example, first incoming message with association id = 0:
+ {trace_ts,<6421.268.0>,call,
{diameter_sctp,handle_info,
[{sctp,#Port<6421.2588>,
{10,67,16,179},
44159,
{[{sctp_sndrcvinfo,0,0,[],0,0,0,269950872,269950872,0}],
<<1,0,0,156,128,0,1,1,0,0,0,0,6,193,40,137,6,193,40,137,0,0,
1,8,64,0,0,30,67,45,49,51,52,50,49,55,52,52,49,46,101,114,
108,97,110,103,46,111,114,103,0,0,0,0,1,40,64,0,0,18,101,
114,108,97,110,103,46,111,114,103,0,0,0,0,1,1,64,0,0,14,0,
1,127,0,0,1,0,0,0,0,1,10,64,0,0,12,0,0,48,57,0,0,1,13,0,0,
0,20,79,84,80,47,100,105,97,109,101,116,101,114,0,0,1,22,
64,0,0,12,0,0,0,1,0,0,1,2,64,0,0,12,0,0,0,0,0,0,1,3,64,0,0,
12,0,0,0,3>>}},
{transport,<6421.252.0>,accept,#Port<6421.2588>,true,undefined,
{32,32},
0,undefined}]},
{1493,21505,577938}}
Later, a shutdown event with association id 1536:
+ {trace_ts,<6421.268.0>,call,
{diameter_sctp,handle_info,
[{sctp,#Port<6421.2588>,
{10,67,16,179},
44159,
{[],{sctp_shutdown_event,1536}}},
{transport,<6421.252.0>,accept,#Port<6421.2588>,0,undefined,
{32,32},
2,<6421.304.0>}]},
{1493,21505,746929}}
Both this and the grandparent commit are on this:
$ uname -a
SunOS beren 5.10 Generic_118833-33 sun4v sparc SUNW,Netra-T2000
|
|
In particular, that the association id received in messages on a
one-to-one socket after peeloff may be different from the id received on
the listen socket at comm_up.
This seems odd, since it's then not possible to send until the id is
discovered by reception of an SCTP message containing it, but it's
unclear if this is a bug or a feature, or if it's specific to certain
platforms. Treat it as a feature in this commit, and get the association
id as mentioned, an incoming CER being expected before anything is sent.
Commit da3e5d67 has more history.
|
|
Which dialyzer itself has never complained about.
|
|
Both diameter_tcp and diameter_sctp are susceptible to deadlock since a
peer that blocks send also prevents additional messages from being
received. Send from a process that's paired with the transport process
to avoid this. Use the existing monitor process in the TCP case, add one
in the SCTP case.
This has been the reason for many sporadic testcase failures, mostly in
diameter_traffic_SUITE.
|
|
Should be ssl:close/1.
|
|
Since smooth upgrade won't be supported in this branch.
|
|
|
|
|
|
Commit 5ca5fb71 ensured that they were closed immediately at transport
removal, but in so doing broke their closing at stop service completely,
by removing the timer that caused sockets to be closed even belatedly.
Monitor on the service process to make it happen.
This could still be improved, since stop_service listening ports aren't
closed until after the service process has died. They could be closed
earlier in the case of stop_service.
|
|
The transport interface documented in diameter_transport(3) is used to
start/stop accepting/connecting transport processes: they're started
with a function call, and told to die with their parent process. In the
accepting case, both diameter_tcp and diameter_sctp start a listening
process when the first accepting transport is started. However, there's
no way for a listening process to find out that that it should stop
listening when transport configuration is removed.
Both diameter_tcp and diameter_sctp have used a timer to terminate the
listening process after all existing accepting processes have died as a
consequence of transport removal. The problem with this is that nothing
stops a new client from connecting before this, and also that no new
transport can succeed in opening the same listening port (eg.
reconfiguration) until the old listener dies.
This commit solves the problem by adding diameter_reg:subscribe/2, to
allow callers to subscribe to messages about added/removed associations.
A call to diameter:add_transport/2 results in a new child process that
registers a term that a listening process subscribes to. Transport
removal results in the death of the child, and the resulting
notification to the listener causes the latter to close its socket and
terminate.
This is still an internal interface, but the subscription mechanism
should probably be made external (eg. a diameter:subscribe/1 that can
be used to subscribe to specified messages), so that transport modules
other than diameter's own can make use of it. There is no support for
soft upgrade.
|
|
* anders/diameter/overload/OTP-13330:
Suppress dialyzer warning
Remove dead case clause
Let throttling callback send a throttle message
Acknowledge answers to notification pids when throttling
Throttle properly with TLS
Don't ask throttling callback to receive more unless needed
Let a throttling callback answer a received message
Let a throttling callback discard a received message
Let throttling callback return a notification pid
Make throttling callbacks on message reception
Add diameter_tcp option throttle_cb
|
|
This one:
diameter_tcp.erl:928: (call)
The call diameter_tcp:throttle({'timeout',_},#transport{socket::port() | {'sslsocket',_,_},parent::pid(),module::atom(),frag::binary() | {non_neg_integer(),non_neg_integer(),binary(),[binary()]},ssl::boolean() | [any()],timeout::'infinity' | non_neg_integer(),tref::'false' | reference(),flush::boolean(),throttle_cb::'false' | fun() | maybe_improper_list(fun() | maybe_improper_list(any(),[any()]) | {atom(),atom(),[any()]},[any()]) | {atom(),atom(),[any()]},throttled::'true' | binary()})
will never return since it differs in the 1st argument from the
success typing arguments:
('discard' | 'ok' | binary() | pid() | {'discard' | 'ok' | binary() | pid(),'false' | fun() | [fun() | [any()] | {atom(),atom(),[any()]}] | {atom(),atom(),[any()]}},#transport{socket::port() | {'sslsocket',_,_},parent::pid(),module::atom(),frag::binary() | {non_neg_integer(),non_neg_integer(),binary(),[binary()]},ssl::boolean() | [any()],timeout::'infinity' | non_neg_integer(),tref::'false' | reference(),flush::boolean(),throttle_cb::'false' | fun() | [fun() | [any()] | {atom(),atom(),[any()]}] | {atom(),atom(),[any()]},throttled::binary()})
It's true that the clause doesn't return, because of the throw, and
that's the intention.
|
|
Orphaned in commit 9298872b.
|
|
That is, don't assume that it's only diameter_tcp doing so: allow it to
be received when not throttling. This lets a callback module trigger a
new throttling callback itself, but it's not clear if this will be
useful in practice.
|
|
|
|
In particular, let a callback decide when to receive the initial
message.
|
|
TCP packets can contain more than one message, so only ask to receive
another message if it hasn't already been received.
|
|
As discussed in the parent commit. This is easier said than done in
practice, but there's no harm in allowing it.
|
|
This can be used as a simple form of overload protection, discarding the
message before it's passed into diameter to become one more request
process in a flood. Replying with 3004 would be more appropriate when
the request has been directed at a specific server (the RFC's
requirement) however, and possibly it should be possible for a callback
to do this as well.
|
|
In addition to returning ok or {timeout, Tmo}, let a throttling callback
for message reception return a pid(), which is then notified if the
message in question is either discarded or results in a request process.
Notification is by way of messages of the form
{diameter, discard | {request, pid()}}
where the pid is that of a request process resulting from the received
message. This allows the notification process to keep track of the
maximum number of request processes a peer connection can have given
rise to.
|
|
The callback is now applied to the atom 'false' when asking if another
message should be received on the socket, and to a received binary
message after reception. Throttling on received messages makes it
possible to distinguish between requests and answers.
There is no callback on outgoing messages since these don't have to go
through the transport process, even if they currently do.
|
|
To let a callback module decide whether or to receive another message
from the peer, so that backpressure can be applied when it's
inappropriate. This is to let a callback protect against reading more
than can be processed, which is otherwise possible since diameter_tcp
otherwise always asks for more.
A callback is made after each message, and can answer to continue
reading or to ask again after a timeout. It's each message instead of
each packet partly for simplicity, but also since this should be
sufficiently fine-grained. Per packet would require some interaction
with the fragment timer that flushes partial messages that haven't been
completely received.
|
|
Record field types have been modified due to commit 8ce35b2:
"Take out automatic insertion of 'undefined' from typed record fields".
|
|
The diffs are all about adapting to the OTP 18 time interface. The code
was previously backwards compatible, falling back on the erlang:now/0 if
erlang:monotonic_time/0 is unavailable, but this was seen to be a bad
thing in commit 9c0f2f2c. Use of erlang:now/0 is now removed.
|
|
By doing away with more wrapping that the parent commit started to
remove.
|
|
OTP-12845
* bruce/change-license:
fix errors caused by changed line numbers
Change license text to APLv2
|
|
A listener process in diameter_sctp starts accepting transport processes
as required, either as associations are established or as diameter asks
for a processes to be started. Since this can happen in any order, the
listener maintains two queues: one for processes that diameter has
requested and which are waiting to be given an association, another for
processes that have been started to become owners of an association but
are waiting for diameter to request them. Only one queue at a time is
non-empty. The first queue's length is bounded by the number of
accepting processes configured as pool_size. Entries in the second queue
are short-lived since diameter starts a replacement transport process
whenever an existing one dies or communicates that it has an
association.
The two queues were previously implemented in an ets ordered_set, whose
keys were the pid() of transport processes. Removing an element from the
queue was then done with ets:first/1. The problem with this it's not
really a queue: there's no guarantee that pid-ordering is the same as
the order in which processes are started. If it isn't then it's possible
that an established association never be given to diameter as a
transport process if there's always a newer association whose pid sorts
first. This isn't a problem in practice since it would require new
associations to be established faster than diameter starts transport
processes, but redo the implementation as a queue, with strict FIFO
semantics.
|
|
The changes in some of the previous commits assume application restart.
|
|
Don't pass an association id that's no longer used.
|
|
In particular, don't give the accepting transport process the listening
socket. It was used to match the initial sctp message received in a
peeloff message, but replace the socket in the forwarded message
instead.
|
|
The existing code was a remnant of the pre-peeloff implementation.
There's no need to close anything but the whole socket.
|
|
Listener death should have no effect on a peeled off association.
|
|
Forwarding an sctp message from the listener process at the same time
that the controlling process is changed means there's no guarantee that
the message order will be preserved. Selectively receive the peeloff
message before entering the gen_server loop to ensure the order is
preserved.
|
|
This is not the case under Solaris for one: successive
associations can receive the same association id as a result of peeloff,
the id only being unique for the controlling port, not for the listening
port as is the case under Linux for example. This made for many failures
in the diameter test suites, the traffic suite in particular.
Peeloff in diameter_sctp was introduced in 9a671bf0, before which the
assumption was fine since it was the listening process that owned all
associations. (Which obviously had other drawbacks.) Other remnants of
the pre-peeloff implementation have also been removed: that the listener
process might receive a message on a socket after peeloff for one.
Peeloff in gen_sctp became available in commit 067cfe79, after the
original implementation of diameter_sctp.
This is trace on the unpatched code showing id reuse under Solaris:
+ {trace_ts,<0.103.0>,call,
{diameter_sctp,handle_info,
[{sctp,#Port<0.1625>,
{127,0,0,1},
35904,
{[],{sctp_assoc_change,comm_up,0,32,32,1}}},
{listener,#Ref<0.0.1.948>,#Port<0.1625>,4,
57384,
{-4,61481},
#Ref<0.0.8.12>,
[]}]},
{1432,458752,612168}}
+ {trace_ts,<0.103.0>,call,
{diameter_sctp,handle_info,
[{sctp,#Port<0.1625>,
{127,0,0,1},
35905,
{[],{sctp_assoc_change,comm_up,0,32,32,1}}},
{listener,#Ref<0.0.1.948>,#Port<0.1625>,4,
57384,
{-3,61481},
#Ref<0.0.8.12>,
[]}]},
{1432,458752,613042}}
The result was this, when the second association was incorrectly
forwarded to the first association's controlling process:
** {function_clause,
[{diameter_sctp,transition,
[{peeloff,#Port<0.1635>,
{sctp,#Port<0.1625>,
{127,0,0,1},
35892,
{[],{sctp_assoc_change,comm_up,0,32,32,1}}},
[]},
{transport,<0.107.0>,accept,#Port<0.1634>,1,undefined,{32,32},0}],
[{file,"transport/diameter_sctp.erl"},{line,561}]},
{diameter_sctp,t,2,[{file,"transport/diameter_sctp.erl"},{line,549}]},
{diameter_sctp,handle_info,2,
[{file,"transport/diameter_sctp.erl"},{line,397}]},
{gen_server,try_dispatch,4,[{file,"gen_server.erl"},{line,614}]},
{gen_server,handle_msg,5,[{file,"gen_server.erl"},{line,680}]},
{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,238}]}]}
|
|
|
|
Commit 4b691d8d made it possible for accepting transport processes to be
started concurrently, and commit 77c1b162 adapted diameter_sctp to this,
but missed that the publication of the listener process in diameter_reg
has to precede the return of its start function. As a result, concurrent
starts could result in multiple listener processes.
|