Age | Commit message (Collapse) | Author |
|
What's interesting when implementing some form of load regulation is
when an incoming request has been answered or discarded. Acknowledge
exactly this, not the identity of handler processes as previously. A
transport process can request acks of nonforthcoming answers by sending
{diameter, ack} to the parent peer_fsm, a handler processes identifies
itself with a {handler, pid()} message, and the peer_fsm monitors on
this to be able to send a notification to the transport if the handler
dies before sending an answer.
|
|
When relaying outgoing requests through transport on a remote node,
terms that were stripped when sending to the transport process weren't
stripped when spawning a process on the remote node.
Also, don't save the request to the process dictionary in a process that
just relays an answer.
|
|
The table has existed forever, to route incoming answers to a waiting
request process: each outgoing request writes to the table, and each
incoming answer reads. This has been seen to suffer from lock contention
at high load however, so this commit moves the routing into the
diameter_peer_fsm processes that are diameter's conduit to transport
processes. The request table is still used for failover detection, but
entries are only written when a watchdog state transitions leaves or
enters state OKAY.
|
|
Commit 9a878743 addressed inefficiency at failover, but introduced
inefficiency in the sending of outgoing requests in so doing: each
outgoing request added an request table entry keyed on a transport pid,
then looked for a specific element with this key, and then (later)
removed the inserted element. Since the request table is a bag, this
results in linear searches over a potentially long list of element
keyed on the same pid. The higher the rate of outgoing calls, the more
costly it becomes.
Instead of writing entries to the request table, the peer_up/down calls
to diameter_traffic that mirror transitions to and from the OKAY state
in the RFC 3539 watchdog state machine now result in a process for
request processes to monitor in order to detect failover.
|
|
* anders/diameter/19.1/OTP-13838:
vsn -> 1.12.1
Update appup for 19.1
Fix xmllint errors in documentation
Remove documentation overkill
Don't run traffic tests in parallel when {string_decode, true}
Remove copyright from generated dictionary modules
Fix dictionary function typo
Fix dictionary typo in relay example
|
|
* anders/diameter/failover/OTP-13412:
Make peer failover more efficient
|
|
It's '#get-'/2, not 'get-'/2. Only failed if the dictionary in question
defined no Failed-AVP, which is rarely the case in practice.
Thanks to Ferenc Holzhauser.
|
|
* anders/diameter/overload/OTP-13330:
Suppress dialyzer warning
Remove dead case clause
Let throttling callback send a throttle message
Acknowledge answers to notification pids when throttling
Throttle properly with TLS
Don't ask throttling callback to receive more unless needed
Let a throttling callback answer a received message
Let a throttling callback discard a received message
Let throttling callback return a notification pid
Make throttling callbacks on message reception
Add diameter_tcp option throttle_cb
|
|
Failover caused the entire request table to be scanned in search of
entries with the transport process in question. With many entries
(possibly as a result of the leak fixed in commit 6c9cbd96), this can
lead to the service process hanging in ets:select_trap/1, with memory
growth when many request processes write concurrently. Now write entries
keyed on the transport pid, so that finding request processes at
failover is a lookup rather than a select scanning the entire table.
There is no upgrade handling in that new code doesn't consider that old
code didn't write entries on the transport pid. Thus, a request whose
table entries were written in old code will timeout rather than failover
in new code. That is, there is a small window for failover to be missed
(since request processes are short-lived), but it requires that it take
place during the upgrade.
As a minor aside, don't ignore failovers when sending binaries (which
isn't officially supported), let prepare_retransmit callbacks deal with
modifying the binary as required.
|
|
By sending {diameter, {answer, pid()}} when an incoming answer is sent
to the specified pid, instead of a discard message as previously. The
latter now literally means that the message has been discarded.
|
|
In addition to returning ok or {timeout, Tmo}, let a throttling callback
for message reception return a pid(), which is then notified if the
message in question is either discarded or results in a request process.
Notification is by way of messages of the form
{diameter, discard | {request, pid()}}
where the pid is that of a request process resulting from the received
message. This allows the notification process to keep track of the
maximum number of request processes a peer connection can have given
rise to.
|
|
* anders/diameter/dialyzer/OTP-13400:
Fix dialyzer warnings
|
|
Whether making record declarations unreadable to compensate for
dialyzer's ignorance of match specs is worth it is truly debatable.
|
|
Whether making record declarations unreadable to compensate for
dialyzer's ignorance of match specs is worth it is truly debatable.
|
|
* anders/diameter/retransmission/OTP-13342:
Fix handling of shared peer connections in watchdog state SUSPECT
Remove unnecessary parentheses
Remove dead export
|
|
Not needed as of commit 6c9cbd96.
|
|
The export of diameter_traffic:failover/1 happened with the creation of
the module in commit e49e7acc, but was never needed since the calling
code was also moved into diameter_traffic.
|
|
* anders/diameter/request_leak/OTP-13137:
Fix request table leak at retransmission
Fix request table leak at exit signal
|
|
* anders/diameter/request_leak/OTP-13137:
Fix request table leak at retransmission
Fix request table leak at exit signal
|
|
In the case of retranmission, a prepare_retransmit callback could modify
End-to-End and/or Hop-by-Hop identifiers so that the resulting
diameter_request entry was not removed, since the removal was of entries
with the identifiers of the original request. The chances someone doing
this in practice are probably minimal.
|
|
The storing of request records in the ets table diameter_request was
wrapped in a try/after so that the latter would unconditionally remove
written entries. The problem is that it didn't deal with the process
exiting as a result of an exit signal, since this doesn't raise in an
exception. Since the process in question applies callbacks to user code,
we can potentially be linked to other process and exit as a result.
Trapping exits changes the current behaviour of the process, so spawn a
monitoring process that cleans up upon reception of 'DOWN'.
|
|
* anders/diameter/M-bit/OTP-12947:
Add service_opt() strict_mbit
|
|
There are differing opinions on whether or not reception of an arbitrary
AVP setting the M-bit is an error. 1.3.4 of RFC 6733 says this about
how an existing Diameter application may be modified:
o The M-bit allows the sender to indicate to the receiver whether or
not understanding the semantics of an AVP and its content is
mandatory. If the M-bit is set by the sender and the receiver
does not understand the AVP or the values carried within that AVP,
then a failure is generated (see Section 7).
It is the decision of the protocol designer when to develop a new
Diameter application rather than extending Diameter in other ways.
However, a new Diameter application MUST be created when one or more
of the following criteria are met:
M-bit Setting
An AVP with the M-bit in the MUST column of the AVP flag table is
added to an existing Command/Application. An AVP with the M-bit
in the MAY column of the AVP flag table is added to an existing
Command/Application.
The point here is presumably interoperability: that the command grammar
should specify explicitly what mandatory AVPs much be understood, and
that anything more is an error.
On the other hand, 3.2 says thus about command grammars:
avp-name = avp-spec / "AVP"
; The string "AVP" stands for *any* arbitrary AVP
; Name, not otherwise listed in that Command Code
; definition. The inclusion of this string
; is recommended for all CCFs to allow for
; extensibility.
This renders 1.3.4 pointless unless "*any* AVP" is qualified by "not
setting the M-bit", since the sender can effectively violate 1.3.4
without this necessitating an error at the receiver. If clients add
arbitrary AVPs setting the M-bit then request handling becomes more
implementation-dependent.
The current interpretation in diameter is strict: if a command grammar
doesn't explicitly allow an AVP setting the M-bit then reception of such
an AVP is regarded as an error. The strict_mbit option now allows this
behaviour to be changed, false turning all responsibility for the M-bit
over to the user.
|
|
To diameter_lib:log/4, which was last motivated in commit 39acfdb0.
|
|
OTP-12845
* bruce/change-license:
fix errors caused by changed line numbers
Change license text to APLv2
|
|
* anders/diameter/18/OTP-12588:
vsn -> 1.10
Remove dead upgrade-related code
Update appup for 18
Fix release note typo
Fix comment typo
|
|
Not needed with the parent commit's restart_application.
|
|
To diameter_lib:log/4, which was last motivated in commit 39acfdb0.
|
|
|
|
The message was regarded as unknown if the answer message in question
set the E-bit and the application dictionary was not the common
dictionary.
|
|
That is, outgoing answer messages received in response to a
handle_request callback having returned {relay, Opts}.
|
|
To clarify what it is that's being computed, which isn't entirely
obvious. No functional change, just renaming.
|
|
As the first step in starting to count outgoing, relayed answer
messages.
|
|
An incoming Diameter message is either a request, an answer to an
outstanding request, or an unexpected answer. The latter weren't
counted, but are now counted on keys of this form:
{pid(), {{unknown, 0}, recv, discarded}}
The form of the second element is similar to those of other counters,
like:
{{relay, 0|1}, send|recv, invalid_error_bit}
Compare this to the key used when counting known answers:
{{ApplicationId, CommandCode, 0}, recv}
The application id and command code aren't included so as not to count
on arbitrary keys, a topic last visited in commit 49e8b11c.
|
|
To differentiate between requests and answers, in analogy with relay
counters. This isn't backwards compatible, but these counters aren't yet
documented.
|
|
Commit 49e8b11c broke the counting of relayed message, causing them to
be accumulated as unknown messages.
|
|
Commit a1df50b3 broke result code counters in the case of answer
messages sent as a header/avp lists (unless the avps, untypically, set
the name field), and for answers sent/received in the relay application.
|
|
* anders/diameter/counters/OTP-12701:
Add counters testcase to 3xxx suite
Fix counting error with unknown application id
Add missing doc wording
|
|
Decode of an answer message not setting the E-bit, and containing
Experiment-Result but not Result-Code, identified Result-Code as the
erroneous when Erroneous-Result-Code was 3xxx. Here's an example (from
trace) of a the errors field after decode:
[{5004,
{diameter_avp,undefined,undefined,false,false,undefined,'Result-Code',
3001,undefined,undefined}}],
The diameter_avp was just constructed from the AVP name and decoded
result, without regard for which result code AVP contained the value.
Fix by extracting the AVP from the incoming message.
|
|
Statistics could be accumulated on a key like {{23,275,0}, recv} even
though 23 was not the application id of the dictionary in question.
Missed in commits df19c272 and 7816ab2f.
|
|
To bound the length of incoming messages that will be decoded. A message
longer than the specified number of bytes is discarded. An
incoming_maxlen_exceeded counter is incremented to make note of the
occurrence.
The motivation is to prevent a sufficiently malicious peer from
generating significant load by sending long messages with many AVPs for
diameter to decode. The 24-bit message length header accomodates
(16#FFFFFF - 20) div 12 = 1398099
Unsigned32 AVPs for example, which the current record-valued decode is
too slow with in practice. A bound of 16#FFFF bytes allows for 5461
small AVPs, which is probably more than enough for the majority of
applications, but the default is the full 16#FFFFFF.
|
|
Despite claims of full backwards compatibility, the text of RFC 6733
changes the interpretation of unspecified values in a DiameterURI. In
particular, 3588 says that the default port and transport are 3868 and
sctp respectively, while 6733 says it's either 3868/tcp (aaa) or
5658/tcp (aaas). The 3588 defaults were used regardless, but now use
them only if the common dictionary is diameter_gen_base_rfc3588. The
6733 defaults are used otherwise.
This kind of change in the standard can lead to interop problems, since
a node has to know which RFC its peer is following to know that it will
properly interpret missing URI components. Encode of a URI includes all
components to avoid such confusion.
That said, note that the defaults in the diameter_uri record have *not*
been changed. This avoids breaking code that depends on them, but the
risk is that such code sends inappropriate values. The record defaults
may be changed in a future release, to force values to be explicitly
specified.
|
|
* anders/diameter/string_decode/OTP-11952:
Let examples override default service options
Set {restrict_connections, false} in example server
Set {string_decode, false} in examples
Test {string_decode, false} in traffic suite
Add service_opt() string_decode
Strip potentially large terms when sending outgoing Diameter messages
Improve language consistency in diameter(1)
|
|
* anders/diameter/route_record/OTP-12551:
Fix ordering of AVPs in relayed messages
|
|
To control whether stringish Diameter types are decoded to string or
left as binary. The motivation is the same as in the parent commit: to
avoid large strings being copied when incoming Diameter messages are
passed between processes; or *if* in the case of messages destined for
handle_request and handle_answer callbacks, since these are decoded in
the dedicated processes that the callbacks take place in. It would be
possible to do something about other messages without requiring an
option, but disabling the decode is the most effective.
The value is a boolean(), true being the default for backwards
compatibility. Setting false causes both diameter_caps records and
decoded messages to contain binary() in relevant places that previously
had string(): diameter_app(3) callbacks need to be prepared for the
change.
The Diameter types affected are OctetString and the derived types that
can contain arbitrarily large values: OctetString, UTF8String,
DiameterIdentity, DiameterURI, IPFilterRule, and QoSFilterRule. Time and
Address are unaffected.
The DiameterURI decode has been redone using re(3), which both
simplifies and does away with a vulnerability resulting from the
conversion of arbitrary strings to atom.
The solution continues the use and abuse of the process dictionary for
encode/decode purposes, last seen in commit 0f9cdba.
|
|
Both incoming and outgoing Diameter messages pass through two or three
processes, depending on whether they're incoming or outgoing: the
transport process and corresponding peer_fsm process and (for incoming)
watchdog processes. Since terms other than binary are copied when
passing process boundaries, large terms lead to copying that can be
problematic, if frequent enough. Since only the bin and transport_data
fields of a diameter_packet record are needed by the transport process,
discard others when sending outgoing messages.
Strictly speaking, the statement that only the aforementioned fields are
needed by the transport process depends on the transport process. It's
true of those implemented by diameter (in diameter_tcp and
diameter_sctp), but an implementation that makes use of other fields is
assuming more than the documentation in diameter_transport(3) promises.
|
|
6.1.9 of RFC 6733 states this:
A relay or proxy agent MUST append a Route-Record AVP to all requests
forwarded.
The AVP was inserted as the head of the AVP list, not appended, since
the entire AVP list was reversed relative to the received order.
Thanks to Andrzej TrawiĆski.
|
|
* anders/diameter/retransmission/OTP-12415:
Fix retransmission of messages sent as header/avps list
|
|
Outgoing answers missing a Result-Code AVP or setting an E-bit
inappropriately were discarded, but there's no particular reason for
doing so if the answer can be encoded, and the sender has no way of
knowing that their answer has been discarded. It's also inappropriate
that the message be discarded in the relay case. Answers are now sent,
and an error counter incremented.
|
|
Extracting the End-to-End and Hop-by-Hop identifiers resulted in a
function clause error, causing the send to fail.
|