Age | Commit message (Collapse) | Author |
|
OTP-12845
* bruce/change-license:
fix errors caused by changed line numbers
Change license text to APLv2
|
|
Not needed with the parent commit's restart_application.
|
|
|
|
To bound the length of incoming messages that will be decoded. A message
longer than the specified number of bytes is discarded. An
incoming_maxlen_exceeded counter is incremented to make note of the
occurrence.
The motivation is to prevent a sufficiently malicious peer from
generating significant load by sending long messages with many AVPs for
diameter to decode. The 24-bit message length header accomodates
(16#FFFFFF - 20) div 12 = 1398099
Unsigned32 AVPs for example, which the current record-valued decode is
too slow with in practice. A bound of 16#FFFF bytes allows for 5461
small AVPs, which is probably more than enough for the majority of
applications, but the default is the full 16#FFFFFF.
|
|
* anders/diameter/dpr/OTP-12609:
Discard incoming/outgoing requests after incoming DPR
Add transport_opt() dpr_timeout
Be lenient with errors in incoming DPR
|
|
Despite claims of full backwards compatibility, the text of RFC 6733
changes the interpretation of unspecified values in a DiameterURI. In
particular, 3588 says that the default port and transport are 3868 and
sctp respectively, while 6733 says it's either 3868/tcp (aaa) or
5658/tcp (aaas). The 3588 defaults were used regardless, but now use
them only if the common dictionary is diameter_gen_base_rfc3588. The
6733 defaults are used otherwise.
This kind of change in the standard can lead to interop problems, since
a node has to know which RFC its peer is following to know that it will
properly interpret missing URI components. Encode of a URI includes all
components to avoid such confusion.
That said, note that the defaults in the diameter_uri record have *not*
been changed. This avoids breaking code that depends on them, but the
risk is that such code sends inappropriate values. The record defaults
may be changed in a future release, to force values to be explicitly
specified.
|
|
* anders/diameter/string_decode/OTP-11952:
Let examples override default service options
Set {restrict_connections, false} in example server
Set {string_decode, false} in examples
Test {string_decode, false} in traffic suite
Add service_opt() string_decode
Strip potentially large terms when sending outgoing Diameter messages
Improve language consistency in diameter(1)
|
|
To control whether stringish Diameter types are decoded to string or
left as binary. The motivation is the same as in the parent commit: to
avoid large strings being copied when incoming Diameter messages are
passed between processes; or *if* in the case of messages destined for
handle_request and handle_answer callbacks, since these are decoded in
the dedicated processes that the callbacks take place in. It would be
possible to do something about other messages without requiring an
option, but disabling the decode is the most effective.
The value is a boolean(), true being the default for backwards
compatibility. Setting false causes both diameter_caps records and
decoded messages to contain binary() in relevant places that previously
had string(): diameter_app(3) callbacks need to be prepared for the
change.
The Diameter types affected are OctetString and the derived types that
can contain arbitrarily large values: OctetString, UTF8String,
DiameterIdentity, DiameterURI, IPFilterRule, and QoSFilterRule. Time and
Address are unaffected.
The DiameterURI decode has been redone using re(3), which both
simplifies and does away with a vulnerability resulting from the
conversion of arbitrary strings to atom.
The solution continues the use and abuse of the process dictionary for
encode/decode purposes, last seen in commit 0f9cdba.
|
|
With the same motivation as in commits 5bd2d72 and b1fd629.
As in the latter, incoming DPR is the only exception.
|
|
To cause a peer connection to be closed following an outgoing DPA, in
case the peer fails to do so. It is the recipient of DPA that should
close the connection according to RFC 6733.
|
|
To avoid having the peer interpret the error as meaning the connection
shouldn't be closed, which probably does more harm than ignoring
syntactic errors in the DPR.
Note that RFC 6733 says this about incoming DPR, in 5.4 Disconnecting
Peer Connections:
Upon receipt of the message, the Disconnect-Peer-Answer message is
returned, which SHOULD contain an error if messages have recently been
forwarded, and are likely in flight, which would otherwise cause a race
condition.
The race here is presumably between answers to forwarded requests and
the outgoing DPA, but we have no handling for this: whether or not there
are pending answers is irrelevant to how DPR is answered. It's
questionable that a peer should be able to prevent disconnection in any
case: it has to be the node sending DPR that decides if it's approriate,
and the peer should take it as an indication of what's coming.
Incoming DPA is already treated leniently: the only error that's not
ignored is mismatching End-to-End and Hop-by-Hop Identifiers, since
there's no distinguishing an erroneous value from an unsolicited DPA.
This mismatch could also be ignored, which is the case for DWA for
example, but this problem is already dealt with by dpa_timeout, which
causes a connection to be closed even when the expected DPA isn't
received.
|
|
These are requests that diameter itself sends. It's previously been
possible to send them, but answers timed out at the caller since they
were discarded in diameter_watchdog. Answers will still timeout, but now
the requests are discarded before being sent.
|
|
DPR is sent by diameter at application shutdown, service stop, or
transport removal. It has been possible to send the request with
diameter:call/4, but the answer was discarded, instead of the transport
process being terminated. This commit causes DPR to be handled in the
same way regardless of whether it's sent by diameter or by
diameter:call/4.
Note that the behaviour subsequent to DPA is unchanged. In particular,
in the connecting case, the closed connection will be reestablished
after a connect_timer expiry unless the transport is removed. The more
probable use case is the listening case, to disconnect a single peer
associated with a listening transport.
|
|
To make the default DPA timeout configurable. The timeout say how many
milliseconds to wait for DPA in response to an outgoing DPR before
terminating the transport process regardless.
|
|
Since there's a race between an answer being sent and the connection
being closed upon the reception of DPA that's likely to be lost, and
because of the questionability of sending messages after DPR, as
discussed in the parent commit. An exception is made for DPR so that
simultaneous DPR in both directions doesn't result in it being discarded
on both ends.
|
|
RFC 6733 isn't terribly clear about what should happen to incoming or
outgoing messages once DPR is sent and the Peer State Machine
transitions into state Closing. There's no event for this in section
5.6, Peer State Machine, and no clarification in section 5.4,
Disconnecting Peer Connections. There is a little bit of discussion in
2.1.1, SCTP Guidelines, in relation to unordered message delivery, but
the tone there is that messages might be received after DPR because of
unordered delivery, not because they were actually sent after DPR.
Discarding outgoing answers may do more harm than good, but requests are
more likely to be unexpected, as has been seen to be the case with DWR
following DPR. DPR indicates a desire to close the connection: discard
any subsequent outgoing requests.
|
|
From {error, Reason} to {no_connection, Reason} when a connection can't
be established. The exit reason of a diameter_peer_fsm process is turned
into a message from the corresponding diameter_watchdog process to the
relevant diameter_service process, the latter sending a 'closed' event
including the reason to any subscribers. Reason = [] when none of the
configured transport modules succeeds in establishing a connection,
which admittedly isn't terribly descriptive. (The lists is of error
reasons from transport start functions, which is empty as long as
transport processes start successfully.)
Note that this form of the closed event is undocumented, aside from the
documentation saying that one should expect undocumented events. The
explicitly documented forms are currently specific to CER/CEA failures.
|
|
Commit df19c272 broke this in avoiding counting on arbitrary keys.
It didn't break it sufficiently for the only counters usage in the test
suites to fail however: watchdog counters worked as intended, but no
others, not even CER and DPR. More testcases are needed.
This commit does change/fix the previous semantics somewhat:
- Retransmissions are no longer counted. This previously made it
impossible to distinguish between these and unanswered requests, since
both counted as an outgoing request. There should probably be a
retransmission counter but it should be distinct from the sent request
counter.
- The counting is always on the node from which diameter:call/4 is
invoked, not the node on which the transport resides, as was previously
the case. (Although they're typically one and the same.)
Note that none of these semantics are documented as yet, so we're not
changing a documented interface.
|
|
* anders/diameter/hardening/OTP-11721:
Simplify example server
Make example server answer unsupported requests with 3001
Make example code quiet
Don't count messages on arbitrary keys
Replace traffic-related log reports with no-op function calls
|
|
That is, don't use a key constructed from an incoming Diameter header
unless the message is known to the dictionary in question. Otherwise
there are 2^32 application ids, 2^24 command codes, and 2 R-bits for an
ill-willed peer to choose from, each resulting in new keys in the
counter table (diameter_stats).
The usual {ApplicationId, CommandCode, Rbit} in a key is replaced by the
atom 'unknown' if the message in question is unknown to the decoding
dictionary.
Counters for messages sent and received by a relay are (still) not
implemented.
|
|
The former were a little over-enthusiastic and could cause a node to be
logged to death if a peer Diameter node was sufficiently ill-willed.
The function calls are to diameter_lib:log/4, the arguments of which
identify the happening in question, and which does nothing but provide a
function to trace on. Many existing log calls have been shrunk.
The only remaining traffic-related report (hopefully) is that resulting
from {answer_errors, report} config, and this has been slimmed.
|
|
* anders/diameter/dpr/OTP-11938:
Ensure watchdog dies with transport if DPA was sent
|
|
* anders/diameter/rc_counters/OTP-11937:
Count encode errors in outgoing messages
Count decode errors in incoming requests
Count decode errors independently of result codes
|
|
* anders/diameter/rc_counters/OTP-11891:
Count result codes in CEA/DWA/DPA
|
|
A DPR/DPA exchange should always cause the watchdog process in question
to die with the transport, so that a subsequent connection with the same
peer doesn't result in a 3 x DWR/DWA exchange. Commit 5903d6db saw to
this for the sending of DPR but neglected the corresponding problem for
DPA.
In the case of sending DPR (the aforementioned commit), note that
there's no distinction between receiving DPA as expected and not: the
watchdog dies with the transport regardless.
diameter_watchdog must be loaded first at upgrade.
|
|
Only decode errors were counted previously. Keys are of the form
{Id, send, error}, where Id is:
{ApplicationId, CommandCode, Rbit} | unknown
The latter will be the case if not even a #diameter_header{} can be
constructed.
|
|
Errors were only counted in incoming answers. Counters are keyed on
tuples of the same form:
{{ApplicationId, CommandCode, Rbit}, recv, error}
|
|
Corresponding counters for other answer messages have been counted
previously, but those for CEA, DWA, and DPA have been missing since
diameter itself sends these messages and the implementation is as bit
more separate than it might be. The counters are keyed on values of the
following form.
{{ApplicationId, CommandCode, 0 = Rbit}, send|recv, {'Result-Code', RC}}
|
|
No longer needed to update code in runtime since the emulator is
restarted at a major release.
|
|
* anders/diameter/failed_avp/OTP-11293:
Fix Failed-AVP construction for CEA/DWA/DPA
|
|
Send of Failed-AVP resulted in encode failure.
|
|
Having the peer_fsm process answer DWR meant that watchdog timer expiry
could result in an outgoing DWR despite the fact that an incoming DWR
was just answered. Having the watchdog process answer avoids this.
diameter_peer_fsm must be loaded before diameter_watchdog. It's
possible for one incoming DWR to go unanswered but a subsequent DWR will
be answered so no harm is done.
|
|
* anders/diameter/one_failed_avp/OTP-11127:
Adapt CEA/DPA Failed-AVP to RFC 6733
|
|
* anders/diameter/host_ip_address/OTP-11045:
Respect Host-IP-Address configuration
|
|
Addresses returned from a transport module were always used to populate
Host-IP-Address AVP's in an outgoing CER/CEA, which precluded the
sending of a VIP address. Transport addresses are now only used if
Host-IP-Address is unspecified.
In other words, respect any configured Host-IP-Address, regardless of
the physical addresses returned by the transport. To use the physical
addresses, don't configure Host-IP-Address.
|
|
By setting only one, not many. The handling for other messages (except
DWA, which is forgiving of errors) was dealt with in commit f7ec93e3.
|
|
RFC 6733 recommends against the use of Inband-Security-Id, so only send
a value that differs from the default.
|
|
A transport module can return a local address list from its start/3
function in order to specify addresses to be used as Host-IP-Address
during capabilities exchange. Now allow addresses to be communicated in
a 'connected' message in the case of a connecting transport, so that
diameter_tcp (in particular) can make local address configuration
optional, communicating the gen_tcp default after connection
establishment instead.
|
|
Crashing watchdog and peer_fsm processes was somewhat unseemly. Emit an
error report and die silently instead.
|
|
Faulty configuration was previously passed directly on to watchdog and
peer_fsm processes, diameter:add_transport/2 happily returning ok and
the error resulting on failure of watchdog and/or peer_fsm processes.
Now check for errors before getting this far, returning {error, Reason}
from diameter:add_transport/2 when one is detected. There are still
some errors that can only be detected after transport start (eg. a
misbehaving callback) but most will be caught early.
|
|
The value determines whether or not an unexpected message length in the
header of an incoming messages causes the peer process to exit, the
message to be discarded or handled as usual. The latter may only be
appropriate for message-oriented transport (eg. SCTP) since
stream-oriented transport (eg. TCP) may not be able to recover the
message boundary once a length error has occurred.
|
|
|
|
Instead, use whatever dictionary a transport has configured as
supporting application id 0. This is to support the updated RFC 6733
dictionaries (which bring with them updated records) and also to be able
to transparently support any changed semantics (eg. 5xxx in
answer-message).
|
|
There is no such transition in RFC 3539, the state remains in INITIAL.
|
|
This was the result of the watchdog process exiting as a consequence of
peer death in some casesi, causing a restarted transport to enter
INITIAL when it should enter REOPEN. The watchdog now remains alive as
long as peer shutdown isn't requested and a 'close' message to the
service process (instead of watchdog death) generates 'closed' events
from the service.
|
|
Service process informs the watchdog process which informs the peer
process. (Instead of going directly to the latter in one case.)
|
|
|
|
|
|
|
|
|