Age | Commit message (Collapse) | Author |
|
Despite claims of full backwards compatibility, the text of RFC 6733
changes the interpretation of unspecified values in a DiameterURI. In
particular, 3588 says that the default port and transport are 3868 and
sctp respectively, while 6733 says it's either 3868/tcp (aaa) or
5658/tcp (aaas). The 3588 defaults were used regardless, but now use
them only if the common dictionary is diameter_gen_base_rfc3588. The
6733 defaults are used otherwise.
This kind of change in the standard can lead to interop problems, since
a node has to know which RFC its peer is following to know that it will
properly interpret missing URI components. Encode of a URI includes all
components to avoid such confusion.
That said, note that the defaults in the diameter_uri record have *not*
been changed. This avoids breaking code that depends on them, but the
risk is that such code sends inappropriate values. The record defaults
may be changed in a future release, to force values to be explicitly
specified.
|
|
* anders/diameter/string_decode/OTP-11952:
Let examples override default service options
Set {restrict_connections, false} in example server
Set {string_decode, false} in examples
Test {string_decode, false} in traffic suite
Add service_opt() string_decode
Strip potentially large terms when sending outgoing Diameter messages
Improve language consistency in diameter(1)
|
|
To control whether stringish Diameter types are decoded to string or
left as binary. The motivation is the same as in the parent commit: to
avoid large strings being copied when incoming Diameter messages are
passed between processes; or *if* in the case of messages destined for
handle_request and handle_answer callbacks, since these are decoded in
the dedicated processes that the callbacks take place in. It would be
possible to do something about other messages without requiring an
option, but disabling the decode is the most effective.
The value is a boolean(), true being the default for backwards
compatibility. Setting false causes both diameter_caps records and
decoded messages to contain binary() in relevant places that previously
had string(): diameter_app(3) callbacks need to be prepared for the
change.
The Diameter types affected are OctetString and the derived types that
can contain arbitrarily large values: OctetString, UTF8String,
DiameterIdentity, DiameterURI, IPFilterRule, and QoSFilterRule. Time and
Address are unaffected.
The DiameterURI decode has been redone using re(3), which both
simplifies and does away with a vulnerability resulting from the
conversion of arbitrary strings to atom.
The solution continues the use and abuse of the process dictionary for
encode/decode purposes, last seen in commit 0f9cdba.
|
|
Both incoming and outgoing Diameter messages pass through two or three
processes, depending on whether they're incoming or outgoing: the
transport process and corresponding peer_fsm process and (for incoming)
watchdog processes. Since terms other than binary are copied when
passing process boundaries, large terms lead to copying that can be
problematic, if frequent enough. Since only the bin and transport_data
fields of a diameter_packet record are needed by the transport process,
discard others when sending outgoing messages.
Strictly speaking, the statement that only the aforementioned fields are
needed by the transport process depends on the transport process. It's
true of those implemented by diameter (in diameter_tcp and
diameter_sctp), but an implementation that makes use of other fields is
assuming more than the documentation in diameter_transport(3) promises.
|
|
* anders/diameter/dpr/OTP-12542:
Discard CER or DWR sent with diameter:call/4
Allow DPR to be sent with diameter:call/4
Add transport_opt() dpa_timeout
Add testcase for sending DPR with diameter:call/4
|
|
DPR is sent by diameter at application shutdown, service stop, or
transport removal. It has been possible to send the request with
diameter:call/4, but the answer was discarded, instead of the transport
process being terminated. This commit causes DPR to be handled in the
same way regardless of whether it's sent by diameter or by
diameter:call/4.
Note that the behaviour subsequent to DPA is unchanged. In particular,
in the connecting case, the closed connection will be reestablished
after a connect_timer expiry unless the transport is removed. The more
probable use case is the listening case, to disconnect a single peer
associated with a listening transport.
|
|
In particular, deal with the deprecation of erlang:now/0 in OTP 18. Be
backwards compatible with older releases: the new api is only used when
available.
The test suites have not been modified.
|
|
There are two timers governing the establishment of peer connections:
connect_timer and watchdog_timer. The former is the RFC 6733 Tc timer
and is used by diameter_service to establish an initial connection. The
latter is RFC 3539 TwInit and is used by diameter_watchdog for
connection reestablishment after the watchdog leaves state INITIAL. A
connecting transport ignored the connect timer since the watchdog
process never died, regardless of the watchdog state, causing the
watchdog timer to handle reconnection.
This seems to have been broken for some time.
|
|
* anders/diameter/hardening/OTP-11721:
Simplify example server
Make example server answer unsupported requests with 3001
Make example code quiet
Don't count messages on arbitrary keys
Replace traffic-related log reports with no-op function calls
|
|
That is, don't use a key constructed from an incoming Diameter header
unless the message is known to the dictionary in question. Otherwise
there are 2^32 application ids, 2^24 command codes, and 2 R-bits for an
ill-willed peer to choose from, each resulting in new keys in the
counter table (diameter_stats).
The usual {ApplicationId, CommandCode, Rbit} in a key is replaced by the
atom 'unknown' if the message in question is unknown to the decoding
dictionary.
Counters for messages sent and received by a relay are (still) not
implemented.
|
|
The former were a little over-enthusiastic and could cause a node to be
logged to death if a peer Diameter node was sufficiently ill-willed.
The function calls are to diameter_lib:log/4, the arguments of which
identify the happening in question, and which does nothing but provide a
function to trace on. Many existing log calls have been shrunk.
The only remaining traffic-related report (hopefully) is that resulting
from {answer_errors, report} config, and this has been slimmed.
|
|
* anders/diameter/dpr/OTP-11938:
Ensure watchdog dies with transport if DPA was sent
|
|
* anders/diameter/rc_counters/OTP-11937:
Count encode errors in outgoing messages
Count decode errors in incoming requests
Count decode errors independently of result codes
|
|
* anders/diameter/rc_counters/OTP-11891:
Count result codes in CEA/DWA/DPA
|
|
* anders/diameter/watchdog_leak/OTP-11934:
Simplify sending of 'close' to watchdog
Fix watchdog table leak
|
|
A DPR/DPA exchange should always cause the watchdog process in question
to die with the transport, so that a subsequent connection with the same
peer doesn't result in a 3 x DWR/DWA exchange. Commit 5903d6db saw to
this for the sending of DPR but neglected the corresponding problem for
DPA.
In the case of sending DPR (the aforementioned commit), note that
there's no distinction between receiving DPA as expected and not: the
watchdog dies with the transport regardless.
diameter_watchdog must be loaded first at upgrade.
|
|
Only decode errors were counted previously. Keys are of the form
{Id, send, error}, where Id is:
{ApplicationId, CommandCode, Rbit} | unknown
The latter will be the case if not even a #diameter_header{} can be
constructed.
|
|
Errors were only counted in incoming answers. Counters are keyed on
tuples of the same form:
{{ApplicationId, CommandCode, Rbit}, recv, error}
|
|
Corresponding counters for other answer messages have been counted
previously, but those for CEA, DWA, and DPA have been missing since
diameter itself sends these messages and the implementation is as bit
more separate than it might be. The counters are keyed on values of the
following form.
{{ApplicationId, CommandCode, 0 = Rbit}, send|recv, {'Result-Code', RC}}
|
|
There's no need to send the message immediately if there's no transport
configuration since that in itself means the service process will tell
the watchdogs to die.
|
|
Commit ef5fddcb (diameter-1.4.1, R16B) caused the leak in the case of an
accepting watchdog with restrict_connections = false. It (correctly)
ensured the state remained at INITIAL but a subsequent 'close' message
to terminate the process was ignored since the state was not DOWN. In
fact, no 'close' was sent since there was no state transition or
previous connection: the former triggers the message from
diameter_service, the latter from diameter_watchdog. The message is now
sent to self() from the watchdog itself.
Send 'close' in the same way when multiple connections to the same peer
are allowed, to avoid waiting for a watchdog timer expiry for the
process to terminate in this case.
|
|
No longer needed to update code in runtime since the emulator is
restarted at a major release.
|
|
* anders/diameter/timer_confusion/OTP-11168:
Rename reconnect_timer -> connect_timer
|
|
The former was misleading since the timer only applies to initial
connection attempts, reconnection attempts being governed by
watchdog_timer. The name is a historic remnant from a (dark, pre-OTP)
time in which RFC 3539 was followed less slavishly than it is now, and
the timer actually did apply to reconnection attempts.
Note that connect_timer corresponds to RFC 6733 Tc, while watchdog_timer
corresponds to RFC 3539 TwInit. The latter RFC makes clear that TwInit
should apply to reconnection attempts. It's less clear if only RFC 6733
is read.
Note also that reconnect_timer is still accepted for backwards
compatibility. It would be possible to add an option to make
reconnect_timer behave strictly as the name suggests (ie. ignore RFC 3539
and interpret RFC 6733 at face value; something that has some value for
testing at least) but no such option is implemented in this commit.
|
|
Commit e762d7d1 broke outgoing DWA by setting new Hop-by-Hop and
End-to-End identifiers instead of those of the incoming DWR.
|
|
Having the peer_fsm process answer DWR meant that watchdog timer expiry
could result in an outgoing DWR despite the fact that an incoming DWR
was just answered. Having the watchdog process answer avoids this.
diameter_peer_fsm must be loaded before diameter_watchdog. It's
possible for one incoming DWR to go unanswered but a subsequent DWR will
be answered so no harm is done.
|
|
Commit 0b7c87dc caused diameter_watchdog:restart/2 to start returning
'stop', so that a watchdog process for a listening transport that
allowed multiple connections to the same peer would die one watchdog
timeout after losing a connection. The new return value was supposed to
be passed up to transition/2, but was instead passed to set_watchdog/1,
resulting in a function_clause error. The resulting crash was harmless
but unseemly.
Not detected by dialyzer.
Thanks to Aleksander Nycz.
|
|
Crashing watchdog and peer_fsm processes was somewhat unseemly. Emit an
error report and die silently instead.
|
|
Faulty configuration was previously passed directly on to watchdog and
peer_fsm processes, diameter:add_transport/2 happily returning ok and
the error resulting on failure of watchdog and/or peer_fsm processes.
Now check for errors before getting this far, returning {error, Reason}
from diameter:add_transport/2 when one is detected. There are still
some errors that can only be detected after transport start (eg. a
misbehaving callback) but most will be caught early.
|
|
Make it just a number of timeouts, without a new DWR being sent.
|
|
To make the number of watchdogs sent before the transitions REOPEN ->
OKAY and OKAY -> SUSPECT configurable. Using anything other then the
default config is non-standard and should only be used for test.
|
|
Traffic handling is connected to the service implementation through the
pick_peer callback and failover but diameter_service was getting
unwieldy as home to both the service process and traffic handling.
|
|
Instead, use whatever dictionary a transport has configured as
supporting application id 0. This is to support the updated RFC 6733
dictionaries (which bring with them updated records) and also to be able
to transparently support any changed semantics (eg. 5xxx in
answer-message).
|
|
There is no such transition in RFC 3539, the state remains in INITIAL.
|
|
This was the result of the watchdog process exiting as a consequence of
peer death in some casesi, causing a restarted transport to enter
INITIAL when it should enter REOPEN. The watchdog now remains alive as
long as peer shutdown isn't requested and a 'close' message to the
service process (instead of watchdog death) generates 'closed' events
from the service.
|
|
In particular, use watchdog messages as input and do away with the older
connection_up/down (and other) messages. Also, only maintain the
watchdog state, not the older up/down op state.
|
|
Service process informs the watchdog process which informs the peer
process. (Instead of going directly to the latter in one case.)
|
|
Which will be the case with R16B in this case.
|
|
|
|
|
|
A watchdog timeout after DPR but before DPA would previously result
in the watchdog restarting the transport.
|
|
|
|
This makes capabilities available to service_info as soon as
capabilities exchange has been completed. In particular, before state
OKAY is reached.
|
|
Code should be loaded in this order:
diameter_session (sequence/1)
diameter_peer_fsm (calls to sequence/1)
diameter_service (sequence config, mask in receive_message/3)
diameter_watchdog (mask in peer start and receive_message/3)
diameter_config (accept sequence config)
Order of diameter and diameter_peer doesn't matter.
|
|
This was a remnant of the time when sasl interpreted everything but
shutdown or normal as a crash.
|
|
|
|
|
|
In particular, not before the service process has a monitor on
the watchdog since the watchdog's exit reason is meaningful.
|
|
In diameter_service:
make_packet -> make_request_packet
make_header -> make_request_header
make_reply_packet -> make_answer_packet
|
|
Simpler, no duplication of similar makefiles and makes for
better dependencies. (Aka, recursive make considered harmful.)
|