Age | Commit message (Collapse) | Author |
|
* anders/diameter/transport/OTP-12929:
Fix start order of alternate transports
Log discarded answers
|
|
* anders/diameter/lcnt/OTP-12912:
Make ets diameter_stats a set
Remove unnecessary sorting in stats suite
Set ets {write_concurrency, true} on diameter_stats
Don't start watchdog timers unnecessarily
Remove unnecessary erlang:monitor/2 qualification
Add missing watchdog suite clause
|
|
* anders/diameter/caseless/OTP-12902:
Match allowable peer addresses case insensitively
Replace calls to module inet_parse to equivalents in inet
|
|
* anders/diameter/grouped_decode/OTP-12879:
Fix relay encode of decoded diameter_avp lists
|
|
There's no need for it to be ordered, and the ordering has been seen to
have an unexpectedly negative impact on performance in some cases. Order
when retrieving statistics instead, so as not to change the
presentation in diameter:service_info/2.
|
|
A transport configured with diameter:add_transport/2 can be passed
multiple transport_module/transport_config tuples in order to specify
alternate configuration, modules being attempted in order until one
succeeds. This is primarily for the connecting case, to allow a
transport to be configured to first attempt connection over SCTP, and
then TCP in case SCTP fails, with configuration like that documented:
{transport_module, diameter_sctp},
{transport_config, [...], 5000},
{transport_module, diameter_tcp},
{transport_config, [...]}
If the options are the same in both cases, another possibility would be
configuration like this, which attaches the same transport_config to
both modules:
{transport_module, diameter_sctp},
{transport_module, diameter_tcp},
{transport_config, [...], 5000},
However, in this case the start order was reversed relative to the
documented order: first tcp, then sctp. This commit restores the
intended order.
|
|
To diameter_lib:log/4, which was last motivated in commit 39acfdb0.
|
|
Commit c74b593a fixed the problem that a decoded deep diameter_avp list
couldn't be encoded, but did so in the wrong way: there's no need to
reencode component AVPs since the Grouped AVP itself already contains
the encoded binary. The blunder caused diameter_codec:pack_avp/1 to fail
if the first element of the AVP list to be encoded was itself a list.
Thanks to Andrzej TrawiĆski for reporting the problem.
|
|
Both diameter_tcp and diameter_sctp can be configured with one or more
IP addresses from which connections should be accepted (an 'accept'
tuple), specified either as a tuple-valued address or as a regular
expression. In the latter case, peer addresses are mapped to string
using inet:ntoa/1 and the result matched against the regexp. Since
(ipv6) addresses are case insensitive, this should also be the case with
the match, but was not.
|
|
Commits b563c796 (R16B) and 0fad6449 (R16B02) added parse_address/1 and
ntoa/1 to module inet, providing documented alternatives to address/1
and ntoa/1 in the undocumented (save comments in inet(3)) inet_parse.
|
|
Last visited in commit 00584303.
|
|
lcnt:inspect/1 recently showed this:
lock id #tries collisions [%] time [us]
----- --- ------- --------------- ----------
db_tab diameter_stats 932920 92.9326 330332554
|
|
In particular, restart the timer with each incoming Diameter message,
only when the previous timer has expired. Doing so has been seen to
result in high lock contention at load, as in the example below:
(diameter@test)9> lcnt:conflicts([{print, [name, tries, ratio, time]}]).
lock #tries collisions [%] time [us]
----- ------- --------------- ----------
bif_timers 7844528 99.4729 1394434884
db_tab 17240988 1.7947 6286664
timeofday 7358692 5.6729 1399624
proc_link 4814938 2.2736 482985
drv_ev_state 2324012 0.5951 98920
run_queue 21768213 0.2091 63516
pollset 1190174 1.7170 42499
pix_lock 1956 2.5562 39770
make_ref 4697067 0.3669 20211
proc_msgq 9475944 0.0295 5200
timer_wheel 5325966 0.0568 2654
proc_main 10005332 2.8190 1079
pollset_rm_list 59768 1.7752 480
|
|
The function has been auto-exported since R14B.
|
|
* anders/diameter/17.5.5/OTP-12757:
vsn -> 1.9.2
Update appup for 17.5.5
Fix mangled release note
|
|
* anders/diameter/sctp/OTP-12744:
Fix diameter_sctp listener race
Tweak transport suite failures
Run traffic suite over SCTP
|
|
Commit 4b691d8d made it possible for accepting transport processes to be
started concurrently, and commit 77c1b162 adapted diameter_sctp to this,
but missed that the publication of the listener process in diameter_reg
has to precede the return of its start function. As a result, concurrent
starts could result in multiple listener processes.
|
|
- OTP-12741: disfunctional counters
- OTP-12744: diameter_sctp race
No load order requirements.
|
|
The message was regarded as unknown if the answer message in question
set the E-bit and the application dictionary was not the common
dictionary.
|
|
That is, outgoing answer messages received in response to a
handle_request callback having returned {relay, Opts}.
|
|
To clarify what it is that's being computed, which isn't entirely
obvious. No functional change, just renaming.
|
|
As the first step in starting to count outgoing, relayed answer
messages.
|
|
An incoming Diameter message is either a request, an answer to an
outstanding request, or an unexpected answer. The latter weren't
counted, but are now counted on keys of this form:
{pid(), {{unknown, 0}, recv, discarded}}
The form of the second element is similar to those of other counters,
like:
{{relay, 0|1}, send|recv, invalid_error_bit}
Compare this to the key used when counting known answers:
{{ApplicationId, CommandCode, 0}, recv}
The application id and command code aren't included so as not to count
on arbitrary keys, a topic last visited in commit 49e8b11c.
|
|
To differentiate between requests and answers, in analogy with relay
counters. This isn't backwards compatible, but these counters aren't yet
documented.
|
|
Commit 49e8b11c broke the counting of relayed message, causing them to
be accumulated as unknown messages.
|
|
Commit a1df50b3 broke result code counters in the case of answer
messages sent as a header/avp lists (unless the avps, untypically, set
the name field), and for answers sent/received in the relay application.
|
|
* anders/diameter/17.5.3/OTP-12702:
Fix broken pre-17.4 appup
Update appup for 17.5.3
vsn -> 1.9.1
|
|
* anders/diameter/counters/OTP-12701:
Add counters testcase to 3xxx suite
Fix counting error with unknown application id
Add missing doc wording
|
|
* anders/diameter/result_codes/OTP-12654:
Fix broken traffic testcase
Match harder in traffic suite
Don't confuse Result-Code and Experimental-Result
|
|
Decode of an answer message not setting the E-bit, and containing
Experiment-Result but not Result-Code, identified Result-Code as the
erroneous when Erroneous-Result-Code was 3xxx. Here's an example (from
trace) of a the errors field after decode:
[{5004,
{diameter_avp,undefined,undefined,false,false,undefined,'Result-Code',
3001,undefined,undefined}}],
The diameter_avp was just constructed from the AVP name and decoded
result, without regard for which result code AVP contained the value.
Fix by extracting the AVP from the incoming message.
|
|
Upgrade instructions have been added for each 17.X release without
adjusting the instructions for preceeding releases: the instructions
have only been sufficient to upgrading one release at a time: 17.0 to
17.1, 17.1 to 17.2, etc.
Conficting load order requirements make smooth upgrade from an
arbitrarily old release impossible. In this case, 17.3 looks to be as
far back as we can go, so require restart from 17.[0-2] or older.
Update the app suite to deal with binary regexps in appup, and to match
version numbers harder.
|
|
Required load order by ticket.
- OTP-12642, extra bit in diameter_avp.data
- OTP-12654, Result-Code/Experimental-Result confusion
- OTP-12701, counting error with unknown Application Id
none
|
|
Statistics could be accumulated on a key like {{23,275,0}, recv} even
though 23 was not the application id of the dictionary in question.
Missed in commits df19c272 and 7816ab2f.
|
|
In the case of a faulty AVP Length (pointing past the end of a message
or not spanning the header), an extra bit is prepended to data bytes in
diameter_avp:collect_avps/1 in order to force a 5014 decode error. The
bit is supposed to be removed as part of the decode in diameter_gen.hrl
but this didn't happen in case of an AVP that unknown to the dictionary
in question.
|
|
As for the port number in the parent commit, a FQDN can't be arbitrarily
long, at most 255 octets. Make decode fail if it's more.
|
|
A port number is a 16-bit integer, but the regexp used to parse it in
commit 1590920 slavishly followed the RFC 6733 grammar in matching an
arbitrary number of digits. Make decode fail if it's anything more than
5, to avoid doing erlang:list_to_integer/1 on arbitrarily large lists.
Also make it fail if the resulting integer is outside of the expected
range.
|
|
To bound the length of incoming messages that will be decoded. A message
longer than the specified number of bytes is discarded. An
incoming_maxlen_exceeded counter is incremented to make note of the
occurrence.
The motivation is to prevent a sufficiently malicious peer from
generating significant load by sending long messages with many AVPs for
diameter to decode. The 24-bit message length header accomodates
(16#FFFFFF - 20) div 12 = 1398099
Unsigned32 AVPs for example, which the current record-valued decode is
too slow with in practice. A bound of 16#FFFF bytes allows for 5461
small AVPs, which is probably more than enough for the majority of
applications, but the default is the full 16#FFFFFF.
|
|
It was possible to configure the option, but doing so caused the service
to fail when starting a watchdog process:
{function_clause,
[{diameter_service,'-spawn_opts/1-lc$^0/1-0-',
[false],
[{file,"base/diameter_service.erl"},{line,846}]},
{diameter_service,start,5,
[{file,"base/diameter_service.erl"},{line,820}]},
{diameter_service,start,3,
[{file,"base/diameter_service.erl"},{line,782}]},
{diameter_service,handle_call,3,
[{file,"base/diameter_service.erl"},{line,385}]},
{gen_server,try_handle_call,4,[{file,"gen_server.erl"},{line,607}]},
{gen_server,handle_msg,5,[{file,"gen_server.erl"},{line,639}]},
{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,237}]}]}
Tests for the option in the config suite were also missing.
Bungled in commit 78b3dc6.
|
|
Required load order by ticket.
- OTP-11492, answer messages discarded
- OTP-12415, retransmission failure
- OTP-12475, grouped AVP decode
- OTP-12543, no requests after DPR
none
- OTP-12412, shutdown issues
diameter_lib
diameter_service
- OTP-12428, transport_opt() pool_size
diameter_lib
diameter_service
diameter, diameter_config
diameter_{tcp,sctp}
diameter, diameter_config
- OTP-12439, new time api in Erlang/OTP 18
diameter_lib
diameter_{config,peer,reg,service,session,stats,sync,watchdog,sctp}
- OTP-11952, service_opt() decode_string
- OTP-12589, DiameterURI encode/decode
diameter_{capx,codec,peer}
diameter_types
diameter_traffic
diameter_{service,peer_fsm}
diameter_watchdog
diameter, diameter_config
- OTP-12542, DPR with diameter:call/4
diameter_{peer_fsm,watchdog}
diameter, diameter_config
- OTP-12609, transport_opt() dpr_timeout
diameter_peer_fsm
diameter, diameter_config
|
|
* anders/diameter/dpr/OTP-12609:
Discard incoming/outgoing requests after incoming DPR
Add transport_opt() dpr_timeout
Be lenient with errors in incoming DPR
|
|
Despite claims of full backwards compatibility, the text of RFC 6733
changes the interpretation of unspecified values in a DiameterURI. In
particular, 3588 says that the default port and transport are 3868 and
sctp respectively, while 6733 says it's either 3868/tcp (aaa) or
5658/tcp (aaas). The 3588 defaults were used regardless, but now use
them only if the common dictionary is diameter_gen_base_rfc3588. The
6733 defaults are used otherwise.
This kind of change in the standard can lead to interop problems, since
a node has to know which RFC its peer is following to know that it will
properly interpret missing URI components. Encode of a URI includes all
components to avoid such confusion.
That said, note that the defaults in the diameter_uri record have *not*
been changed. This avoids breaking code that depends on them, but the
risk is that such code sends inappropriate values. The record defaults
may be changed in a future release, to force values to be explicitly
specified.
|
|
Both RFC 3588 and 6733 disallow the combination. Make its encode fail.
|
|
* anders/diameter/string_decode/OTP-11952:
Let examples override default service options
Set {restrict_connections, false} in example server
Set {string_decode, false} in examples
Test {string_decode, false} in traffic suite
Add service_opt() string_decode
Strip potentially large terms when sending outgoing Diameter messages
Improve language consistency in diameter(1)
|
|
* anders/diameter/route_record/OTP-12551:
Fix ordering of AVPs in relayed messages
|
|
To control whether stringish Diameter types are decoded to string or
left as binary. The motivation is the same as in the parent commit: to
avoid large strings being copied when incoming Diameter messages are
passed between processes; or *if* in the case of messages destined for
handle_request and handle_answer callbacks, since these are decoded in
the dedicated processes that the callbacks take place in. It would be
possible to do something about other messages without requiring an
option, but disabling the decode is the most effective.
The value is a boolean(), true being the default for backwards
compatibility. Setting false causes both diameter_caps records and
decoded messages to contain binary() in relevant places that previously
had string(): diameter_app(3) callbacks need to be prepared for the
change.
The Diameter types affected are OctetString and the derived types that
can contain arbitrarily large values: OctetString, UTF8String,
DiameterIdentity, DiameterURI, IPFilterRule, and QoSFilterRule. Time and
Address are unaffected.
The DiameterURI decode has been redone using re(3), which both
simplifies and does away with a vulnerability resulting from the
conversion of arbitrary strings to atom.
The solution continues the use and abuse of the process dictionary for
encode/decode purposes, last seen in commit 0f9cdba.
|
|
Both incoming and outgoing Diameter messages pass through two or three
processes, depending on whether they're incoming or outgoing: the
transport process and corresponding peer_fsm process and (for incoming)
watchdog processes. Since terms other than binary are copied when
passing process boundaries, large terms lead to copying that can be
problematic, if frequent enough. Since only the bin and transport_data
fields of a diameter_packet record are needed by the transport process,
discard others when sending outgoing messages.
Strictly speaking, the statement that only the aforementioned fields are
needed by the transport process depends on the transport process. It's
true of those implemented by diameter (in diameter_tcp and
diameter_sctp), but an implementation that makes use of other fields is
assuming more than the documentation in diameter_transport(3) promises.
|
|
With the same motivation as in commits 5bd2d72 and b1fd629.
As in the latter, incoming DPR is the only exception.
|
|
To cause a peer connection to be closed following an outgoing DPA, in
case the peer fails to do so. It is the recipient of DPA that should
close the connection according to RFC 6733.
|
|
To avoid having the peer interpret the error as meaning the connection
shouldn't be closed, which probably does more harm than ignoring
syntactic errors in the DPR.
Note that RFC 6733 says this about incoming DPR, in 5.4 Disconnecting
Peer Connections:
Upon receipt of the message, the Disconnect-Peer-Answer message is
returned, which SHOULD contain an error if messages have recently been
forwarded, and are likely in flight, which would otherwise cause a race
condition.
The race here is presumably between answers to forwarded requests and
the outgoing DPA, but we have no handling for this: whether or not there
are pending answers is irrelevant to how DPR is answered. It's
questionable that a peer should be able to prevent disconnection in any
case: it has to be the node sending DPR that decides if it's approriate,
and the peer should take it as an indication of what's coming.
Incoming DPA is already treated leniently: the only error that's not
ignored is mismatching End-to-End and Hop-by-Hop Identifiers, since
there's no distinguishing an erroneous value from an unsolicited DPA.
This mismatch could also be ignored, which is the case for DWA for
example, but this problem is already dealt with by dpa_timeout, which
causes a connection to be closed even when the expected DPA isn't
received.
|
|
* anders/diameter/dpr/OTP-12542:
Discard CER or DWR sent with diameter:call/4
Allow DPR to be sent with diameter:call/4
Add transport_opt() dpa_timeout
Add testcase for sending DPR with diameter:call/4
|