Age | Commit message (Collapse) | Author |
|
Failover caused the entire request table to be scanned in search of
entries with the transport process in question. With many entries
(possibly as a result of the leak fixed in commit 6c9cbd96), this can
lead to the service process hanging in ets:select_trap/1, with memory
growth when many request processes write concurrently. Now write entries
keyed on the transport pid, so that finding request processes at
failover is a lookup rather than a select scanning the entire table.
There is no upgrade handling in that new code doesn't consider that old
code didn't write entries on the transport pid. Thus, a request whose
table entries were written in old code will timeout rather than failover
in new code. That is, there is a small window for failover to be missed
(since request processes are short-lived), but it requires that it take
place during the upgrade.
As a minor aside, don't ignore failovers when sending binaries (which
isn't officially supported), let prepare_retransmit callbacks deal with
modifying the binary as required.
|
|
|
|
* anders/diameter/dialyzer/OTP-13400:
Fix dialyzer warnings
|
|
* anders/diameter/17.5.6.9/OTP-13385:
vsn -> 1.9.2.4
Update appup for 17.5.6.9
|
|
* anders/diameter/retransmission/OTP-13342:
Fix handling of shared peer connections in watchdog state SUSPECT
Remove unnecessary parentheses
Remove dead export
|
|
Whether making record declarations unreadable to compensate for
dialyzer's ignorance of match specs is worth it is truly debatable.
|
|
|
|
OTP-13342 remote watchdog transition to state SUSPECT
|
|
A peer connection shared from a remote node was regarded as being
available for peer selection (aka up) as long as its peer_fsm process
was alive; that is, for the lifetime of the peer connection. In
particular, it didn't take note of transitions into watchdog state
SUSPECT, when the connection remains. As a result, retransmissions could
select the same peer connection whose watchdog transition caused the
retransmission.
A service process now broadcasts a peer_down event just as it
does a peer_up event.
The fault predates the table rearrangements of commit 8fd4e5f4.
|
|
Not needed as of commit 6c9cbd96.
|
|
The export of diameter_traffic:failover/1 happened with the creation of
the module in commit e49e7acc, but was never needed since the calling
code was also moved into diameter_traffic.
|
|
Too wide function clause was used in ssl_connection which led to ssl
connection process crashes when `{hibernate_after, N}` with extremely
small N was passed among other options to `ssl:connect`.
|
|
|
|
* anders/diameter/17.5.6.8/OTP-13212:
vsn -> 1.9.2.3
Update appup for 17.5.6.8
|
|
Each service process maintains a dictionary of peers, mapping an
application alias to a {pid(), #diameter_caps{}} list of connected
peers. These lists are potentially large, peers were appended to the end
of the list for no particular reason, and these long lists were
constructed/deconstructed when filtering them for pick_peer callbacks.
Many simultaneous outgoing request could then slow the VM to a crawl,
with many scheduled processes mired in list manipulation.
The pseudo-dicts are now replaced by plain ets tables. The reason for
them was (once upon a time) to have an interface interchangeable with a
plain dict for debugging purposes, but strict swapablity hasn't been the
case for some time now, and in practice a swap has never taken place.
Additional tables mapping Origin-Host/Realm have also been introduced,
to minimize the size of the peers lists when peers are filtered on
host/realm. For example, a filter like
{any, [{all, [realm, host]}, realm]}
is probably a very common case: preferring a Destination-Realm/Host
match before falling back on Destination-Realm alone. This is now more
efficiently (but not equivalently) expressed as
{first, [{all, [realm, host]}, realm]}
to stop the search when the best match is made, and extracts peers from
host/realm tables instead of searching through the list of all peers
supporting the application in question. The code to try and start with a
lookup isn't exhaustive, and the 'any' filter is still as inefficient as
previously.
|
|
See commit 862af31d.
|
|
|
|
OTP-13164 more efficient peer lists
One module. Downgrade not supported.
|
|
|
|
* anders/diameter/17.5.6.7/OTP-13211:
vsn -> 1.9.2.2
Update/fix appup for 17.5.6.7
Be resilient to diameter_service state upgrades
|
|
* anders/diameter/request_leak/OTP-13137:
Fix request table leak at retransmission
Fix request table leak at exit signal
|
|
* anders/diameter/17/watchdog/OTP-12969:
Fix watchdog function_clause
|
|
* anders/diameter/M-bit/OTP-12947:
Add service_opt() strict_mbit
|
|
|
|
OTP-12947 strict_mbit
OTP-12969 watchdog function_clause
OTP-13137 request leak
diameter_config (that allows the new option) should be loaded after the
others.
Anchor was missing from one regexp. Patches did not accumulate through
older versions.
|
|
By not failing in code that looks up state: pick_peer and service_info.
|
|
In the case of retranmission, a prepare_retransmit callback could modify
End-to-End and/or Hop-by-Hop identifiers so that the resulting
diameter_request entry was not removed, since the removal was of entries
with the identifiers of the original request. The chances someone doing
this in practice are probably minimal.
|
|
The storing of request records in the ets table diameter_request was
wrapped in a try/after so that the latter would unconditionally remove
written entries. The problem is that it didn't deal with the process
exiting as a result of an exit signal, since this doesn't raise in an
exception. Since the process in question applies callbacks to user code,
we can potentially be linked to other process and exit as a result.
Trapping exits changes the current behaviour of the process, so spawn a
monitoring process that cleans up upon reception of 'DOWN'.
|
|
|
|
* ia/ssl/maint-17/backport-of-18-fix:
ssl: Prepare for release
ssl: Do not crash on proprietary hash_sign algorithms
|
|
|
|
|
|
|
|
|
|
|
|
Commit 4f365c07 introduced the error on set_watchdog/2, as a consequence
of timeout/1 returning stop, which only happens with accepting
transports with {restrict_connections, false}.
|
|
There are differing opinions on whether or not reception of an arbitrary
AVP setting the M-bit is an error. 1.3.4 of RFC 6733 says this about
how an existing Diameter application may be modified:
o The M-bit allows the sender to indicate to the receiver whether or
not understanding the semantics of an AVP and its content is
mandatory. If the M-bit is set by the sender and the receiver
does not understand the AVP or the values carried within that AVP,
then a failure is generated (see Section 7).
It is the decision of the protocol designer when to develop a new
Diameter application rather than extending Diameter in other ways.
However, a new Diameter application MUST be created when one or more
of the following criteria are met:
M-bit Setting
An AVP with the M-bit in the MUST column of the AVP flag table is
added to an existing Command/Application. An AVP with the M-bit
in the MAY column of the AVP flag table is added to an existing
Command/Application.
The point here is presumably interoperability: that the command grammar
should specify explicitly what mandatory AVPs much be understood, and
that anything more is an error.
On the other hand, 3.2 says thus about command grammars:
avp-name = avp-spec / "AVP"
; The string "AVP" stands for *any* arbitrary AVP
; Name, not otherwise listed in that Command Code
; definition. The inclusion of this string
; is recommended for all CCFs to allow for
; extensibility.
This renders 1.3.4 pointless unless "*any* AVP" is qualified by "not
setting the M-bit", since the sender can effectively violate 1.3.4
without this necessitating an error at the receiver. If clients add
arbitrary AVPs setting the M-bit then request handling becomes more
implementation-dependent.
The current interpretation in diameter is strict: if a command grammar
doesn't explicitly allow an AVP setting the M-bit then reception of such
an AVP is regarded as an error. The strict_mbit option now allows this
behaviour to be changed, false turning all responsibility for the M-bit
over to the user.
|
|
Too much code was removed in commit 560f73141af
|
|
|
|
* anders/diameter/17.5.6.3/OTP-12927:
vsn -> 1.9.2.1
Update appup for 17.5.6.3
|
|
* anders/diameter/17/time/OTP-12926:
Simplify time manipulation
Remove use of monotonic time in pre-18 code
Remove unnecessary redefinition of erlang:max/2
|
|
* anders/diameter/grouped_errors/OTP-12930:
Fix decode of Grouped AVPs containing errors
Simplify logic
Simplify logic
|
|
* anders/diameter/transport/OTP-12929:
Fix start order of alternate transports
Log discarded answers
|
|
* anders/diameter/lcnt/OTP-12912:
Make ets diameter_stats a set
Remove unnecessary sorting in stats suite
Set ets {write_concurrency, true} on diameter_stats
Don't start watchdog timers unnecessarily
Remove unnecessary erlang:monitor/2 qualification
Add missing watchdog suite clause
|
|
* anders/diameter/caseless/OTP-12902:
Match allowable peer addresses case insensitively
Replace calls to module inet_parse to equivalents in inet
|
|
* anders/diameter/grouped_decode/OTP-12879:
Fix relay encode of decoded diameter_avp lists
|
|
* anders/diameter/decode/OTP-12891:
Don't compute AVP list length unnecessarily at AVP decode
|
|
* anders/diameter/decode/OTP-12871:
Don't traverse errors list unnecessarily when detecting missing AVPs
Don't flag AVP as missing as a consequence of decode error
Correct inaccurate doc
Truncate potentially large terms passed to diameter_lib:log/4
|
|
There's no need for it to be ordered, and the ordering has been seen to
have an unexpectedly negative impact on performance in some cases. Order
when retrieving statistics instead, so as not to change the
presentation in diameter:service_info/2.
|
|
The ordering of (ets) diameter_stats (also unnecessary) ensures the
sorting.
|