aboutsummaryrefslogtreecommitdiffstats
path: root/lib/diameter/src/base/diameter_service.erl
AgeCommit message (Collapse)Author
2017-06-13Let spawn_opt config replace erlang:spawn_opt/2 for request processesAnders Svensson
By accepting an MFA that is applied to the fun that is otherwise spawned for each incoming request, to allow handler processes to be reused. This is not yet documented and may change, but the motivation is to let spawn be replaced by process pool, from which the MFA selects. A list-valued spawn_opt is equivalent to {erlang, spawn_opt, [Opts]}.
2017-06-13Don't deconstruct {TPid, Caps} unnecessarilyAnders Svensson
The tuple is returned from and passed to callbacks, so retain the tuple instead of its elements.
2017-06-13Remove use of process dictionary in decodeAnders Svensson
By passing additional arguments through it.
2017-06-12Remove upgrade-related codeAnders Svensson
This and subsequent commits are destined for OTP 20.0.
2017-03-07Don't use request table for answer routingAnders Svensson
The table has existed forever, to route incoming answers to a waiting request process: each outgoing request writes to the table, and each incoming answer reads. This has been seen to suffer from lock contention at high load however, so this commit moves the routing into the diameter_peer_fsm processes that are diameter's conduit to transport processes. The request table is still used for failover detection, but entries are only written when a watchdog state transitions leaves or enters state OKAY.
2016-05-30Don't restart transport processes after transport removalAnders Svensson
A replacement accepting transport could be started after the service process received a shutdown message from diameter_config, if a connection was accepted before the transport process in question was terminated. The replacement lived on until the service needed to restart it.
2016-05-30Rename diameter_reg:del -> removeAnders Svensson
Letters are cheap.
2016-05-09Merge branch 'anders/diameter/info/OTP-13508'Anders Svensson
* anders/diameter/info/OTP-13508: Add diameter:peer_find/1 Add diameter:peer_info/1
2016-05-04Add diameter:peer_info/1Anders Svensson
To return information about a single peer_ref(), to avoid having to retrieve more than is needed with service_info/2.
2016-03-09Merge branch 'maint'Anders Svensson
2016-03-07Fix dialyzer warningsAnders Svensson
Whether making record declarations unreadable to compensate for dialyzer's ignorance of match specs is worth it is truly debatable.
2016-03-07Merge branch 'maint'Anders Svensson
2016-03-07Merge branch 'anders/diameter/retransmission/OTP-13342' into maintAnders Svensson
* anders/diameter/retransmission/OTP-13342: Fix handling of shared peer connections in watchdog state SUSPECT Remove unnecessary parentheses Remove dead export
2016-02-19Fix handling of shared peer connections in watchdog state SUSPECTAnders Svensson
A peer connection shared from a remote node was regarded as being available for peer selection (aka up) as long as its peer_fsm process was alive; that is, for the lifetime of the peer connection. In particular, it didn't take note of transitions into watchdog state SUSPECT, when the connection remains. As a result, retransmissions could select the same peer connection whose watchdog transition caused the retransmission. A service process now broadcasts a peer_down event just as it does a peer_up event. The fault predates the table rearrangements of commit 8fd4e5f4.
2016-02-09Make peer handling more efficientAnders Svensson
Each service process maintains a dictionary of peers, mapping an application alias to a {pid(), #diameter_caps{}} list of connected peers. These lists are potentially large, peers were appended to the end of the list for no particular reason, and these long lists were constructed/deconstructed when filtering them for pick_peer callbacks. Many simultaneous outgoing request could then slow the VM to a crawl, with many scheduled processes mired in list manipulation. The pseudo-dicts are now replaced by plain ets tables. The reason for them was (once upon a time) to have an interface interchangeable with a plain dict for debugging purposes, but strict swapablity hasn't been the case for some time now, and in practice a swap has never taken place. Additional tables mapping Origin-Host/Realm have also been introduced, to minimize the size of the peers lists when peers are filtered on host/realm. For example, a filter like {any, [{all, [realm, host]}, realm]} is probably a very common case: preferring a Destination-Realm/Host match before falling back on Destination-Realm alone. This is now more efficiently (but not equivalently) expressed as {first, [{all, [realm, host]}, realm]} to stop the search when the best match is made, and extracts peers from host/realm tables instead of searching through the list of all peers supporting the application in question. The code to try and start with a lookup isn't exhaustive, and the 'any' filter is still as inefficient as previously.
2016-02-09Remove unnecessary erlang:monitor/2 qualificationAnders Svensson
See commit 862af31d.
2015-12-22Merge branch 'maint'Anders Svensson
2015-12-22Merge branch 'maint-17' into maintAnders Svensson
2015-12-21Make peer handling more efficientAnders Svensson
Each service process maintains a dictionary of peers, mapping an application alias to a {pid(), #diameter_caps{}} list of connected peers. These lists are potentially large, peers were appended to the end of the list for no particular reason, and these long lists were constructed/deconstructed when filtering them for pick_peer callbacks. Many simultaneous outgoing request could then slow the VM to a crawl, with many scheduled processes mired in list manipulation. The pseudo-dicts are now replaced by plain ets tables. The reason for them was (once upon a time) to have an interface interchangeable with a plain dict for debugging purposes, but strict swapablity hasn't been the case for some time now, and in practice a swap has never taken place. Additional tables mapping Origin-Host/Realm have also been introduced, to minimize the size of the peers lists when peers are filtered on host/realm. For example, a filter like {any, [{all, [realm, host]}, realm]} is probably a very common case: preferring a Destination-Realm/Host match before falling back on Destination-Realm alone. This is now more efficiently (but not equivalently) expressed as {first, [{all, [realm, host]}, realm]} to stop the search when the best match is made, and extracts peers from host/realm tables instead of searching through the list of all peers supporting the application in question. The code to try and start with a lookup isn't exhaustive, and the 'any' filter is still as inefficient as previously.
2015-12-21Remove unnecessary erlang:monitor/2 qualificationAnders Svensson
See commit 862af31d.
2015-12-20Merge branch 'anders/diameter/17.5.6.7/OTP-13211' into maint-17Erlang/OTP
* anders/diameter/17.5.6.7/OTP-13211: vsn -> 1.9.2.2 Update/fix appup for 17.5.6.7 Be resilient to diameter_service state upgrades
2015-12-20Be resilient to diameter_service state upgradesAnders Svensson
By not failing in code that looks up state: pick_peer and service_info.
2015-10-09Update DiameterHans Bolinder
Record field types have been modified due to commit 8ce35b2: "Take out automatic insertion of 'undefined' from typed record fields".
2015-09-14Merge branch 'anders/diameter/M-bit/OTP-12947' into maintAnders Svensson
* anders/diameter/M-bit/OTP-12947: Add service_opt() strict_mbit
2015-08-25Add service_opt() strict_mbitAnders Svensson
There are differing opinions on whether or not reception of an arbitrary AVP setting the M-bit is an error. 1.3.4 of RFC 6733 says this about how an existing Diameter application may be modified: o The M-bit allows the sender to indicate to the receiver whether or not understanding the semantics of an AVP and its content is mandatory. If the M-bit is set by the sender and the receiver does not understand the AVP or the values carried within that AVP, then a failure is generated (see Section 7). It is the decision of the protocol designer when to develop a new Diameter application rather than extending Diameter in other ways. However, a new Diameter application MUST be created when one or more of the following criteria are met: M-bit Setting An AVP with the M-bit in the MUST column of the AVP flag table is added to an existing Command/Application. An AVP with the M-bit in the MAY column of the AVP flag table is added to an existing Command/Application. The point here is presumably interoperability: that the command grammar should specify explicitly what mandatory AVPs much be understood, and that anything more is an error. On the other hand, 3.2 says thus about command grammars: avp-name = avp-spec / "AVP" ; The string "AVP" stands for *any* arbitrary AVP ; Name, not otherwise listed in that Command Code ; definition. The inclusion of this string ; is recommended for all CCFs to allow for ; extensibility. This renders 1.3.4 pointless unless "*any* AVP" is qualified by "not setting the M-bit", since the sender can effectively violate 1.3.4 without this necessitating an error at the receiver. If clients add arbitrary AVPs setting the M-bit then request handling becomes more implementation-dependent. The current interpretation in diameter is strict: if a command grammar doesn't explicitly allow an AVP setting the M-bit then reception of such an AVP is regarded as an error. The strict_mbit option now allows this behaviour to be changed, false turning all responsibility for the M-bit over to the user.
2015-08-13Merge branch 'maint-17' into maintAnders Svensson
The diffs are all about adapting to the OTP 18 time interface. The code was previously backwards compatible, falling back on the erlang:now/0 if erlang:monotonic_time/0 is unavailable, but this was seen to be a bad thing in commit 9c0f2f2c. Use of erlang:now/0 is now removed.
2015-08-05Simplify time manipulationAnders Svensson
By doing away with more wrapping that the parent commit started to remove.
2015-06-18Change license text to APLv2Bruce Yinhe
2015-03-27Add service_opt() incoming_maxlenAnders Svensson
To bound the length of incoming messages that will be decoded. A message longer than the specified number of bytes is discarded. An incoming_maxlen_exceeded counter is incremented to make note of the occurrence. The motivation is to prevent a sufficiently malicious peer from generating significant load by sending long messages with many AVPs for diameter to decode. The 24-bit message length header accomodates (16#FFFFFF - 20) div 12 = 1398099 Unsigned32 AVPs for example, which the current record-valued decode is too slow with in practice. A bound of 16#FFFF bytes allows for 5461 small AVPs, which is probably more than enough for the majority of applications, but the default is the full 16#FFFFFF.
2015-03-24Add service_opt() string_decodeAnders Svensson
To control whether stringish Diameter types are decoded to string or left as binary. The motivation is the same as in the parent commit: to avoid large strings being copied when incoming Diameter messages are passed between processes; or *if* in the case of messages destined for handle_request and handle_answer callbacks, since these are decoded in the dedicated processes that the callbacks take place in. It would be possible to do something about other messages without requiring an option, but disabling the decode is the most effective. The value is a boolean(), true being the default for backwards compatibility. Setting false causes both diameter_caps records and decoded messages to contain binary() in relevant places that previously had string(): diameter_app(3) callbacks need to be prepared for the change. The Diameter types affected are OctetString and the derived types that can contain arbitrarily large values: OctetString, UTF8String, DiameterIdentity, DiameterURI, IPFilterRule, and QoSFilterRule. Time and Address are unaffected. The DiameterURI decode has been redone using re(3), which both simplifies and does away with a vulnerability resulting from the conversion of arbitrary strings to atom. The solution continues the use and abuse of the process dictionary for encode/decode purposes, last seen in commit 0f9cdba.
2015-03-05Merge branch 'anders/diameter/time/OTP-12439' into maintAnders Svensson
* anders/diameter/time/OTP-12439: Use new time api in test suites Use new time api in implementation
2015-03-05Merge branch 'anders/diameter/pool/OTP-12428' into maintAnders Svensson
* anders/diameter/pool/OTP-12428: Fix SCTP match blunder in suites Be backwards compatible with diameter_sctp listener state Add gen_tcp testcase that fails sporadically Simplify transport suite Remove (ancient) dead code Don't orphan slave nodes in example suite Refresh example code Improve language consistency in diameter(1) Add pool suite to test transport_opt() pool_size Adapt tcp/sctp transport modules for pool_size > 1 Add transport_opt() pool_size
2015-02-20Use new time api in implementationAnders Svensson
In particular, deal with the deprecation of erlang:now/0 in OTP 18. Be backwards compatible with older releases: the new api is only used when available. The test suites have not been modified.
2015-02-20Add transport_opt() pool_sizeAnders Svensson
Transport processes are started by diameter one at a time. In the listening case, a transport process accepts a connection, tells the peer_fsm process, which tells its watchdog process, which tells its service process, which then starts a new watchdog, which starts a new peer_fsm, which starts a new transport process, which (finally) goes about accepting another connection. In other words, not particularly aggressive in accepting new connections. This behaviour doesn't do particularly well with a large number of concurrent connections: with TCP and 250 connecting peers we see connections being refused. This commit adds the possibilty of configuring a pool of accepting processes, by way of a new transport option, pool_size. Instead of diameter:add_transport/2 starting just a single process, it now starts the configured number, so that instead of a single process waiting for a connection there's now a pool. The option is even available for connecting processes, which provides an alternate to adding multiple transports when multiple connections to the same peer are required. In practice this also means configuring {restrict_connections, false}: this is not implicit. For backwards compatibility, the form of diameter:service_info(_,transport) differs in the connecting case, depending on whether or not pool_size is configured. Note that transport processes for the same transport_ref() can be started concurrently when pool_size > 1. This places additional requirements on diameter_{tcp,sctp}, that will be dealt with in a subsequent commit.
2015-01-19Monitor more efficiently at shutdownAnders Svensson
There's no need for building a pid list only to map it to a list of monitor references. Also, monitoring before banging the shutdown message makes for better trace, avoiding unnecessary noproc reasons when the process dies before the monitor is created.
2014-11-27Order peers in pick_peer callbacksAnders Svensson
The order of peers presented to a diameter_app(3) pick_peer callback has previously not been documented, but there are use cases that are simplified by an ordering. For example, consider preferring a direct connection to a specified Destination-Host/Realm to any host in the realm. The implementation previously treated this as a special case by placing matching hosts at the head of the peers list, but the documentation made no guarantees. Now present peers in match-order, so that the desired sorting is the result of the following filter. {any, [{all, [host, realm]}, realm]} The implementation is not backwards compatible in the sense that a realm filter alone is no longer equivalent in this case. However, as stated, the documentation never made any guarantees regarding the sorting.
2014-08-05Map binary process info to a reference/byte countAnders Svensson
That is, instead of including the list in a diameter:service_info/2 info tuple, only include the number of references and the number of bytes referenced. The list itself can be quite large and typically isn't that interesting, at least not to a diameter user.
2014-07-21Add info item for diameter:service_info/2Anders Svensson
To extract only process info from connections info, which can be useful to reduce the amount of information returned. Choose 'info' for the item since process_info is more than one word: all others are one. Don't choose memory since it's too specific: might want to use it for more.
2014-07-21Add (process) info tuple to diameter:service_info/2Anders Svensson
To show process_info of interest. This is not yet documented since it may well change.
2014-05-26Replace traffic-related log reports with no-op function callsAnders Svensson
The former were a little over-enthusiastic and could cause a node to be logged to death if a peer Diameter node was sufficiently ill-willed. The function calls are to diameter_lib:log/4, the arguments of which identify the happening in question, and which does nothing but provide a function to trace on. Many existing log calls have been shrunk. The only remaining traffic-related report (hopefully) is that resulting from {answer_errors, report} config, and this has been slimmed.
2014-05-25Merge branch 'anders/diameter/watchdog_leak/OTP-11934' into maintAnders Svensson
* anders/diameter/watchdog_leak/OTP-11934: Simplify sending of 'close' to watchdog Fix watchdog table leak
2014-05-25Merge branch 'anders/diameter/request_leak/OTP-11893' into maintAnders Svensson
* anders/diameter/request_leak/OTP-11893: Fix leaking request table Add check that request table is empty to failover suite Comment fix
2014-05-20Simplify sending of 'close' to watchdogAnders Svensson
There's no need to send the message immediately if there's no transport configuration since that in itself means the service process will tell the watchdogs to die.
2014-05-20Fix watchdog table leakAnders Svensson
Commit ef5fddcb (diameter-1.4.1, R16B) caused the leak in the case of an accepting watchdog with restrict_connections = false. It (correctly) ensured the state remained at INITIAL but a subsequent 'close' message to terminate the process was ignored since the state was not DOWN. In fact, no 'close' was sent since there was no state transition or previous connection: the former triggers the message from diameter_service, the latter from diameter_watchdog. The message is now sent to self() from the watchdog itself. Send 'close' in the same way when multiple connections to the same peer are allowed, to avoid waiting for a watchdog timer expiry for the process to terminate in this case.
2014-05-18Fix leaking request tableAnders Svensson
A new connection writes the pid to the table diameter_request. The normal handling is that loss of a connection leads to a watchdog state change in the service process, which removes the entry, but this usually won't happen in the case of diameter:stop_service/1 since the service process is terminated without waiting for watchdog transitions. The request table should really be service-specific, so that the table is deleted when the service is stopped, which requires passing the table identifier into request processes and handling that the table may not exist. Just clear out the service-specific entries at service process termination for now.
2014-03-31Merge branch 'anders/diameter/pick_peer/OTP-11789'Anders Svensson
* anders/diameter/pick_peer/OTP-11789: Fix pick_peer case clause failure
2014-03-20Fix pick_peer case clause failureAnders Svensson
In the case of {call_mutates_state, true} configuration on the service in question, any peer selection that failed to select a peer resulted in a case clause failure in diameter_service:pick_peer/5, when the call to the service process returned false. This was noticed in the case of a peer failover in which an alternate peer wasn't available. The explicit matching is intentional, to match exactly what's expected.
2014-01-27Remove upgrade-related codeAnders Svensson
No longer needed to update code in runtime since the emulator is restarted at a major release.
2013-11-29Rename reconnect_timer -> connect_timerAnders Svensson
The former was misleading since the timer only applies to initial connection attempts, reconnection attempts being governed by watchdog_timer. The name is a historic remnant from a (dark, pre-OTP) time in which RFC 3539 was followed less slavishly than it is now, and the timer actually did apply to reconnection attempts. Note that connect_timer corresponds to RFC 6733 Tc, while watchdog_timer corresponds to RFC 3539 TwInit. The latter RFC makes clear that TwInit should apply to reconnection attempts. It's less clear if only RFC 6733 is read. Note also that reconnect_timer is still accepted for backwards compatibility. It would be possible to add an option to make reconnect_timer behave strictly as the name suggests (ie. ignore RFC 3539 and interpret RFC 6733 at face value; something that has some value for testing at least) but no such option is implemented in this commit.
2013-09-04Fix broken spawn_optAnders Svensson
The option was morphed into a boolean() and then ignored.