aboutsummaryrefslogtreecommitdiffstats
path: root/lib/diameter
AgeCommit message (Collapse)Author
2015-06-19Ensure accepting processes are first in, first outAnders Svensson
A listener process in diameter_sctp starts accepting transport processes as required, either as associations are established or as diameter asks for a processes to be started. Since this can happen in any order, the listener maintains two queues: one for processes that diameter has requested and which are waiting to be given an association, another for processes that have been started to become owners of an association but are waiting for diameter to request them. Only one queue at a time is non-empty. The first queue's length is bounded by the number of accepting processes configured as pool_size. Entries in the second queue are short-lived since diameter starts a replacement transport process whenever an existing one dies or communicates that it has an association. The two queues were previously implemented in an ets ordered_set, whose keys were the pid() of transport processes. Removing an element from the queue was then done with ets:first/1. The problem with this it's not really a queue: there's no guarantee that pid-ordering is the same as the order in which processes are started. If it isn't then it's possible that an established association never be given to diameter as a transport process if there's always a newer association whose pid sorts first. This isn't a problem in practice since it would require new associations to be established faster than diameter starts transport processes, but redo the implementation as a queue, with strict FIFO semantics.
2015-06-19Remove upgrade-related codeAnders Svensson
The changes in some of the previous commits assume application restart.
2015-06-19Be less parallel in traffic suiteAnders Svensson
At the current count, there are 128 groups run in parallel, each of which runs 52 testcases in parallel. That makes for 128*52 = 6656 testcases, which is probably also a factor in the sporadic failures addressed by the parent commit. Don't run the 128 groups in parallel.
2015-06-19Increase send/receive buffers for testsuite SCTP listenersAnders Svensson
The defaults result in sporadic timeouts in the traffic suite after testing over SCTP was added in commit fadf753b. The behaviour looks to be specific to SLES 11, and is presumably the same resends/congestion that lead to the buffers being increased in the gen_sctp suite in commit 12febf13 (and commented in commit e931991f). The behaviour hasn't been seen on SLES 10.
2015-06-19Decrease unnecessarily long testsuite timetrapsAnders Svensson
2015-06-19Simplify accepting transport startAnders Svensson
Don't pass an association id that's no longer used.
2015-06-19Simplify peeloff signalingAnders Svensson
In particular, don't give the accepting transport process the listening socket. It was used to match the initial sctp message received in a peeloff message, but replace the socket in the forwarded message instead.
2015-06-19Simplify socket close at terminateAnders Svensson
The existing code was a remnant of the pre-peeloff implementation. There's no need to close anything but the whole socket.
2015-06-19Don't monitor listener after peeloffAnders Svensson
Listener death should have no effect on a peeled off association.
2015-06-19Don't receive initial messages out of orderAnders Svensson
Forwarding an sctp message from the listener process at the same time that the controlling process is changed means there's no guarantee that the message order will be preserved. Selectively receive the peeloff message before entering the gen_server loop to ensure the order is preserved.
2015-06-19Remove assumption that SCTP association ids will be uniqueAnders Svensson
This is not the case under Solaris for one: successive associations can receive the same association id as a result of peeloff, the id only being unique for the controlling port, not for the listening port as is the case under Linux for example. This made for many failures in the diameter test suites, the traffic suite in particular. Peeloff in diameter_sctp was introduced in 9a671bf0, before which the assumption was fine since it was the listening process that owned all associations. (Which obviously had other drawbacks.) Other remnants of the pre-peeloff implementation have also been removed: that the listener process might receive a message on a socket after peeloff for one. Peeloff in gen_sctp became available in commit 067cfe79, after the original implementation of diameter_sctp. This is trace on the unpatched code showing id reuse under Solaris: + {trace_ts,<0.103.0>,call, {diameter_sctp,handle_info, [{sctp,#Port<0.1625>, {127,0,0,1}, 35904, {[],{sctp_assoc_change,comm_up,0,32,32,1}}}, {listener,#Ref<0.0.1.948>,#Port<0.1625>,4, 57384, {-4,61481}, #Ref<0.0.8.12>, []}]}, {1432,458752,612168}} + {trace_ts,<0.103.0>,call, {diameter_sctp,handle_info, [{sctp,#Port<0.1625>, {127,0,0,1}, 35905, {[],{sctp_assoc_change,comm_up,0,32,32,1}}}, {listener,#Ref<0.0.1.948>,#Port<0.1625>,4, 57384, {-3,61481}, #Ref<0.0.8.12>, []}]}, {1432,458752,613042}} The result was this, when the second association was incorrectly forwarded to the first association's controlling process: ** {function_clause, [{diameter_sctp,transition, [{peeloff,#Port<0.1635>, {sctp,#Port<0.1625>, {127,0,0,1}, 35892, {[],{sctp_assoc_change,comm_up,0,32,32,1}}}, []}, {transport,<0.107.0>,accept,#Port<0.1634>,1,undefined,{32,32},0}], [{file,"transport/diameter_sctp.erl"},{line,561}]}, {diameter_sctp,t,2,[{file,"transport/diameter_sctp.erl"},{line,549}]}, {diameter_sctp,handle_info,2, [{file,"transport/diameter_sctp.erl"},{line,397}]}, {gen_server,try_dispatch,4,[{file,"gen_server.erl"},{line,614}]}, {gen_server,handle_msg,5,[{file,"gen_server.erl"},{line,680}]}, {proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,238}]}]}
2015-05-29Update release notesErlang/OTP
2015-05-29Merge branch 'anders/diameter/test/OTP-12767' into maint-17Erlang/OTP
* anders/diameter/test/OTP-12767: Replace config suite call to erlang:now/0 Fix incorrect suite usage of OTP 18 monotonic time Make tls suite crash more verbosely
2015-05-29Merge branch 'anders/diameter/17.5.5/OTP-12757' into maint-17Erlang/OTP
* anders/diameter/17.5.5/OTP-12757: vsn -> 1.9.2 Update appup for 17.5.5 Fix mangled release note
2015-05-29Merge branch 'anders/diameter/sctp/OTP-12744' into maint-17Erlang/OTP
* anders/diameter/sctp/OTP-12744: Fix diameter_sctp listener race Tweak transport suite failures Run traffic suite over SCTP
2015-05-24Fix diameter_sctp listener raceAnders Svensson
Commit 4b691d8d made it possible for accepting transport processes to be started concurrently, and commit 77c1b162 adapted diameter_sctp to this, but missed that the publication of the listener process in diameter_reg has to precede the return of its start function. As a result, concurrent starts could result in multiple listener processes.
2015-05-24Tweak transport suite failuresAnders Svensson
Make anything but a comm_up sctp_assoc_change crash. Make timeouts more reasonable.
2015-05-24Run traffic suite over SCTPAnders Svensson
Previously it was only run over TCP. Configure a pool of accepting processes since simultaneous connections are otherwise prone to rejection, as discussed in commit 4b691d8d. Tweak timeouts to more reasonable values.
2015-05-24Replace config suite call to erlang:now/0Anders Svensson
To remove a compilation warning with OTP 18.
2015-05-24Fix incorrect suite usage of OTP 18 monotonic timeAnders Svensson
Value was used as strictly increasing when it's only non-decreasing, causing testcases to fail.
2015-05-24Make tls suite crash more verboselyAnders Svensson
To see why it's failing on at least one test machine.
2015-05-23vsn -> 1.9.2Anders Svensson
2015-05-23Update appup for 17.5.5Anders Svensson
- OTP-12741: disfunctional counters - OTP-12744: diameter_sctp race No load order requirements.
2015-05-23Fix mangled release noteAnders Svensson
2015-05-18Fix counting of no_result_code/invalid_error_bitAnders Svensson
The message was regarded as unknown if the answer message in question set the E-bit and the application dictionary was not the common dictionary.
2015-05-18Count relayed answersAnders Svensson
That is, outgoing answer messages received in response to a handle_request callback having returned {relay, Opts}.
2015-05-18Rename dictionary-related functions/variablesAnders Svensson
To clarify what it is that's being computed, which isn't entirely obvious. No functional change, just renaming.
2015-05-18Lift answer send up the call chainAnders Svensson
As the first step in starting to count outgoing, relayed answer messages.
2015-05-18Count discarded incoming messagesAnders Svensson
An incoming Diameter message is either a request, an answer to an outstanding request, or an unexpected answer. The latter weren't counted, but are now counted on keys of this form: {pid(), {{unknown, 0}, recv, discarded}} The form of the second element is similar to those of other counters, like: {{relay, 0|1}, send|recv, invalid_error_bit} Compare this to the key used when counting known answers: {{ApplicationId, CommandCode, 0}, recv} The application id and command code aren't included so as not to count on arbitrary keys, a topic last visited in commit 49e8b11c.
2015-05-18Include R-bit in unknown message counter keysAnders Svensson
To differentiate between requests and answers, in analogy with relay counters. This isn't backwards compatible, but these counters aren't yet documented.
2015-05-18Fix broken relay countersAnders Svensson
Commit 49e8b11c broke the counting of relayed message, causing them to be accumulated as unknown messages.
2015-05-18Fix broken result code countersAnders Svensson
Commit a1df50b3 broke result code counters in the case of answer messages sent as a header/avp lists (unless the avps, untypically, set the name field), and for answers sent/received in the relay application.
2015-05-17Add counters testcase to relay suiteAnders Svensson
Which fails for a variety of reasons to be addressed in subsequent commits.
2015-05-06Prepare releaseErlang/OTP
2015-05-06Merge branch 'anders/diameter/17.5.3/OTP-12702' into maint-17Erlang/OTP
* anders/diameter/17.5.3/OTP-12702: Fix broken pre-17.4 appup Update appup for 17.5.3 vsn -> 1.9.1
2015-05-06Merge branch 'anders/diameter/counters/OTP-12701' into maint-17Erlang/OTP
* anders/diameter/counters/OTP-12701: Add counters testcase to 3xxx suite Fix counting error with unknown application id Add missing doc wording
2015-05-06Merge branch 'anders/diameter/result_codes/OTP-12654' into maint-17Erlang/OTP
* anders/diameter/result_codes/OTP-12654: Fix broken traffic testcase Match harder in traffic suite Don't confuse Result-Code and Experimental-Result
2015-05-05Fix broken traffic testcaseAnders Svensson
The send_error testcase tested that Session-Id in an answer-message was not undefined, but that's always the case since the AVP has arity 0 or 1. The correct test is that it's a list of length 1, to ensure that diameter has inserted the session id as expected.
2015-05-05Match harder in traffic suiteAnders Svensson
To ensure that the expected answer messages are received.
2015-05-05Don't confuse Result-Code and Experimental-ResultAnders Svensson
Decode of an answer message not setting the E-bit, and containing Experiment-Result but not Result-Code, identified Result-Code as the erroneous when Erroneous-Result-Code was 3xxx. Here's an example (from trace) of a the errors field after decode: [{5004, {diameter_avp,undefined,undefined,false,false,undefined,'Result-Code', 3001,undefined,undefined}}], The diameter_avp was just constructed from the AVP name and decoded result, without regard for which result code AVP contained the value. Fix by extracting the AVP from the incoming message.
2015-05-03Fix broken pre-17.4 appupAnders Svensson
Upgrade instructions have been added for each 17.X release without adjusting the instructions for preceeding releases: the instructions have only been sufficient to upgrading one release at a time: 17.0 to 17.1, 17.1 to 17.2, etc. Conficting load order requirements make smooth upgrade from an arbitrarily old release impossible. In this case, 17.3 looks to be as far back as we can go, so require restart from 17.[0-2] or older. Update the app suite to deal with binary regexps in appup, and to match version numbers harder.
2015-05-03Update appup for 17.5.3Anders Svensson
Required load order by ticket. - OTP-12642, extra bit in diameter_avp.data - OTP-12654, Result-Code/Experimental-Result confusion - OTP-12701, counting error with unknown Application Id none
2015-05-03Add counters testcase to 3xxx suiteAnders Svensson
To start checking that the counters are counting what's expected. The parent commit fixes a case in which they weren't.
2015-05-03Fix counting error with unknown application idAnders Svensson
Statistics could be accumulated on a key like {{23,275,0}, recv} even though 23 was not the application id of the dictionary in question. Missed in commits df19c272 and 7816ab2f.
2015-05-03Add missing doc wordingAnders Svensson
2015-05-03vsn -> 1.9.1Anders Svensson
2015-04-03Remove extra avp bit from diameter_avp decodeAnders Svensson
In the case of a faulty AVP Length (pointing past the end of a message or not spanning the header), an extra bit is prepended to data bytes in diameter_avp:collect_avps/1 in order to force a 5014 decode error. The bit is supposed to be removed as part of the decode in diameter_gen.hrl but this didn't happen in case of an AVP that unknown to the dictionary in question.
2015-03-31Prepare releaseErlang/OTP
2015-03-27Remove potentially large error reason in call to diameter_lib:log/4Anders Svensson
The function is intended to be traced on, to see abnormalities (mostly) without producing excessive output. In the case of decode failure, the error reason can be things like {badmatch, HugeBinary}. Missed in commit 0058430.
2015-03-27Limit FQDN in DiameterURI to 255 octetsAnders Svensson
As for the port number in the parent commit, a FQDN can't be arbitrarily long, at most 255 octets. Make decode fail if it's more.