Age | Commit message (Collapse) | Author |
|
|
|
We have hosts (mostly *very* slooow VMs) that can timeout
anything. Since we are basically testing communication,
we therefor must check for system events at every failure.
Grrr!
|
|
Try to detect if the test case function skip's,
end re-exit so that "eventually" the test case
runner detects it in turn re-exit...#@!!£##!...
|
|
A test run where the node died after a series of "busy dist port"
events and then the node died which caused the process that ran
the test case to crash with reason = noconnection. And since this
was a "permanent" node, the remaining test cases then crashed.
Again, this has nothing to do with snmp or its test suites.
|
|
A very simple way to calculate how long the request timeout
shall be. Previously it was simply a constant (3.5 seconds).
We assume that the number of schedulers matters...
|
|
We maybe needs to throw the skip when we detect system events.
|
|
|
|
When we fail to start a (slave) node with reason timeout,
its possible that it actually succeeded, which will cause
the following test case(s) init to fail (with 'already started',
since it which also try to start this node).
But since we don't know if we get this because for this reason
or because the previous test case failed to clean up after
itself, we cannot assume that the state of the node is ok
(and therefor use it). So, we attempt to stop it, and try
start it again (*one* time).
This has been observed on one (slooow) VM host.
|
|
On solaris 5.11 (actually OpenIndiana Hipster) the arguments
for (some of) the netsnmp programs differ from what is used
in our test suite.
Added actual checks for *one of* the failures, which is
indicated with the printout "illegal option". In those cases
the test case is instead skipped.
|
|
When a manager request timeout occurs (the manager sends
a request to the agent but does not receive a reply in time),
we now check if there has been any system monitor events,
and if so we instead skip's the test case.
|
|
We have experienced intermittent failures of random test cases
on slow VMs. In order to get more info about the system, we
use system monitor to collect info. We have one global monitor
system, and one local monitor on each node. Each local monitor
report to the global, which in turn prints a message.
|
|
Improve the checks for if/when we shall run the IPv6
test cases.
OTP-15764
|
|
Add an "os test" to the IPv6 group init.
On "old" version of darwin (9.8.0) its
simply to messy to figure out our IPv6
address, so its better to simply skip the
IPv6 tests on those machines.
OTP-15764
|
|
Replaced the monitor (to the tc runner process) with
a link. The point is that if the test case "stalls",
the ts (ct) framework shall kill it (with the test case
process).
OTP-15764
|
|
The function used by "all" the agent test cases to
actually run the operations have been improved.
There was previously very little monitoring of the
result. Have added some (minor) checks, both before
trying to running the test case, and "during".
Such as, is the node we attempt to use actually alive.
Then, when we spawn the test case runner process on
the (remote) node, make it report back before trying
to run the actuall test case (so we know that the spawn
worked.
Also added a monitor of the process, so that we will
detect fatal errors.
OTP-15764
|
|
OTP-15764
|
|
Use of the deprecated module random has been replaced
by the module rand.
OTP-15331
|
|
|
|
|
|
For the notify_started02 test case we (now) try estimate how
long we should wait for completion (based on the time of the
first iteration). On slow machines, it takes longer to start and
stop the manager, so adjust the total timeout accordingly.
|
|
1) A test case failed due to the times retreived by
get(snmpEngineTime) reported to large time diffs
(this is basically a sanity check).
Added some more info about time(s) to see if there
are some "gaps" somewhere.
The problem occurred on a slow Virtial Machine.
2) A previous (failing) test case failed to cleanup after
itself (see above), which caused later test cases to fail.
Specifically, the app top supervisor was not terminated,
which caused the start agent function to fail (basically
already_started).
|
|
Added common (formated) timestamp function(s). Made use of
these in the verbosity module (for debug printouts) and in
the test suite(s).
I also *think* I found the cause for some if the test case
failures (timeouts).
For v3 (agent) test cases the test manager makes use of parts
of the agent code: snmp_framework_mib and snmp_user_based_sm_mib.
And since they store their data in snmpa_local_db, that also
needs to be running.
And this was the problem (I think). On some (slow) machines,
the snmpa_local_db process from the *previous* test case
might still be running when the we tried to start it. That meant
that no new snmpa_local_db was started. Instead the old one,
still running but terminating, was retain. For a while. Until
it actually finally stopped. So the next operation towards
snmpa_local_db, insert, simply hanged until the process
terminated.
This in combination with the fact that the packet server process,
which was started using proc_lib, previously called init_ack
before this init was actually done, could actually start and then
at a much later time hang because some operation timed out
(the packet server was hanging).
Yuckety yuck-yuck.
|
|
Improved test printouts (more with timestamps), also
fixed the printouts of the (fake) local-db start
result printout.
Also updated the copyright end-dates.
|
|
Handle when the agent test manager starts the fake local-db
and that process is already running (for some reason).
|
|
The agent test manager had a bug during start that could
potentially cause deadlock, but atleast could cause test
cases to fail because of timeouts. The test manager
(actually the "packet server") used proc_lib to start the
process but it called the init_ack function before the init
was actually complete. This was only a problem for v3 cases
(where it did a bunch of further inits, including starting
the local-db process).
Also did debug/verbosity tweaking. Added a bunch of debug
(verbosity) printouts for the agent test manager "packet server"
during v3 init. Also made sure we could distinguish the
"normal" local-db from the one used by the test manager
(this is done by using a new short-name).
|
|
The EXPECT printouts has been improved. Partly by
including a timestamp.
|
|
The test manager used in the agent tests has been tweaked
in order to increase the readability (both of the code
and the output).
|
|
Improved the test manager printouts to make it easier to
diagnose problems...
|
|
|
|
|
|
Some simple tweaking to find the "proper" address
to localhost (to make test suite work in the office
environment).
|
|
Try to handle "failures" during init_per_testcase such that
if they are either throw or exit, then they are transformed
into a skip.
|
|
On some linux platforms "they" add a 127.0.1.1
in the hosts file, wich can cause problems for
some (manager) test cases. So, just to be on
the safe side, make sure we bind to the configured
address.
Note that this has nothing to do with the current
issue.
|
|
Added the Extra (net-if data) argument to all the get-
mechanism callback functions.
OTP-15691
|
|
Added the snmpa_get module as the default get-mechanism
for the agent.
This has been done by simply moving the do_get, do_get_next
and do_get_bulk functions from the snmpa_agent module.
Some functions where also moved into the lib module (with
the idea of beeing more generally useful).
OTP-15691
|
|
Removed the last vestiges of the otp_mibs app from the
compiler test suite (was still trying to use MIBs from
otp_mibs).
|
|
OTP-14984
|
|
|
|
|
|
I did not find any legitimate use of "can not", however skipped
changing e.g RFCs archived in the source tree.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
* ingela/snmp/ipv6-tests:
snmp: Use ipv6 common test configuration check
|
|
The test for ipv6 could return false positives which resulted in failing
test cases due lack of full ipv6 support.It could be nice with a working
run-time check but this will do for now.
|