aboutsummaryrefslogblamecommitdiffstats
path: root/lib/test_server/doc/src/why_test_chapter.xml
blob: 745d4218f1baba6e5f47b9528f86effb7e8664a9 (plain) (tree)











































































































































                                                                            
<?xml version="1.0" encoding="latin1" ?>
<!DOCTYPE chapter SYSTEM "chapter.dtd">

<chapter>
  <header>
    <copyright>
      <year>2002</year><year>2009</year>
      <holder>Ericsson AB. All Rights Reserved.</holder>
    </copyright>
    <legalnotice>
      The contents of this file are subject to the Erlang Public License,
      Version 1.1, (the "License"); you may not use this file except in
      compliance with the License. You should have received a copy of the
      Erlang Public License along with this software. If not, it can be
      retrieved online at http://www.erlang.org/.
    
      Software distributed under the License is distributed on an "AS IS"
      basis, WITHOUT WARRANTY OF ANY KIND, either express or implied. See
      the License for the specific language governing rights and limitations
      under the License.
    
    </legalnotice>

    <title>Why Test</title>
    <prepared>Siri Hansen</prepared>
    <docno></docno>
    <date></date>
    <rev></rev>
  </header>

  <section>
    <title>Goals</title>
    <p>It's not possible to prove that a program is correct by
      testing. On the contrary, it has been formally proven that it is
      impossible to prove programs in general by testing. Theoretical
      program proofs or plain examination of code may be viable options
      for those that wish to certify that a program is correct. The test
      server, as it is based on testing, cannot be used for
      certification. Its intended use is instead to (cost effectively)
      <em>find bugs</em>. A successful test suite is one that reveals a
      bug. If a test suite results in Ok, then we know very little that
      we didn't know before.
      </p>
  </section>

  <section>
    <title>What to test?</title>
    <p>There are many kinds of test suites. Some concentrate on
      calling every function in the interface to some module or
      server. Some other do the same, but uses all kinds of illegal
      parameters, and verifies that the server stays alive and rejects
      the requests with reasonable error codes. Some test suites
      simulate an application (typically consisting of a few modules of
      an application), some try to do tricky requests in general, some
      test suites even test internal functions.
      </p>
    <p>Another interesting category of test suites are the ones that
      check that fixed bugs don't reoccur. When a bugfix is introduced,
      a test case that checks for that specific bug should be written
      and submitted to the affected test suite(s).
      </p>
    <p>Aim for finding bugs. Write whatever test that has the highest
      probability of finding a bug, now or in the future. Concentrate
      more on the critical parts. Bugs in critical subsystems are a lot
      more expensive than others.
      </p>
    <p>Aim for functionality testing rather than implementation
      details. Implementation details change quite often, and the test
      suites should be long lived. Often implementation details differ
      on different platforms and versions. If implementation details
      have to be tested, try to factor them out into separate test
      cases. Later on these test cases may be rewritten, or just
      skipped.
      </p>
    <p>Also, aim for testing everything once, no less, no more. It's
      not effective having every test case fail just because one
      function in the interface changed.
      </p>
  </section>

  <section>
    <title>How much to test</title>
    <p>There is a unix shell script that counts the number of non
      commented words (lines and characters too) of source code in each
      application's test directory and divides with the number of such
      source words in the src directory. This is a measure of how much
      test code there is.
      </p>
    <p>There has been much debate over how much test code, compared to
      production code, should be written in a project. More test code
      finds more bugs, but test code needs to be maintained just like
      the production code, and it's expensive to write it in the first
      place. In several articles from relatively mature software
      organizations that I have read, the amount of test code has been
      about the same as the production code.  </p>
    <p>In OTP, at the time of
      writing, few applications come even close to this, some have no
      test code at all.
      </p>

    <section>
      <title>Full coverage</title>
      <p>It is possible to cover compile the modules being tested
        before running the test suites. Doing so displays which branches
        of the code that are tested by the test suite, and which are
        not. Many use this as a measure of a good test suite. When every
        single line of source code is covered once by the test suite,
        the test suite is finished.
        </p>
      <p>A coverage of 100% still proves nothing, though. It doesn't
        mean that the code is error free, that everything is tested. For
        instance, if a function contains a division, it has to be
        executed at least twice. Once with parameters that cause
        division by zero, and once with other parameters.
        </p>
      <p>High degree of coverage is good of course, it means that no
        major parts of the code has been left untested. It's another
        question whether it's cost effective. You're only likely to find
        50% more bugs when going from 67% to 100% coverage, but the work
        (cost) is maybe 200% as large, or more, because reaching all of
        those obscure branches is usually complicated.
        </p>
      <p>Again, the reason for testing with the test server is to find
        bugs, not to create certificates of valid code. Maximizing the
        number of found bugs per hour probably means not going for 100%
        coverage. For some module the optimum may be 70%, for some other
        maybe 250%. 100% shouldn't be a goal in itself.</p>
    </section>

    <section>
      <title>User interface testing</title>
      <p>It is very difficult to do sensible testing of user
        interfaces, especially the graphic ones. The test server has
        some support for capturing the text I/O that goes to the user,
        but none for graphics. There are several tools on the market
        that help with this.</p>
    </section>
  </section>
</chapter>