aboutsummaryrefslogblamecommitdiffstats
path: root/lib/ssl/doc/src/ssl_protocol.xml
blob: 3dc23327953494ea9a440c385d012e142cd27813 (plain) (tree)




























































































































































































































































































































































                                                                                                                                  
<?xml version="1.0" encoding="latin1" ?>
<!DOCTYPE chapter SYSTEM "chapter.dtd">

<chapter>
  <header>
    <copyright>
      <year>2003</year><year>2009</year>
      <holder>Ericsson AB. All Rights Reserved.</holder>
    </copyright>
    <legalnotice>
      The contents of this file are subject to the Erlang Public License,
      Version 1.1, (the "License"); you may not use this file except in
      compliance with the License. You should have received a copy of the
      Erlang Public License along with this software. If not, it can be
      retrieved online at http://www.erlang.org/.
    
      Software distributed under the License is distributed on an "AS IS"
      basis, WITHOUT WARRANTY OF ANY KIND, either express or implied. See
      the License for the specific language governing rights and limitations
      under the License.
    
    </legalnotice>

    <title>The SSL Protocol</title>
    <prepared>Peter H&ouml;gfeldt</prepared>
    <docno></docno>
    <date>2003-04-28</date>
    <rev>PA2</rev>
    <file>ssl_protocol.xml</file>
  </header>
  <p>Here we provide a short introduction to the SSL protocol. We only
    consider those part of the protocol that are important from a
    programming point of view.
    </p>
  <p>For a very good general introduction to SSL and TLS see the book
    <cite id="rescorla"></cite>.
    </p>
  <p><em>Outline:</em></p>
  <list type="bulleted">
    <item>Two types of connections - connection: handshake, data transfer, and
     shutdown - 
     SSL/TLS protocol - server must have certificate - what the the
     server sends to the client - client may verify the server -
     server may ask client for certificate - what the client sends to
     the server - server may then verify the client - verification -
     certificate chains - root certificates - public keys - key
     agreement - purpose of certificate - references</item>
  </list>

  <section>
    <title>SSL Connections</title>
    <p>The SSL protocol is implemented on top of the TCP/IP protocol.
      From an endpoint view it also has the same type of connections
      as that protocol, almost always created by calls to socket
      interface functions <em>listen</em>, <em>accept</em> and
      <em>connect</em>. The endpoints are <em>servers</em> and
      <em>clients</em>.
      </p>
    <p>A <em>server</em><em>listen</em>s for connections on a
      specific address and port. This is done once. The server then
      <em>accept</em>s each connections on that same address and
      port. This is typically done indefinitely many times.
      </p>
    <p>A <em>client</em> connects to a server on a specific address 
      and port. For each purpose this is done once.
      </p>
    <p>For a plain TCP/IP connection the establishment of a connection
      (through an accept or a connect) is followed by data transfer between
      the client and server, finally ended by a connection close. 
      </p>
    <p>An SSL connection also consists of data transfer and connection
      close, However, the data transfer contains encrypted data, and
      in order to establish the encryption parameters, the data
      transfer is preceded by an SSL <em>handshake</em>. In this
      handshake the server plays a dominant role, and the main
      instrument used in achieving a valid SSL connection is the
      server's <em>certificate</em>. We consider certificates in the
      next section, and the SSL handshake in a subsequent section.</p>
  </section>

  <section>
    <title>Certificates</title>
    <p>A certificate is similar to a driver's license, or a
      passport. The holder of the certificate is called the
      <em>subject</em>.  First of all the certificate identifies the
      subject in terms of the name of the subject, its postal address,
      country name, company name (if applicable), etc.
      </p>
    <p>Although a driver's license is always issued by a well-known and
      distinct authority, a certificate may have an <em>issuer</em>
      that is not so well-known. Therefore a certificate also always
      contains information on the issuer of the certificate. That
      information is of the same type as the information on the
      subject. The issuer of a certificate also signs the certificate
      with a <em>digital signature</em> (the signature is an inherent
      part of the certificate), which allow others to verify that the
      issuer really is the issuer of the certificate.
      </p>
    <p>Now that a certificate can be checked by verifying the
      signature of the issuer, the question is how to trust the
      issuer. The answer to this question is to require that there is
      a certificate for the issuer as well. That issuer has in turn an
      issuer, which must also have a certificate, and so on. This
      <em>certificate chain</em> has to have en end, which then must
      be a certificate that is trusted by other means. We shall cover
      this problem of <em>authentication</em> in a subsequent
      section.
      </p>
  </section>

  <section>
    <title>Encryption Algorithms</title>
    <p>An encryption algorithm is a mathematical algorithm for
      encryption and decryption of messages (arrays of bytes,
      say). The algorithm as such is always required to be publicly
      known, otherwise its strength cannot be evaluated, and hence it
      cannot be used reliably. The secrecy of an encrypted message is
      not achieved by the secrecy of the algorithm used, but by the
      secrecy of the <em>keys</em> used as input to the encryption and
      decryption algorithms. For an account of cryptography in general
      see <cite id="schneier"></cite>.
      </p>
    <p>There are two classes of encryption algorithms: <em>symmetric key</em> algorithms and <em>public key</em> algorithms.  Both
      types of algorithms are used in the SSL protocol.
      </p>
    <p>In the sequel we assume holders of keys keep them secret (except
      public keys) and that they in that sense are trusted. How a 
      holder of a secret key is proved to be the one it claims to be
      is a question of <em>authentication</em>, which, in the context
      of the SSL protocol, is described in a section further below.
      </p>

    <section>
      <title>Symmetric Key Algorithms</title>
      <p>A <em>symmetric key</em> algorithm has one key only. The key
        is used for both encryption and decryption. Obviously the key
        of a symmetric key algorithm must always be kept secret by the
        users of the key. DES is an example of a symmetric key
        algorithm.
        </p>
      <p>Symmetric key algorithms are fast compared to public key
        algorithms. They are therefore typically used for encrypting
        bulk data.
        </p>
    </section>

    <section>
      <title>Public Key Algorithms</title>
      <p>A <em>public key</em> algorithm has two keys. Any of the two
        keys can be used for encryption. A message encrypted with one
        of the keys, can only be decrypted with the other key. One of
        the keys is public (known to the world), while the other key
        is private (i.e. kept secret) by the owner of the two keys.
        </p>
      <p>RSA is an example of a public key algorithm.
        </p>
      <p>Public key algorithms are slow compared to symmetric key
        algorithms, and they are therefore seldom used for bulk data
        encryption. They are therefore only used in cases where the
        fact that one key is public and the other is private, provides
        features that cannot be provided by symmetric algorithms.
        </p>
    </section>

    <section>
      <title>Digital Signature Algorithms</title>
      <p>An interesting feature of a public key algorithm is that its
        public and private keys can both be used for encryption.
        Anyone can use the public key to encrypt a message, and send
        that message to the owner of the private key, and be sure of
        that only the holder of the private key can decrypt the
        message.
        </p>
      <p>On the other hand, the owner of the private key can encrypt a
        message with the private key, thus obtaining an encrypted
        message that can decrypted by anyone having the public key.
        </p>
      <p>The last approach can be used as a digital signature
        algorithm.  The holder of the private key signs an array of
        bytes by performing a specified well-known <em>message digest algorithm</em> to compute a hash of the array, encrypts the
        hash value with its private key, an then presents the original
        array, the name of the digest algorithm, and the encryption of
        the hash value as a <em>signed array of bytes</em>.
        </p>
      <p>Now anyone having the public key, can decrypt the encrypted
        hash value with that key, compute the hash with the specified
        digest algorithm, and check that the hash values compare equal
        in order to verify that the original array was indeed signed
        by the holder of the private key.
        </p>
      <p>What we have accounted for so far is by no means all that can
        be said about digital signatures (see <cite id="schneier"></cite>for
        further details).
        </p>
    </section>

    <section>
      <title>Message Digests Algorithms</title>
      <p>A message digest algorithm is a hash function that accepts 
        an array bytes of arbitrary but finite length of input, and 
        outputs an array of bytes of fixed length. Such an algorithm
        is also required to be very hard to invert.  
        </p>
      <p>MD5 (16 bytes output) and SHA1 (20 bytes output) are examples
        of message digest algorithms.
        </p>
    </section>
  </section>

  <section>
    <title>SSL Handshake</title>
    <p>The main purpose of the handshake performed before an an SSL
      connection is established is to negotiate the encryption
      algorithm and key to be used for the bulk data transfer between
      the client and the server. We are writing <em>the</em> key,
      since the algorithm to choose for bulk encryption one of the
      symmetric algorithms.
      </p>
    <p>There is thus only one key to agree upon, and obviously that
      key has to be kept secret between the client and the server. To
      obtain that the handshake has to be encrypted as well.
      </p>
    <p>The SSL protocol requires that the server always sends its
      certificate to the client in the beginning of the handshake. The
      client then retrieves the server's public key from the
      certificate, which means that the client can use the server's
      public key to encrypt messages to the server, and the server can
      decrypt those messages with its private key. Similarly, the
      server can encrypt messages to the client with its private key,
      and the client can decrypt messages with the server's public
      key. It is thus is with the server's public and private keys
      that messages in the handshake are encrypted and decrypted, and
      hence the key agreed upon for symmetric encryption of bulk data
      can be kept secret (there are more things to consider to really
      keep it secret, see <cite id="rescorla"></cite>).
      </p>
    <p>The above indicates that the server does not care who is
      connecting, and that only the client has the possibility to
      properly identify the server based on the server's certificate.
      That is indeed true in the minimal use of the protocol, but it
      is possible to instruct the server to request the certificate of
      the client, in order to have a means to identify the client, but
      it is by no means required to establish an SSL connection.
      </p>
    <p>If a server request the client certificate, it verifies, as a
      part of the protocol, that the client really holds the private
      key of the certificate by sending the client a string of bytes
      to encrypt with its private key, which the server then decrypts
      with the client's public key, the result of which is compared
      with the original string of bytes (a similar procedure is always
      performed by the client when it has received the server's
      certificate).
      </p>
    <p>The way clients and servers <em>authenticate</em> each other,
      i.e. proves that their respective peers are what they claim to
      be, is the topic of the next section.
      </p>
  </section>

  <section>
    <title>Authentication</title>
    <p>As we have already seen the reception of a certificate from a
      peer is not enough to prove that the peer is authentic. More
      certificates are needed, and we have to consider how certificates
      are issued and on what grounds.
      </p>
    <p>Certificates are issued by <em>certification authorities</em>
      (<em>CA</em>s) only. They issue certificates both for other CAs
      and ordinary users (which are not CAs). 
      </p>
    <p>Certain CAs are <em>top CAs</em>, i.e. they do not have a
      certificate issued by another CA. Instead they issue their own
      certificate, where the subject and issuer part of the
      certificate are identical (such a certificate is called a
      self-signed certificate). A top CA has to be well-known, and has
      to have a publicly available policy telling on what grounds it
      issues certificates. 
      </p>
    <p>There are a handful of top CAs in the world. You can examine the
      certificates of several of them by clicking through the menus of
      your web browser. 
      </p>
    <p>A top CA typically issues certificates for other CAs, called
      <em>intermediate CAs</em>, but possibly also to ordinary users. Thus
      the certificates derivable from a top CA constitute a tree, where
      the leaves of the tree are ordinary user certificates. 
      </p>
    <p>A <em>certificate chain</em> is an ordered sequence of
      certificates, <c>C1, C2, ..., Cn</c>, say, where <c>C1</c> is a
      top CA certificate, and where <c>Cn</c> is an ordinary user
      certificate, and where the holder of <c>C1</c> is the issuer of
      <c>C2</c>, the holder of <c>C2</c> is the issuer of <c>C3</c>,
      ..., and the holder of <c>Cn-1</c> is the issuer of <c>Cn</c>,
      the ordinary user certificate. The holders of <c>C2, C3, ..., Cn-1</c> are then intermediate CAs.
      </p>
    <p>Now to verify that a certificate chain is unbroken we have to
      take the public key from each certificate <c>Ck</c>, and apply
      that key to decrypt the signature of certificate <c>Ck-1</c>,
      thus obtaining the message digest computed by the holder of the
      <c>Ck</c> certificate, compute the real message digest of the
      <c>Ck-1</c> certificate and compare the results. If they compare
      equal the link of the chain between <c>Ck</c> and <c>Ck-1</c> is
      considered to unbroken. This is done for each link k = 1, 2,
      ..., n-1. If all links are found to be unbroken, the user
      certificate <c>Cn</c> is considered authenticated.
      </p>

    <section>
      <title>Trusted Certificates</title>
      <p>Now that there is a way to authenticate a certificate by
        checking that all links of a certificate chain are unbroken,
        the question is how you can be sure to trust the certificates
        in the chain, and in particular the top CA certificate of the
        chain.
        </p>
      <p>To provide an answer to that question consider the
        perspective of a client, which have just received the
        certificate of the server. In order to authenticate the server
        the client has to construct a certificate chain and to prove
        that the chain is unbroken. The client has to have a set of CA
        certificates (top CA or intermediate CA certificates) not
        obtained from the server, but obtained by other means. Those
        certificates are kept <c>locally</c> by the client, and are 
        trusted by the client.
        </p>
      <p>More specifically, the client does not really have to have
        top CA certificates in its local storage. In order to
        authenticate a server it is sufficient for the client to
        posses the trusted certificate of the issuer of the server
        certificate.
        </p>
      <p>Now that is not the whole story. A server can send an
        (incomplete) certificate chain to its client, and then the
        task of the client is to construct a certificate chain that
        begins with a trusted certificate and ends with the server's
        certificate. (A client can also send a chain to its server, 
        provided the server requested the client's certificate.)
        </p>
      <p>All this means that an unbroken certificate chain begins with
        a trusted certificate (top CA or not), and ends with the peer
        certificate. That is the end of the chain is obtained from the
        peer, but the beginning of the chain is obtained from local
        storage, which is considered trusted.
        </p>
    </section>
  </section>
</chapter>