Securing Communications with SSL/TLS: A High-Level Overview

by Chris Pepper

In Part 1 of this article, we discuss SSL/TLS and how they work. In Part 2, we will discuss command-line procedures for actually using SSL/TLS certificates. In Part 3, we will look at some streamlined CA tools.

SSL (Secure Sockets Layer) and TLS (Transport Layer Security) are systems for providing security to Internet communications (especially web browsing). Specifically, they use encryption to provide confidentiality (privacy) and authentication (authorization) There are three major versions of SSL (v1, v2, and v3); the fourth version was renamed before release, to become "TLS v1". SSL and TLS are based upon a) public key encryption and decryption, b) simple identifying information, and c) trust relationships. In combination, these elements make SSL/TLS useful for protecting a broad range of Internet communications.

http://en.wikipedia.org/wiki/Public_key

If you just want to surf the web, you can probably ignore this article, but if you are concerned about 'phishing' scams and identity theft (and really, pretty much everybody should be to some degree or other in 2007), this article should help you understand one of the more important protections from online criminals. For people who manage websites (a rapidly increasing population), information about how to deal with SSL/TLS and certificates may be helpful, both for providing privacy & security, and also for deciding whether to it's appropriate or worthwhile to purchase a certificate. The rest of this series will provide details on the technical procedures. It's worth mentioning that certificates are useful for purposes beyond SSL/TLS encryption; they can also be used for email encryption and software code signing (of whole applications, ActiveX controls, Firefox plug-ins, etc.).

To establish an SSL/TLS connection, one or both parties must have a certificate, which includes start and end dates for validity, the name of the entity certified, and one or more "signatures", which attest to its validity. In HTTPS communications (encrypted web browsing), the server always provides a certificate; the client may as well, although client certificates are not yet common.

Public Key Encryption: The Short Version -- Regular ('symmetric') encryption works by using a password (also called a key) to mathematically transform text into gibberish. Only the same password can be used to reverse the process, recreating the original plaintext. The problem for communications is that it requires both communicators to know the password (and the encryption/decryption algorithms), and to keep it secret from everybody else. This clearly doesn't scale easily - it wouldn't be possible to visit every person or organization you communicate with, create a new secret password, and then use that password just to communicate with that party, with a unique and secret password for each bank, online vendor, community site, etc. Both establishing all those new passwords, and keeping track of them securely, would be very difficult.

Public key encryption (also called private key cryptography) addresses both problems by working with pairs of keys (called "private" and "public"), each of which can reverse the other (but not itself!). This is very strange to people who are only familiar with symmetric encryption. Paired keys solve several problems in privacy and identification.

Possession of a private key can "prove" identity: As a rule, only a private key's creator can encrypt and decrypt with that key (since they are never shared). For an over-simplified example, imagine a Citibank customer uses her private key to encrypt her account number, and sends it to www.citibank.com. If they have her public key on file and linked to an account, successful decryption provides fairly strong assurance that the party who sent the encrypted account number is the right customer - private keys are much harder to steal or forge than ink signatures on paper. As a bonus, digital signatures work instantaneously over the Internet.

Digital signatures have one highly unusual characteristic. Most secrets tend to 'leak out' if they're used too frequently, but digital signatures (and private keys in general) become more valuable as they are used, building up credibility (in public key terms, this is called "trust"). Keys start out with no trust, and may gain trust through a) faith ("Nobody would bother to break into my personal webmail server."), b) assurance (when someone else vouches for them), c) out-of-band verification, and d) experience. Even better is personal verification. If you give me your key at an event where I can recognize your face, I can have a lot of confidence in that key. Each such key exchange event adds value to the keys exchanged.

In reality, sending account numbers is not a good use of encryption, because if an attacker knows both the ciphertext (which we have to assume could be intercepted - if we knew nobody could tap our communications, we wouldn't need encryption!) and the plaintext, they could (theoretically) be able to find a correlation which helps break the encryption or calculate the key used. Real encryption tends to use lots of random numbers and disposable keys, precisely to avoid the possibility of "known plaintext" attacks.

Unfortunately, private/public key encryption and decryption are slow - they're much more difficult to compute than conventional single-key algorithms, due to the exotic mathematics underpinning asymmetric algorithms. Most public-key cryptosystems (including SSL/TLS) actually encrypt the data to be exchanged with symmetric encryption, which is fast and efficient. Asymmetric encryption is reserved for exchange of the (short-lived) symmetric keys. As a bonus, this combination frustrates cryptanalysis by not providing large amounts of data encrypted with any one key to analyze. Symmetric keys are only used for a short time and then discarded, while asymmetric keys are only used for (symmetric) key exchange, rather than for all encrypted data.

Imagine an idealized example:

Citibank and I each separately create our own private/public keypairs, which we can use with each other and with anyone else.
I create a new bank account, and Citibank and I exchange public keys (in addition to, or instead of, my handwritten signature). Note that we never give our private keys to anyone else; having a private key could be considered to grant a limited power-of-attorney.
I visit www.citibank.com with my web browser.
Citibank's web server randomly generates a number between 0 and 2^1024-1 ("a 1024-bit number"), which we will call 'randomS'.
Citibank encrypts randomS with my public key, and sends it to my browser.
My browser decrypts randomS with my private key.
My browser generates another 1024-bit random number, encrypts it with Citibank's public key, and sends it to Citibank (call this 'randomC').
Now that Citibank's web server and my browser both know two secret numbers (and nobody else can, because they don't have our private keys to decrypt and discover the secrets, even if they are eavesdropping on our communications), we can combine randomS and randomC and some additional random data to create a session key (perhaps use a logical XOR operation on all the bits to derive a new number - the real algorithm is more complicated).
Each time either of us wants to send any information to the other side, whether a URL or a web page, we use symmetric encryption such as AES-128 (the Advanced Encryption Standard with 128-bit blocks) to quickly encrypt whatever we send, and the recipient uses the session key to decrypt.
Every two minutes we repeat the key exchange procedure to generate a brand-new session key; this counters decryption attacks based on analyzing large amounts of cyphertext, by ensuring that a cryptanalyst never has much encrypted data from any one (session) key to work with.

http://en.wikipedia.org/wiki/Advanced_Encryption_Standard

Session key agreement is actually more complicated than that, and SSL/TLS normally works without a client certificate, but you see the key concepts. Note that I can safely use the same procedure with any number of different correspondents (web sites), discarding the session keys after use, and re-using the same private key in all my communications.

If I generate a random SSL/TLS client certificate for each session I can communicate with Citibank using this model, but there is no cryptographic assurance that I'm the customer I claim to be. In the future, customers may provide public keys when opening important accounts, providing strong cryptographic identification, linked to account authorization. For now, though, banks still use passwords and other methods such as physical password generators (called "hard tokens") and scratch-off password sheets for user authentication. It is worth mentioning that dual-key authentication is a central feature of the ssh ("Secure SHell") protocols for remote login, which are derived from SSL.

Since banks don't actually provide their public keys to customers when opening bank accounts, things are somewhat more complicated. SSL/TLS certificates combine cryptographic information, identity information, and external assurance information, enabling me to trust that the web server I am communicating with really does belong to Citibank.

Who Do You Trust? -- Keeping in mind that public keys are really just large numbers, how can we know who or what is behind a given public key? After all, I could create a certificate and claim it belongs to the pope. SSL/TLS handles this with trusted certificate authorities, where some trusted party vouches for a given certificate. Every web browser ships with a bundle of trusted "root" SSL/TLS certificates, and every certificate signed by them is trusted by that browser. Additionally, the entities that own these certificates (called "certification authorities", "certificate authorities", or "CAs") may delegate their trust, signing 'intermediate' certificates which are also trusted to sign further certificates; this hierarchy of trust is called a "certificate chain". So long as you stay within the lines (only visit sites certified - directly or indirectly - by CAs trusted by your browser), you need not worry about this. If you want to step outside the lines, however, things are more complicated, as described later.

CAs are not the only way to establish trust, of course. In particular, PGP/GPG (Pretty Good Privacy/GNU Privacy Guard, tools for encrypting and decrypting files and messages) uses a "web of trust" concept, eschewing commercial authorities in favor of people signing each other's public keys. SSH doesn't assume a particular method of establishing trust; access is granted either by an administrator according to local policies, or by installing one's own key file via some pre-existing authorization.

Establishing an SSL/TLS Certificate: a Walkthrough -- To establish a secure website, I first create a public/private keypair. I keep my private key secret, and never share it with anyone. Next, I combine the public key with some identifying information, such as the site's domain name and owner, to create a certificate signing request (CSR). CSRs themselves aren't useful, but once a CSR is signed it becomes a certificate, which can be used to identify a server for SSL/TLS communications.

Typically, I send my CSR to a Certificate Authority, which then confirms to its satisfaction (their requirements vary widely) that I'm authorized to use this (domain) name, adds start and end dates to the CSR, and then signs the whole thing with the CA private key, producing a certificate. The CA sends the certificate back to me, and my web server software can then use it (paired with the corresponding private key, which never left my possession) to establish SSL/TLS connections with browsers. Browsers in turn trust my web server based on recognizing and trusting the CA's signature. The start and end dates help ensure that certificates aren't used forever (good for security), and also to push users to pay for renewed certificates.

Warnings -- Because CAs vouch for the identity of the certificate's owner, they tend to be picky about the details of the certificate request. Misspelling a name can delay certificate issuance, and requests for certificates under different business names can be even more troublesome.

Since people trust signed certificates to identify web sites and protect their confidentiality, SSL/TLS keys (the secret part) must be kept secret and safe. In the best case, if you lose your signed certificate or key, you could be out a few hundred dollars and offline while getting a new one. In the worst case, if a hostile party (cracker, FBI agent, or your ex) got a copy of an SSL/TLS certificate, they could either impersonate the real site, or decrypt all supposedly secure communications sent to that site. There is a US federal standard (FIPS 140) dealing with how to secure such confidential data, and it goes into topics such as tamper-proof hardware and multi-party authorization, but most people secure their private keys either with a password (which must be entered to start the web server after a reboot), or simply by protecting the computer containing the key, which enables rebooted computers to resume serving HTTPS web sites without human intervention. This is important to think about when first venturing into SSL/TLS, and much more so for Certificate Authorities.

http://en.wikipedia.org/wiki/FIPS_140-2

Of Course It's More Complicated Than That -- With Internet Explorer 7, Microsoft has introduced "Extended Validation" (EV) for "High Assurance" SSL/TLS certificates, adding additional checks of the SSL/TLS configuration, warning of reported problematic sites, and adding specific requirements from CAs. Other browsers, such as Firefox (and thus presumably Safari at some point), are expected to follow suit. EV certificates are of course more expensive.

http://en.wikipedia.org/wiki/Extended_Validation_Certificate

A Certificate Authority is responsible for verifying that each request comes from the party described in the certificate, that this organization has legitimate ownership of the domain, and that the requester is authorized to make the request - in the case of Citibank, this would be someone with responsibility for the citibank.com web servers. The details of what is required and how it is verified vary between CAs. EV certificates are more expensive, and intended to have higher standards than non-EV certificates.

There are alternatives to paying a Certification Authority to sign your certificate. First, you could sign it yourself; such a self-signed certificate lacks a third party's assurance of authenticity, but provides exactly the same encryption as a "real" certificate with a CA's signature. Alternatively, you can become your own CA and avoid the CA fees. For large numbers of certificates, the savings can justify the substantial overhead of managing a CA. The problem is that visitors to your site must a) deal with a (legitimate) security warning, and b) trust either the site certificate or the your new CA certificate; procedures for doing so vary across browsers and versions, and (because criminals can to be CAs as easily as anyone else) some browsers make it deliberately difficult to trust a new CA.

Prices vary widely among the different CA companies. VeriSign is one of the largest and most expensive, charging $1,000 for a 128-bit certificate lasting a year, or $1,500 with EV. When Thawte undercut VeriSign's prices and threatened their market share, VeriSign bought Thawte, retaining the brand for cheaper certificates ($700 or $900 for 128-bit certificates for a year, with or without EV, but the process of installing a Thawte certificate is more difficult, because an intermediate certificate must also be installed). Recently, when GeoTrust threatened VeriSign's popularity and pricing, VeriSign repeated the performance (GeoTrust charges $180 for a 128-bit 1-year certificate, but also $900 for EV). Because they are so expensive, CAs offer various discounts for longer-lasting certificates or multiple purchases, and renewals are typically cheaper than new certificates. Most CAs are quite conscientious about reminding their users to renew certificates before they expire (and pay for the privilege), but they're generally good about preserving any unused time, so there is no penalty for early renewal. A late renewal can be quite embarrassing, as users are asked if they trust the expired certificate; putting certificate expirations into a calendar program such as iCal can help avoid these problems.

http://www.verisign.com/ssl/buy-ssl-certificates/secure-site-services/

http://www.thawte.com/ssl-digital-certificates/buy-ssl-certificates/

http://www.geotrust.com/buy/geotrust_ssl_certs.asp

All CAs offer the same basic service (signing of SSL/TLS CSRs to produce trusted certificates); but there are many variables; including reputations, administrator convenience of the certification process, admin convenience of actually using the certs, user convenience accessing certified sites, and CA policies. In an attempt to justify their prices, many CAs offer guarantees of integrity for the certificates (web sites) they certify, such as VeriSign's Secured Seal program.

When Is SSL/TLS Useful? -- In real-life terms, people use SSL/TLS for two reasons: privacy and identity assurance. First, the encryption helps prevent criminals from prying into electronic communications, and particularly from capturing passwords, which would provide access to email, bank accounts, etc. Second, SSL/TLS certificates provide a fairly good (though certainly not perfect) guarantee that a web site with the lock icon is legitimate and trustworthy. Unfortunately, there are easier ways to attack SSL/TLS sites than actually breaking the encryption, including using similar names to legitimate sites (with foreign alphabets, they may even be visually indistinguishable from the legitimate name), using JavaScript to fake the SSL/TLS lock, and even putting a lock icon into the page content, where many people will not realize it's a design artifact rather than a security guarantee.

To see a site's SSL/TLS certificate details, visit the site in a browser (the SSL/TLS URL will start with "https://"), and click the lock icon (Safari shows it in the upper-right corner; Firefox and IE use the lower-right corner). As an example, Apple's https://store.apple.com/ certificate was issued by the "VeriSign Trust Network" and signed by "VeriSign, Inc." That VeriSign certificate was in turn signed by VeriSign's "Class 3 Public Primary Certification Authority". The "Class 3" certificate is trusted by most browsers in use today. In Mac OS X, it's visible in Keychain Access, in the "X509Anchors" keychain (SSL/TLS certificates are based on the X.509 standard); Firefox also the X509 root certificates inside Firefox.app, because Firefox doesn't use the Apple Keychain. Because the Class 3 certificate is built in, Safari and Firefox users see a lock icon instead of scary warnings when using SSL/TLS sites authorized by that Class 3 certificate, including https://store.apple.com/.

http://en.wikipedia.org/wiki/X.509

There are many CAs, but working with a new CA is problematic compared to using a better established CA. In this case "better established" means bundled into more browsers and versions, because when a browser visits a site with an unknown certificate it presents a (deliberately scary) warning that security cannot be assured, and nobody wants that to be the first user experience of their site - especially when selling online. This applies equally to self-signed certificates, those signed by a private CA (such as a university or company without a "parent" signature from an official CA), and certs signed by "upstart" commercial CAs not yet bundled in the user's particular browser.

http://news.netcraft.com/SSL-survey

On the other hand, establishing your own private CA costs nothing - the free OpenSSL can do it all. It just takes an investment of time to learn the procedures and a security commitment to protect the root key, which is the security linchpin for all child certificates. The details are outside the scope of this article, but there are several online resources to get started, and the procedure can be automated and streamlined quite effectively. In my own testing, I've produced two simple scripts (cert.sh and sign.sh); using either one of these scripts, I provide the hostname (twice) and the root key's passphrase, and hit Return a bunch of times; most of the rest is automated. OpenSSL includes CA.pl, another script to automate these tasks.

http://www.openssl.org/

Stuffing the Genie back in the Bottle: Intermediates and Revocation -- The x.509 standard supports multi-level 'chains' of certificates and signatures. Most chains are only two or three steps long - a root certificate, a server certificate, and perhaps an intermediate certificate. Fortunately, intermediates generally don't affect the user experience of SSL/TLS, although they make life somewhat more complicated for site administrators. If you get an Intermediate certificate from your CA (commercial or private), simply follow the instructions that came with it.

If you lose your car or house keys, you can change the locks. For SSL/TLS the equivalent is 'revocation', whereby one identifies a keypair as compromised and informs people not to use it. Unfortunately, revocation is a very difficult problem for several reasons. For one, revocations must be managed as carefully as certificate signatures - it would be unacceptable if someone could revoke Amazon's SSL/TLS certificate out of spite - but private keys are tightly restricted, so what if the computer containing the only copy of the key is stolen? Additionally, the SSL/TLS design doesn't make any assumptions or demands about timeliness, but if a certificate has been compromised, the revocation should happen before anyone is able to commit fraud with the stolen certificate & key. As a result, although there are many revocation systems, they are largely unused.

But Who Cares, Anyway? -- Anyone who ever enters sensitive information at a website, whether it's a credit card number, phone number, home address, or supposedly anonymous rant, should check for 'https' in the URL, and consider seriously any warnings about expired, mis-named, or otherwise untrusted certificates. If your browser warns you about a site, please consider the warning carefully, and decide if it means you should go elsewhere or proceed with your eyes open.

This also applies to other electronic communications. Email communications should also be protected with encryption, and SSL/TLS is one of the easier ways, although server-to-server connections are rarely encrypted, which is a weakness not normally present with web surfing. Fortunately, Mail.app and Apple's .Mac mail service support SSL/TLS (but unfortunately not for webmail, which is bad). To configure a .Mac account to use SSL/TLS, click SMTP Settings in "Account Information", and "Use Secure Sockets Layer (SSL)", and "Use SSL" under the "Advanced" tab.

File transfers and remote login can be protected with ssh encryption (similar to SSL/TLS, but implemented differently), or even tunneled through SSL/TLS, although this is uncommon. Unencrypted telnet and FTP are no longer safe on today's Internet, and should not be used. If your ISP provides telnet and FTP access, but not ssh and SFTP access, it's time to find a new ISP!

For site administrators things are more complicated, of course. Every service should offer encryption, to start, and SSL/TLS is often the easiest way to do this. But what kind of certificates should you use? Public e-commerce sites, or those dealing with other highly sensitive information, should be using 128-bit commercial certificates. The details of which certificate you should buy will depend on the site itself, but it's worth keeping in mind that the main differentiators are around visitor confidence (EV certificates, well-established root keys) and ease of use for administrators, while the actually signing process is cryptographically equivalent for all CAs.

But if your site doesn't depend on consumer confidence, then what should you do? The two alternatives to a commercial certificate are: one or more self-signed certificates, and creating your own Certificate Authority (which is actually self-signed as well). In general, for one or two hostnames (since certificates are tied to hostnames), using self-signed certificates is fairly easy. This is perfect for personal sites, where a few hundred dollars to trust yourself or a close friend would be a waste of money. Even for sites which do not provide SSL/TLS access, administrative access (updating blogs, checking statistics online, etc.) is a perfect use for a self-signed certificate.

If you have many sites, such as a university or corporation, it may make more sense to create your own CA, and use that to sign individual certificates. The advantage here is that users can trust your CA once, and never again have to deal with untrusted certificate warnings (unless they switch computers or browsers, in which case the process must be repeated). If you op to to go down this path, you should first think seriously about both electronic and physical security of your root certificate's key, including backups and staff turnover. Fortunately, being a CA is not technically much more complicated than self-signing a certificate, although assisting users with installing root certificates is deliberately more complicated than simply trusting a self-signed certificate in some browsers.

What Comes Next? In Part 2 of this article, we will use standard command-line tools to work with SSL/TLS certificates.