Understanding encryption and cryptography basics

Cryptography doesn't have to be so cryptic. Here's a primer, in plain English, to walk you through the basics.

This article can also be found in the Premium Editorial Download: Information Security magazine: Negative exposure: Web scanners reveal unknown holes:

AES, DES, RSA, ECC -- there are so many ways to encrypt your data. Whether your company's protecting customer credit card information, securing remote user connections to your network or protecting your intellectual property from digital piracy, you're using encryption every day.

But crypto can be intimidating to the uninitiated, and there are a daunting array of options. In the 1980s, there was only one real choice -- the Data Encryption Standard (DES). That's changed. Today, we have a broad selection of stronger, faster and better-designed algorithms. Now, the problem is to sort out the choices.

But what's the difference? How do you know if you're buying industrial-strength protection or if your developers are choosing the right encryption for the job? Where do you begin to make sense of it all?

Start right here. This primer will help you decipher the jumble of TLAs (three-letter acronyms) that define encryption.

Key concepts

The basic notion of all ciphers is to allow two people -- popularly called "Alice" and "Bob" -- to exchange information privately.

Although there are many ciphers to choose from, the need for compatibility often limits our choices. For example, to ensure secure connections between a Web site and just about any browser, the site must support RC4, since the overwhelming majority of Web browsers support that cipher. If Alice sends encrypted files to Bob, Bob can't choose the cipher to use. He must use the same cipher as Alice, or he won't be able to decrypt her messages, and she won't understand his replies.

Quality encryption always follows a fundamental rule: the actual procedure being used, the algorithm, doesn't need to be kept secret. But the key does. Even the sharpest hacker in the world will be unable to decrypt data as long as the key remains secret.

Today's ciphers use either secret-key or public-key techniques. Secret-key ciphers can be used to protect critical/sensitive data. Because secret-key ciphers use a single key that Bob and Alice must share, this is also known as symmetric cryptography. In 1949, Claude Shannon of Bell Laboratories published the fundamental theory behind secret-key ciphers, and the decades of evolution since then have yielded high-caliber examples. However, it was not until 1975 that a really strong secret-key algorithm, DES, was made available for general use.

Public-key, or asymmetric, cryptography also emerged in the mid-1970s. Public-key ciphers use a pair of keys: the public key that gets shared with other people, and a corresponding private key that is kept secret by its single owner. For example, Alice can create a key pair and share the public key with Bob and anyone else who might want to send her a secret message. Bob can encrypt a message to Alice by using her public key, and Alice can decrypt it using her private key.

At first, public-key encryption was a solution looking for a problem. Developers knew how to use it to build certain types of mechanisms -- notably digital signatures and key distribution protocols -- but it took a couple of decades of evolution before these applications achieved widespread use.

Secret-key ciphers

Secret-key ciphers generally fall into one of two categories -- stream or block ciphers (see Figure 1).

Stream ciphers encrypt data as a sequence of bits, one bit at a time. The best-known stream cipher is probably Ron Rivest's Cipher #4 (RC4), which most e-commerce sites use to encrypt the data passing between the browser and Web server. Other stream ciphers include A5, which encrypts GSM cellphone traffic, and the cipher that encrypts satellite television signals.

Traditionally, cryptanalysts have broken secret codes by searching for repeating patterns in the data, and the essence of a strong code is the ability to hide such patterns. A stream cipher creates a sequence of bits that doesn't repeat for a very long time, and uses that sequence to hide the message.

There's a catch. While the stream cipher might prevent an attacker from reading the encrypted message, he can modify it by systematically making changes to the message's plaintext, switching individual bits on and off in the ciphertext.

For example, imagine that Bob is sending a payment instruction to his bank. An attacker named Henry can intercept the message and easily change the amount from $50 to $99, or from $50 to $10, as long as he knows the message's layout and changes the appropriate bits. Henry doesn't even need to know the secret key being used. Researchers identified this problem in the way RC4 was used in older versions of Microsoft's PPTP remote protocol, and Microsoft had to fix the problem. Such difficulties help explain why block ciphers are more widely used.

Block ciphers have been the workhorse of computer-based encryption since DES was introduced. A block cipher encrypts data one fixed-size block at a time -- rather than bit by bit -- producing the same sized block of encrypted data. For example, DES accepts 64-bit blocks of data and uses a 56-bit secret key to produce 64-bit blocks of ciphertext. Unlike stream ciphers, the block cipher spreads a single bit of the plaintext across the entire encrypted block. If an attacker modifies a single bit in the ciphertext, the change will usually ripple through every bit in the block when it is decrypted. So, if Henry changes a single bit in a block-encrypted message, he'll turn a whole series of characters into gibberish.

While block ciphers eliminate the problem of bit-by-bit changes to ciphertext, they present another issue. If we use a block cipher to simply encrypt data one block at a time, then any time we repeat a block of data in the plaintext, we'll produce an identical block of ciphertext. Attackers can use this to start guessing what the messages might say.

Figure 1: Block ciphersFigure 1

For example, imagine that Bob and Alice are dissidents under an oppressive regime, and they use encryption to exchange uncensored news reports and have e-mail discussions that quote large portions of those reports. Peeping Tom, who works for the secret police, monitors their encrypted e-mail. If Bob regularly takes Voice of America news reports and encrypts them, Tom can look for repeated blocks of ciphertext and infer that they talk about the same subject. If Tom realizes that he's looking at encrypted news reports, he can even look for patterns in the ciphertext and see if they match patterns of character blocks in the plaintext news reports. After a while, Tom will be able to tell a lot about encrypted messages just by looking at which ciphertext blocks appear in them. Moreover, Tom can pull blocks out of one message and insert them into another, and the resulting message will decrypt into readable plaintext if he sends them to Bob or Alice.

AES is available in VPN products from major firewall manufacturers, and has been incorporated into toolkits produced by cryptography vendors.

Clearly, another layer of cryptographic manipulation is needed to prevent this problem. The solutions are called "modes."

Modes are employed in most encryption products to combine the block cipher with the plaintext. Three modes were developed and standardized for use with DES: cipher block chaining (CBC), cipher feedback (CFB) and output feedback (OFB). Each mode has its own way of combining data between encryption steps, and so each one achieves slightly different results. A fourth method, electronic code book (ECB), is where no mode is used and the cipher is simply applied by itself.

Figure 2: Cipher block chainingFigure 2

CBC, the most common mode, combines the previous block of ciphertext with the next block of plaintext before encrypting it (see Figure 2). A random value -- an "initialization vector" -- is applied to the first block before it is encrypted. This chaining mechanism means the encryption of each block depends on the encryption of all previous blocks. CBC helps detect modifications to the ciphertext: any change in the order of the ciphertext blocks will produce a block of garbage when it's decrypted.

Attack and defense

Often we only need to decide whether the encryption provided is safe enough for the application at hand. If a product uses a reputable, quality cipher, then the cipher will never be the weak link.

Since the introduction of DES, a number of ciphers have been created and introduced as standards; others have appeared as respected alternatives to those standards (see Figure 3).

But what makes a cipher strong? Once we've eliminated repeating patterns, the secrecy of a cipher ultimately rests on three things:

  1. The infrastructure it runs in. If the cryptography is implemented primarily in software, then the infrastructure will be the weakest link. If Bob and Alice are trying to keep their messages secret, Tom's best bet is to hack into one of their computers and steal the messages before they're encrypted. It's always going to be easier to hack into a system, or infect it with a virus, than to crack a large secret key. In many cases, the easiest way to uncover a secret key might be to eavesdrop on the user and intercept the secret key when it's passed to the encryption program. For example, the FBI used a special product called a keystroke monitor to uncover the secret key used by Nicodemo "Little Nicky" Scarfo, an alleged New Jersey mob boss, to encrypt his business records. Scarfo's encryption software used a key formatted as a password, and the keystroke monitor discovered that Scarfo used his father's prison ID number to encrypt his records.
  2. Key size. In cryptography, key size matters. If an attacker can't install a keystroke monitor, then the best way to crack the ciphertext is to try to guess the key through a "brute-force" trial-and-error search. A practical cipher must use a key size that makes brute-force searching impractical. However, since computers get faster every year, the size of a "borderline safe" key keeps growing.

    Experts acknowledge that keys of 64 bits or less, including DES keys, are vulnerable to a determined attacker. In 1999, the Electronic Frontier Foundation (EFF) funded the development of a device called Deep Crack, which could crack a DES encryption key in three days or less. Contemporary cipher keys always contain more than 100 bits, and a few support 256-bit keys.

  3. Algorithm quality. The question of algorithm quality is harder to judge. It's relatively easy to construct a plausible-looking cipher based on existing algorithms, but it can be hard to detect subtle flaws unless experienced people take a close look. Cipher flaws can yield "shortcuts" that allow attackers to skip large blocks of keys while doing their trial-and-error search. For example, the well-known compression utility PKZIP traditionally incorporated a custom-built encryption feature that used a 64-bit key. In theory, it should take 264 trials to check all possible keys. In fact, there is a shortcut attack against PKZIP encryption that only requires 227 trials to crack the ciphertext.

    The only way to find such flaws is to actually try to crack the algorithm, usually by using tricks that have worked against other ciphers. An algorithm usually only shows its quality after being subjected to such analyses and attacks. Even so, the failure to find a flaw today doesn't guarantee that someone won't find one eventually.

It's easy to multiply two prime numbers together and produce a huge number, like one with 300 digits, but it's genuinely impractical to extract those two primes if you don't know what either was.

DES has stood the test of time because the cipher's quality had been proven over many years of published research. After a quarter century of study, researchers only managed to find a few theoretical attacks that ultimately weren't as practical as brute force. The DES cipher's only real weakness has been its key size. But a modification, TripleDES, gets around this by applying the cipher three times in a row, using either a 112-bit key or a 168-bit key. The resulting cipher is much slower than other ciphers of similar strength, but it remains a national standard.

Not everyone wanted to use DES, because of its 56-bit key. Alternatives tended to have the same 64-bit block size as DES, but they provided a larger key size, usually 128 bits or more. These included Ron Rivest's RC2 and RC5; the International Data Encryption Algorithm (IDEA) used in several versions of the PGP encryption package; and the "Blowfish" algorithm developed by Bruce Schneier and used in utility programs developed by his company, Counterpane Internet Security.

However, none of these ciphers could really replace DES. Although each had its strengths -- in security, or in hardware/ software performance -- each had weaknesses. The replacement for DES had to embody the best of cryptographic state of the art, and it had to perform well in a variety of hardware and software environments.

AES

The Advanced Encryption Standard (AES) supports three key sizes: 128, 192 and 256 bits and uses a 128-bit block size.

While DES was explicitly designed to be built in hardware, no thought was given to making it work efficiently in software. The National Institute of Standards and Technology (NIST) evaluated software execution efficiency and storage requirements to ensure that AES worked well in C and Java running on workstations, as well as the more restricted environments of embedded ARM processors and smart cards.

Since AES is a relatively new standard, products are just beginning to incorporate it. It's available in VPN products from major firewall manufacturers, and has been incorporated into toolkits produced by major cryptography vendors. Last summer, the Internet Engineering Task Force (IETF) approved a draft standard for incorporating AES into SSL and related transport protocols, so Web servers and clients may be upgraded to use AES in the near future.

Although Rijndael, developed by Dutch researchers Vincent Rijmen and Joan Daemen, won the NIST competition, all of the AES finalist ciphers provide vast improvements over DES and the DES substitutes. All of them are block ciphers that support 128-bit or larger key sizes. None of the finalists had any serious weaknesses; the final choice involved a balance of cryptographic strength with performance. Some of these ciphers may find their way into various products and implementations, so it's worthwhile to know their names:

  • MARS, submitted by IBM.
  • RC6, submitted by RSA Laboratories.
  • Serpent, submitted by an international team of researchers.
  • Twofish, submitted by an American research team.

Public-key ciphers

Once we have our secret-key algorithm in place to encrypt our secret information, we still need a way to exchange those pesky secret keys. How can Alice and Bob share a secret key if they can exchange e-mail, but they can't send other messages back and forth? If they simply send a key "in the clear," then Tom or Henry might intercept it. Or perhaps Bob needs to send messages to his bank and provide strong evidence that he's written the message, even though he sends it electronically instead of on paper. In other words, he needs a way to certify the authenticity of some information.

Public-key ciphers can handle these tasks, and provide some real advantages over secret-key ciphers alone. The increased key size generally pays for itself by solving problems that are much harder to solve through secret-key techniques. There are two major types of public-key ciphers used today: Diffie-Hellman and RSA.

Diffie-Hellman. The Diffie-Hellman (DH) algorithm, created by Whit Diffie and Martin Hellman, introduced public-key cryptography to the public. Like all public-key algorithms, DH is based on the difficult mathematical problem it presents to would-be crackers: discrete logarithms in a finite field. In particular, DH takes advantage of the fact that it's easy to perform modular exponentiation (that is, multiply a number to a power and then derive its remainder relative to another number), but it's hard to undo that computation and find the discrete logarithm of the result.

In practice, DH isn't really a cipher. It's a process by which two people can create a shared secret that eavesdroppers can't guess. If Alice and Bob want to establish a shared secret, like a secret key for example, Bob sends Alice his DH public key, and Alice sends Bob her DH public key. Then, Alice multiplies her own private key with Bob's public key to derive the shared secret. When Bob multiplies his own private key with Alice's public key, he gets the same result that she does. Now they can use that shared secret to encrypt messages protected with a secret-key cipher, and each can decrypt what the other encrypts. Eavesdroppers can't duplicate the process and derive their shared key, since the eavesdroppers don't know either of the private keys.

Other researchers have developed variants of DH to perform public-key encryption and digital signatures. The Digital Signature Standard (DSS), issued by NIST in 1994, is closely related to one of those variants.

A major application of DH is VPNs. Internet Key Exchange (IKE), the key management protocol used with the IP Security (IPSec) protocol, uses DH when two encrypting devices first try to communicate. DH provides a secret seed value that is used to generate a temporary set of shared secret keys. These temporary keys encrypt messages that will exchange the actual keys the two endpoints use when exchanging encrypted data.

Another technique, elliptic curve cryptography (ECC), is also based on discrete logarithms, but focuses on computations mapped to a two-dimensional elliptic curve. The resulting algorithm requires much smaller keys than DH. However, it hasn't made much progress in supplanting existing applications of DH or the RSA algorithm. The advantages of a smaller key size don't seem to matter except in severely restricted processing environments, like smart cards, RFIDS and PDAs.

RSA. Named for inventors Ron Rivest, Adi Shamir and Len Adelman, RSA is the workhorse of public-key cryptography. It performs key distribution by using a public key to encrypt a randomly generated secret key. Then the recipient uses the corresponding private key to decrypt it. RSA performs digital signatures by encrypting a hash or checksum using the signer's private key; anyone can verify it using the signer's public key.

The mathematical problem at the core of RSA is that of factoring: It's easy to multiply two prime numbers together and produce a huge number, like one with 300 digits, but it's genuinely impractical to extract those two primes if you don't know what either of them was.

RSA is probably used on every desktop that's connected to the Internet. Most e-commerce sites use RSA to set up encrypted communication between the server and the browser. When a browser establishes such a connection, it uses the Secure Sockets Layer (SSL) protocol to establish a shared, secret key. Under SSL, the browser encrypts a randomly generated secret key with the server's RSA public key. Upon receipt, the server uses its RSA private key to decrypt the message from the browser and extract the secret key. In most cases, the shared secret key is used with the RC4 secret-key cipher to encrypt the actual messages between the server and browser.

Attacking public-key ciphers

Public-key ciphers have never seriously challenged secret-key ciphers as techniques for encrypting large amounts of data. First, public-key ciphers are much slower than secret-key ciphers when encrypting the same information. Another reason is tied to the peculiar mathematical nature of public-key encryption.

The encryption process computes a mathematical formula using the plaintext as an input. Attackers have ways to exploit the mathematical nature of public-key encryption to uncover messages. The attack against RSA, for example, consists of an attempt to factor the product of two prime numbers, which forms the essential part of every RSA public key. To account for the speed of such attacks, minimum public-key sizes are much larger than secret keys offering comparable strength. When we look at RSA and DH public keys, most people notice that they're far larger than even the strongest secret keys. These days, a really strong public key is expected to have around 2,000 bits.

While mathematical attacks are generally applied to individual messages, there are also more general attacks to try to recover a private key through a brute-force attack on its public key. For example, the attacker Tom might take Bob's well-known public key and try to find the value of the corresponding private key. Once Tom finds the private key, he can easily masquerade as Bob.

These problems, and several others, place restrictions on the ways that public-key encryption can be used safely. These restrictions have been formalized in a set of standards, the Public Key Cryptography Standards (PKCS), that describe how to use public-key techniques effectively and safely. There are individual standards for RSA, DH and ECC, and there are standards to describe how best to perform important application functions, such as the application of digital signatures and the encryption of secret keys using public keys. Digital signature applications, such as software patch distribution, and key-sharing protocols, like SSL and IKE, comply with these.

A word of advice

That covers the basics. If you want to learn more, there are a number of good resources available. Here are some general guidelines for choosing the right cryptography for the job:

  • RSA and Diffie-Hellman algorithms dominate public-key cryptography. Both have proven effective in real-world applications. ECC carries some promise, particularly in smart cards or other restricted environments.
  • Today, if you encounter a strong secret-key cipher, it will probably be TripleDES or RC4. As time goes on, the new AES cipher will replace the older ones, since it provides a better combination of safety and performance. AES is the best choice for new product development.
  • Sensible vendors use established, well-known, well-vetted ciphers. It is rarely practical to develop a new cipher and to convince potential customers of its effectiveness. Moreover, strong ciphers are already being used in most products, so it makes sense to use the good names simply for interoperability.

About the author
Rick Smith Ph.D., CISSP, is a writer, lecturer and consultant on information security. He is the author of Authentication: From Passwords to Public Keys (Addison-Wesley, 2002) and Internet Cryptography (Addison-Wesley, 1997).

Next Steps

How likely is an NSA encryption backdoor attack?

Get information on how to protect data after the NSA Bullrun revelations

This was first published in January 2003

Dig deeper on Disk Encryption and File Encryption

Pro+

Features

Enjoy the benefits of Pro+ membership, learn more and join.

Related Discussions

Rick Smith, Contributor asks:

What do you think is the safest form of encryption?

0  Responses So Far

Join the Discussion

0 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

-ADS BY GOOGLE

SearchCloudSecurity

SearchNetworking

SearchCIO

SearchConsumerization

SearchEnterpriseDesktop

SearchCloudComputing

ComputerWeekly

Close