The following is an excerpt from Safety of Web Applications: Risks, Encryption and Handling Vulnerabilities with PHP by author Eric Quinton and published by Syngress. This section from chapter three explores symmetric and asymmetric encryption.
Encryption is the process of modifying a message so that it becomes incomprehensible to anybody who does not know the key or decryption method. We distinguish two main types of encryption: symmetric encryption, and its variant, hashing, and asymmetric encryption. The implementation of the latter is based on digital certificates.
3.2.1. Symmetric encryption
Symmetric encryption (or private-key encryption) uses the same key both to encrypt and decrypt the message.
A cipher is applied to the information being encrypted. One of the parameters of this cypher is a key known only by the sender and receiver (Figure 3.1).
Given the key and the cipher, the encrypted message can be decrypted by applying the inverse cipher associated with the key.
There are two techniques for doing this. Block ciphers divide the message into several parts of equal size (between 64 and 256 bits, depending on the algorithm). This is the most common type of encryption in computer systems. Stream ciphers encrypt the message bit by bit: this technique is mainly used for radio transmission systems (GSM-cellphone networks, Bluetooth-wireless networks, for example).
The size of the key itself is typically between 56 and 256 bits.
Safety of Web Applications: Risks, Encryption and Handling Vulnerabilities with PHP
Author: Eric Quinton
Learn more about Safety of Web Applications from publisher Syngress
At checkout, use discount code PBTY25 for 25% off this and other Elsevier titles
The strength of a symmetric cipher depends on several factors. The longer the key, the more secure the encryption. It is widely believed that a key of 256 bits (2256 is approximately 1077, which is estimated to be close to the number of electrons in the universe) can never be broken by brute force, i.e. by testing each combination in turn. However, the length of the key is not the only factor that determines the strength of the cipher. Messages are encrypted block by block, and the larger each block, the more robust the cipher. The same computation function is also applied multiple times (number of iterations). The greater the number of iterations, the more robust the cipher; ANSSI recommends performing 65,000 iterations. The relevancy of the algorithm itself must also be considered, and the key must be generated completely at random.
To improve security, especially for codes or passwords, limiting the number of permitted attempts is also a good idea. For example, smart cards are blocked after three unsuccessful attempts, which means that unlock codes with only 4 - 6 digits are sufficient.
Today, the most widely used algorithm is AES256: the blocks are 128 bits in size, and the key is 256 bits. It is currently believed to be secure.
Symmetric encryption is inexpensive in terms of computation time, due to the simplicity of its algorithms (matrix permutations and boolean XOR-type functions are applied to the data). However, they have a disadvantage: the sender and the receiver must first exchange the secret key. Over an Internet connection, confirming the identities of the sender and the receiver is problematic, and the key must be transmitted in such a way that nobody else can see it. We will see below that asymmetric encryption provides a solution to this problem.
3.2.2. Computing hashes and salting passwords
In computing, a hash is a fixed-length sequence of characters calculated from a file or an arbitrary sequence of characters. The hash is unique: it is impossible to obtain the same hash from different data. But if the algorithm is not sufficiently secure or the number of possible combinations is too small, there can be collisions, i.e. two different strings can lead to the same hash. Finally, it should not be possible to reconstruct the original information from the hash.
There are two main situations in which hashes are useful. The first is when we wish to verify that a copy of a file is identical to the original, for example when downloading an ISO image. The website hosting the download indicates the hash value, and specifies the method used to calculate it. Once the file has been downloaded, it is easy to recalculate the hash and check that both values are identical. If there is a difference, the downloaded file is not identical to the original, either due to an error during transmission or interference from a hacker, which typically takes the form of a man-in-the-middle attack [WIK 15a]. In this type of attack, the attackers position themselves between the client’s computer and the web server, and rewrite the transmitted information in real time.
Hashes are also used to encode passwords in such a way that they cannot be decoded.
A special procedure can be used to store passwords. When the password is created, its hash is calculated, and the hash is stored in the database. To check the password, the program calculates the hash of the string entered by the user, and then compares it to the value stored in the database. If the two hashes are identical, the password is accepted as correct (Figure 3.2).
Today, we extend this procedure with a technique known as salting. The hash is calculated from the data given to the hash algorithm. But if these data are predictable (password too easy to guess, for example), it can be relatively easy to recover the original data from the hash.
This can happen in practice for passwords. Too many people choose passwords that are easy to guess: the strings password or 12345678, first names or the date of birth of a child or spouse, etc., are unfortunately very common choices. If an attacker knows the hashing algorithm and has access to the database following an intrusion or data theft, they will be able to run a search that will easily find some of these passwords.
To protect against the risk of this type of attack, one solution is to mix in a piece of variable information when the hash is calculated. This variable information is different for each user. This is called salting. Usually, the account or login id, which is necessarily unique, is appended to the password: even if two users have the same password, this procedure results in different hashes. Below is an example with the password password and the two distinct user accounts john and mark (the code was generated using a Linux command):
88071 bcc ...
We joined the username (john) and the password (password) together before computing the hash. We now do the same with a different username, mark:
The two hashes are different.
Therefore, even though the passwords themselves might be the same, the values stored in the database are never identical. Even if the attacker knows the salting algorithm, they will be forced to recalculate all possible values for each account, which makes their task much more complex.
3.2.3. Asymmetric encryption
Symmetric encryption is secure enough to protect communications, but it suffers from a fundamental flaw: the encryption key must be shared between both parties. Thus, we need a way to exchange the key without it being intercepted, while verifying the identity of the person with whom we are communicating.
Asymmetric encryption provides a solution to this problem.
Asymmetric protocols generate two keys instead of one, based on two randomly chosen prime numbers. The remarkable property of this procedure is that a message encrypted with one key can only be decrypted using the other:
The message encrypted with key 1 can only be decrypted with key 2. The reverse is also true: the message encrypted with key 2 can only be decrypted with key 1.
In practice, the first key is kept secret by its owner: it is referred to as the private key. The second key, the public key, is transmitted to all recipients that request it.
This mechanism provides an easy way of accomplishing two different tasks: encrypting messages and verifying the identity of a communication partner.
If Bob wants to send an encrypted message to Alice, he retrieves her public key, and uses it to encrypt his message. Alice can then decrypt the message using her private key: she is the only one able to do so, as she is the only one who knows the private key.
Read an excerpt
Download the PDF of chapter three in full to learn more!
Now, if Alice sends a message to Bob, and Bob wants to be certain that it was definitely Alice who sent it, the procedure is a little more complicated (Figure 3.4).
The following sequence of operations is performed:
- Alice calculates the hash of her message using a hash function as discussed above;
- she encrypts the hash using her private key;
- she sends a message with the encrypted hash to Bob;
- Bob receives this message, and calculates its hash;
- he decrypts the encrypted hash sent by Alice using her public key;
- finally, he compares both hashes: if they are identical, then it must have been Alice who sent the message.
Of course, this protocol is carried out automatically, and these calculations are performed by software programs, such as email clients like Thunderbird [MOZ 15].
Asymmetric encryption is relatively robust because it is currently not possible to quickly factor the product of two prime numbers if they are chosen to be sufficiently large (there are other algorithms for managing asymmetric keys based on elliptic curves rather than prime numbers; these algorithms do not require the keys to be so large).
As it currently stands, the prime numbers used to generate the keys must have sizes of at least 2048 bits to guarantee that they are robust. Even today, ANSSI recommends using keys with 3096 bits, especially if they are intended to remain in usage until after 2030.
About the author:
Dig Deeper on Disk and file encryption tools