NIST recently published a journal article named "Finding Bugs in Cryptographic Hash Function Implementations,"...
detailing how NIST revisited its five-year SHA-3 competition to find the successor to the Secure Hash Algorithm version 2 (SHA-2).
The paper details how NIST developed a new testing strategy -- and four new tests -- to find bugs in cryptographic hashing functions. The selection process took eight years, ending in August 2015 when the algorithm for the hash cryptographic function Keccak was selected for the SHA-3 competition after NIST experts didn't find bugs in its algorithms.
In this tip, we'll take a look at how NIST developed techniques to automate finding bugs in cryptographic hash functions and why using an entropy source with hashes could help uncover more bugs.
As part of designing test cases to detect bugs in the hashing algorithms in the SHA-3 competition, the NIST authors developed four tests that corresponded to four desired properties of a cryptographic hash function. These properties included the following:
- Preimage resistance is the property of a hash function that makes it infeasible to recover the original data -- the preimage -- given the message digest returned by the hash function.
- Second preimage resistance is the property of a hash function that, given a preimage and a hash value, makes it impractical to discover a second preimage -- a different set of data from the original preimage -- that produces the same hash value. Because a hash function can be second preimage-resistant without being preimage-resistant, a trusted one-way hash can be both. Preimage resistance means an attacker cannot recover the original data being hashed by looking at the hash. Second preimage resistance means an attacker cannot create a second set of data that will produce the same hash value as the original data.
- Collision resistance is the property of the hash function that makes it infeasible to find two different messages that produce the same message digest. If a hash function demonstrates collision resistance, the implication is that it is also second preimage-resistant. But collision resistance means the hash function is able to withstand an attacker choosing both the first and second message instead of choosing the second message for a fixed message, as in second preimage resistance.
- Function behavior is the property of hash functions -- and, indeed, all mathematical functions -- that guarantees the same output -- hash digest -- given the same input.
Starting from these four properties of cryptographic hash functions, the authors of the NIST report -- Nicky Mouha, Mohammad S Raunak, D. Richard Kuhn and Raghu Kacker -- developed four approaches to testing the reference implementations of the SHA-3 candidate algorithms, including:
- The Bit-Contribution Test, which is designed to test whether changes in the message are reflected in changes to the hash value in order to verify second preimage resistance in the hash function.
- The Bit-Exclusion Test, which is designed to meet one of the requirements of the SHA-3 competition -- that the message be provided in an array of bytes. The C programming language doesn't prevent the programmer from going past the end of the array.
- The Update Test is designed to look for potential metaphoric relations to show whether the SHA-3 Competition API, an API that NIST specified as part of the SHA-3 competition, is complied with.
- The Combinatorial Update Test, which is a variation of the Update Test used to reduce the test set size.
Five hash functions
The tests focused mostly on bugs that were discovered in the five SHA-3 implementations that demonstrated why bugs in the hash functions escaped detection during the SHA-3 selection process.
One of the five finalists of the SHA-3 competition was the BLAKE hash function; however, the NIST authors found a bug in the source code of every one of the BLAKE reference implementations. Although the bug was fixed by the BLAKE designer as of 2016, it was rediscovered by the report's authors. The bug affected all sizes of the hash function, and later versions have been used in many protocols.
A second hash function, JH, was designed by Hongjun Wu. All JH implementations shared a bug: they read bits outside of the bounds of the message that are not a multiple of eight bits, which is required by the NIST API. The NIST authors detected this bug using the Bit-Exclusion Test.
Fugue, another of the hash functions, was designed by Shai Halevi, Eric Hall and Charanjit Jutla of the IBM Thomas J. Watson Research Center. A few months after advancing to the second round, Jutla announced that Stefan Tillich discovered a bug in the Fugue implementation. Erroneous implementations occurred with all the messages for which the length was not divisible by eight.
The Hamsi hash function was designed by Özgül Küçük. All Hamsi implementations were submitted in the first round, but they didn't make it to the second round. The NIST authors found serious bugs in the 384-bit and 512-bit hash outputs -- only half of the message bits were processed.
The fifth and final hash function finalist, LANE, was selected for the first round, but it failed to reach the second round. The Update Test discovered an extremely dangerous bug that ensured the same hash value would be returned for all the messages of a particular length.
Entropy for hash functions
In order to reduce bugs, I believe that all the non-winning hash functions from the SHA-3 competition should be applied to an entropy source to reduce bugs. The tests didn't account for entropy to generate random bits for use in cryptography. However, if an entropy source doesn't work properly, then the hash function tests will fail; entropy as a service could be used on low-entropy devices.
The hash function Keccak, published in NIST FIPS 202 as SHA-3, is not widely accepted due to its slower software speeds than SHA-2 -- which replaced deprecated SHA-1. Support for SHA-3 is generally not available for most software and hardware as code, and firmware needs to be customized for each device. SHA-3 also hasn't been applied to entropy sources to reduce or eliminate the flaws that the NIST authors failed to detect in their tests.
All hash functions should be applied to an entropy source before starting any test approaches. If this had been done, then the flaws in Keccak might have been detected earlier.