Spartak - Fotolia

How machine learning-powered password guessing impacts security

A new password guessing technique takes advantage of machine learning technologies. Expert Michael Cobb discusses how much of a threat this is to enterprise security.

Michael Cobb

Published: 07 Dec 2017

What happens when authentication and access control measures are attacked by adversaries equipped with machine learning? This question has been examined in a couple of recent university studies, and it's worth taking a look at the potential impact of their findings on the security of real-world information systems.

Password guessing impacts system security in both online and offline attacks. An online password guessing attack can be found in the logs of every server that's on the internet -- a constant series of attempts to log in remotely using guessed credentials. Such attacks can be thwarted by having complex passwords, limiting the number of attempted logins and requiring two-factor authentication.

In an offline password guessing attack, the adversary obtains a set of system or application credentials, usernames and hashed passwords. They can then attempt to guess passwords on their own machine. This is done by checking to see if a hash of the guess, such as password, matches any of the hashes obtained from the target system.

Offline password guessing depends on having a large collection of plausible passwords, often called a password cracking dictionary. A real dictionary is used as the starting point, and then variations are added based on common tricks, like character swapping -- D1sn3yW0rld -- and adding special characters -- password! Hackers also add actual passwords disclosed in breaches, such as LinkedIn and RockYou, to these dictionaries.

How machine learning increases the threat

A new approach to improving password guessing techniques is harnessing the power of machine learning algorithms. For example, researchers at the Stevens Institute of Technology and the New York Institute of Technology came up with something they call PassGAN, a novel technique that "leverages Generative Adversarial Networks (GANs) to enhance password guessing."

Without going into the science of Generative Adversarial Networks, a GAN uses two neural networks, one of which tries to fool the other with fake data that is very close to actual data. What researchers found is that, by training a GAN on a list of leaked passwords, it can rapidly produce a large number of plausible password guesses, potentially outperforming password guessing tools such as Hashcat and John the Ripper.

Password cracking tools are classic examples of the double-edged phenomenon: security technology that can be used for evil or good.

What does this mean for information system security, apart from underlining the importance of protecting password hashes, given that cybercriminals are increasingly likely to apply machine learning to offline cracking? Password cracking tools are classic examples of the double-edged phenomenon: security technology that can be used for evil or good; and in this case, it can be adapted to measure the strength of passwords before users are allowed to use them based on an ease of guessing score.

Of course, this latest research also adds to the reasons why system security needs to use stronger authentication than passwords alone to protect access. One popular technology is a one-time passcode generated on a mobile device assumed to be under the control of the device owner.

However, the reliability of that assumption is somewhat undermined by another piece of research, this time from Newcastle University. Researchers there have developed a proof-of-concept attack called PINlogger that uses machine learning and a neural network to analyze sensor data on a mobile device to detect when a PIN is being entered, and then determine the actual PIN.

With several dozen sensors on a mobile device -- from the touchscreen to sensors for motion, speed, orientation, rotation and more -- it is perhaps not surprising that combined sensor output, when analyzed with machine learning, can reveal a lot about a user's physical interaction with a device.

However, there are some constraints on this PINlogger attack. It requires the mobile device to have a web browser that supports JavaScript and web APIs that can access onboard sensors. Also, the user needs to be led to the attacker's malicious webpage and must keep that page open during an attack. However, the use of JavaScript to access sensors via the browser means that the attack does not require users to download an app to become victims.

The researchers were not content just to create a proof of concept for this sensor-based attack; they actually studied how mobile device users perceived the risks from sensors typically found in these systems. The results showed that many people were not aware of all the sensors on their devices or the potential for information like mobile orientation and motion to be used to defeat security measures like a PIN. The researchers also noted a lack of granularity in sensor access control policies.

As more sensors are added to mobile devices, the potential for abuse is likely to grow, and the researchers concluded that the problem of sensor-based attacks is a hard one to solve, but needs to be addressed fairly urgently, before they start appearing in the wild. Update your security awareness training content now.

How machine learning-powered password guessing impacts security

A new password guessing technique takes advantage of machine learning technologies. Expert Michael Cobb discusses how much of a threat this is to enterprise security.

How machine learning increases the threat

Dig Deeper on Identity and access management

cryptanalysis

dictionary attack

Hash-based Message Authentication Code (HMAC)

WPA3

How machine learning increases the threat

Related Resources

Dig Deeper on Identity and access management

cryptanalysis

dictionary attack

Hash-based Message Authentication Code (HMAC)

WPA3