Online privacy might be the biggest oxymoron of the early 21st century. Computer users are so ready to share the most innocuous details about their lives on social networks, for example, that it seems privacy has willingly been surrendered.
People may not realize that once they reveal information A or B, a smart third party can infer C, which could be much more sensitive information.
Allesandro Acquisti, professor, Carnegie Mellon University
Information security researchers, privacy experts and hackers alike, in the meantime, have become adept at foraging for nuggets of personal data to exploit this phenomenon. Perhaps no individuals have been more at the forefront of this movement than Carnegie Mellon University professors Alessandro Acquisti and Ralph Gross. Two years ago at the Black Hat Briefings in Las Vegas, the two researchers proved what had long been a theory that individuals’ Social Security numbers could be predicted based on publicly available information about them.
Next week at Black Hat, Acquisti and Gross will reveal details on another phase of their research. This time, they will explain how the convergence of facial recognition technology and data, including images, posted to social networks could lead to the re-identification of individuals and strangers online.
“We’re entering a point in the near future in which the puzzle of personal, sensitive information about individuals is almost complete,” Acquisti said. “There are fewer missing pieces for third parties interested in finding out so much about you.”
Acquisti would not divulge information about what he sees as the practical application of such capabilities, but did say attacks using facial recognition technology are becoming practical.
“Computer facial recognition used to be pretty bad. If the target changed orientation or lighting, the computer would fail at predicting,” he said. “Computers are getting better and we’re seeing slow but constant progress.”
Acquisti’s and Gross’ research certainly could extend the reach of social engineering techniques that are the backbone of so many of today’s phishing campaigns and ultimately targeted attacks against organizations and individuals. Ultimately, social engineering of any kind relies on the trust individuals have in a particular platform.
“People may not realize that once they reveal information A or B, a smart third party can infer C, which could be much more sensitive information,” Acquisti said. “C is not available to us, so people may not think it’s a risk.”
Acquisti, whose background is in behavioral economics, added that people are notoriously bad about choosing instant gratification over the long-term costs of over disclosing information online.
“This is immediate gratification bias. Something you post today may be intended for only a small set of recipients,” he explained. “Your friends may like or think a risqué picture is funny, but you don’t’ consider the long-term risk and significant cost if a potential employee sees the photo three years from now.”
Beyond embarrassment, the trove of personal information posted online not only is the foundation for social engineering attacks, but identity theft. Two years ago, Acquisti and Gross presented their work on the possibility of predicting Social Security numbers of individuals assigned numbers at birth between 1989 and 2003. According to their paper, Acquisti and Gross were able to spot patterns in an individual’s SSN and birth data and statistically infer an individual’s number.
According to the research paper published by Acquisti and Gross, the first five digits are predictable because of flaws in the way numbers are assigned; the first three numbers represent the area where an individual was born, while the next two are representative an individual’s year of birth. Using the Social Security Administration’s Death Master File, which lists individuals who have died and their Social Security numbers, they were able to apply this collection of available personal information to an algorithm they developed to make a guess at the last four digits.
“Our results highlight the unexpected privacy consequences of the complex interactions among multiple data sources in modern information economies and quantify privacy risks associated with information revelation in public forums,” they wrote.