In the past, to address such concerns, many organizations implemented blacklists, either via outgoing Web proxies or on edge-filtering devices. Years ago, security professionals would put rules into their routers or firewalls to block specific ports or IP addresses that were known to be malicious. Blacklists simply sought to automate the process with constantly updated lists of known bad domains. However, the idea of "enumerating badness," as renowned expert Marcus Ranum called the practice, has been a failed approach for quite some time. The reason for this failure is that the amount of "badness" on the Internet is growing at a much greater rate than people's lists can keep up with.
How highly predictive blacklists work
Previous blacklists include the global worst offender list (GWOL), which is a roll of sites that are known to either be attack sources or contain malware, and the local worst offender lists (LWOL), which are local blacklists created by an organization. GWOLs generally come from security vendors that provide Web filtering services, while the LWOLs are generally supported by a local organization. Keeping up to date on these lists can be time-intensive and costly, as they need to be continually updated to keep ahead of threats.
The highly predictive blacklists, however, compare companies' firewall logs -- which are shared via the SANS Institute's DShield data center -- and search for overlap. For each participating organization, attackers can then be ranked based on the calculated probability that they will go after a particular company's network. If, for example, certain networks are hit from the same originating location on the Internet, predictions can be made about future attacks and which individual networks have similar characteristics and are most vulnerable.
The highly predictive blacklist (HPB) approach is unique; it allows for custom blacklisting based on the importance to a given company. An HPB gives an organization's firewalls individualized attack data and integrates an inventive relevance-based ranking setup based on Google's PageRank system, which analyzes hypertext links. For example, let's say that you are a Department of Defense contractor that contributes attack logs to DShield. Because you are not the only defense contractor contributing to DShield, the data center can develop a list of IP addresses that have been known to attack other defense contractors and develop a list for your organization based on historical attack data from similar organizations.
While I was working in the defense sector, it was not uncommon to hear about an attack against another company, only to see the exact same attack against our organization a week later. Highly predictive blacklists would have worked well in that scenario because we would have had the opportunity to block the offending IP addresses before they targeted our organization.
HPBs operate more efficiently than traditional GWOL-based blacklists, which require a sufficient amount of attack or malware traffic before offending IP addresses or IP ranges become incorporated.
Ideally, every organization should make whitelist-based decisions. Sites that employees are allowed to access should be explicitly permitted, and all other domains should be rejected.
However, for many organizations, such an optimal arrangement isn't possible, either technically or politically. Because of these issues, enterprises commonly turn to blacklists, seeking to weed out as many dangerous domains as possible.
Although SANS Internet Storm Center Chief Research Officer Johannes Ullrich said the HPB approach can be better "by a factor of 10 or more" than traditional blacklist approaches, he also warns that there are some caveats that security pros need to be aware of.
Because HPBs are customized, some companies may get little benefit from the approach. Remember, these lists are based on attack data from organizations similar to yours, and are dependent on data from those organizations being uploading to DShield. If your organization is unique, or other similar organizations are not uploading information to DShield, there is little data for them to create a list that matches your company. Highly predictive blacklists are an emerging area of research and will get better over time.
Also, many organizations use highly predictive blacklisting as part of their ingress protections, or their methods of defending against attacks coming in. Organizations, however, should also use HPB as part of their egress, or outbound, filtering approach. Yet, be mindful that doing so is hardly a panacea for restricting users from going to bad websites. While this defense may thwart the casual user who inadvertently clicks on a link, a determined user who is deliberately trying to bypass a company's egress filtering will find a way to use proxy filters to do so. A simple Google search for "Bypassing Websense" pulled over 50,000 hits, suggesting that users determined to bypass proxy filters can get plenty of help.
To get started testing HPB for your organization, you will need to sign up with DShield, which is a free service, and start providing attack data from your IDS and firewalls. Any data received by DShield may be shared with employees of SANS, SANS instructors, and third-party contractors. Keep this in mind when uploading your data to the control center. I strongly recommend testing this approach before implementing it, and you should also vet the sharing of your logs with DShield before you get started. Feel free to read the SANS Institute policy for sharing information as well. HPB is an emerging technology, and all emerging technologies need to be fully vetted and tested before implementation.
Implementing highly predictive blacklisting can reduce your technical exposure and keep users from non-approved websites. If you cannot implement the whitelist approach, why not implement the best blacklist approach possible?
About the author:
John Strand currently is a Senior Security Researcher with his company Black Hills Information Security, and a consultant with Argotek, Inc for TS/SCI programs. He teaches the SANS 504 "Hacker Techniques, Exploits and Incident Handling," 517, "Cutting Edge Hacking Techniques," and 560 "Network Penetration Testing" classes as a Certified SANS Instructor. Strand also answers your questions on information security threats.
This was first published in November 2008