Evaluate Weigh the pros and cons of technologies, products and projects you are considering.

Mining NetFlow

Your routers and switches can yield a mother lode of security information about your network--if you know where to dig

Your routers and switches can yield a mother lode of security information about your network--if you know where to dig.

Excavating endless logs to detect malicious network activity is a lot like mining for gold-- randomly digging holes to find a nugget or two isn't very efficient. Your search will be a lot more fruitful if you know what to look for and where to look for it.

Fortunately, data generated by NetFlow, a de facto UDP-based traffic reporting protocol, yields a rich vein of specific information about data flow--source and destination IP addresses and port numbers, protocol and service types, and the router input interface.

Mining NetFlow data can still be extremely difficult, but a handful of free and/or relatively inexpensive tools allow you to hit pay dirt by easily sorting, viewing and analyzing the information you want to use. The results can help you identify and shut down everything from spam to botnets. This technique is particularly valuable for ISPs, but can produce invaluable security information in any organization.

Drilling Operations
Cisco defines flow as a stream of packets between a given source and destination, defined by a network-layer IP address and transport-layer source and destination port numbers. It includes date and time information, as well as packet and byte counts. NetFlow data, created by Cisco and also supported on Juniper and other routers, has been used for network, application and user monitoring, as well as capacity planning, security analysis and accounting.

NetFlow security analysis is a basic form of anomaly detection--looking at traffic patterns that stray from the norm. In contrast, signature-based IDSes inspect payload information.

This analysis can generate a wide range of security-related information. A typical activity, such as a single device generating an unusually high number of connections, could indicate DoS or worm attacks, network scans or spam. You can filter NetFlow data to show only activity on specific ports, which may indicate worm or Trojan activity (Slammer uses port 1434). You may also spot unfamiliar IP addresses generating outbound traffic or, conversely, known IP addresses on your network generating inbound traffic.

Discovering an SSH Compromise

NetFlow data exposed a server cluster compromise by revealing a high volume of port 22 (SSH) traffic from foreign IP addresses instead of the from authorized administrators. In this example attackers used the servers to create a rogue IRC network. (IP addresses and DNS names are removed to protect the compromised organization.)

Striking It Rich

For a security analyst, finding and using NetFlow data will aid in the discovery and prevention of network compromises. Here are some real-world examples of the power and flexibility of NetFlow analysis:

It's in the mail: When a PC is sending unsolicited e-mail, NetFlow analysis can identify the compromised machine. The daily routine of SMTP analysis leads to scheduled reports, removing the manual work. The net result is a significant reduction of spam being generated, which will help to keep the organization off e-mail blacklists.

It's easy to identify problems by selecting SMTP traffic and sorting it by the number of flows. For example, an unregistered server causing thousands of e-mail flows in a relatively short time interval has a compromised system and will be contacted. Failure to fix problems results in subsequent notifications and could lead to account suspension.

Even registered mail servers with excessive flows can be worth further investigation. For example, thousands of e-mail flows out of a small business or a school in the middle of the night raises a red flag, and checking IPs against Internet registries often turns up large numbers of suspicious foreign destinations. Detailed flow analysis of a specific machine generating spam usually shows activity on ports indicating a specific mail Trojan.

On the trail of malware: When a new virus or worm hits, NetFlow analysis can reveal its characteristics, the extent to which it has infected networks and how it's spreading. For example, Sasser and Sober are fairly easy to profile from the ports they use and the number of flows generated. Thousands of flows on port 445 over a 20-minute interval just isn't normal activity--it's Sasser.

Symantec, McAfee, the Internet Storm Center and others are ready sources of information on the activity patterns of viruses, worms and spyware.

Phishing expedition: An analysis following a trouble call by a dial-up user reveals contact with an Asian Web site whose IP looked familiar. A quick review of recent NetFlow data turns up the IP as the destination in one of the rash of phishing e-mails that had hit the network that week.

Porous firewall: An investigation of multicast traffic showing up in firewall logs for a cluster of servers reveals something the logs did not--unauthorized traffic passing through. Analysis of the IP addresses shows several unknown European and U.S. addresses--none of them the Canadian support group that administers the server cluster using SSH through the firewall.

In this example, SSH had been compromised (see "Discovering an SSH Compromise," right), and further port analysis reveals the servers had been set up as IRC servers; they had connected to several other servers in different parts of the world, none of which matched authorized IPs. The traffic showed that the compromised cluster was being used to crack other servers, expanding the underground IRC network.

The IRC services were eliminated, SSH passwords were changed, patches applied and tighter firewall policies (in particular, egress filtering) were implemented. Follow-up analysis shows an unsuccessful effort to repeat the attack.

Tales out of school: Schools have an ongoing struggle with support issues on many fronts. In the course of sorting flows by a port used by a current threat, analysis of traffic from a high school reveals a multitude of problems: significant flows outward on that port, unsanctioned peer-to-peer file sharing, and suspicious conversations between Chinese IP addresses and the school's database server.

Mining Tools

There are many automated tools for dealing with network threats, but recording virtually all traffic and being able to analyze it in many ways at will provides a powerful way to identify and deal with problems that have happened despite other controls.

NetFlow analysis is a largely manual, detailed process conducted over a lengthy time period; it can be a bit tedious, but automatically scheduled reports can expedite analysis of specific areas and complement the ad hoc capabilities. We recommend using a combination of tools for data mining and warehousing, enabling you to maintain several months of information for long-term analysis.

  • Mark Fullmer's Flow-Tools (www.splin tered.net/sw/flow-tools) is a compilation of libraries and programs used to collect, send, process and generate reports based on NetFlow data. Among a variety of functions, various programs can generate more than 50 reports, such as source destination IP pairs and most active devices, or any designated export field; tag flows based on a particular network; and import/export data in ASCII format. The Web site is an excellent resource for information about data flow analysis. An alternative free NetFlow analysis package is SiLK (http://silktools.sourceforge.net), created by the CERT Analysis Center. There are also commercial tools, such as AdventNet's ManageEngine NetFlow Analyzer (http://origin.manageengine.adventnet.com/ products/netflow).
  • WebView is a Web-based reporting tool from Berbee Information Networks, an IT/security managed service provider. A front end for Flow-Tools, WebView has a nice ad hoc query interface that makes it easy to rapidly dig through gigabytes of flow data to discover interesting trends and patterns. It allows selection based on such factors as IP addresses, ports, peers, number of flows, and amount of data. It's currently available only to Berbee customers, but it is open source, and Berbee says it will soon be available for free on SourceForge.net.
  • KEDIT, from Mansfield Software Group (www.kedit.com), is a powerful text editor that allows for further sorting of data and offers commands that enable easy data reduction; it is also a powerful macro capability used to remove uninteresting data, such as router-to-router chatter.

As threats evolve, the ad hoc nature of data mining makes it a valuable technique for identifying and adapting to new dangers as they emerge. Analyzing NetFlow data brings precious security information to the surface, helping managers understand what's going on in their networks and keeping them safe.


Traffic at a Glance

WebView can show NetFlow data on traffic flows in graphs (right) and tables (below), useful in monitoring the type and volume of traffic on the network and understanding QoS issues.

Dig Deeper on Network device security: Appliances, firewalls and switches