The growth in networking connectivity, complexity and activity has been accompanied by an increase in the number of crimes committed within networks, forcing both enterprises and law enforcement to undertake highly specialized investigations. Forensic analysis, the methodical investigation of a crime scene, presents special difficulties in the virtual world. What is problematic for an investigator to do within a computer, making sense out of fragile digital data arranged in obscure and complex ways, can be almost impossible within the significantly larger digital context of the network. Forensic analysis of network traffic is rarely conclusive by itself, but a wide variety of network analysis, intrusion management and specialty products are making it increasingly practical to draw useful conclusions about the details of network-based incidents.
The biggest challenge in conducting network forensics is the sheer amount of data generated by the network, often comprising gigabytes a day. Nearly impossible to store and tedious to search, if an incident is discovered well after the fact, most of the relevant network traffic data is gone. The second challenge of network forensics lies in the inherent anonymity of the Internet protocols. Each network layer uses some form of addressing for the 'to' and 'from' points, such as MAC addresses, IP addresses and e-mail addresses, all of which can be spoofed. Fortunately, the wide range of powerful software, including products purpose-built for forensic analysis, make it practical to solve cases through the analysis of network activity.
Network forensic tasks that can be facilitated through software include the collection, normalizing, filtering, labeling, stream reassembly, correlation and analysis of multiple sources of traffic data. Although there are single-purpose tools aimed at each of these tasks, feature creep is blurring the distinction between categories, resulting in tools that are useful in addressing a growing number of things that can go wrong on the network. However, before an investigator can perform any other forensic task, suitable network activity data must be collected. Raw network packets, which contain the highest possible level of traffic detail, supplement the often sparse log data available from applications, authentication systems, routers and firewalls. Such network data is collected by sniffing. The original and still popular sniffing software is tcpdump. Run on a network host, tcpdump creates a binary file of the entire packet header and data contents visible on a network interface. It also provides rudimentary filtering and retrieval access to those dump files. Running tcpdump on an Intel Linux host provides sufficient data collection capabilities for a small network. However, technical advances continue to be made in sniffing platforms to accommodate the bandwidth explosion. Gaining performance through the use of either dedicated hosts or special-purpose appliances, vendors such as Network Associates, Intellitactics, DeepNine and Sandstorm are competing to provide the highest reliable rates of packet capture and storage capacity.
Two other types of data collection tools can also be used in network investigations. Honeypots are bogus environments designed to appeal to attackers. Although of limited use, in certain cases they can provide a safe and convenient place to gather ongoing data about specific attackers. Opinions differ as to whether intrusion-detection systems should be considered as forensic tools. By design or configuration, most IDS reduce and normalize their data stores to the point where a significant level of detail is lost. However, they are gaining new forensically-useful functionality along with better capture and storage capacity. Increasingly, high-end commercial products from vendors such as Network Associates and Niksun are positioned as supporting both real-time alarming and after-the-fact forensic investigations.
Although a few protocol-savvy technologists are able to make sense of the filtered output of tcpdump, most investigators prefer having a more coherent picture of network activity. Stream reassembly or sessioning is the collation and packaging of raw network traffic from a single source such that all the data within a connection session is presented as a complete stream. Sessioning is performed by protocol analysis tools, which isolate the specific communications that took place between two or more of the apparent endpoints or relay points. Such an analysis is the first step in determining who communicated when and what was transmitted. Most protocol analysis tools provide a tree-oriented view of sessions and protocols used within the sessions. Such a visual presentation of network traffic makes it easier to understand exactly what happened on the network. Open source Etherreal and commercial Etherpeek are two popular protocol analysis tools that are equally useful to network troubleshooters and forensic investigators.
Investigations of system hacks typically concentrate on examining the protocol information in packet headers to determine the source and victim systems. In contrast, application data gets more emphasis when investigating intellectual property theft incidents, or when scrutinizing inappropriate employee activity. Content-oriented investigations are facilitated by forensic analysis tools, which process raw data and provide parallel views of the same source material at different levels of system abstraction. For example, the popular computer forensic product EnCase, allows an investigator to open up a byte-level view of a file alongside a window that shows the formatted text or bitmap that would be visible when the file was loaded into an application.
Likewise, Network Forensic Analysis Tools (NFATS), provide a view of raw packet data alongside corresponding views into higher layers, such as the data flows within a session or the application information visible in a Web browser. NFATs are complete packages that combine protocol analysis with the ability to view application-layer information, such as bitmaps, mail and documents. They usually have a built-in sniffing capability and most support the analysis of data collected by other systems, especially when it is in the de facto standard tcpdump format. Created specifically for forensic and incident investigation, such tools are not aimed at the network administration market. Investigators use commercial NFATs to search through application data, view it and after finding something suspicious, drill down to the deeper network layers to examine session and packet header information pointing to specific workstations and human suspects.
To address the 'end-to-end' challenge, the NFAT vendors have been expanding their products' capability to correlate data from multiple sources, providing the investigator with a more accurate understanding of the communication flow within the greater enterprise. It is then easier to locate the endpoint computers so that they can be searched for corroborating evidence. To deal with the complexity of network investigations, several NFATs include case management functionality that can link all relevant data objects to a single incident file, a further example of how NFAT vendors have learned from traditional computer forensic products.
While developers of forensic products have not introduced any truly revolutionary capabilities lately, the tools we already have are continuing to improve in useful ways. With a wide range of powerful software in your network security toolbox, including products purpose-built for forensic analysis, solving cases through network analysis is a practical feat.
About the author
Jay Heiser, CISSP, is a London-based security analyst with TruSecure Corp.