Grafvision - Fotolia
Many security teams have come to the unfortunate conclusion that our preventive approaches to security, and the controls that block threats to our IT assets, just won't work 100% of the time. It's only a matter of time until an employee clicks on a link or is socially engineered, a piece of unknown malware infects our systems, or a zero-day exploit is used to target us. What then?
For large organizations, the number of alerts our detection tools generate is becoming overwhelming, and even with advanced analytics platforms that help sift through the noise, we're drowning in manual tasks and processes that take up valuable time -- time that could be better spent investigating and responding to unusual activity in the environment. Sadly, we're learning the hard way that people don't scale well, and no one has the budget for an unlimited headcount. The way out? Incident response (IR) tools and methods that automate the process.
In the past several years, as incident response tools have advanced, the information security industry has started to revisit the idea of automation in security controls and processes, a topic that, in the past, was quite controversial. Many security professionals have been leery -- and rightly so -- of automating security control configurations (for example, creating or deleting a new firewall rule automatically) or responsive security analysis tasks (such as triggering automated responses to threats that intrusion detection systems identify). Today, however, there are many viable options for IR automation.
In an enterprise security operations center (SOC) that is mature, analysts tend to spend the most time on the following types of activities:
- Alert identification and correlation: Alerts come in to centralized collection platforms for analysis, often originating from firewalls, network and host intrusion detection and prevention tools, malware sandboxing, system and application logs and many more sources. Unfortunately, this initial identification phase requires sifting through an inordinate amount of noise and very often leads to follow-up activities that an analyst must perform.
- False positive identification and suppression: For events and patterns we've seen before, tuning false positives may be somewhat more streamlined, but determining what is a false positive and what isn't remains the albatross of event management and analysis.
- Initial investigation and triage: Analysts need to investigate activities in the environment to validate legitimate incidents underway. This task is often limited by security and forensic analysts' availability.
- Ticket generation and updates: When an event warrants investigation, tickets need to be opened and assigned to a primary incident response team member who then updates the case as additional details are discovered and vetted.
- Report generation: Tools create many reports automatically, while others are assembled after manual investigative steps.
This is by no means a complete list, but rather, the common areas where enormous amounts of analyst time are consumed on a daily basis. In addition, very simple but essential tasks, such as email alerting and evidence uploads to a secure repository, may make using automation sensible.
Automating with incident response tools
In a 2015 report, forensic expert Alissa Torres stated that automation of any incident response process should focus on three major phases:
- Continuous data collection
- Aggregating and applying threat intelligence
- Streamlining live response capabilities
Security teams are improving at continuous data collection and analytics. They're starting to employ threat intelligence, too, although this is still an immature market and capability is hampered by a lack of maturity in commercial products as well as a shortage in available skills.
The majority of focus regarding incident response automation currently lies in phase 3, streamlining live response capabilities. These same phases were echoed in a 2015 RSA Conference presentation by James Carder and Jessica Hebenstreit, both formerly of Mayo Clinic, who provided tactical examples of security response automation, such as the following:
- Automated lookups of domain names never seen before (driven by proxy and domain name system logs).
- Automated searches for detected indicators of compromise.
- Automated forensic imaging of disk and memory from a suspect system driven by alerts triggered in network and host-based antimalware platforms and tools.
- Network access controls automatically blocking outbound command and control channels from a suspected system.
Fortunately, a number of products are emerging that offer help with many of these phases. Vendor products such as Swimlane, Invotas (now FireEye Security Orchestrator), CyberSponse, Phantom, Resilient Systems (now part of IBM), Hexadite and more are facilitating IR automation and security orchestration by integrating with numerous other tools in the environment.
The vast majority of these platforms focus on simple API requests and responses from endpoint security tools, event management platforms, network detection and access control systems, and forensics agents within the environment. The interoperability of data among these systems allows IR orchestration platforms to help security analysts determine "what to do next" when working through defined IR run books. For example, a detection event or set of correlated events from one endpoint may trigger a follow-up action to scour the environment for specific indicators of compromise in other systems. This, in turn, could lead to a network quarantine action or lookup of network traffic attributes within the environment, and reporting and alerting could be automated the entire way through this workflow. At any given moment, all SOC team members and forensic analysts could see what stage of an incident run book was in play and could collaborate within a common messaging and reporting interface at any time as new evidence or activities came to light.
Incident response tools that aid automation
Endpoint security vendors have begun to emphasize response automation activities and integration with detection, response and forensics capabilities. Given the prevalence of compromise scenarios that involve endpoint systems (both end user and central data center), the need to quickly identify indicators of compromise (both behavioral and signature-based), integrate threat intelligence that is continuously updated and integrated centrally, and allow analysts to perform lookup actions across other systems is paramount. As these capabilities mature, more and more security teams are seeking the ability to automate pre-emptive response actions such as network quarantine, forensic imaging at the host level, and even automated scans of the systems with vulnerability management tools. Without the proper tools and planning, however, this level of incident response automation and orchestration is doomed to failure, as false positives could easily render systems and network segments unreachable.
There are several major factors to consider when looking into these types of products:
- Vendor maturity. Some of these products have been around for several years and actually have installations within large organizations. This factor is especially important for older IR tools, as they pose a much greater risk of disrupting production environments if not vetted by mature teams.
- Integration partners. This is actually one of the most important considerations, as most of these tools fundamentally rely on the use of APIs to perform automation activities. The more associates an integration partner has in the areas of endpoint security, network security, antimalware, identity management, forensics and so on, the higher the likelihood that integration and ongoing management will go smoothly. Another key integration is with messaging and reporting tools, the most common being help desk ticketing and tracking software.
- Security information and event management (SIEM) tool alignment. These tools are usually implemented with some sort of defensive motivation, and that often means automating detection, response and investigation tasks and processes. For most large clients, this means integrating with the SIEM tool because that is where all event management is taking place currently. How events are passed among the systems and reported should be taken into account when reviewing products.
- Ease of use and implementation. Some of these tools have well-designed and intuitive GUIs. This is critical, as analysts shouldn't spend an enormous amount of time fumbling through a clumsy interface to perform important actions or looking for information during an incident. Creating and monitoring run books and workflows should be fluid and simple, and team member collaboration should be straightforward. Reporting should also be simple, with a variety of reports available for both technical analysts and executive management.
None of these incident response tools come close to replacing skilled, knowledgeable security analysts who understand the environment and know how to properly react during an incident scenario. Some of these tools offer a library of prebuilt workflows for specific incident types, and this can help to kick-start the automation and orchestration process for many teams that want to get moving quickly. But we still need people who understand the attacker methods and attack lifecycles, know how to properly parse and interpret alerts and correlate many sources of information together, and work well as a team to perform specific roles (network security analysis, host forensics and overall incident response coordination, for example).
Another key thing to keep in mind is tracking of improvements and metrics related to response performance. To do that effectively, security teams should ensure they have reasonable measurements for mean time to detect, respond and eliminate threats. They should make sure that any tools acquired or built to help with response automation (commercial or in-house) bring these numbers down over time.
The breach landscape is still ugly and will likely remain that way for some time. Unless we continue to learn, and start detecting and responding more quickly, there's no way we'll ever get ahead of the attackers we face now and in the future.
Worst-case scenario: How to cope with a security breach
When and how to work with incident response professionals
Learn more about endpoint security tools