Incident response team best practicesDate: Jan 03, 2011
Expert Lenny Zeltser discusses incident response best practices including building and testing policies, creating incident response teams and much more.
- Can you speak to the maturity of incident response plans within the enterprise? (0:25)
- Where are enterprises most challeneged when it comes to incident response? (1:34)
- Is there an ideal skillset for an incident handler? (2:44)
- Who must sit on an incident response team? (3:43)
- How often should organizations review their incident response plans? (6:22)
- How often should organizations conduct a dry run of their incident response plans? (7:55)
- Can you talk about the NIST framework and whether it should be customized or not? (9:56)
- Where do companies make the most mistakes after an incident has occured? (11:55)
About the expert:
Lenny Zeltser is the New York security consulting leader at Savvis Inc. He is also a senior faculty member at SANS Institute, where he teaches a course on reverse-engineering malware.
Read the full transcript from this video below:
Incident response best practices
Mike Mimoso: Hi, I'm Mike Mimoso, and joining me today is Lenny
Zeltser. We are
going to talk about incident response. Thanks for joining us today, Lenny.
Lenny Zeltser: It's great to be here.
Mike Mimoso: Our first question today. I was just wondering if you could speak to
the overall maturity of incident response plans within the enterprise.
Lenny Zeltser: It really varies. A lot is based on the size of the organization,
how long they have been around and how mature their overall security
practice is. Certainly, a leading framework that we've all been following
for incident response has been around for a while and was published by NIST
a number of years ago and has been popularized by many security courses and
organizations, like SANS. We know what we need to do.
Companies that have been able to invest into creating a formal security
management program are pretty good at having created those in response
capability and the appropriate procedures around it. But then, you have a
lot of small organizations that maybe, have appeared over the last few
years or organizations that have been small for a while that really haven't
had a chance to mature their practice overall, so they are still pretty
immature when it comes to incident response. Overall, incident response
tends to be the last thing that companies somehow look when they try to
formalize their security program. That suggests there's a lot of room for
Mike Mimoso: Where are enterprises most challenged when it comes to incident
response? Are we talking about a lack of planning or a lack of leadership,
Lenny Zeltser: The biggest challenge around incident response is to be able to
justify time to prepare for it before something bad actually happens. With
organizations where staff is stretched thinly, it's very difficult to
justify spending time on something that might or might never occur. So, we
are so busy fighting fires, and it's just the hardest thing is finding time to
do it. And it requires a strong maturity from the organization's management
to be able to tell their people that yes, it's ok to spend a number of
hours, days, or weeks planning for an incident.
That is the biggest challenge is that we are not prepared for it. Really,
one side of it is that organizations have not planned for it, but the other
similarly interesting aspect of it is that there are some organizations
that I have seen that have spent some time preparing for it, but have
planned in a manner that's completely unrealistic and maybe, too
theoretical and too paperwork driven. And thus, if an actual incident
happens really, they don't really know what to do despite having planned
for it, and they have a false sense of security.
Mike Mimoso: Is there an ideal skill set for an incident handler?
Lenny Zeltser: It's different, depending on you understanding what an incident
handler is because as part of an incident handling team, you have people
who have different skills and various talents. You have people who are
system and network administrators. You have people who are security
analysts. You have people who are good at interacting with management and
with customers. So, really a wide set of skills comprises an incident
But when I think of individuals that are involved in incident handling, I think
the biggest skill, or maybe it's a talent, is being able to stay calm in a
situation that is very stressful. On the one hand, you might say that is
something a person may need to be born with, but I find that more and more
it's really something that you can train yourself for with the proper
preparation and maybe, exercises.
Mike Mimoso: Who must sit on an incident response team?
Lenny Zeltser: You need to be sure to include both technical and non-technical
staff as part of an incident handling team. There's a lot of technical work
that will be need to be done when an incident happens. Let's say, a server
gets compromised, you need someone with a system administrator's
capabilities who knows how to look at the server and knows what processes
to examine, what logs to look at, how to understand what goes on there.
Similarly, most incidents involve a network and, therefore, somebody with
networking expertise is required as well. At the same time, you need
somebody who has a security background so that they know how to take the
perspective of an attacker. Because a regular sys admin might not be as
skilled at knowing how to read logs and to derive malicious
actions, based on what is going on there. So, that's really the technical
aspect of the incident handling team.
From a non-technical perspective, you need somebody who is good at
coordinating multiple people with various agendas and perspectives on the
incident. This is really a person whom I consider an incident coordinator.
He might not even be technical at all, but he might be skilled at dealing
with people, knowing how to keep everybody calm, knows how to communicate
what goes on to the management, which means understanding the technical
aspects of the incident. Communicating them in non-technical jargon to the
managers and at the same time knows how to interpret management's requests
and feedback in a way that the technical team understands.
Then, of course, you have the actual data owners. These are presumably
business people who have the responsibilities for the data that was
compromised. These are the people that interact with the organization's
customers, and they, of course, will need to be logged as well. They will
need to know what to do and how to interpret what goes on. Also, most
incident handling teams have somebody with a legal background who knows how
to interpret and apply compliance requirements or contractual obligations.
Lastly, from a non-technical you have somebody that is good with public
relations. Because a company may need to communicate what is going on when
they decide to do so to their customers, to the public at large, to media
members. As you can see, a really wide set of skill sets, talents, and
backgrounds comprise an incident handling team.
Mike Mimoso: How often should organizations review their incident response plans?
Should this be an annual exercise?
Lenny Zeltser: The more up-to-date your plan is the better, of course. Too often,
we will have plans that had been created years ago. They are nice and thick
and printed on beautiful glossy paper, but they are collecting dust,
because no one ever looks at them. Plans need to be looked at and validated
for relevance and accuracy. I would say as a rule of thumb probably, at
least, on an annual basis. Even more importantly, they should be reviewed
whenever something significant changes with respect to the infrastructure
that the organization is using for technology, or respect to the company's
business and compliance requirements. When there is a major change, of
course, the incident response plans have to be updated as well.
A major flaw in a lot of incident response plans is that they are too
optimistic about people knowing what to do. Too optimistic about its
expectations that people will actually read the plan, and therefore the
plan ends up being too long, too thick, too detailed. When somebody sees a
large volume called an incident response plan, they are scared to even open
it; therefore, probably won't read it. Secondly, the larger and more
detailed your plan is the more difficult it is and more expensive it is to
keep it up-to-date. I would say keep the size and the detail level in your
plan in proportion to the overall maturity of your IT and security
processes and also at the same time update it, keeping up with the same
pace at which you would update any other IT or security documentation.
Mike Mimoso: On a related question, how often should organizations conduct a dry
run of their incident response plans or, at least, some kind of table top
Lenny Zeltser: A dry run, it really depends. If your company is getting hacked
every day, then you're probably doing OK and don't need to do a dry run,
but, of course, it's a little sad. It's very challenging for organizations
who don't frequently deal with incidents because if they don't run through
a dry run or table top exercise, that means that when an incident does
happen, people are out of practice and they don't know what to do.
Those organizations that perhaps, have a mature security program, they
don't encounter significant incidents all that frequent, and they become a
little complacent, a little too comfortable. These are the companies that
really need to pay attention to how to maybe, simulate an incident, train
both the security staff and the business people at what to do if an
incident actually does happen.
Mike Mimoso: What should team leaders be looking for during those dry runs?
Lenny Zeltser: When you go through a dry run of an incident response exercise, you
are looking out for those deviations from your written policies and
procedures in comparison to what actually happens. You may have made
certain assumptions when you wrote your plan. For example, you never
thought about how certain large log files might be extracted from an
environment that's very locked down. You go through an exercise and you
find out that an environment is locked down so much that you actually can't
extract those logs.
It's those little points that you forget when thinking about a plan and
putting it down that will become apparent. Like how do you get access to a
system to begin with? Who has access to them? Who has the log on
credentials? As someone is going through a dry run of the incident
response, keep a log of those issues that come up that might be very small.
Maybe, it's as simple as somebody not knowing an IP address in a system
where he needs to connect to. That's the beauty of a dry run, it's ok to
make mistakes, and you just adjust your plans accordingly.
Mike Mimoso: You mentioned the NIST framework earlier as a guide for incident
response. Do companies use it as is, or are they customizing it to their
business? Can you talk about whether it's a mistake or not, to not
customize it to your business?
Lenny Zeltser: Yeah, certainly. In general, when it comes to security policies,
there are security policies you can buy pre-written. A lot of companies
that I have seen, unfortunately, just buy a set of security policies. Pick
the ones they feel are going to be relevant and include it in their own
binding and say this is their security policy manual. In those situations,
policies are rarely crafted specific to the company and, therefore, are
kind of irrelevant and only will hurt the company if it actually tries to
go through those polices and follow the steps prescribed there.
It is incredibly relevant to the incident response because when an incident
does happen it leaves very little room for error. That means, if you
adopted somebody else's incident response plan, that is not very realistic
within your company. It does not account for the number of people you have,
their skill set, then if it's a pie in a sky incident plan, it's not
helpful. It will only hurt you. It's very important to not be threatened by
some of the sample incident response plans that a smaller organization
might see out there and say wow, we will never be able to pull that off.
Even in a larger organization, think of it this way. If something were to
happen today, what would you want to happen and what is realistic for you
to actually implement and try to plan that is well, realistic, pragmatic. A
plan that is not what you would hope someone might do, but what somebody
might actually do. Write those things down, because when an incident does
happen, people are stressed and they can make mistakes. If they don't have
those guidelines, and if the guidelines that you give them are specific to
your systems, your business processes, then you will be really helping the
Mike Mimoso: The final question, Lenny. Now that an incident has happened, where
do companies make the most mistakes? Is it in identifying the scope of an
incident? Containing it or even communicating it to shareholders or the
media, if necessary?
Lenny Zeltser: The mistakes are tied to the biggest challenges in incident
response. In my mind, there are two significant challenges. One is
understanding the actual scope of the incident, and the other one is how do
you communicate and stay in touch with all the interested parties that care
about the incident. From the scope perspective, the mistake some companies
make is underestimating which systems are actually affected.
They do that because a lot of times the people involved in the incident
don't actually have the knowledge of the environment. They might come into
an environment whose business processes they don't understand. These might
be sys admins who never cared to ask what do these systems actually do?
What are their dependencies? What other systems depend on the systems that
just got compromised?
Without having an understanding of the overall business information, they
cannot properly estimate what systems may have been affected and even more
importantly what business process may have been affected. That means
organizations need to make that information about business flows. Make that
information available to incident responders. Similarly, sometimes
organizations might go overboard and say, overestimate the scope of the
Again, I think it's tied to the lack of knowledge of the business processes
application makeup and the systems that drive those processes. They err on
the side of caution and say let's consider everything is in scope. If you
consider that the scope is too big, then you spend too much time on
unnecessary analysis steps and you lose manpower and you lose analysis
The other aspect regarding communications that I think is also very
important to track and to examine is tied to how you simply keep people
informed about what's going on. Especially, a lot of technical people will
be very stressed out and will be very concerned about what is going on, but
if an incident response is considered a technical problem, then your
technologists are busy actually looking through the logs examining
processes and looking at the contents of the file system, and nobody thinks
about keeping non-technical people involved.
When non-technical people, especially managers, are not apprised of the
situation, they may try to maybe, ask too many unnecessary questions
because they are frustrated. They don't have the information they need.
That in turn slows down the technology team. Designating somebody as the
person who will make sure that the technical team is able to do their work
and not get bogged down with communication details, but instead is able to
translate what goes on and communicate that to managers, I think is very
Mike Mimoso: Thanks for joining us today, Lenny.
Lenny Zeltser: It's been great. Thank you.
Mike Mimoso: Thank you for watching. I'm Mike Mimoso, and for more security
resources, please go to searchsecurity.com.