This article is part of an Essential Guide, our editor-selected collection of our best articles, videos and other content on this topic. Explore more in this guide:
2. - Implementing secure coding fundamentals: Read more in this section
- Badware, malware are separate, but connected, issues
- Don't hack back: Fix vulnerabilities to win the cyberwar
- The 12 processes of highly successful software security programs
- To find software flaws, architectural risk analysis a must
- Want to do a number on software insecurity? Try BSIMM-V
Explore other sections in this guide:
Software defects that lead to security problems come in two major flavors -- bugs in the implementation and flaws in the design. A majority of attention in the software security marketplace (too much, we think) is devoted to finding and fixing bugs, mostly because automated code review tools make that process straightforward. But flaws in the design and architecture of software account for 50% of security defects (see McGraw's 2001 book Building Secure Software for more on this.)
In this article, we'll explain the difference between bugs and flaws. More importantly, we'll describe an architecture risk analysis (ARA) process that has proven to be useful in finding and fixing flaws.
What is the difference between a bug and a flaw? Perhaps some examples can help.
Bugs are found in software code (source or binary). One of the classic bugs of all time, the buffer overflow, has at its root the misuse of certain string handling functions in C. The most notorious such functions is gets() -- a system call that gets input from a user until the user decides to hit return. Imagine a fixed size buffer or something like an empty drinking glass. Then imagine that you set up things to get more input than fits in the glass (and the attacker is "pouring"). If you pour too much water into a glass and overfill it, water spills all over the counter. In the case of a buffer overflow in C, too much input can overwrite the heap or even overwrite the stack in such a way as to take control of the process. Simple bug. Awful repercussions. (And in the case of gets(), particularly easy to find in source code.)
Hundreds of system calls exist in C that can lead to security bugs if they are used incorrectly, ranging from string handling functions to integer overflow and integer underflow hazards. And there are just as many bugs in Java and other languages. There are also common bugs in Web applications (think cross-site scripting or cross-site request forgery) and bugs related to databases (like SQL injection).
There's an endless parade of bugs (and, by the way, there are way more than ten). The fact is, there are so many possible bugs that it makes sense to adopt and use a tool to find them. The many commercial source code review tools available include HP's Fortify, IBM AppScan Source, Coverity Inc.'s Quality Advisor, and Klocwork Inc.'s Clocwork Insight. The latest twist in source code review is to integrate bug finding directly into each developer's integrated development environment (IDE), so that bugs are uncovered as close to conception as possible. For example, Cigital Inc.'s SecureAssist does this.
At the other end of the defect spectrum we find flaws. Flaws are found in software architecture and design. Here's a really simple flaw example. Ready? Forgot to authenticate user. That kind of error of omission will usually not be found with code review. But it can be a serious problem. Does your process run as root? Better be darn sure who's using it!
Other examples of flaws include "attacker in the middle" problems that allow tampering or eavesdropping between components, layers, machines, or networks; and "replay attack" problems that have to do with weak protocols.
To flesh things out a bit, here is a list of some common Java-related flaws: misuse of cryptography, compartmentalization problems in design, privileged block protection failure (DoPrivilege()), catastrophic security failure (fragility), type safety confusion error, insecure auditing, broken or illogical access control (RBAC over tiers), method over-riding problems (subclass issues), too much trust in (client-side) components that should not be trusted. (For more on these issues, see McGraw's ancient book Securing Java.)
Flaws are just as common as bugs. In fact, in most studies, bugs and flaws divide the defect space 50/50. Of course we're really talking about a continuum. There are some tricky cases that may be categorized as both a bug and a flaw depending on how you look at it. But, in general, making a distinction between bugs and flaws is a useful exercise.
Simply put, if we're going to solve the software security problem, we're going to need to focus more attention on flaws.
Finding flaws with ARA
We've specialized in software risk analysis and design review for many years. When we first started reviewing systems for security in 1997, we followed an ad hoc, expertise-driven approach (three smart guys in a room with a white board). These days when we're doing a deep dive into software architecture looking for flaws, we follow a process called architectural risk analysis.
ARA is a four-step process. Step 0 (naturally we start with 0 since we're geeks) is get an architecture diagram.
That may sound silly or glib, but believe it or not, teasing a relevant and up-to-date architecture diagram out of the development team is sometimes harder than it should be. For example, some extreme Agile methods say things like "the code is the design." We beg to differ.
Our ultimate goal for this step is to create a one-page diagram of the software system. One page is important, because we want a forest-level view of the software. Bugs are found at the tree level, flaws at the forest level. We don't want reams of code, we don't want UML and we don't want a firewall placement network map. In many cases, we build the one-page diagram ourselves through a process of interviewing software architects, developers and testers.
By the way, your diagram will have some essential components, including, but not limited to DAO/persistence layers, business logic/business rules, security features, toolkits (WSE, WCF, Ajax), middleware, Web services, cloud API, caching and distribution.
After we have an architecture diagram in hand, we undertake three specialized analysis steps in an ARA: 1) known-attack analysis, 2) system-specific attack analysis, and 3) dependency analysis. (Note to long-time readers: We re-named the three steps of ARA to make them easier to grok.)
1) Known attack analysis is about as straightforward as it gets. Take a list of known attacks relevant to your architecture and go through them. Microsoft's STRIDE approach (part of what they mistakenly call threat modeling) is a good example. STRIDE is an acronym for spoofing, tampering, repudiation, information disclosure, denial of service, and elevation of privilege. That's the Microsoft list.
The key to known attack analysis is to know some attacks. Developing a canonical list of attacks at the design level is more than half the battle. STRIDE may work for vendors of operating systems, but you'll need your own list based on your own market and your own unique attackers. One way to create such a list is to ask your vulnerability management group which attacks are consistently being uncovered.
When you find an attack on your list that's relevant, calculate its impact and think about how you would fix the architecture to mitigate the risk. Notice our use of the word mitigate here. Sometimes a defect does not need to be entirely solved or completely eradicated. We just need to reduce the risk to an acceptable level for a given set of conditions.
Finding fundamental flaws can really pay off. Wouldn't it be great to identify a flawed design and mitigate the risk associated with tens or hundreds of bugs with a single properly implemented security control? That happens all the time with output encoding
2) System-specific attack analysis focuses on exposing invalid assumptions, ferreting out ambiguity and finding new attacks based on how the system works. This step requires the most hard-core experience and natural ability because security itself is an emergent property of software. You know how sometimes a piece of software seems to behave properly when it's all by itself, but when it's added to a bigger ecosystem it goes entirely off the rails? That's partially what we mean. Anticipating emergent consequences can be tricky business.
In any case, decomposing the problem can be helpful. At the very least, during this step we like to think about trust modeling (identifying trust boundaries explicitly), data sensitivity modeling (identifying privacy and trust issues) and threat modeling (identifying attackers and considering the attacker's perspective). Note that our use of the term threat modeling is different from Microsoft's.
One technique we like to emphasize here leverages diverse points of view. Ever seen what happens when you put two software architects together with one architecture in the same room? Hint: not world peace. Take advantage of how very experienced architects differ in their views of the same system. There's gold in them thar hills.
Once again, during this step, note attacks that you find and their impact. Think about how you can mitigate the risk through changes in the design.
3) Dependency analysis has to do with determining how wobbly the tower of other software your counting on to work turns out to be. Let's face it, today's software almost always relies on the proper behavior of components and frameworks that someone else designed and built. What were their assumptions, and how do they impact your system? What will happen to your design if the frameworks misbehave?
Start with the components you're relying on. Are there known vulnerabilities in the components you're counting on? (Open source or otherwise, the answer is often "yes.") Do you have sufficient security controls built into the framework? Do they actually work? (Sadly, the answer is "no" more often than we would like.) Are there features or functions that need to be disabled? (Probably.) Is the framework secure by default? (Maybe, if you have been very good lately.)
Write down what you find. Think through impact. Determine what to do about it.
After completing our four steps to ARA, you'll find yourself with a bunch of risks and some ideas for improving the design. Take those risks and explicitly consider business impact. Then rank your findings, organize them and propose a solution to the most important flaws you've uncovered.
Next comes the tricky bit: figuring out how and when to introduce major architecture changes. In some cases, solutions unfold only over several years. Do not despair.
Lightweight design review
Does ARA sound easy? Well, it's not. Because the process is intense and the work is high-expertise, an in-depth ARA will not make sense for all of the applications you have in your portfolio. ARA is a must-have for critical systems, but not so much for systems that are not the core of your business. In a future article, we'll tackle what can be done to address the flaws in these "lesser" systems.
Fix your flaws
Whether your architecture review process leads to quick refactorings or multi-year enterprise architecture changes over multiple releases, today is the time to focus real attention on software security flaws. We can't abandon the bug parade and the tools we use to find and fix bugs, but because flaws make up about half the problem, it's only prudent to address them.
About the authors:
Jim DelGrosso is principal consultant at Cigital Corporation, where Gary McGraw serves as CTO.