Problem solve Get help with specific problems with your technologies, process and projects.

Black box and white box testing: Which is best?

There's no question that testing application security is essential for enterprises, but which is better: black box security testing or white box security testing? Learn more in this expert tip.

This tip is part of's Data Protection School lesson, How to build secure applications. For more learning resources, visit either the lesson page or the Data Protection School main page.

It's tempting to want a single, do-it-all tool -- in the family room, for example, wouldn't it be nice if there was an HDTV that also had an embedded DVR, an Internet connection for YouTube streaming and speakers that create true surround sound? Sure it would. But, in reality, to meet all those requirements it's necessary to buy multiple components -- each one serving a unique function.

Security School logo

Application testing is no different. When it comes to application security testing, "there's no silver bullet," said Jay Leek, who heads up corporate IT security services for mobile technology company Nokia Corp. White box and black box application testing tools and techniques address specific non-redundant needs; they can be used in a complementary manner, but no single approach does it all. Or, as Leek sums it up, "You must have multiple approaches looking at the same application from different angles in order to have any kind of confidence that you've got a fairly secure application."

White box testing: Inside looking out

White box testing is also called structural testing and static analysis. The source code -- or a compiled binary of it -- is assessed from an insider's view for security vulnerabilities and coding flaws. White box testing is commonly used early in the development process because it can be applied effectively while the code and modules are still being created.

Data Protection School

For more lessons ranging from DLP and e-disovery to Web 2.0 threats and database defenses, visit our Data Protection School.

With white box testing, developers can install plug-ins into their integrated development environment (IDE) of choice and catch semantic coding errors even before the code is checked-in or compiled. Think of this kind of analysis like a spell checker; it's great for checking many potential mistakes early in the process, but it's not a replacement for a proficient editor. This is due, in part, to the fact that software vulnerabilities are not semantic. For example, there is nothing inherently wrong with allowing a user to enter an unmasked 16-digit number into a form, but if that number is a credit card primary account number (PAN) and the form is part of the company's cardholder data environment, it's a violation of the Payment Card Industry Data Security Standard (PCI DSS). In other words, coding errors aren't the only source of software vulnerabilities; business rules and compliance requirements must also be taken into account when measuring an application's overall security posture.

Static source analysis can also be implemented later in the process, after code is checked into a source code repository, or, in the case of static binary analysis, after code is compiled by using a standalone tool. Often, audit or security teams use standalone tools during spot checks of code modules before code is compiled. In the case of binary analysis, the compiled code can be reviewed before it goes into production or even after launch to catch errors in software that is already deployed.

A benefit of static testing is that the tools can pinpoint the exact line (or lines) of code where a weakness or flaw is introduced. On the down side, static analysis tools can return high levels of false positives if they are not tuned carefully. Also, scanning every line of code may be too costly for some organizations. As companies become more proficient with these tools, they can configure them to suppress alerting on specific weaknesses if the company has deemed the risks of those weaknesses acceptable, thereby reducing the false positive rate. For efficiency, Leek advises that companies "focus on areas and applications of high criticality," paying closest attention to those at highest risk. "Then look at a subset of the source code," for those applications.

Pros: Early in the process, white box testing is great at catching semantic errors, can help teach developers how to write better code, and can identify a precise line of code where a flaw occurs.

Cons: White box testing may not cover dependent code (called services, libraries and other artifacts), it requires source code access, has limited insight into vulnerability exploitability and produces a high rate of false positives if not tuned appropriately.

Black box testing: Outside looking in

Also known as dynamic analysis and Web application scanning, black box testing analyzes code as it is running to identify vulnerabilities that an attacker could find when the application is running in production. Because these tools take an outsider's view, they are able to test whether the weakness can be exploited, or identify the type of weakness so a human penetration tester can validate exploitability manually.

For example, the tool may find a validation-input error that could allow a SQL injection attack, but verification could prove that, due to tight access control and the use of stored procedures on the back-end database, the vulnerability is not exploitable. Dynamic tools can be used before an application is approved for production and non-invasive dynamic testing can be conducted continuously on active applications to scan for new vulnerabilities.

Dynamic tools crawl, or spider, an application to identify the execution paths. Many Web 2.0 applications create pages on-the-fly based on user input. This can create recursive looping if the tool is not capable of indentifying and ending the loop. If the tool does not find all the paths in the application, parts of it will remain unscanned, providing only a partial picture of the software's vulnerability level. Rich Internet applications (RIA) that implement animation and interactivity with Flash or streaming video create blind spots for scanners, which are not yet fully adept at scanning such file types. Because dynamic tools test running applications, they can only be used after the post-build phase.

Pros: Black box testing can validate weakness exploitability (though this may require input from human testers for most accurate results), test applications within the system view without requiring the source, and some can be run continuously even on production applications.

Cons: Later in the lifecycle, poor spidering can result in only partial test coverage. Black box testing cannot pinpoint where a vulnerability is introduced in code, and most testers can't scan Web 2.0 and RIA media for vulnerabilities.

Gray box testing: Providing 360o views

White box and black box testing uncover different kinds of security weaknesses in applications at different stages in the lifecycle. As Leek explains, there are "things you can't find in one or the other" and you "need access to source in order to validate vulnerabilities." Essentially, he said, white and black box testing are "complementary tools." This is why many organizations chose to implement a hybrid testing approach that combines both white and black testing techniques.

Recommended resources

NIST SP800-64 Revision 2, "Security Considerations in the System Development Life Cycle" 

NIST SP500-268, "Source Code Security Analysis Tool Functional Specification Version 1.0

NIST SP500-269, "Software Assurance Tools: Web Application Security Scanner Functional Specification Version 1.0"

Such an approach, known as gray box testing, is most effective when there is a level of integration between the testing types, especially when they are tied together with an overarching software assurance management console or dashboard. For example, a dynamic tool that identifies a flaw doesn't have access to the underlying source code, making it hard for developers to know what to fix. If a management console is used to integrate white and black, the weakness can be prioritized in the remediation queue and the source code tool can pinpoint the location of the flaw. This helps managers understand exactly where the problems are and gives developers the information they need to address the problem.

The human factor

Regardless of which secure software testing tool, or tools, your company decides to use, don't overlook the importance of having skilled testers running the tools and analyzing the results. Although automated tools are designed to be user friendly, the market is still maturing. All automated tools work best when they are configured for your environment and application testing priorities particularly. If you don't have application security testing experts at your company, you can still benefit from application security testing by hiring outside experts. Leek recommends that companies "look for a managed service, a SaaS solution or something that is human-assisted." For comprehensive application security analysis, "you need a people element if you don't have it in your own team -- look to an external provider for those services."


As with many questions in security, the answer to "What kind of testing is best?" is: It depends. For finding semantic errors early in the lifecycle, white is the way to go. Don't have the source code or want to know how an attacker views your application? Go with black. And for the most comprehensive software assurance program, consider using a hybrid approach that leverages the intelligence and information from both types of testing into a single management and reporting console.

About the author:
Diana Kelley is a partner with Amherst, N.H.-based consulting firm SecurityCurve. She formerly served as vice president and service director with research firm Burton Group. She has extensive experience creating secure network architectures and business solutions for large corporations and delivering strategic, competitive knowledge to security software vendors.

This was last published in November 2009

Dig Deeper on Application attacks (buffer overflows, cross-site scripting)