Data classification best practices: Techniques, methods and projects

Effective data classification in the enterprise requires a simple approach.

This article can also be found in the Premium Editorial Download: Information Security magazine: Top considerations for midmarket security:

The process of data classification is not something that comes naturally to us. So why even bother to do it at all? Because it just makes no sense to treat all information as if it has the same security significance (or integrity significance or availability significance). Without making a conscious decision to perform different levels of control on different types of data, you inevitably either treat all data as if it were nuclear...

waste, carefully securing it to such a degree that nobody can get any work done, or apply no controls whatsoever, ensuring that proprietary and regulated data will get into the wrong hands.

A data classification model is a mechanism to optimize the level of security effort, maintaining the maximum possible flexibility while ensuring a proportionate level of control for sensitive data. But how do we do choose a model that is practical and useful? What data classification best practices should we follow?

Given that classification is a practice we borrowed directly from military organizations, it's illuminating to examine what they've learned over the last century. The U.S. had no formal scheme until World War I, when the French and British forced use of a three-tiered scheme. By the start of World War II, the allies somewhat reluctantly accommodated a more complex and information-rich world by extending classification to four levels, and then the U.S. demanded that a fifth level be added for the Manhattan Project. Because of the significant global threat presented by thermonuclear weaponry, NATO continued, albeit grudgingly, to use a five-level scheme.

The military always avoided additional complexity to their schemes because of the difficulty of applying classifications compounds with each added level. I am aware of some commercial organizations that use one more level than NATO. That probably doesn't make sense, even if they are dealing with nuclear bombs. My experience has been that anything above three levels is too abstract to be practical.

Ten years ago, I was introduced to the idea of a simple low, medium, high scheme through the National Security Agency's Information Security (INFOSEC) Assessment Methodology course. The NSA, traditionally not an organization that shies away from abstract and impractical computer security concepts, has been promulgating this low-granularity concept for a decade. The International Security Forum (ISF) also has been recommending a low, medium, high scale.

Ultimately, a data classification model that is "too simple" is superior to one that is "too complex." Simplicity is especially important as it becomes increasingly clear that the lines of business need to not only determine the sensitivity of their own IT assets, but they also need to "own" the associated risk. A complex scheme that requires business managers to become data classification specialists is a non-starter.

Increasingly, we're going to be offering security in the form of service levels, aligning the degree of protection with business needs and budgets. The baseline level of protection is suitable for the majority of business information, which is of low sensitivity, but special information requires special care, and that costs extra. Our service catalog has to be easily understandable, offering specific benefits for higher costs.

Of course, to be effective, a data classification model requires processes, techniques, methods, projects and technology to provide the correct level of security specific to each level. And data owners and users must be willing to follow the processes and use the technology, even when it means extra work or reduced system performance. However, a growing number of organizations are learning that low, medium, and high make a very practical starting point for service-level offerings.

Jay Heiser is a London-based research vice president at Gartner. Send your comments on this column to

This was first published in March 2009

Dig Deeper on Data Analysis and Classification



Find more PRO+ content and other member only offers, here.



Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to: