Evaluate Weigh the pros and cons of technologies, products and projects you are considering.

Endpoint DLP fills data protection gap

Learn how endpoint data loss prevention technology complements network DLP and secures data that users interact with on laptops, mobile and portable storage devices.

Be it as part of a full-suite solution or as a standalone product, organizations are increasingly turning to endpoint data loss prevention (DLP) to close the gap on data protection. Although most organizations start with network DLP to gain the broadest coverage as quickly as possible, loss of sensitive data isn't exactly a problem limited to the network or storage repositories. From remote users to portable storage, the endpoint is not only a significant repository for sensitive information, it's where users spend much, if not most, of their time accessing the data.

But endpoint DLP is also the least mature segment of this increasingly popular class of technology. Due to processor and memory limitations it's where we see the biggest differences between competing products, and the greatest feature and performance constraints. We also see competing solutions targeting the endpoint from different genealogical backgrounds, with offerings from traditional DLP, traditional endpoint, portable device control, and even encryption vendors. With such a confusing landscape, it's important to understand value of endpoint DLP, the potential feature set, and how to prioritize your needs.


At the beginning of the DLP market we nearly always recommended organizations start with network DLP. A network tool allows you to protect managed and unmanaged systems (such as contractor laptops), and is typically easier to deploy in an enterprise because you don't have to touch every desktop and server. It also has advantages in terms of the number and types of content protection policies you can deploy, how it integrates with email for workflow, and the scope of channels covered. During the DLP market's first few years, it was hard to even find a content-aware endpoint agent.

But customer demand for endpoint DLP grew quickly because of two major needs--content discovery on the endpoint, and the ability to prevent loss through USB storage devices. And that's where the first batches of endpoint DLP tools were focused.

The next major driver for endpoint DLP is supporting network policies when a system is outside the corporate gateway. We all live in an increasingly mobile workforce where we need to support consistent policies no matter where someone is physically located, nor how they connect to the Internet.

Finally, we see some demand for deeper integration of DLP with how a user interacts with their system. In part, this is to support more intensive policies to reduce malicious loss of data. You might, for example, disallow certain content from moving into certain applications, such as encryption tools. Some of these same hooks are used to limit cut/paste, print screen, and fax, or to enable more advanced security such as automatic encryption and application of DRM rights.


Data loss prevention is defined as:

"Products that, based on central policies, identify, monitor, and protect data at rest, in motion, and in use through deep content analysis."

We normally use network DLP to monitor and protect data in network communications, and content discovery (sometimes built into the network DLP server) to handle data in storage. Endpoint DLP is a little different, since it has to potentially manage all three parts of the problem, and all on a system that's also running everything from antivirus to the actual business and productivity applications users need to get their jobs done.

Ideally, we'd like to monitor and protect network traffic when the endpoint is on remote networks, keep track of sensitive data stored on the endpoint, and track other usage, such as moving a file to portable storage, printing, or even cutting and pasting between applications. I say "ideally" since few products offer all of these capabilities in one package, especially once we start looking for advanced content analysis techniques.

Even for a single given function, there can be a dozen different approaches, all with varying degrees of success. We break out the potential functions into four main categories:

  • Content Discovery: Scanning of stored content for policy violations.
  • File System Protection: Monitoring and enforcement of file operations as they occur (as opposed to discovery, which is scanning of content already written to media). Most often, this is used to prevent content from being written to portable media/USB. It's also where tools hook in for automatic encryption or application of DRM rights.
  • Network Protection: Monitoring and enforcement of network operations. Provides protection similar to gateway DLP when an endpoint is off the corporate network. Since most endpoints treat printing and faxing as a form of network traffic, this is where most print/fax protection can be enforced (the rest comes from special print/fax hooks).
  • GUI/Kernel Protection: A more generic category to cover data in use scenarios, such as cut/paste, application restrictions, and print screen.

Between these four categories we cover most of the day-to-day operations a user might perform that places content at risk. It hits our primary drivers for endpoint DLP--protecting data from portable storage, protecting systems off the corporate network, and supporting discovery on the endpoint. Most of the tools on the market start with file and (then) networking features before moving on to some of the more complex GUI/kernel functions.


Even if you have an endpoint with a quad-core processor and 8 GB of RAM, your users might get upset if all that horsepower was dedicated to running DLP. As important as content analysis is for data protection, it can also be extremely resource intensive.

The key distinguishing feature of DLP, endpoint or otherwise, is deep content analysis based on central policies. This contrasts with non-DLP endpoint tools, such as encryption or portable device control (USB blocking) based on file type, tagging/metadata, or location. While covering all content analysis techniques is beyond the scope of this article, some of the more common ones include partial document matching, database fingerprinting (or exact data matching), rules-based, conceptual, statistical, predefined categories (like PCI compliance), and combinations of the above. They offer far deeper analysis than just simple keyword and regular expression matching.

This advanced analysis comes at a cost in terms of memory and processing power. For example, to use partial document matching you create a series of overlapping hashes for the entire document, and then scan in real time the file or network traffic to identify matches. For database fingerprinting, we use a similar technique except the hashes are of database fields, correlated with related field hashes for the same row. Imagine monitoring these hashes for tens of thousands of documents or millions of database rows; while it's possible, and commonly done on dedicated servers, the performance and memory requirements would crush even the fastest of laptops.

Different agents have different enforcement capabilities, which may or may not match up to their network counterparts. At a minimum, most endpoint tools support rules/regular expressions, some degree of partial document matching, and a whole lot of contextual analysis. Others support their entire repertoire of content analysis techniques, but you will likely have to tune policies to run on more resource constrained endpoints.

Some tools rely on the central management server for aspects of content analysis, to offload agent overhead. Rather than performing all analysis locally, they ship content back to the server, and act on any results. This obviously isn't ideal, since those policies can't be enforced when the endpoint is off the enterprise network, and it sucks up a bit of bandwidth. But it does allow enforcement of policies that are otherwise totally unrealistic on an endpoint, such as fingerprinting of a large enterprise database.

One option is policies that adapt based on endpoint location. For example, when you're on the enterprise network most policies are enforced at the gateway. Once you access the Internet outside the corporate walls, a different set of policies is enforced. For example, you might use database fingerprinting of the customer database at the gateway when the laptop is in the office or on a (non-split-tunneled) VPN, but drop to a rule/regex for Social Security numbers (or account numbers) for mobile workers. Sure, you'll get more false positives, but you're still able to protect your sensitive information while meeting performance requirements.


Agent management consists of two main functions: deployment and maintenance. On the deployment side, most tools today are designed to work with whatever workstation management tools your organization already uses. As with other software tools, you create a deployment package and then distribute it along with any other software updates. If you don't already have a software deployment tool, you'll want to look for an endpoint DLP tool that includes basic deployment capabilities. Since all endpoint DLP tools include central policy management, deployment is fairly straightforward. There's little need to customize packages based on user, group, or other variables beyond the location of the central management server.

The rest of the agent's lifecycle, aside from major updates, is controlled through the DLP central management server. Agents should communicate regularly with the central server to receive policy updates and report incidents/activity. When the central management server is accessible, this should happen in near real time. When the endpoint is off the enterprise network (without VPN/remote access), the DLP tool will store violations locally in a secure repository that's encrypted and inaccessible to the user. The tool will then connect with the management server next time it's accessible, receiving policy updates and reporting activity. The management server should produce aging reports to help you identify endpoints that are out of date and need to be refreshed. Under some circumstances, the endpoint may be able to communicate remote violations through encrypted email or another secure mechanism from outside the corporate firewall.

Aside from content policy updates and activity reporting, there are a few other features that require central management. For content discovery, you'll need to control scanning schedule/frequency, as well as bandwidth and performance (e.g., capping CPU usage). For real time monitoring and enforcement you'll also want performance controls, including limits on how much space is used to store policies and the local cache of incident information.

Although there have been some major acquisitions of DLP solutions by vendors with consolidated endpoint agents, the DLP agents are still usually separate executables. Even when they offer unified deployment and management, DLP is still another agent. The exception is endpoint-only tools, such as portable device control solutions, that extended functionality with DLP features.


Today's endpoint DLP solutions fall into three categories:

  • Endpoint agents of a full-suite DLP solution (combined network, storage/discovery, and endpoint DLP).
  • DLP features in another endpoint product (such as portable device control or encryption).
  • Dedicated endpoint DLP.

From a data security perspective, there's a definite advantage to full-suite DLP solutions since they allow you to define a single policy, and then apply it across multiple channels. Rather than having to define a data protection rule in an email tool, and then again in an endpoint tool, it's all managed with a single policy server. Incident handling is also centralized, so you don't have different administrators dealing with data security issues (probably using different processes) based on where the incident occurred. And due to the performance constraints of the end, a full-suite solution supports the widest range of policies for the bulk of your users, with--if the product supports it--adaptive policies when those users leave the corporate network. The disadvantage of the full-suite's endpoint agents is that they often lack other useful functionality, for example, they may be weaker in controlling USB access.

When a non-DLP vendor adds DLP features, they typically start with only support for basic regular expressions or pre-set categories added to their existing functionality. For example, a portable device control product will add filtering on files moving to USB, or an encryption tool will add content-based encryption. Although they don't offer full DLP functionality, they are useful in smaller companies and organizations that only require some basic data protection on the endpoint, and don't want the cost or complexity of a full solution. This option also tends to lack a dedicated management interface for DLP, which could complicate incident handling.

Thanks to various acquisitions, there are only a couple of dedicated endpoint DLP tools on the market. They tend to support most, if not all, of the DLP content analysis techniques, and offer more functionality than DLP features in other endpoint tools. In some cases, they even offer a few extra endpoint features over those in a full-suite endpoint agent. The challenge is that they need to play games to replicate certain core DLP functions, such as email quarantine routing, and they still face the performance constraints of the endpoint.

No matter which category you select, and no matter what the vendors tell you, there are major functional differences between products. Few products support all four categories of functions, and there's a lot of practical variation even with the parts they support. It's important to really dig in with a potential vendor and ask them to demonstrate your most important use cases.

Deciding on the solution comes down to your data protection needs and budget. Full-suites with endpoint agents are the most expensive, but offer the broadest coverage, and usually the best content analysis. But we also see some organizations with specific endpoint needs that a full-suite agent lacks; for example, creating shadow files of everything transferred to a USB drive. And those of you on a limited budget with only basic needs might leverage DLP features in an existing endpoint product, and combine that with DLP features in other network tools, such as your email gateway.

Rich Mogull is analyst and CEO of consultancy Securosis. Send comments on this article to [email protected].

Dig Deeper on Data loss prevention technology