In a recent report, Nemertes Research found that 23% of organizations surveyed were deploying big data frameworks...

Sign in for existing members Continue Reading This Article Enjoy this article as well as all of our content, including E-Guides, news, tips and more. Step 2 of 2: You forgot to provide an Email Address. This email address doesn’t appear to be valid. This email address is already registered. Please login. You have exceeded the maximum character limit. Please provide a Corporate E-mail Address.

By submitting my Email address I confirm that I have read and accepted the Terms of Use and Declaration of Consent. By submitting your personal information, you agree that TechTarget and its partners may contact you regarding relevant content, products and special offers. You also agree that your personal information may be transferred and processed in the United States, and that you have read and agree to the Terms of Use and the Privacy Policy.

in their security operations. Why? Because they are drowning in security data already and are not even looking at everything they could. A decade ago, the increasing volume and complexity of firewall, router and other security log streams led to the invention of security incident and event management (SIEM) systems. But before most organizations even got a SIEM in place and tuned to their needs, the number and volume of security logs streams began to outstrip the capacities of the tools for assessing them. Consequently, the promise of SIEM systems was undercut by a narrowing focus: IT was able to push more and more data through them, but that growing stream of data represented less and less of the total possible universe of security data IT could put through them.

When data got too big This course of events led in turn to applying then-emerging big data techniques to security. At first, most security systems that used big data frameworks were homegrown and based on Hadoop. The goal, as with SIEM, was to run analyses that would, by looking across all the available streams of data, allow security staff to improve their overall security posture. But who or what protected the data and systems used for this analysis? In most cases, no one and nothing. Data was amassed on generic server images used as part of a Hadoop farm with no special controls, as though the data they contained was of low risk and little importance. However, the data amassed should be expected to be sensitive at least -- and in some cases confidential and critical. At a minimum it should include logging data for who is using what, from where, when and perhaps even what are they doing while logged in. It could easily come to include, depending on logging settings and on how widely the net is being cast, snippets of ordinarily protected data: social security or credit card numbers, employee or customer addresses and so on. As sensor nets spread and are connected, IT could even wind up collecting information about who is located where in a building (e.g., from card key systems). And, of course, data lakes that can show you how people are breaking into your organization could just as easily show bad guys how to do so.