There’s a lot of hype about “big data” and Hadoop in our organization right now. What are the key network security...
considerations we should think about with Hadoop network design and deployment?
Ask the Expert!
Have questions about network security for expert Matt Pascucci? Send them via email today! (All questions are anonymous.)
Since “big data” is a hot topic these days, there’s no question an increasing number of enterprise infosec teams are going to be asked about the security-related ramifications of big data projects. There are many issues to look into, but here are a few tips for making big data security efforts more secure during architecture and implementation phases:
- Create data controls as close to the data as possible, since much of this data isn’t “owned” by the security team. The risk of having big data traversing your network is that you have large amounts of confidential data – such as credit card data, Social Security numbers, personally identifiable information (PII), etc. -- that’s residing in new places and being used in new ways. Also, you’re usually not going to see terabytes of data siphoned from an organization, but the search for patterns to find the content in these databases is something to be concerned about. Keep the security as close to the data as possible and don’t rely on firewalls, IPS, DLP or other systems to protect the data.
- Verify that sensitive fields are indeed protected by using encryption so when the data is analyzed, manipulated or sent to other areas of the organization, you’re limiting risk of exposure. All sensitive information needs to be encrypted once you have control over it.
- After you’ve made the move to encrypt data, the next logical step is to concern yourself with key management. There are a few new ways to perform key management, including creating keys on an as-needed basis so you don’t have to store them.
- In Hadoop designs, review the HDFS permissions of the cluster and verify all access to HDFS is authenticated. When first implemented, Hadoop frameworks were notoriously bad at performing authentication of users and services. This allows users to impersonate as a user the cluster services itself. You can be authenticated to the Hadoop framework using Kerberos, which can be used with HDFS access tokens to authenticate to the name node.
There are many other areas of security in big data systems, like Hadoop, but when securing big data, authentication, encryption and permissions are three of the largest areas of concern during the big data architecture phase. As with most IT projects, building security in from the beginning is always better than trying to add security later.
Related Q&A from Matthew Pascucci
While there are no set rules, there are some security recommendations when it comes to virtual machines running on one host. Learn the best practices... Continue Reading
Poisoned search results have spread the Zeus Panda banking Trojan throughout Google. Learn what this means, how search engine poisoning works and ... Continue Reading
A report from CrowdStrike highlights the growth of malware-less attacks using certain command-line tools. Learn how to handle these growing attacks ... Continue Reading
Have a question for an expert?
Please add a title for your question
Get answers from a TechTarget expert on whatever's puzzling you.