Doug Jacobson and Julie A. Rursch
Published: 01 Aug 2013
When it comes to integrating information technology trends into the curriculums of many universities and colleges, the educational system has fallen behind the learning curve. This is true for big data education, and unfortunately, the IT security needed to protect unstructured information.
The concepts related to the handling of large amounts of data are briefly touched on in courses that focus on databases or algorithms. But when big data is addressed in an algorithms class, it’s primarily as a justification for teaching different sorting algorithms, essentially, ordering lists in “big data” projects.
If universities do offer classes on big data, it is often as graduate-level coursework. Despite few computer engineering or computer sciences classes that focus specifically on big data, we see the concept show up in other courses; bio-informatics, for example, where processing big data is required to complete a task.
Given the void in big data education, it should come as no surprise that the security of big data is not covered in most curriculums. Even the newly proposed National Security Association and Department of Homeland Security focus areas for the National Centers of Academic Excellence list big data security as an optional knowledge unit in three content areas.
In our defense, how can we secure something that is hard to classify? How can we teach others how to secure it?
Security of big data is important, but it is difficult to teach for many reasons—the terminology, current security and monitoring systems, physical infrastructure—and that’s just for starters. First and foremost, it is hard to classify what is meant by the term “big data.” It implies incomplete knowledge of what data points may be in the storage set and trying to secure that which is unknown is difficult. Think about data loss prevention; it’s difficult, if not impossible, to tell if sensitive data is leaving the facility when the data isn’t enumerated.
We’re not teaching big data security. But in our defense, how can we secure something that is hard to classify? Furthermore, how can we teach others how to secure it? The new classification of big data presents a basic problem that needs resolution before we provide solutions.
New security methods
Does the new classification of big data mean new security methods are warranted or can we use methods that currently are deployed, only on a larger scale? In the case of big data, we argue that the size and complexity requires more than just scaling current data security methods.
If we can get beyond the terminology and lack of knowledge, we need to rethink the implementation of security and monitoring systems in big data situations. In current security and monitoring systems, writing to and reviewing log files is the primary technique used to capture events and indicate when security breaches are attempted or have succeeded. In today’s world, we hear lamentations of how large log files grow and how difficult it is to separate the useful data from the noise, even with the help of a vendor’s product. In the world of big data, the complexity of security and monitoring systems only grows exponentially.
Although, many factors complicate big data security, one final issue we want to note is that big data often lives in the cloud. Therefore, the discussions about security methods for big data include cloud security. Neither of these topics is mature and organizations taking security measures will need to consider how these measures will work with cloud data.
From the educational perspective, we believe that teaching big data security starts with the fundamentals of data security that are taught in all security programs. There is no stronger foundation for big data security discussions than a deep and broad understanding of security concepts; however, the additional complexities that big data adds to the problem of security need to be included in the curriculum.
While we believe the best way for students to learn is through laboratory experiments or simulations, developing big data security exercises may prove more difficult than traditional security exercises. If we argue that a definition of big data could be developed and universally accepted, we still see obstacles to overcome. Currently, students work with intrusion detection and data loss prevention, but not in a big data environment. And, we have found, they really aren’t prepared to handle the massive amount of data that pours in from security devices, network monitoring and data loss monitors. Laboratory experiments have to be carefully crafted to not overwhelm students, but also provide the look and feel of big data.
No meaningful data
Unfortunately, access to realistic and meaningful data is difficult in higher education. We cannot have access to real big data because, in many cases, it is private. We need to develop example data sets of big data in which the data types match different industries. This is a perfect place for academia to partner with vertical industries or industry trade groups to develop these data sources. And, we, as educators, need to be innovative in combining cloud and big data security concepts and encouraging our students to think about these topics.
So, what can we realistically hope to accomplish in the area of big data security education? We would hope that as educators we can help our students learn the fundamentals needed to adapt to ever changing threats and technologies. While today the current topics are big data and cloud security, tomorrow’s topics are unknown. As educators we are bound to include the most current security topics and issues such as big data and cloud security for our students. However, we must also strive to educate our students so they can adapt to changes once they leave our hallowed halls.
Doug Jacobson is a professor in the department of electrical and computer engineering at Iowa State University and director of the Information Assurance Center, which was one of the original seven NSA-certified centers of academic excellence in information assurance education.
Julie A. Rursch is a lecturer in the department of electrical and computer engineering at Iowa State University and director of the Iowa State University Information Systems Security Laboratory, which provides security training, testing and outreach to support business and industry.
Send comments on this feature article to firstname.lastname@example.org.