In December 2010, Honda experienced a data breach that affected 2.2 million customers. Names, email addresses, vehicle identification numbers (VINs), and credentials for a Honda portal were stolen from a database. The database, however, was not accessed within Honda’s infrastructure. This sensitive information was stolen from a cloud-based marketing service provider that Honda did business with. A year ago, cloud storage provider Dropbox pushed a code change that eliminated the password authentication system required to access users’ stored data, rendering any data from any account accessible to anyone who wanted to access it.
In addition, Dropbox drew criticism for maintaining control of users’ encryption keys, potentially making accounts and data susceptible to compromise should those keys fall into the wrong hands. Also, last year, Amazon’s Simple Storage Service (S3) was found to be susceptible to a basic HTTP-focused brute-force attack that could expose customer’s data storage accounts.
As more systems, applications and data are moved into cloud provider environments, these types of stories are likely to become more common. How can users and organizations keep their data safe when moving to a cloud provider environment? Find out about data protection in the cloud as we detail the technologies and challenges you’ll encounter as your organization sends and consumes data from the cloud, and learn what you need to do to protect it.
Technology challenges to consider
Within virtualized environments, numerous virtual machines are housed on a single physical system, a condition known as multi-tenancy. The hypervisor software is responsible for maintaining segmentation and isolation between virtual machines. This can be augmented with open source or commercial virtual network and virtual security appliances or add-ons. However, there are still challenges to traditional security best practices that stem from multi-tenancy, such as separation of duties and system segregation.
- Policy: Different virtual systems and data sets may have widely differing classifications and sensitivity levels. To ensure the proper security policy is applied to sensitive data, systems and applications that store or process this data are often kept physically separate from others. However, in a multi-tenant environment such as the cloud, this may not be feasible. In addition, ensuring internal policies related to data handling and access control may be difficult when migrating systems and applications to a cloud provider. This can be a problem when integrating public cloud services to an existing private cloud (a hybrid cloud scenario), as well as during a wholesale migration of data and systems to a public cloud environment.
- Encryption: Encryption can be challenging to implement internally due to key management and maintenance, performance issues, and access controls. Extending internal encryption platforms and capabilities into the cloud can seem daunting at best. For example, how will administrators manage encryption keys for data and systems in the cloud? When encryption keys need to be generated or revoked, how can this easily be accomplished for resources hosted elsewhere? Will cloud service providers (CSPs) need access to keys, and what kinds of risk will this introduce? For hybrid clouds, handling encryption may be less of an issue, but moving to a public cloud may pose significant challenges.
- DLP: Data loss prevention (DLP) requires a number of distinct technologies and processes to be effective. First, sensitive data needs to be fingerprinted so DLP monitoring tools can recognize the data based on string matching, file types and other attributes. Second, a centralized policy creation and implementation infrastructure needs to be in place to push policy to DLP monitoring tools, and these monitoring tools need to be in place to inspect traffic on network segments and critical host systems alike. Finally, quarantine and response measures should be implemented to take a variety of actions when a potential policy violation is detected. Implementing this in virtualized environments may be problematic due to resource constraints that result from installation of DLP software agents, or lack of virtualization integration options. Extending DLP to a CSP infrastructure may be difficult, especially in a multi-tenant environment where granular data protection policies are not available.
- Monitoring: Security monitoring techniques using intrusion detection, network flow analysis tools, and host-based agents are common in internal data centers. However, ensuring systems are properly monitored in the cloud is a different story. In many cases, cloud providers may not allow or support advanced monitoring technologies or processes, although some may offer this as a service.
Encryption in the cloud
Fortunately, numerous data protection options are emerging for cloud environments. The first options for data protection in the cloud is encryption, and a variety of new solutions and tools can help organizations adequately control encryption keys, policies, and authentication and authorization associated with data protection in cloud environments where data and systems are dynamically migrated across platforms and even distinct data centers. For example, Amazon Web Services (AWS) has a number of features that allow users to control encryption keys and access methods. When new AWS user accounts are created, they are provided an access key that allows RESTful and Query protocol requests to AWS APIs. Users can also create X.509 certificates that provide SOAP access to Amazon APIs, or a public-private key pair can be generated, with only the user retaining the private key (as in all asymmetric cryptosystems). Certificates and access keys can be rotated easily, and multiple keys and certificates can be used concurrently to access AWS accounts.
In most provider environments, managing storage volume security options will be a significant amount of work.
In private cloud and Infrastructure as a Service (IaaS) provider environments, there are several options for encrypting data that minimize the need to redesign applications and re-architect system and network design. These include the following:
- Volume-based encryption: While storage volumes are unmounted or offline (as backups, for example), data is encrypted and unreadable without explicit access using encryption keys. However, when cloud data volumes are online, any authenticated user can access data on the volume. This may be highly impractical in a multi-tenant environment unless providers manage access to volumes per cloud instance. In most provider environments, managing storage volume security options will be a significant amount of work, because each customer would need specific encryption options, availability scenarios and access types.
- Application-specific encryption: Custom applications may include encryption with keys and certificates, and this is often incumbent on the developers to ensure key portability and encryption continuity is maintained when applications are moved to a cloud provider environment. In Platform as a Service (PaaS) environments, encryption APIs may be made available. In Microsoft Azure, for example, the SDK for developers exposes all common hashing functions such as MD5 and SHA1, as well as major .NET encryption libraries and capabilities. However, SQL Azure does not have significant encryption support, which may be a severe limitation to developers looking to leverage cloud-based database services in conjunction with Azure-hosted applications.
- File encryption: File encryption is likely the most flexible type of encryption for us within virtualized and cloud environments. Encryption is applied at the source, and managed by customers or third-party providers that act as “proxies” for key management and encryption policy application. Examples of cloudbased key management providers include Voltage Security and Trend Micro.
In addition to the built-in capabilities providers offer, a number of vendors offer products that may simplify cloud-based data encryption or protection of virtual machines. High Cloud Security is one company offering policy-based encryption for entire virtual machines, and the VMs stay encrypted when moved throughout a cloud provider’s environment. All key management and role-based access is defined locally before moving to the cloud, greatly simplifying the ability to migrate VMs without checking compatibility requirements in the CSP infrastructure.
Additional cloud-focused encryption providers include CipherCloud and Vormetric. CipherCloud provides a virtual appliance called the CipherCloud Security Gateway that natively integrates with cloud services such as Saleforce. com and Google Apps. This appliance can provide encryption, key management, tokenization, and user monitoring functionality, among other features. Vormetric, a more traditional encryption solution provider, has adapted its enterprise encryption and key management platforms to extend this functionality into Amazon and other CSP environments.
Information lifecycle in the cloud
Another critical element of data protection in the cloud involves the data lifecycle. Whether data is encrypted or not, customers should have a clearly defined data lifecycle, and ensure CSPs can maintain and support this, especially in the case of a business failure or other critical situation that could expose sensitive information. A reasonable lifecycle approach should include the following:
- Retention: CSPs should state how long they retain data that relates to customer instances and applications. In many cases, this may be log data or other related information that potentially contains sensitive details about customer activities.
- Disposal: Under what circumstances do CSPs dispose of customer data? If the CSP goes out of business, or some other unusual scenario comes to fruition, contractual language should protect the customer by stating that CSPs will dispose of data in a secure manner. This may consist of destroying physical drives or using degaussing or disk wiping software.
- Classification: Data classification can be simple to define, yet challenging to implement. For sensitive data within a cloud environment, organizations may want to ensure the data is appropriately segmented by using dedicated hypervisor platforms or systems versus traditional multitenant scenarios. Most providers offer virtual private clouds or standalone cloud servers for an additional cost, and this may be the best option for highly sensitive data.
DLP in the cloud
Data loss prevention is another common data protection technology that may require adaptation for virtualized and cloud environments. Following are several key considerations related to cloud DLP:
- Policy and monitoring: Host- and network-based DLP products need to fingerprint sensitive data before they’ll be capable of detecting and preventing potential breaches. For customers who employ host-based DLP agents, software agents with a pre-existing policy can run on virtual machines in the cloud as long as the agent can communicate with policy and alerting systems. Networkbased DLP may not translate effectively to a public cloud in any sense, as any monitoring tools in a CSP environment would need to be tuned to each customer’s data types and usage patterns. In a private cloud, and potentially in a hybrid cloud, DLP policies and monitoring can likely operate normally, as long as the DLP technology is compatible with the virtualization platforms in use. Most major DLP product vendors, including McAfee and Symantec, support DLP agents on virtual machines. Network monitoring may require some architecture redesign, however, to ensure traffic from virtual switches is supported. Some providers such as Trend Micro and Palisade Systems offer DLP virtual appliances that can integrate into virtualized networks.
- Incident detection and management: One challenge with cloud-based DLP is the need to tightly integrate into an incident response program. Many CSPs do not provide in-house incident response services for customers, and others may not be able to adequately support event notification service-level agreements (SLAs) that trigger customer’s incident response programs internally. This means any DLP detection or prevention actions taken in the cloud, most likely from a software agent on IaaS-hosted virtual machines, may not quickly lead to investigations from either CSP or customer IR teams.
- Provider DLP Controls: Technologies such as Websense Cloud DLP are attempting to integrate traditional DLP policies and monitoring with SaaS cloud solutions such as Salesforce.com, as well as PaaS and IaaS cloud options such as Azure and AWS. Cloud-based security service providers like Zscaler are offering DLP services specific to its hosted email and Web analysis services, which may be a good option for customers looking to outsource DLP entirely. Unfortunately, major CSPs do not offer robust DLP options that are the equivalent of customers’ in-house DLP today. Another point to consider is the internal CSP controls (including DLP), given the potential access to customer data and systems by CSP staff. For this, look to a CSP’s SAS 70 or SSAE 16 report on internal controls to ensure DLP or other protective technologies are in place internally.
In addition to DLP and encryption, there are a number of other virtualization security tools and controls that can be implemented to help with data protection. These include virtual protection appliances such as Juniper’s vGW series (which provides virtual firewall, intrusion detection and prevention, and policy- based virtualization isolation) and HyTrust’s security appliance that enables control and audit over administration of the entire virtualization infrastructure with a focus on policy and compliance. Numerous security configuration guides, such as those from VMware, Microsoft, Center for Internet Security (CIS), and Defense Information Systems Agency (DISA), can be leveraged to lock down virtualized components.
Many CSPs also offer numerous data protection tools and services that may be of interest. For example, Terremark (a Verizon company) offers managed IDS/IPS, firewall and application firewall, log aggregation and analysis, and incident response services. Akamai offers cloud-based DDoS and Web application firewall services, among others. Even Amazon has some basic firewall and access controls for users built in, although these capabilities are limited and should ideally be augmented with other security products and services.
Ultimately, the state of data protection in cloud environments is still somewhat immature. Most enterprise DLP products support virtualization technology to some extent, but this does not mean these virtualized systems and applications can be easily extended into hybrid and public clouds without losing data protection capabilities. New services are emerging that can help to protect data in SaaS and PaaS provider environments, although these are currently somewhat limited to email and Web traffic.
Encryption is a more reasonable option for many to secure data in the cloud, with varying degrees of support from CSPs and numerous implementation methods ranging from encryption of entire virtual machines to file and folder encryption and application-specific encryption in VMs and SaaS and PaaS environments. As more organizations consider migrating data and applications to cloud providers, CSPs will need to enable broader support for encryption and DLP technologies. This will allow customers to ensure strong data protection controls are in place wherever their data is.
Dave Shackleford is owner and principal consultant at Voodoo Security, senior vice president of research and CTO at IANS, and a SANS analyst, instructor, and course author. He is a VMware vExpert and has extensive experience designing and configuring secure virtualized infrastructures. He coauthored the first published course on virtualization security for the SANS Institute, serves on the board of directors at the SANS Technology Institute and helps lead the Atlanta chapter of the Cloud Security Alliance. Send comments on this article to firstname.lastname@example.org.