What should be done with a RAID-5 array's failed drives?

Even one failed drive in a RAID-5 array can present an enterprise with serious data protection concerns. In this SearchSecurity.com Q&A, expert Michael Cobb explains which policies can protect and recover RAID-5 data.

I have had a failed drive in a RAID-5 array. The drive is dead and cannot have a drive wipe performed on it. The drive is under warranty and needs to be sent back to the server manufacturer. Is the data that can be recovered from the single drive a security concern? Everything I can find indicates two drives would be needed to retrieve any information from the drive. Is that true?

The data on your failed drive is a security issue. There is data on your failed drive that can be recovered, and you're right to be concerned about it. Before you send the drive back to the manufacturer, you need to check what confidentiality and non-disclosure policies the vendor has in place. Also, if the drive will not be returned to you, you need to know what the destruction policy is.

When it comes to RAID-5 data recovery, you're assuming that you need two drives out of a three-drive set in order to restore all your files. But the key word here is "all." If files are below a certain size, useful data can be recovered from just one disk. Let me explain by examining how RAID-5 stores your data.

Fundamental to RAID-5 is data striping. When your computer saves data to a RAID-5 array of disks, the data is divided up into segments, and the segments are written across the drive array in sequence. So, for example, the first 32 KB would be written to disk one, the next 32 KB would be written to disk two, and so on. Similarly, when a computer reads a file, the multiple pieces of data from each disk drive are extracted and reassembled to create the file.

Stripe size refers to a single data unit that is written to each disk. The performance of a RAID-5 array can be tuned by finding a stripe size that is well-matched to the type of application being used. For example, on-demand video services or data-intensive applications that access large records should use small stripes so that each file or record will span across all the drives in the array. If the data transfer occurs across multiple drives, large amounts of data can be accessed at a greater speed.

RAID-5 also uses distributed parity. Parity is a fault-tolerance feature that deals with error detection. Parity data is stored and distributed among the drives, and when one drive fails, parity information can be used to rebuild the data on the disk.

Larger files will be saved across the disks in your RAID-5 array, and in a three-set array, you would need two disks to recover those files. But what about smaller files, or pieces of data within larger files? A malicious hacker, for example, may only want the username and password from an email, not the rest of the message. The figure below shows how files of different sizes could be distributed across drives in a four-disk RAID-5 array. If drive 2 were to fail, you can see that certain data would be accessible if the stripe size used is 16 KB. File 1 is 4 KB and therefore fits entirely onto drive 2, while the contents of File 2, which is 20 KB, almost fit onto one drive as well. A low-level disk reader would be able to read all of File 1 and segments of the other files! Therefore, it's necessary to treat a failed drive with the same care that you would any other data drive.


Figure 1: File distribution in a four-disk RAID-5 array

More information:

  • Learn how to enforce a data destruction policy.
  • Secure data with full-disk encryption.

Dig Deeper on Data security and privacy

Networking
CIO
Enterprise Desktop
Cloud Computing
ComputerWeekly.com
Close