Ask the Expert

Should confidential data be indexed or used as the index key?

I've read that confidential database data shouldn't be indexed or used as the index key. What does that mean, and what best practices should I employ to ensure that this isn't a problem in my organization?

    Requires Free Membership to View

Database indexes are much like indexes in text books, as they provide quick reference points on where to find requested data. They reduce database server efforts and speed up data retrieval times. In a relational database, every table should have an indexed primary key whose sole purpose is to create a well-defined link and distinctive value between records in the database. In order to ensure that the technical implementation of the database is separate from the business logic, this primary key value should not have any real-life significance.

A table of a bank's customers, for example, may well have a column for storing each customer's unique bank account number – a possible candidate for a primary key. The primary key's value distinguishes each row of customer data.

To speed up the retrieval of customer data, the bank account number or the Social Security number of each customer, for example, can be indexed. The arrangement allows bank staff to quickly search the database using that particular piece of information. These indexes, however, are the focus of a new timing attack technique demonstrated by researchers from Core Security Technologies. The attack uses a series of insert operations to find weaknesses in the database's indexing algorithm. Attackers can then extract data from indexed fields. The insertion commands do not exploit any application logic or code flaws; the functions are typically available to all database users.

The initial defensive recommendation is to not use indexes on confidential data. Without indexes, however, data retrieval is complex. To find the particular row matching a given bank account or Social Security number, the database server would have to perform a full table scan to search every row in the customers' table. Complex queries across multiple tables also depend heavily on indexes. These delays would have a significant impact on performance and cripple most large commercial databases.

While there are no reports of this attack being used in the wild, it is a plausible threat. Database administrators should monitor log files more closely to look for abnormal repetitive insert activity. Application firewalls will also need to be tuned to detect unusual patterns of activity. For new databases, architects must make some modifications to the data model and application code. For each column in a table that must be indexed, there must now be a corresponding column to store the hash value of the confidential data. This hash value can then be used for indexing. The attacker will not be able to calculate the value of confidential data from it, effectively negating the attack. Applications can still search for the confidential data efficiently by performing the search on the indexed hash value column and passing the hashed value of the data as the search criteria.

More information:

  • James Foster demystifies database compliance.
  • Visit SearchSecurity.com's Data Protection School.
  • This was first published in October 2007

    There are Comments. Add yours.

     
    TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

    REGISTER or login:

    Forgot Password?
    By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
    Sort by: OldestNewest

    Forgot Password?

    No problem! Submit your e-mail address below. We'll send you an email containing your password.

    Your password has been sent to: