BACKGROUND IMAGE: iSTOCK/GETTY IMAGES
NoSQL, or Not Only SQL, is an approach to data storage and retrieval that is very fashionable with startups developing...
interactive Web applications and enterprises dealing with huge quantities of data. The main reason for its popularity is that it provides better scalability and availability, as well as faster access to data, than traditional relational database management systems (RDBMS), including Oracle's MySQL and Microsoft's SQL Server.
Data held in a RDBMS has to be predictable so it can be stored in organized tables and rows, with relationships defined between different elements. Data in a NoSQL database, on the other hand, doesn't need to be so structured or follow a fixed schema. When performance and real-time access are more important than consistency, such as when indexing and retrieving a large number of records, NoSQL is a better fit than a relational database. Data can also be more easily held across multiple servers, providing improved fault tolerance and scalability. Companies like Google and Amazon use their own cloud-friendly NoSQL database technologies, and there are a number of commercial and open source NoSQL databases available, such as Couchbase, MongoDB, Cassandra and Riak.
For all the advantages of storing data in a NoSQL database, NoSQL security is adversely impacted by the need to access data quickly and easily. To store information securely, a database needs to provide confidentiality, integrity and availability (CIA). Enterprise RDBMS databases provide CIA through integrated security features such as role-based security, encrypted communications, support for row and field access control, as well as access control through user-level permissions on stored procedures. RDBMS databases also have ACID (atomicity, consistency, isolation, durability) properties that guarantee database transactions are processed reliably; data replication and logging ensure durability and data integrity. These features increase the time it takes to retrieve large amounts of data, so they are not implemented in NoSQL databases.
In order to maintain fast access to data, NoSQL databases come with little built-in security. They have what's called BASE (basically available, soft state, eventually consistent) properties; rather than requiring consistency after every transaction, the database just needs to eventually reach a consistent state. For example, when users view data, such as the number of items in stock, they may see the last snapshot taken of the data rather than a current view. Because transactions aren't written to the database immediately, there is a possibility that simultaneous transactions could interfere with each other. This inherent race condition, in which users do not necessarily see the same data at the same time, means a NoSQL database could never be used for handling financial transactions.
NoSQL databases also lack confidentiality and integrity. As NoSQL databases don't have a schema, permissions on a table, column or row can't be segregated. This can also lead to multiple copies of the same data. This can make it hard to keep data consistent, particularly as changes to multiple tables can't be wrapped in a transaction where a logical unit of insert, update or delete operations is executed as a whole.
With more than 20 different implementations of NoSQL available, a lack of standards also increases the complexities of keeping data secure. Confidentiality and integrity have to be provided entirely by the application accessing the NoSQL data. It is not a sound practice to have the last line of defense for any valuable data at the application level. Application developers are not renowned for implementing security features, and new code usually means new bugs. Any requests sent to a NoSQL database need to be escaped, filtered and validated, while the database itself needs to reside in a hardened environment.
Interestingly some NoSQL projects are now starting to add back RDBMS-type security features. Oracle, for example, added transactional control over data written to one node. Cassandra supports transaction logging and automatic replication, and MongoDB supports master-slave replication.
If scalability and availability are the key database requirements for an organization, then NoSQL may be the right choice for certain large data sets. However, system architects should take a close look at their requirements for security, privacy and data integrity before choosing a NoSQL database. The lack of NoSQL security features, namely authentication or authorization support, means that sensitive data is best kept in a traditional RDBMS.
About the author
Michael Cobb, CISSP-ISSAP, is a renowned security author with more than 15 years of experience in the IT industry and another 16 years of experience in finance. He is founder and managing director of Cobweb Applications Ltd., a consultancy that helps companies secure their networks and websites, and also helps them achieve ISO 27001 certification. He co-authored the book IIS Security and has written numerous technical articles for leading IT publications. Michael is also a Microsoft Certified Database Administrator and a Microsoft Certified Professional.