I’m conducting research on the security of various database architectures. Can you help me understand how the security capabilities of NoSQL compare to the security of the major commercial relational databases?
For a database to store information securely, it needs to provide confidentiality, integrity and availability (CIA). The data must be available when it is needed (availability), but only to authorized individuals or systems (confidentiality), and the data can only be modified by those authorized to do so (integrity).
Relational database security includes integrated features such as role-based security, encrypted communications and support for row and field access control, as well as access control through user-level permissions on stored procedures. Enterprise relational databases (RDBMS) such as Oracle and Microsoft SQL Server also have ACID (atomicity, consistency, isolation, durability) properties that guarantee database transactions are processed reliably; data replication and logging ensure durability and data integrity. But, these features come at a cost, mainly licensing fees and speed of data access.
For social network applications such as Facebook and ecommerce sites like Amazon, which are handling extremely large data sets, scalability and availability are key database requirements. So that data can be distributed across hundreds -- or even thousands -- of servers, many have turned to non-relational database management systems or NoSQL databases. However, NoSQL security is nowhere near as robust as relational database security.
NoSQL databases have what’s called BASE (basically available, soft state, eventually consistent) properties so rather than requiring consistency after every transaction, it is enough for the database to eventually be in a consistent state. This means users may not see data as it is now, but what it was when the last snapshot was taken, the number of items in stock, for example. Because transactions aren’t written to the database immediately, there is a possibility that simultaneous transactions could interfere with each other. This inherent race condition, where not all users necessarily see the same data at the same time, is a real risk in a database handling, say, share transactions.
NoSQL databases also lack the qualities of confidentiality and integrity. As NoSQL databases lack a schema, you can’t segregate permissions on a table, column or row, and to keep access to data fast, they have little built-in security. The documentation for the popular MongoDB reads, “One valid way to run the Mongo database is in a trusted environment, with no security and authentication. ... Of course, in such a configuration, one must be sure only trusted machines can access database TCP ports.” The NoSQL database Riak has no authentication or authorization support.
This means confidentiality and integrity have to be provided entirely by the application accessing the data. Having the last line of defense for any valuable data at the application level is not sound practice. Application developers are not renowned for implementing security features, let alone having to write them from scratch. That leaves just your firewall protecting your data.
This was first published in October 2011