The Twelve Days of NoSQL: Day Two: Requirements and Assumptions
On the second day of Christmas, my true love gave to me
Two turtle doves.
As I mentioned in the previous post, the NoSQL movement got its big boost from the e-commerce giant Amazon. Amazon started out by using Oracle Database for its e-commerce platform but later switched to a proprietary database management system called Dynamo that it built in-house. Dynamo is the archetypal NoSQL product; it embodies all the innovations of the NoSQL camp. The Dynamo requirements and assumptions are documented in the paper Dynamo: Amazon’s Highly Available Key-value Store published in 2007. Here are some excerpts from that paper:
“Customers should be able to view and add items to their shopping cart even if disks are failing, network routes are flapping, or data centers are being destroyed by tornados. Therefore, the service responsible for managing shopping carts requires that it can always write to and read from its data store, and that its data needs to be available across multiple data centers.”
“There are many services on Amazon’s platform that only need primary-key access to a data store. For many services, such as those that provide best seller lists, shopping carts, customer preferences, session management, sales rank, and product catalog, the common pattern of using a relational database would lead to inefficiencies and limit scale and availability. Dynamo provides a simple primary-key only interface to meet the requirements of these applications.”
“Experience at Amazon has shown that data stores that provide ACID guarantees tend to have poor availability.”
“Dynamo targets applications that operate with weaker consistency (the “C” in ACID) if this results in high availability.”
“… since each service uses its distinct instance of Dynamo, its initial design targets a scale of up to hundreds of storage hosts.”
To paraphrase, Amazon’s requirements were extreme performance, extreme scalability, and extreme availability, surpassing anything that had ever been achieved before. Also, Amazon’s prior experience with the relational model led it to conclude that the only way to satisfy these requirements was to stop playing by the rules of the relational camp. If you belong in the relational camp, please suspend disbelief while I explain how Amazon achieved its ends. You will be in a better position to pass judgment on NoSQL technology once you understand each Amazon innovation.