The Twelve Days of NoSQL: Day Three: Functional Segmentation
On the third day of Christmas, my true love gave to me
Three French hens.
Amazon’s pivotal design decision was to break its monolithic enterprise-wide database service into simpler component services such as a best-seller list service, a shopping cart service, a customer preferences service, a sales rank service, and a product catalog service. This avoided a single point of failure. In an interview for the NoCOUG Journal, Amazon’s first database administrator, Jeremiah Wilton explains the rationale behind Amazon’s approach: “The best availability in the industry comes from application software that is predicated upon a surprising assumption: The databases upon which the software relies will inevitably fail. The better the software’s ability to continue operating in such a situation, the higher the overall service’s availability will be. But isn’t Oracle unbreakable? At the database level, regardless of the measures taken to improve availability, outages will occur from time to time. An outage may be from a required upgrade or a bug. Knowing this, if you engineer application software to handle this eventuality, then a database outage will have less or no impact on end users. In summary, there are many ways to improve a single database’s availability. But the highest availability comes from thoughtful engineering of the entire application architecture.” As an example, the shopping cart service should not be affected if the checkout service is unavailable or not performing well.
I said that this was the pivotal design decision made by Amazon. I cannot emphasize this enough. If you resist functional segmentation, you are not ready for NoSQL. If you miss the point, you will not understand NoSQL.
Note that functional segmentation results in simple hierarchical schemas. Here is an example of a simple hierarchical schema from Ted Codd’s 1970 paper on the relational model, meticulously reproduced in the 100th issue of the NoCOUG Journal. This schema stores information about employees, their children, their job histories, and their salary histories.
- employee (man#, name, birthdate)
- children (man#, childname, birthyear)
- jobhistory (man#, jobdate, title)
- salaryhistory (man#, jobdate, salarydate, salary)
Functional segmentation is the underpinning of NoSQL technology but it does not present a conflict with the relational model; it is simply a physical database design decision. Each functional segment is usually assigned its own standalone database. The collection of functional segments could be regarded as a single distributed database. However, distributed transactions are verboten in the NoSQL world. Functional segmentation can therefore result in temporary inconsistencies if, for example, the shopping cart data is not in the same database as the product catalog and occasional inconsistencies result. Occasionally, an item that is present in a shopping cart may go out of stock. Occasionally, an item that is present in a shopping cart may be repriced. The problems can be resolved when the customer decides to check out, if not earlier. As an Amazon customer, I occasionally leave items in my shopping cart but don’t complete a purchase. When I resume shopping, I sometimes get a notification that an item in my shopping chart is no longer in stock or has been repriced. This technique is called “eventual consistency” and the application is responsible for ensuring that inconsistencies are eventually corrected. Randy Shoup, one of the architects of the eBay’s ecommerce platform, explains how:
“At eBay, we allow absolutely no client-side or distributed transactions of any kind – no two-phase commit. In certain well-defined situations, we will combine multiple statements on a single database into a single transactional operation. For the most part, however, individual statements are auto-committed. While this intentional relaxation of orthodox ACID properties does not guarantee immediate consistency everywhere, the reality is that most systems are available the vast majority of the time. Of course, we do employ various techniques to help the system reach eventual consistency: careful ordering of database operations, asynchronous recovery events, and reconciliation or settlement batches. We choose the technique according to the consistency demands of the particular use case.” (Scalability Best Practices: Lessons from eBay)
The eventual consistency technique receives a lot of attention because it is supposedly in conflict with the relational model. We will return to this subject later in this series and argue that eventual consistency is not in conflict with the relational model.