The Twelve Days of NoSQL: Day Four: Sharding

Home > NoSQL, Oracle, Physical Database Design, SQL > The Twelve Days of NoSQL: Day Four: Sharding

The Twelve Days of NoSQL: Day Four: Sharding

December 28, 2013 Iggy Fernandez Leave a comment Go to comments

On the fourth day of Christmas, my true love gave to me
Four colly birds.

(Yesterday: Functional Segmentation)(Tomorrow: Replication and Eventual Consistency)

Let’s recap what we have covered so far.

Day One: NoSQL technology consists of “disruptive innovations” which present a dilemma for established players.
Day Two: The goals of NoSQL technology are extreme performance, extreme scalability, and extreme availability.
Day Three: The underpinning component of NoSQL technology is functional segmentation which results in simple hierarchical schemas.

Amazon’s next design decision was “sharding” or horizontal partitioning of all the tables in a hierarchical schema. Hash-partitioning is typically used. Each table is partitioned in the same way as the other tables in the schema and each set of partitions is placed in a separate database referred to as a “shard.” The shards are independent of each other; that is, there is no clustering (as in Oracle RAC) or federation (as in IBM DB2).

Note that the hierarchical schemas that result from functional segmentation are always shardable; that is, hierarchical schemas are shardable by definition.

Returning to the example from Ted Codd’s 1970 paper on the relational model:

employee (man#, name, birthdate) with primary key (man#)
children (man#, childname, birthyear) with primary key (man#, childname)
jobhistory (man#, jobdate, title) with primary key (man#, jobdate)
salaryhistory (man#, jobdate, salarydate, salary) with primary key (man#, jobdate, salarydate)

Note that the jobhistory, salaryhistory, and children tables have composite keys. In each case, the leading column of the composite key is the man#. Therefore, all four tables can be partitioned using the man#.

Sharding is an essential component of NoSQL designs but it does not present a conflict with the relational model; it too is simply a physical database design decision. In the relational model, the collection of standalone databases or shards can be logically viewed as a single distributed database.

Also see: The Twelve Days of SQL: Day Four: The way you write your query matters

Categories: NoSQL, Oracle, Physical Database Design, SQL Tags: NoSQL, Oracle, Physical Database Design, SQL

Comments (0) Trackbacks (0) Leave a comment Trackback

No comments yet.

No trackbacks yet.

	Define, explain in d… on Dilbert likes Hadoop clus…
	Define, explain in d… on Dilbert likes Hadoop clus…
	Define, explain in d… on Dilbert likes Hadoop clus…
	Define, explain in d… on Dilbert likes Hadoop clus…
	Define, explain in d… on Dilbert likes Hadoop clus…

So Many Oracle Manuals, So Little Time

The Twelve Days of NoSQL: Day Four: Sharding

Leave a comment Cancel reply

Recent Posts

Recent Comments

Categories

So Many Oracle Manuals, So Little Time

The Twelve Days of NoSQL: Day Four: Sharding

Share this:

Related

Leave a comment Cancel reply

Recent Posts

Recent Comments

Categories