The Hitchhiker’s Guide to the EXPLAIN PLAN: The story so far (Part 11–22)
On the Toad World site, I’m writing a series of blog posts and Wiki articles on the subject of EXPLAIN PLAN. I’m using EXPLAIN PLAN as a motif to teach not just SQL tuning but also relational theory, logical database design, and physical database design. In a year’s time, I hope to have enough material for a self-published book.
Part 11—Abandon All Hope: Query plans can and will change dramatically because SQL is a non-procedural language. At best, reviewing a query plan can only establish that the query plan is reasonably good. But there is no guarantee that the plan will be consistently used. In addition to goodness of query plans, we need robustness (applicability to a wide range of bind variables) and stability (the guarantee that the plan of choice will be used) but they entail a lot of work. There is hope only if we are willing to do the work that is entailed. There is no hope if you are unwilling to take the time to understand the principles of robustness and stability and do what it takes to achieve them. There is no Easy Button like in Staples stores.
Part 12—Throw Away that Execution Plan: Instead of focusing on EXPLAIN PLAN at the outset, focus on logical database and physical database design, give the optimizer the information it needs to do a good job, and write well-structured SQL statements that the optimizer can easily understand.
Part 13—The Great Pretender: An enduring Oracle Database myth is that EXPLAIN PLAN and AUTOTRACE show the execution plan. This may have been true in early versions of Oracle but no longer. As Tom Kyte said: “It ain’t so much the things we don’t know that get us into trouble. It’s the things you know that just ain’t so or just ain’t so anymore or just ain’t always so.”
Part 14—Damn the Cost, Full Speed Ahead: The cost displayed in query plans is a misleading and useless piece of information. It is an estimate that is more likely to be wrong than correct. By default, it is computed using particular values of bind variables and therefore does not apply to subsequent executions with different values. And, it is measured in units of SREADTIM, not clock seconds.
Reviewing Execution Plan History: When a SQL statement is performing poorly, the conventional wisdom is to examine its query execution plan. If a query is executed more than once, its query plan can be different the second time around for a variety of reasons. Bind variable peeking and adaptive query optimization are only two of the reasons. Other reasons include changes in statistics, cursor_sharing=similar in Oracle Database 10g, adaptive cursor sharing in Oracle Database 11g Release 1, and cardinality feedback in Oracle Database 11g Release 2. At any given point in time, different users may be using different execution plans for the same SQL query. All this means that a query does not have a single execution plan but an execution plan history. Therefore EXPLAIN PLAN is practically useless; at best it is a teaching tool. One must look at the history of query execution plans for a particular SQL statement, not the output of EXPLAIN PLAN.
Part 15—Oracle-in-the-box: The OTN Developer Day VM is the greatest thing since sliced bread and I use it in all my experiments and demonstrations. It uses Linux as the operating system and is pre-loaded with a fully-functional Oracle 12c database with all the options. I don’t have to go through any installation headaches, my laptop is not messed up, and—best of all—I can take a snapshot of a known good state and revert to it at will. And, of course, you don’t need any licenses because “all [Oracle] software downloads are free, and most come with a Developer License that allows you to use full versions of the products at no charge while developing and prototyping your applications, or for strictly self-educational purposes.”
Part 16—Practice Exam I: The deep-left tree clearly illustrates the fundamental strategy by which a SQL query is converted into a series of join operations: first a “driving” table is selected from amongst the tables involved in the query and then the remaining tables are joined to it using the nested-loop method.
Part 17—Practice Exam II: The opposite of the deep-left tree is the deep-right tree. Each qualifying row in the driving table is checked against again in-memory hash-lookup tables.
Part 18—Practice Exam III: Every time the optimizer chooses to join two data sets using the hash-join method, it has to decide which of the two data sets will be the in-memory lookup table. The smaller of the two data sets is the logical choice. The lookup table is always placed on the left hand side of the join node. This can result in zig-zag trees.
Part 19—Practice Exam IV: Instead of evaluating all join possibilities by brute force, the optimizer adopts reasonable strategies in the hope that they will lead to reasonably efficient query plans. The oldest strategy in the book is to pick a “driving” table and then keep picking another table with which to join until all tables have been joined. However, this depends on the ability to perform “view merging” if the query contains views and “subquery unnesting” if the query contains subqueries. If a view cannot be merged or a subquery cannot be nested, then the optimizer is forced to generate a “bushy” tree.
Reviewing Query Execution Plans: You should stop reviewing query execution plans of queries that you have not yet executed. You must first execute your query and then use DBMS_XPLAN.DISPLAY_CURSOR to review the query execution plan that was used by the Oracle engine. Here DISPLAY_CURSOR has some obvious deficiencies. It does not indicate the order in which the lines are actually executed nor the parent of each line. It does not indicate which lines are the most expensive in terms of logical reads and phusical reads. The extra information we need can be obtained from the V$SQL_PLAN_STATISTICS_ALL view.
Part 20—The Final Exam! (Part I): According to Ralph Kimball, there is only one basic evaluation approach—along with a “bail out” option—in a star schema: evaluate the constraints on the dimension tables, prepare a list of composite keys of the fact table, and use a composite index on the fact table to fetch the data from the fact table. When you notice that the list of composite candidate keys is too long, you bail out to a full-table scan in which you look at every fact record without the help of indexes. For the final exam, we will exercise Kimball’s theories using the Sales History (SH) sample schema provided by Oracle. The Sales “fact table” and its five dimensions—Products, Customers, Times, Channels, and Promotions—form the classic star pattern that Kimball refers to.
Part 21—The Final Exam! (Part II): In addition to the basic evaluation approach of Kimball and the bail-out option, Oracle Database offers a lightning-fast option which uses “bitmap” indexes. A bitmap index is composed of multiple “bitmaps” instead of primary key and ROWID pairs like conventional indexes. Each bitmap in a bitmap index is an array of bits (0s and 1s) and tracks exactly one of the values that occurs in the indexed column. The 1s in a bitmap indicate that the corresponding records in the table have the value being tracked by the bitmap while the 0s indicate that the corresponding records do not have that value. Bitmaps are highly effective in solving “COLUMN IN (LIST)” and “COLUMN_1 IN (LIST_1) AND COLUMN_2 IN (LIST_2).”
Part 22—The Final Exam! (Part III We Don’t Use Databases; We Don’t Use Indexes): Back in the early days of the relational era, the creator of relational theory, Dr. Edward Codd married relational theory with transactional database management systems (a.k.a. ACID DBMS) and the Relational Database Management System (RDBMS) was born. He authored two influential ComputerWorld articles—“Is your DBMS really relational?” (October 14, 1985) and “Does your DBMS run by the rules?” (October 21, 1985)—that set the direction of the relational movement for the next quarter century. Today, the full declarative power of “data base sublanguages” (the term used by Dr. Codd) such as Structured Query Language (SQL) is only available within the confines of a transactional database management system. However “external tables” are the half-way house between jail and freedom.