Home > DBA, NoSQL, Oracle, SQL > Day 2: The Twelve Days of SQL: SQL is a non-procedural language

Day 2: The Twelve Days of SQL: SQL is a non-procedural language


On the second day of Christmas, my true love gave to me
Two turtle doves.

Day 2: SQL is a non-procedural language (Day 1: SQL is based on relational calculus and relational algebra) (Day 3: There isn’t always a single optimal query plan for a SQL query)

The creators of SQL intended it for the use of for the use of “accountants, engineers, architects, and urban planners” who, “while they are not computer specialists, would be willing to learn to interact with a computer in a reasonably high-level, non-procedural [emphasis added] query language” (http://faculty.cs.tamu.edu/yurttas/PL/DBL/docs/sequel-1974.pdf). Hence, the primary goal of SQL was ease of use; it allowed database users to request data without requiring or allowing them to specify the sequence of relational operations to be used by the database to satisfy the request.

Users, developers, database administrators, and managers will often complain about SQL performance. Often, the root cause is that the query plan has changed or is not particularly efficient. However, SQL is a non-procedural language and does not come with an automatic guarantee of performance and predictability. In another part of this series, I will show you how to keep track of execution plans and execution metrics of critical SQL statements and what you can do to promote performance and predictability.

Here are some common performance complaints:

  • The query is running slowly.
  • The query ran fast yesterday but is running slowly today.
  • The query ran fast a few minutes ago but is running slowly now.
  • The query runs fast in the QA database but runs slowly in the production database.
  • The query runs fast in other production databases but not in this production database.
  • Oracle is not using the indexes we created.
  • The query plan is not the one we expected.

I was once asked the following questions in an interview (changed to use the Human Resources sample schema): “Find the managers who earn less than one or more of their employees. Which is the best way to write the query if there is only one manager and many employees? Which is the best way to write the query if there is only one employee and many managers?”

In the first case, the intuitive answer is to use the relational calculus method because the form of the query suggests that Oracle will drive from managers to employees.

select m.employee_id, m.first_name, m.last_name
from employees m
where m.employee_id = 100 -- Steven King
and exists (
  select * from employees e
  where m.employee_id = e.manager_id
  and m.salary < e.salary
);

In the second case, the intuitive answer is to use the relational algebra method because the form of the query suggests that Oracle will drive from employees to managers

select distinct m.employee_id, m.first_name, m.last_name
from employees e join employees m on (m.employee_id = e.manager_id)
where e.employee_id = 102 -- Lex de Haan (1954-2006)
and m.salary < e.salary;

However, in both cases, Oracle Database is perfectly capable of driving from either employees or managers depending on the data distribution suggested by the optimizer statistics. Here are the results of my tests using Oracle Database 11g Release 2 and the Developer Days Virtual Machine.

Test 1: One manager and many employees (relational calculus method)

Oracle drives from managers to employees. Step 3 in the query plan is the first to be executed and checks the condition “m.employee_id = 100.”

select m.employee_id, m.first_name, m.last_name
from employees m
where m.employee_id = 100 -- Steven King
and exists (
  select * from employees e
  where m.employee_id = e.manager_id
  and m.salary < e.salary
);

select * from table(dbms_xplan.display_cursor);

Plan hash value: 389357184

-----------------------------------------------------------------------------------------------
| Id  | Operation                    | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                |       |       |     1 (100)|          |
|   1 |  NESTED LOOPS SEMI           |                |     1 |    78 |     1   (0)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| EMPLOYEES      |     1 |    52 |     0   (0)|          |
|*  3 |    INDEX UNIQUE SCAN         | EMP_EMP_ID_PK  |     1 |       |     0   (0)|          |
|*  4 |   TABLE ACCESS BY INDEX ROWID| EMPLOYEES      |     1 |    26 |     1   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN          | EMP_MANAGER_IX |     1 |       |     0   (0)|          |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("M"."EMPLOYEE_ID"=100)
   4 - filter("M"."SALARY"<"E"."SALARY")
   5 - access("E"."MANAGER_ID"=100)
       filter("M"."EMPLOYEE_ID"="E"."MANAGER_ID")

Test 2: One employee and many managers (relational calculus method)

Oracle drives from employees to managers. Step 3 in the query plan is the first to be executed and checks the condition “e.employee_id = 102.”

select m.employee_id, m.first_name, m.last_name
from employees m
where exists (
  select * from employees e
  where e.employee_id = 102 -- Lex de Haan (1954-2006)
  and m.employee_id = e.manager_id
  and m.salary < e.salary
);

select * from table(dbms_xplan.display_cursor);

Plan hash value: 60640716

-----------------------------------------------------------------------------------
| Id  | Operation                    | Name          | Rows  | Bytes | Cost (%CPU)|
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |               |       |       |     1 (100)|
|   1 |  NESTED LOOPS                |               |     1 |    91 |     0   (0)|
|   2 |   TABLE ACCESS BY INDEX ROWID| EMPLOYEES     |     1 |    39 |     0   (0)|
|*  3 |    INDEX UNIQUE SCAN         | EMP_EMP_ID_PK |     1 |       |     0   (0)|
|*  4 |   TABLE ACCESS BY INDEX ROWID| EMPLOYEES     |   409 | 21268 |     0   (0)|
|*  5 |    INDEX UNIQUE SCAN         | EMP_EMP_ID_PK |     1 |       |     0   (0)|
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("E"."EMPLOYEE_ID"=102)
   4 - filter("M"."SALARY"<"E"."SALARY")
   5 - access("M"."EMPLOYEE_ID"="E"."MANAGER_ID")

Test 3: One manager and many employees (relational algebra method)

Oracle drives from managers to employees. Step 4 in the query plan is the first to be executed and checks the condition “m.employee_id = 100.”

select distinct m.employee_id, m.first_name, m.last_name
from employees e join employees m on (m.employee_id = e.manager_id)
where m.employee_id = 100 -- Steven King
and m.salary < e.salary;

select * from table(dbms_xplan.display_cursor);

Plan hash value: 3383941080

------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                |       |       |     3 (100)|          |
|   1 |  HASH UNIQUE                  |                |     1 |    78 |     3  (34)| 00:00:01 |
|   2 |   NESTED LOOPS                |                |     1 |    78 |     2   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID| EMPLOYEES      |     1 |    52 |     0   (0)|          |
|*  4 |     INDEX UNIQUE SCAN         | EMP_EMP_ID_PK  |     1 |       |     0   (0)|          |
|*  5 |    TABLE ACCESS BY INDEX ROWID| EMPLOYEES      |     1 |    26 |     2   (0)| 00:00:01 |
|*  6 |     INDEX RANGE SCAN          | EMP_MANAGER_IX |    14 |       |     0   (0)|          |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("M"."EMPLOYEE_ID"=100)
   5 - filter("M"."SALARY"<"E"."SALARY")
   6 - access("E"."MANAGER_ID"=100)

Test 4: One employee and many managers (relational algebra method)

Oracle drives from employees to managers. Step 3 in the query plan is the first to be executed and checks the condition “e.employee_id = 102.” This is the same query plan as in Test 2.

select distinct m.employee_id, m.first_name, m.last_name
from employees e join employees m on (m.employee_id = e.manager_id)
where e.employee_id = 102 -- Lex de Haan (1954-2006)
and m.salary < e.salary;

select * from table(dbms_xplan.display_cursor);

Plan hash value: 60640716

-----------------------------------------------------------------------------------
| Id  | Operation                    | Name          | Rows  | Bytes | Cost (%CPU)|
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |               |       |       |     1 (100)|
|   1 |  NESTED LOOPS                |               |     1 |    91 |     0   (0)|
|   2 |   TABLE ACCESS BY INDEX ROWID| EMPLOYEES     |     1 |    39 |     0   (0)|
|*  3 |    INDEX UNIQUE SCAN         | EMP_EMP_ID_PK |     1 |       |     0   (0)|
|*  4 |   TABLE ACCESS BY INDEX ROWID| EMPLOYEES     |   409 | 21268 |     0   (0)|
|*  5 |    INDEX UNIQUE SCAN         | EMP_EMP_ID_PK |     1 |       |     0   (0)|
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("E"."EMPLOYEE_ID"=102)
   4 - filter("M"."SALARY"<"E"."SALARY")
   5 - access("M"."EMPLOYEE_ID"="E"."MANAGER_ID")

Day 1: SQL is based on relational calculus and relational algebra

Day 3: There isn’t always a single optimal query plan for a SQL query

Categories: DBA, NoSQL, Oracle, SQL
  1. Ravi
    December 1, 2011 at 11:46 am

    Iggy,

    There is an easier way to find all the managers who earn less than one of their subordinates. Oracle could just talk to them and list all the sullen and sulky ones. I bet that is more optimized under any circumstances 😉

    -RK

  1. No trackbacks yet.

Leave a comment