Home > Oracle, SQL > When Two Joins Are Better Than One

When Two Joins Are Better Than One


I was asked to tune a “Top N” query; a simplified version is shown below. Table t1 is first filtered by group (WHERE group_id = :group_id) then partitioned by subgroup (PARTITION BY subgroup_id) and sorted by timestamp (ORDER BY timestamp DESC). The record with the most recent timestamp in each partition is the one required (WHERE rank = 1).

SELECT
  t1.*,
  t2.*
FROM
  (
    SELECT
      t1.*,
      RANK () OVER (PARTITION BY subgroup_id ORDER BY timestamp DESC) as rank
    FROM t1
    WHERE group_id = :group_id
  ) t1
  LEFT JOIN t2 ON t2.id = t1.t2_id
WHERE rank = 1;

The query plan showed that most of the time was spent retrieving 192,000 records from table t1 using index lookups (TABLE ACCESS BY INDEX ROWID). However, only 479 of those records were retained after partitioning and sorting.

----------------------------------------------------------------------------------------------------
| Id  | Operation                       | Name   | Starts | A-Rows |   A-Time   | Buffers | Reads  |
----------------------------------------------------------------------------------------------------
|*  1 |  HASH JOIN OUTER                |        |      1 |    479 |00:00:16.66 |   16051 |  16033 |
|*  2 |   VIEW                          |        |      1 |    479 |00:00:16.33 |   15424 |  15413 |
|*  3 |    WINDOW SORT PUSHED RANK      |        |      1 |    192K|00:00:16.06 |   15424 |  15413 |
|*  4 |     FILTER                      |        |      1 |    192K|00:00:14.23 |   15424 |  15413 |
|   5 |      TABLE ACCESS BY INDEX ROWID| T1     |      1 |    192K|00:00:14.03 |   15424 |  15413 |
|*  6 |       INDEX RANGE SCAN          | T1_IX1 |      1 |    192K|00:00:04.80 |    1021 |   1021 |
|   7 |   TABLE ACCESS FULL             | T2     |      1 |  27142 |00:00:00.26 |     627 |    620 |
----------------------------------------------------------------------------------------------------

The solution was to create an index t1_ix2 on the triplet (group_id ASC, subgroup_id ASC, timestamp DESC). Note that the timestamp values are stored in descending order in this index instead of the default ascending order. An extra join was then introduced at the outset of the query. It used the new index to collect only the ROWIDs of the 479 records but not any data from table t1. As a bonus, sorting was no longer required (WINDOW NOSORT) for the “Top N” calculation since the new index was sorted in the required order. In the next join, we retrieved the required 479 records from table t1 using the ROWIDs collected in the first join (TABLE ACCESS BY USER ROWID). We did have to use several SQL hints (LEADING, USE_NL, USE_HASH, and NO_SWAP_JOIN_INPUTS) to make the optimizer see things our way but the modified query was five times faster than the original (3.11 seconds v/s 16.6 seconds).

SELECT
  /*+ LEADING(temp t1 t2) USE_NL(t1) USE_HASH(t2) NO_SWAP_JOIN_INPUTS(t2) */
  t1.*,
  t2.*
FROM
  (
    SELECT
      /*+ NO_MERGE */
      ROWID AS t1_rowid,
      RANK () OVER (PARTITION BY subgroup_id ORDER BY timestamp DESC) as rank
    FROM t1
    WHERE group_id = :group_id
  ) temp
  LEFT JOIN t1 ON (t1.ROWID = temp.t1_rowid),
  LEFT JOIN t2 ON (t2.id = t1.t2_id)
WHERE temp.rank = 1;

 

----------------------------------------------------------------------------------------------------
| Id  | Operation                       | Name   | Starts | A-Rows |   A-Time   | Buffers | Reads  |
----------------------------------------------------------------------------------------------------
|*  1 |  HASH JOIN OUTER                |        |      1 |    479 |00:00:03.11 |    2129 |   1802 |
|   2 |   NESTED LOOPS OUTER            |        |      1 |    479 |00:00:02.82 |    1502 |   1182 |
|*  3 |    VIEW                         |        |      1 |    479 |00:00:02.37 |    1023 |   1016 |
|*  4 |     WINDOW NOSORT               |        |      1 |    192K|00:00:02.14 |    1023 |   1016 |
|*  5 |      INDEX RANGE SCAN           | T1_IX2 |      1 |    192K|00:00:01.89 |    1023 |   1016 |
|   6 |    TABLE ACCESS BY USER ROWID   | T1     |    479 |    479 |00:00:00.45 |     479 |    166 |
|   7 |   TABLE ACCESS FULL             | T2     |      1 |  27142 |00:00:00.23 |     627 |    620 |
----------------------------------------------------------------------------------------------------

P.S. I learned the ROWID-to-ROWID trick from SQL Tuning by Dan Tow.

Categories: Oracle, SQL
  1. Narendra
    December 7, 2009 at 2:56 am
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: