Results of the Second International NoCOUG SQL Challenge
Follow @Oratweets
As published in the 100th issue of the NoCOUG Journal (November 2011)
Also see Results of the First International NoCOUG SQL Challenge
The Second International NoCOUG SQL Challenge was published on 2/13/11 in the February 2011 issue of the NoCOUG Journal. SQL commands to create the data were provided at http://bit.ly/g58WVn. The challenge was to find the secret message hidden in a seemingly random collection of words. The winners are Andre Araujo (Australia), Rob van Wijk (Netherlands), and Ilya Chuhnakov (Russia.) Each winner will receive an Amazon Kindle from contest sponsor Pythian and the August Order of the Wooden Pretzel in keeping with the pronouncement of Steven Feuerstein that “some people can perform seeming miracles with straight Es-Cue-El, but the statements end up looking like pretzels created by somebody who is experimenting with hallucinogens.”
The first reaction to the challenge was one of puzzlement. Van Wijk wrote on 2/14/11 on his blog: “Unfortunately, I don’t understand what needs to be done. Is it forming a sentence? Three sentences? Do all words need to be used? If so, lots of sentences can be made; how do I know which is the right one? I’m afraid I don’t think it is a *SQL* Challenge, but I may be missing something.” However, the puzzle quickly fell to the combined onslaught of the international database community. At 6:11 AM PST on 2/15/11 we received a solution from Araujo. He had realized that the words formed a binary tree and used “recursive common table expressions” to decode the secret message (the winning solution to a contest conducted by columnist Marilyn vos Savant in which contestants had to write a sensible paragraph of one hundred unique words).
“TRYING TO TYPE ONE HUNDRED DISTINCT WORDS IN A SINGLE PARAGRAPH IS REALLY TOUGH IF I CANNOT REPEAT ANY OF THEM THEN PROBABLY THOSE WITH MANY LETTERS SHOULD BE USED MAYBE SOME READERS WILL UTILIZE DICTIONARIES THESAURUSES THESAURI OR POSSIBLY EVEN ENCYCLOPEDIAS BUT MY PREFERENCE HAS ALWAYS BEEN THAT GRAY MATTER BETWEEN YOUR EARS SERIOUSLY MARILYN CHALLENGES SUCH AS THIS REQUIRE SKILLS BEYOND MATH SCIENCE AND PHYSICS SO WHAT DO YOU ASK READING COMPREHENSION WRITING ABILITY GOOD OLD FASHIONED ELBOW GREASE SCIENTISTS DONT CARE ABOUT STRUCTURE THEY WANT RESULTS HEY LOOK ONLY ELEVEN MORE LEFT FIVE FOUR THREE TWO DONE”
Araujo posted a detailed analysis of the problem at http://www.pythian.com/news/20757/nocoug-sql-challenge-entry-2/. He admitted that—even though he had successfully decoded the secret message—his solution would not work for all binary trees. At 1:08 PM PST the same day, van Wijk sent us a recursive CTE solution that works for all binary trees. Here is the solution with some modifications for extra clarity.
-- Assign an ordering string to each node
WITH CTE(word1, word2, word3, ordering) AS
(
-- This is the anchor member of the recursive CTE
-- Identify the root of the binary tree
SELECT
r.word1, r.word2, r.word3,
-- The ordering string for the root node is '1'
cast('1' AS VARCHAR2(4000) ) AS ordering
FROM riddle r
WHERE NOT EXISTS (
SELECT * FROM riddle r2
WHERE r.word2 IN (r2.word1, r2.word3) )
UNION ALL
-- This is the recursive member of the recursive CTE
-- Identify the left and right nodes if any
SELECT
r.word1, r.word2, r.word3,
-- Compute the ordering string for this node
CASE
-- Handle the case of a left node
WHEN r.word2 = CTE.word1
-- Change the last digit to '0' and then append '1'
THEN replace(CTE.ordering, '1', '0') || '1'
-- Handle the case of a right node
WHEN r.word2 = CTE.word3
-- Change the last digit to '2' and then append '1'
THEN replace(CTE.ordering, '1', '2') || '1'
END AS ordering
FROM riddle r JOIN CTE
ON r.word2 IN (CTE.word1, CTE.word3)
)
-- Sort the words using the ordering string
SELECT word2 FROM CTE ORDER BY ordering;
A recursive CTE consists of an “anchor” member and one or more “recursive” members. The anchor member generates seed data while the recursive member generates additional data. Any additional data is fed right back to the recursive member and the process continues until no more data is found. To help understand van Wijk’s solution, store the words of the sentence “Quick brown Fox jumps over the lazy dog” in a binary tree as shown in Figure 1.
CREATE TABLE riddle
(
word1 VARCHAR2(32),
word2 VARCHAR2(32) NOT NULL,
word3 VARCHAR2(32)
);
INSERT INTO RIDDLE VALUES (NULL, 'Quick', NULL);
INSERT INTO RIDDLE VALUES ('Quick', 'brown', NULL);
INSERT INTO RIDDLE VALUES ('brown', 'Fox', 'dog' );
INSERT INTO RIDDLE VALUES (NULL, 'jumps', NULL);
INSERT INTO RIDDLE VALUES ('jumps', 'over', NULL);
INSERT INTO RIDDLE VALUES ('over', 'the', 'lazy' );
INSERT INTO RIDDLE VALUES (NULL, 'lazy', NULL);
INSERT INTO RIDDLE VALUES ('the', 'dog', NULL);
The sentence can be reconstructed by traversing the tree in “in-order” fashion which involves performing the following operations recursively at each node starting with the root node: first traverse the left sub-tree (if any), then process the node itself, and finally traverse the right sub-tree (if any.) Recursion can be achieved in SQL queries using “recursive common table expressions” (recursive CTEs). However, recursive CTEs only permit “pre-order” traversal (parent, left sub-tree, right sub-tree) not “in-order” traversal. Van Wijk worked around the problem by using a two-phase approach. An ordering string is generated for each node during the pre-order traversal of the tree (Figure 2) and the results are then sorted using the ordering string.
On 2/16/11 (incorrectly stated as 3/16/11 in the NoCOUG Journal, refer to http://iggyfernandez.wordpress.com/2011/02/13/hot-off-the-press-the-nocoug-journal-and-the-second-international-nocoug-sql-challenge%E2%80%8F/#comment-916), Chuhnakov submitted two solutions. The first used the MODEL clause which—in his words—works “automagically.” Columns are classified into “dimension” and “measure” arrays where the two terms have the same meaning as for fact tables in data warehouses. The Word2 column is the obvious dimension array while Word1 and Word3 are measure arrays. Chuhnakov creates another measure array called Text to store messages contained in sub-trees. Each Text value is recursively defined in terms of other Text values by concatenating the left sub-tree with the current node and the right sub-tree. Note that the CV function returns the current value of its argument.
SELECT MAX(text)
KEEP (DENSE_RANK LAST ORDER BY length(text) ) AS text
FROM
(
SELECT * FROM riddle
MODEL
DIMENSION BY (word2)
MEASURES
(
word1,
word3,
CAST(NULL AS VARCHAR2(4000) ) AS text
)
RULES AUTOMATIC ORDER
(
text [ word2 ] = trim ( text [ word1 [ CV ( word2 ) ] ] )
|| ' ' || CV ( word2 )
|| ' ' || trim ( text [ word3 [ CV ( word2 ) ] ] )
)
);
Chuhnakov’s second solution was similar to van Wijk’s solution but used the CONNECT BY method.
-- Assign an ordering string to each node
WITH CTE AS
(
SELECT
r.word2,
-- Compute the ordering string for this node
replace(sys_connect_by_path(
CASE
WHEN r.word2 = PRIOR word1 THEN '0'
WHEN r.word2 = PRIOR word3 THEN '2'
END, '/' ), '/' ) || '1' AS ordering
FROM riddle r
-- Identify the root of the binary tree
START WITH NOT EXISTS (
SELECT * FROM riddle r2
WHERE r.word2 IN (r2.word1, r2.word3) )
-- Identify the left and right nodes if any
CONNECT BY r.word2 IN (PRIOR r.word1, PRIOR r.word3)
)
-- Sort the words using the ordering string
SELECT word2 FROM CTE ORDER BY ordering;
Other solutions using techniques similar to the ones already described were subsequently received.
Addendum
The reference solution used by the judges involved “threading” the binary tree by identifying the successor and predecessor of each node. The successor of each node is the leftmost node in its right subtree while its predecessor is the rightmost node in its left subtree.
WITH
-- Identify successors of those nodes that have a right child
-- Nodes that do not have a right child do not have any successors
-- The successor of each node is the leftmost node in its right subtree
successors(root, lvl, word1, word2, word3, comments) AS
(
SELECT
word2 AS root,
0 AS lvl,
word1,
word2,
word3,
'Identify successors of "' || word2 || '"' AS comments
FROM riddle
WHERE word3 IS NOT NULL
UNION ALL
SELECT
suc.root,
suc.lvl + 1 AS lvl,
rid.word1,
rid.word2,
rid.word3,
'"' || rid.word2 || '" is a level ' || (suc.lvl + 1) || ' successor of "' || suc.root || '"' AS comments
FROM
successors suc,
riddle rid
WHERE rid.word2 = CASE WHEN suc.lvl = 0 THEN suc.word3 ELSE suc.word1 END
)
SEARCH DEPTH FIRST BY word2 SET seq#,
-- Identify predecessors of those nodes that have a left child
-- Nodes that do not have a left child do not have any predecessors
-- The predecessor of each node is the rightmost node in its left subtree
predecessors(root, lvl, word1, word2, word3, comments) AS
(
SELECT
word2 As ROOT,
0 AS lvl,
word1,
word2,
word3,
'Identify predecessors of "' || word2 || '"' AS comments
FROM riddle
WHERE word1 IS NOT NULL
UNION ALL
SELECT
pre.root,
pre.lvl + 1 AS lvl,
rid.word1,
rid.word2,
rid.word3, '"' || rid.word2 || '" is a level ' || (pre.lvl + 1) || ' predecessor of "' || pre.root || '"' AS comments
FROM
predecessors pre,
riddle rid
WHERE rid.word2 = CASE WHEN pre.lvl = 0 THEN pre.word1 ELSE pre.word3 END
)
SEARCH DEPTH FIRST BY word2 SET seq#,
-- Create a linked list of nodes and successors
-- If a node has more than one successor then choose the farthest one
LinkedList(node, successor) AS
(
SELECT
root AS node,
word2 AS successor
FROM successors suc
WHERE lvl = (SELECT max(lvl) FROM successors WHERE root = suc.root)
UNION ALL
SELECT
word2 AS node,
root AS successor
FROM predecessors pre
WHERE lvl = (SELECT max(lvl) FROM predecessors WHERE root = pre.root)
),
-- Create the ordered list of nodes and successors
-- Start with the node that is not a successor of any node
OrderedList(lvl, node, successor) AS
(
SELECT
1 AS lvl,
node,
successor
FROM LinkedList
WHERE node NOT IN (SELECT successor FROM LinkedList)
UNION ALL
SELECT
ord.lvl + 1 AS lvl,
lin.node,
lin.successor
FROM
OrderedList ord,
LinkedList lin
WHERE lin.node = ord.successor
)
-- List the nodes in the right order
-- Don't forget the node that does not have a successor
SELECT node FROM
(
SELECT
lvl,
node
FROM OrderedList
UNION ALL
SELECT
null,
successor
FROM LinkedList
WHERE successor NOT IN (SELECT node FROM LinkedList)
)
ORDER by lvl NULLS LAST;
Here are the results of each of the common table expressions in the above listing.
-- Identify successors of those nodes that have a right child
-- Nodes that do not have a right child do not have any successors
-- The successor of each node is the leftmost node in its right subtree
ROOT LVL WORD1 WORD2 WORD3 COMMENTS SEQ#
----- ------ ----- ----- ----- -------------------------------------------------- ------
Fox 0 brown Fox dog Identify successors of "Fox" 1
1 the dog "dog" is a level 1 successor of "Fox" 2
2 over the lazy "the" is a level 2 successor of "Fox" 3
3 jumps over "over" is a level 3 successor of "Fox" 4
4 jumps "jumps" is a level 4 successor of "Fox" 5
the 0 over the lazy Identify successors of "the" 6
1 lazy "lazy" is a level 1 successor of "the" 7
-- Identify predecessors of those nodes that have a left child
-- Nodes that do not have a left child do not have any predecessors
-- The predecessor of each node is the rightmost node in its left subtree
ROOT LVL WORD1 WORD2 WORD3 COMMENTS SEQ#
----- ------ ----- ----- ----- -------------------------------------------------- ------
Fox 0 brown Fox dog Identify predecessors of "Fox" 1
1 Quick brown "brown" is a level 1 predecessor of "Fox" 2
brown 0 Quick brown Identify predecessors of "brown" 3
1 Quick "Quick" is a level 1 predecessor of "brown" 4
dog 0 the dog Identify predecessors of "dog" 5
1 over the lazy "the" is a level 1 predecessor of "dog" 6
2 lazy "lazy" is a level 2 predecessor of "dog" 7
over 0 jumps over Identify predecessors of "over" 8
1 jumps "jumps" is a level 1 predecessor of "over" 9
the 0 over the lazy Identify predecessors of "the" 10
1 jumps over "over" is a level 1 predecessor of "the" 11
-- Create a linked list of nodes and successors
-- If a node has more than one successor then choose the farthest one
NODE SUCCE
----- -----
Fox jumps
the lazy
brown Fox
Quick brown
lazy dog
jumps over
over the
-- Create the ordered list of nodes and successors
-- Start with the node that is not a successor of any node
LVL NODE SUCCE
------ ----- -----
1 Quick brown
2 brown Fox
3 Fox jumps
4 jumps over
5 over the
6 the lazy
7 lazy dog
-- List the nodes in the right order
-- Don't forget the node that does not have a successor
NODE
-----
Quick
brown
Fox
jumps
over
the
lazy
dog
Results of the First International NoCOUG SQL Challenge
Download the 100th issue of the NoCOUG Journal




Recent Comments