The First International NoCOUG SQL Challenge: Nine Ways To Change a Lightbulb
The First International NoCOUG SQL Challenge sponsored by Apress is now closed. Thanks to Chen Shapira for helping to publicize the challenge. The winner will be announced in the next issue of the NoCOUG Journal.
The challenge was to use a table of probabilities to compute the probabilities of obtaining different sums in N throws of an unbiased die—N being a substitution variable or bind variable. Since the value of N is indeterminate, this apparently calls for the table to be joined to itself an indeterminate number of times—a seemingly impossible feat. However, there were nine very original solutions—including one Postgres solution—from seven different countries. The modus operandi was for each participant to publish their solution in a blog post; this preserved their claim on the solution and generated additional publicity for the challenge.
The first solution was quickly found by Laurent Schneider from Switzerland who knew that a table could be joined to itself an indeterminate number of times using Oracle’s proprietary CONNECT BY clause. He then constructed expressions for the required probabilities using the SYS_CONNECT_BY_PATH function. All that remained was to evaluate the expressions which he accomplished using the XMLQUERY function. Laurent is the author of Advanced Oracle SQL Programming: The Expert Guide to Writing Complex Queries.
The fourth solution was found by Vadim Tropashko from the USA who observed that recursive common table expressions—already available in DB2 and SQL Server and expected in Oracle Database 11gR2—could be used to perform the required joins. Vadim is the author of SQL Design Patterns: Expert Guide to SQL Programming.
The fifth and sixth solutions were found by Alberto Dell’Era from Italy who used advanced mathematical techniques to calculate the required probabilities for extremely high values of N. He recognized that the table of probabilities defined a mathematical function and that the process of joining the table with itself and grouping the results is the so-called “convolution” of this function with itself. Throwing the die N times is therefore equivalent to performing N – 1 convolutions. Convolutions have obvious advantages over an N-way Cartesian join because they keep the size of intermediate results in check. The question is how to compute the required N convolutions within a single SQL query if the value of N is indeterminate. Alberto solved the problem using a “Discrete Fourier Transform;” a certain new function derived from the original function. Fourier transforms have the interesting property that the transform of the convolution of functions is the simple product of the individual transforms; this is called the convolution theorem. Therefore, the Fourier transform of the N-way convolution of the function with itself is the N-th power of the Fourier transform of the function. The Fourier transform in this case is easy to compute and the N-th power of the Fourier transform is also easy to compute. At this point, Alberto has the Fourier transform of the N-way convolution instead of the N-way convolution itself. To obtain the N-way convolution itself, he computes the “inverse Fourier transform” of the Fourier transform that he has already derived. Alberto provided an even more advanced version of the above method using a mathematical technique called the Fast Fourier Transform which he implemented using the MODEL clause.
The eighth solution was found by a blogger named Cd-MaN from Romania who demonstrated the use of pipelined table functions and recursion in a Postgres database; the solution can be modified to work in an Oracle database.