Weird SQL Results
Chris Taylor raised a good question.
Here is the interesting part (to me) – Sometimes it returns 1 row, sometimes it returns more than 1 row, and sometimes it returns no rows. To my mind, “trunc(dbms_random.value(1,500)” should return a random value between 1 & 500 every time and just one value. Is this weird, or is it just me???
Basically he’s wondering why querying a table on a randomly-chosen unique identifier does not always return exactly 1 row.
At first glance it seems nonsensical and looks like a bug, but really, it’s not – in fact, Oracle is doing exactly the right thing (anecdotally, a colleague has told me that SQL Server gets this particular behaviour wrong).
A simple test case: the inner query generates exactly 100 rows, with the numbers 1 to 100. The outer query then queries this result set, with the intention of picking one of those rows only:
select * from ( select rownum r from dual connect by level <= 100 ) where r = trunc(dbms_random.value(1,100));
However, what happens here is that Oracle knows that dbms_random.value is not deterministic; and the predicates (WHERE clause) must be evaluated once for each row returned from the FROM clause. Since dbms_random.value is not deterministic, Oracle knows it must re-evaluate it for each row, separately – which means it is comparing a different number for each row returned.
So, for example, Oracle looks at row 1, with the value 1. It generates a random number, e.g. 12, and so the predicate evaluates to FALSE, and the row is not returned. Oracle then looks at row 2, which has the value 2. It generates a random number, e.g. 2, and so the predicate evaluates to TRUE, and the row is returned. It does this for each row until the 100th is evaluated and then the query stops.
If none of the 100 random numbers happen to coincide with the values returned from the table, then the query will return no rows.
Quiz question: what is the theoretical maximum number of rows the above query could ever return?
To fix it so that Oracle only evaluates the selection criteria once, we just need to wrap it in another subquery:
select * from ( select rownum r from dual connect by level <= 100 ) where r = (select trunc(dbms_random.value(1,100)) from dual);
Gary
29 April 2010 - 1:56 pm
The returned value must be LESS than the high_value supplied to dbms_random, so you won’t get a row where r=100. So I’ll opt for 99, though the chances of that happening are pretty astronomical. Maybe calculating the odds will be the followup question.
PS. The behaviour becomes more obviously correct if your query is like
SELECT * FROM table
WHERE col_1 = TRUNC(DBMS_RANDOM.VALUE(col_2, col_3))
I can’t think where you would use that, though possibly
UPDATE table
SET col_1 = TRUNC(DBMS_RANDOM.VALUE(col_min, col_max))
might have a use case.
Jeffrey Kemp
29 April 2010 - 2:36 pm
I think the odds of getting the sequence 1..6 in that order, with a perfectly random die, in exactly six rolls, would be 1/6*1/6*1/6*1/6*1/6*1/6 = 1/6**6 ~= 2.14334e-5.
Therefore, the odds of getting the sequence 1..99 in that order should be = 1/99**99 ~= 2.70467e-198, which is pretty much impossible, for all practical purposes 🙂
BTW those are excellent sample queries, they illustrate clearly why dbms_random.value must be called for each row.