WITH With an IN; or, A Reason to Refactor
A work colleague needed to make a change to a report and came up against a brick wall. He knew what he wanted to express in SQL, but Oracle wouldn’t accept his syntax.
The original query used a WITH clause like this (note, I’ve removed a great deal of irrelevant detail here, the actual query had a lot of other stuff going on, but this will do to illustrate this point):
WITH q AS
(SELECT expensive_function1(:id) idpart1
,expensive_function2(:id) idpart2
FROM DUAL)
SELECT t.*
FROM q, some_table t
WHERE q.idpart1 = t.idpart1
AND q.idpart2 = t.idpart2;
This works well because the two expensive functions are run only once in the query, and the results are used to probe some_table on its compound key.
The requirements had changed, however, and my colleague needed to change it so that instead of calling the expensive functions, it now got multiple keys from another table. His first cut looked like this:
SELECT t.*
FROM some_table t
WHERE (t.idpart1, t.idpart2) IN
(SELECT idpart1, idpart2
FROM driving_table
WHERE id = :id);
This worked fine, but my colleague felt that the WITH clause should be retained, e.g.:
WITH q AS
(SELECT idpart1, idpart2
FROM driving_table
WHERE id = :id)
SELECT t.*
FROM some_table t
WHERE (t.idpart1, t.idpart2) IN q;
However, Oracle doesn’t like this and raises ORA-00920: invalid relational operator when the statement is parsed.
To my mind this alternative formulation was not required, but I was interested to know what this error message meant, and why can’t we use a WITH subquery as the target for an IN statement?
As it is, however, the Oracle documentation explains why this formulation is not allowed – look up the “SELECT” command in the SQL Reference 10g Release 2 (10.2), search for “WITH”, and under “subquery_factoring_clause” we read:
“Restrictions on Subquery Factoring: This clause is subject to the following restrictions:
…
In a compound query with set operators, you cannot use the query_name for any of the component queries, but you can use the query_name in the FROM clause of any of the component queries.”
In other words, we can refer to a WITH subquery in a FROM clause but not as part of a set operator like IN.
In the case of my colleague, I recommended he use this much simpler formulation, which works just fine for this report:
SELECT t.*
FROM some_table t, driving_table d
WHERE t.idpart1 = q.idpart1
AND t.idpart2 = q.idpart2;
This shows how, when maintaining existing code, sometimes you have to go back to basics and consider whether the change to the requirements (in this case, a seemingly minor change) means that the code can be refactored to give an appreciable benefit. Caveat: this is not always the case: making large changes to existing code always carries the risk of introducing more defects.