Report query with optional parameters
I received a question today from a developer who wanted to write a single static SQL query that could handle multiple optional parameters – i.e. the user might choose to leave one or more of the parameters NULL, and they’d expect the query to ignore those parameters. This is a quite common requirement for generic reporting screens, and there are two different methods commonly used to solve it.
Their sample query, using bind variables (natch), never returned any rows if any of the bind variables were null:
SELECT * FROM emp WHERE job = :P_JOB AND dept = :P_DEPT AND city = :P_CITY
This is expected, of course, because “x = null” always evaluates to “unknown”, and this causes the rows to be omitted.
Option 1: add “OR v NOT NULL”, e.g.
SELECT * FROM emp WHERE (job = :P_JOB OR :P_JOB IS NULL) AND (dept = :P_DEPT OR :P_DEPT IS NULL) AND (city = :P_CITY OR :P_CITY IS NULL)
Option 2: use NVL, e.g.
SELECT * FROM emp WHERE job = NVL(:P_JOB, job) AND dept = NVL(:P_DEPT, dept) AND city = NVL(:P_CITY, city)
If the columns in the table do not have NOT NULL constraints on them, Option #2 will fail to return rows that have NULL in the relevant column – regardless of whether the user parameter is null or not. This is because “job = job” will always be “unknown” if job is null. In this case, Option #1 must be used.
If the columns do have NOT NULL constraints on them, then both Option #1 and Option #2 will work just fine. However, given the choice I would use Option #2 in order to take advantage of the potential performance optimisation that Oracle 12 can do with these types of NVL queries. There is a 3rd option, which is identical to Option #2 except that it uses the COALESCE function instead of NVL – but I would avoid this option as it will not get the performance optimisation.
On the other hand, if any of the attributes is the result of a costly operation (e.g. a function call), I would always use Option #1 (“OR NULL”) instead, because the NVL does not use short-circuit evaluation to avoid multiple function calls.
If there is a mix of columns that have NOT NULL constraints and others that don’t, I don’t really see any problem with mixing the two methods, e.g. in the case where dept has a NOT NULL constraint but job and city don’t:
SELECT * FROM emp WHERE (job = :P_JOB OR :P_JOB IS NULL) AND dept = NVL(:P_DEPT, dept) AND (city = :P_CITY OR :P_CITY IS NULL)
Here’s a question for you to think about. What if the business rule states that the report should omit records where a column is null (i.e. the column may have nulls but they don’t want those records to ever appear in the report)? You may as well use NVL, e.g. in the case where dept has a NOT NULL constraint, but job and city don’t, but the report should omit records where job is null:
SELECT * FROM emp WHERE job = NVL(:P_JOB, job) AND dept = NVL(:P_DEPT, dept) AND (city = :P_CITY OR :P_CITY IS NULL)
You might argue that future developers might be confused by the above query; it’s not exactly clear whether the developer intended to omit the records with null jobs, or if they made a mistake. Code comments might help, but alternatively you might choose to make the rule explicit, e.g.:
SELECT * FROM emp WHERE job = NVL(:P_JOB, job) AND job IS NOT NULL AND dept = NVL(:P_DEPT, dept) AND (city = :P_CITY OR :P_CITY IS NULL)
If you feel strongly about this one way or another, please leave your comments below 🙂
This topic is a reminder that when there are multiple possible solutions to a problem, the choice should not be taken arbitrarily; and we should avoid enshrining one choice in any standards document as the “one true way”. This is because the answer is often “it depends” – different options may be valid for different scenarios, and have advantages and disadvantages that need to be taken into account.
Further Reading
Jorge Rimblas
15 February 2018 - 10:55 am
Nice blog post. Very good points, I like being purposeful and coding the intent.
Very interesting how this NVL and COALESCE discussion is evolving.
Juno Tasli
2 June 2020 - 5:02 am
Thank you, this was helpfull. What if the optional parameters are coming from an LOV that let’s the user select multiple values from the list?
Jeffrey Kemp
3 June 2020 - 7:31 am
Hi Juno,
Typically the result from a multi-select list item are stored as a colon-delimited list, e.g.
ITEM1:ITEM2:ITEM3
. You could parse this into a table of values to get best performance, but typically I just use a simple string search to find matches, e.g.:Jeff
Debraj
5 June 2020 - 10:30 pm
My insight
Long back I tried this approach on a huge table and the performance was awful. Then I tried dynamic sql based reporting., that only adds to the condition which have values and the performance upgraded enormously.
Conclusion : This approach may be fine with small tables, but surely not good on large tables. Apply dynamic sql based report then.
Jeffrey Kemp
5 June 2020 - 10:37 pm
Thanks Debraj, that’s true. I’ve seen the same thing.
Just recently I tried dynamically generating the query for this purpose. It made quite a good improvement for one report, but made it worse for another report (the reports both had complex but different queries and somewhat different data volumes).
Only issue it might add in some cases is the cost of parsing. I think results will vary depending on the complexity of the query.
Jeff