Home SQL Question: why can’t the optimizer do better with these?

Question: why can’t the optimizer do better with these?

I’m scratching my head over this one. I thought the cost-based optimizer would be smart enough to eliminate certain predicates, joins and sorts automatically, and pick a cheaper plan accordingly; but in my tests it doesn’t seem to. Can you shed any light on this?

First, the setup for my test case:

select * from v$version;
-- Oracle Database 11g Enterprise Edition
--     Release 11.2.0.1.0 - 64bit Production
-- PL/SQL Release 11.2.0.1.0 - Production
-- CORE  11.2.0.1.0  Production
-- TNS for Linux: Version 11.2.0.1.0 - Production
-- NLSRTL Version 11.2.0.1.0 - Production

create table parent_table
(parent_id number(12)     not null
,padding   varchar2(1000)
);

create table child_table
(child_id  number(12)     not null
,parent_id number(12)     not null
,padding   varchar2(1000)
);

-- I find I need at least 100,000 rows for the
-- differences of costs to become significant

insert into parent_table
select rownum, lpad('x',1000,'x')
from dual connect by level <= 100000;

insert into child_table
select rownum, rownum, lpad('x',1000,'x')
from dual connect by level <= 100000;
alter table parent_table add (
  constraint parent_pk primary key (parent_id) );
alter table child_table add (
  constraint child_pk primary key (child_id)
 ,constraint fk foreign key (parent_id)
  references parent_table (parent_id) );
create index child_table_parent_i on child_table (parent_id);
begin dbms_stats.gather_table_stats(ownname => USER
, tabname => 'PARENT_TABLE'
, estimate_percent => 100
, method_opt => 'for all columns size auto'
, cascade => TRUE); end;
begin dbms_stats.gather_table_stats(ownname => USER
, tabname => 'CHILD_TABLE'
, estimate_percent => 100
, method_opt => 'for all columns size auto'
, cascade => TRUE); end;

Case 1

explain plan for
select count(*)
from child_table;

-- index fast full scan (unique) on child_pk - cost 58 - great

explain plan for
select count(*)
from child_table
where parent_id is not null;

-- index fast full scan on child_table_parent_i - cost 62

Q.1: Why can’t this query with the NOT NULL predicate on a NOT NULL column eliminate the predicate – and use the pk index instead?

Case 2

explain plan for
select count(*)
from parent_table inner join child_table
on (parent_table.parent_id = child_table.parent_id);

-- index fast full scan (unique) on child_pk - cost 58 - great

explain plan for
select count(*)
from parent_table inner join child_table
on (parent_table.parent_id = child_table.parent_id)
order by parent_table.padding;

-- hash join
--    index fast full scan on child_table_parent_i
--    full table scan parent_table
-- - cost 9080

Q.2: The first query eliminated the join; why can’t it eliminate the ORDER BY as well, since it’s irrelevant to the results?

Jeffrey Kemp 14 April 2011

dba-for-the-day / SQL /

About Jeffrey Kemp

Application Designer & Developer at Oracle specialising in Oracle APEX (Application Express), Oracle SQL and PL/SQL. Oracle ACE Alumni ♠️. Piano player, father, husband, Christian. The views expressed on this blog are my own and do not necessarily reflect the views of Oracle.

Comments

barfo rama
15 April 2011 - 4:22 am

Maybe http://jonathanlewis.wordpress.com/2006/12/21/constraints/ has some clues.
Connor
15 April 2011 - 9:05 am

Why not take a squizz at the 10053 trace and see what you can deduce ?
Connor
15 April 2011 - 9:17 am

And I should have added – it *can* eliminate things, but perhaps just not in your particular case. Here’s an example of where it can:

SQL> set autotrace traceonly explain tabtcmc@qbeta1 SQL> select count(*) 2 from 3 ( select ticket_seq 4 from tote.ticket 5 order by 1 ) 6 /
Execution Plan ---------------------------------------------------------- Plan hash value: 3887145247
-------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Cost (%CPU)| Time | Pstart| Pstop | -------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 195K (1)| 00:39:09 | | | | 1 | SORT AGGREGATE | | 1 | | | | | | 2 | PARTITION HASH ALL | | 233M| 195K (1)| 00:39:09 | 1 | 8 | | 3 | INDEX FAST FULL SCAN| TICKET_PK | 233M| 195K (1)| 00:39:09 | 1 | 8 | --------------------------------------------------------------------------------------------
Gary
15 April 2011 - 6:30 pm

I wonder how Fine Grained Access Control comes into play.
Even with the constraint, a not-null column can appears a NULL to a particular user.

http://download.oracle.com/docs/cd/B28359_01/network.111/b28531/vpd.htm#i1014682

That said, I’m still cautious about optimizer smarts in ‘edge’ cases.
Iudith Mentzel
20 April 2011 - 6:26 pm

Regarding the ORDER BY, as far as I remember, in an early version of Oracle (I think Oracle5) such an ORDER BY on an aggregated query, ordering by a column that is not at the same aggregation level
would have caused an Oracle error, just like
the ORDER BY was not accepted in a view’s
definition query.
Things seem to have changed since then.

However, at least as per your execution plan, it looks like it DOES eliminate the SORT AGGREGATE step, it only performs the HASH JOIN, which, of course could have been eliminated as well, like in the first join query.

Maybe it is a matter of the order in which the optimizer makes its decisions, something like: first deciding NOT to eliminate the join because of the presence of the parent table column in the ORDER BY, and then in fact NOT performing the ORDER BY effectively, because of the aggregation.

Question: why can’t the optimizer do better with these?

Case 1

Case 2

About Jeffrey Kemp

Comments

Leave a Reply Cancel reply