Skip to content

Question: why can’t the optimizer do better with these?

April 14, 2011

I’m scratching my head over this one. I thought the cost-based optimizer would be smart enough to eliminate certain predicates, joins and sorts automatically, and pick a cheaper plan accordingly; but in my tests it doesn’t seem to. Can you shed any light on this?

First, the setup for my test case:

select * from v$version;
-- Oracle Database 11g Enterprise Edition
--     Release 11.2.0.1.0 - 64bit Production
-- PL/SQL Release 11.2.0.1.0 - Production
-- CORE  11.2.0.1.0  Production
-- TNS for Linux: Version 11.2.0.1.0 - Production
-- NLSRTL Version 11.2.0.1.0 - Production

create table parent_table
(parent_id number(12)     not null
,padding   varchar2(1000)
);

create table child_table
(child_id  number(12)     not null
,parent_id number(12)     not null
,padding   varchar2(1000)
);

-- I find I need at least 100,000 rows for the
-- differences of costs to become significant

insert into parent_table
select rownum, lpad('x',1000,'x')
from dual connect by level <= 100000;

insert into child_table
select rownum, rownum, lpad('x',1000,'x')
from dual connect by level <= 100000;
alter table parent_table add (
  constraint parent_pk primary key (parent_id) );
alter table child_table add (
  constraint child_pk primary key (child_id)
 ,constraint fk foreign key (parent_id)
  references parent_table (parent_id) );
create index child_table_parent_i on child_table (parent_id);
begin dbms_stats.gather_table_stats(ownname => USER
, tabname => 'PARENT_TABLE'
, estimate_percent => 100
, method_opt => 'for all columns size auto'
, cascade => TRUE); end;
begin dbms_stats.gather_table_stats(ownname => USER
, tabname => 'CHILD_TABLE'
, estimate_percent => 100
, method_opt => 'for all columns size auto'
, cascade => TRUE); end;

Case 1

explain plan for
select count(*)
from child_table;

-- index fast full scan (unique) on child_pk - cost 58 - great

explain plan for
select count(*)
from child_table
where parent_id is not null;

-- index fast full scan on child_table_parent_i - cost 62

Q.1: Why can’t this query with the NOT NULL predicate on a NOT NULL column eliminate the predicate – and use the pk index instead?

Case 2

explain plan for
select count(*)
from parent_table inner join child_table
on (parent_table.parent_id = child_table.parent_id);

-- index fast full scan (unique) on child_pk - cost 58 - great

explain plan for
select count(*)
from parent_table inner join child_table
on (parent_table.parent_id = child_table.parent_id)
order by parent_table.padding;

-- hash join
--    index fast full scan on child_table_parent_i
--    full table scan parent_table
-- - cost 9080

Q.2: The first query eliminated the join; why can’t it eliminate the ORDER BY as well, since it’s irrelevant to the results?

About these ads

From → SQL

5 Comments
  1. barfo rama permalink

    Maybe http://jonathanlewis.wordpress.com/2006/12/21/constraints/ has some clues.

  2. Connor permalink

    Why not take a squizz at the 10053 trace and see what you can deduce ?

  3. Connor permalink

    And I should have added – it *can* eliminate things, but perhaps just not in your particular case. Here’s an example of where it can:


    SQL> set autotrace traceonly explain
    tabtcmc@qbeta1
    SQL> select count(*)
    2 from
    3 ( select ticket_seq
    4 from tote.ticket
    5 order by 1 )
    6 /

    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 3887145247

    --------------------------------------------------------------------------------------------
    | Id | Operation | Name | Rows | Cost (%CPU)| Time | Pstart| Pstop |
    --------------------------------------------------------------------------------------------
    | 0 | SELECT STATEMENT | | 1 | 195K (1)| 00:39:09 | | |
    | 1 | SORT AGGREGATE | | 1 | | | | |
    | 2 | PARTITION HASH ALL | | 233M| 195K (1)| 00:39:09 | 1 | 8 |
    | 3 | INDEX FAST FULL SCAN| TICKET_PK | 233M| 195K (1)| 00:39:09 | 1 | 8 |
    --------------------------------------------------------------------------------------------

  4. I wonder how Fine Grained Access Control comes into play.
    Even with the constraint, a not-null column can appears a NULL to a particular user.

    http://download.oracle.com/docs/cd/B28359_01/network.111/b28531/vpd.htm#i1014682

    That said, I’m still cautious about optimizer smarts in ‘edge’ cases.

  5. Iudith Mentzel permalink

    Regarding the ORDER BY, as far as I remember, in an early version of Oracle (I think Oracle5) such an ORDER BY on an aggregated query, ordering by a column that is not at the same aggregation level
    would have caused an Oracle error, just like
    the ORDER BY was not accepted in a view’s
    definition query.
    Things seem to have changed since then.

    However, at least as per your execution plan, it looks like it DOES eliminate the SORT AGGREGATE step, it only performs the HASH JOIN, which, of course could have been eliminated as well, like in the first join query.

    Maybe it is a matter of the order in which the optimizer makes its decisions, something like: first deciding NOT to eliminate the join because of the presence of the parent table column in the ORDER BY, and then in fact NOT performing the ORDER BY effectively, because of the aggregation.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 195 other followers

%d bloggers like this: