Easy way to crash a Oracle 10.2 session

This query crashes the session when it’s run or parsed.
I haven’t checked if this appears in other versions or not.


SQL> conn scott/tiger@orcl
Connected.

SQL> select * from v$version;
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 – 64bi
PL/SQL Release 10.2.0.3.0 – Production
CORE 10.2.0.3.0 Production
TNS for Solaris: Version 10.2.0.3.0 – Production
NLSRTL Version 10.2.0.3.0 – Production

SQL> explain plan for with q as
2 (select ‘x’ d from dual
3 union all
4 select ‘y’ from dual)
5 select d from q
6 union all
7 select d from q
8 order by dbms_random.value;
explain plan for with q as (select ‘x’ d from dual
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel

The crashes happened when I added the “order by dbms_random.value”. The problem seems to be in the optimiser since I get the same results whether I run the query or just explain it. Possibly something to do with the optimiser wrongly assuming the random function is deterministic?

There’s a simple workaround, so I’m not too worried. I just change it to this and it works fine:


with q as
(select ‘x’ d from dual
union all
select ‘y’ from dual)
select d, dbms_random.value o from q
union all
select d, dbms_random.value o from q
order by o;


Do you know the difference between USER_TAB_COLS & USER_TAB_COLUMNS?

I just noticed the difference – I was wondering why I couldn’t see the virtual columns (i.e. columns that support function-based indexes) on a table, and it was because I was querying USER_TAB_COLUMNS. Once I used USER_TAB_COLS instead there they were.

The difference is noted in the documentation (Oracle 10g Release 2 (10.2)) for USER_TAB_COLS and ALL_TAB_COLS but not for USER_TAB_COLUMNS or ALL_TAB_COLUMNS, unfortunately.


Generic Audit

Looking at the column comments on this table, I can sympathise with the poor soul who after painstakingly describing 100 columns just decided that it just wasn’t worth it…

table audit_history (column comments in quotes):

id number(18) “Unique identifier”
aud_table_name varchar2(30) “audit record table”
audit_action varchar2(50) “audit action e.g. update”
aud_timestamp date “timestamp of the audit”
aud_user_id varchar2(100) “user performing the action”
aud_col_1 varchar2(4000) “audit column 1”
aud_col_2 varchar2(4000) “audit column 2”
aud_col_3 varchar2(4000) “audit column 3”
… (etc.) …
aud_col_99 varchar2(4000) “audit column 99”
aud_col_100 varchar2(4000) “audit column 100”
aud_col_101 varchar2(4000)
aud_col_102 varchar2(4000)
… (etc.) …
aud_col_139 varchar2(4000)
aud_col_140 varchar2(4000)

Tip: Don’t let anyone even think about using this kind of design for their change history auditing requirements. It might look elegant to them, but it is not good. Just, don’t.


Generating test data that matches existing data

I’ve had to create test data a number of times, including data for tables that had mandatory foreign keys to existing tables. It was not feasible to just create new master rows for my test data; I wanted to refer to a random sample of existing data; but the code that generates the test data had to perform reasonably well, even though it had to pick out some random values from a very large table.

Solution? A combination of the new 10g SAMPLE operator, and DBMS_RANDOM. To illustrate:


(create a “very large table”)
SQL> create table t as
2 select rownum n, dbms_random.string(‘a’,30) v
3 from all_objects;

Table created.

SQL> select count(*) from t;

COUNT(*)
———-
40981

(get a random sample from the table)
SQL> select n, substr(v,1,30) from t sample(0.01)
2 order by dbms_random.value;

N SUBSTR(V,1,30)
———- ——————————
11852 xSsdmFtGqkymbKCFoZwUzNxpJAPwaV
8973 RGyNjqMfVayKdiKFGvLYuAFYUpIbCw
25295 eJJtoieSWtzUTIZXCbOLzmdmWHHPOy
297 hiTxUPYKzWKAjFRYTTfJSSCuOwGGmG
1924 yZucJWgkFviAIeXiSCuNeUuDjClvxt
40646 wMTumPxfBMoAcNtVMptoPchILHTXJa

6 rows selected.

SQL> set serveroutput on

(Get a single value chosen at random)
SQL> declare
2 cursor cur_t is
3 select n from t sample(0.01)
4 order by dbms_random.value;
5 l number;
6 begin
7 open cur_t;
8 fetch cur_t into l;
9 dbms_output.put_line(l);
10 close cur_t;
11* end;
SQL> /
21098

PL/SQL procedure successfully completed.


My test code would open the cursor, fetch as many values as it needed, and then close it. If the cursor ran out of values (e.g. the sample was too small for the desired amount of test data, which varied), my code just re-opened the cursor to fetch another set of random values from the large table.

The reason I sort the sample by dbms_random.value is so that if I only want one value, it is not weighted towards rows found nearer the start of the table.

Note: If I didn’t really care about the sample being picked at random from throughout the table, I could have just selected from the table “where rownum < n".



Quick Quiz – Sequence NEXTVAL and CURRVAL

Answer this in your head before trying it out or looking it up. Assume the following script is run in a single session on an Oracle database:

CREATE SEQUENCE seq START WITH 1 INCREMENT BY 1;

CREATE TABLE t (a INTEGER, b INTEGER);

-- Statement #1
INSERT INTO t VALUES (seq.NEXTVAL, seq.NEXTVAL);

-- Statement #2
INSERT INTO t VALUES (seq.CURRVAL, seq.NEXTVAL);

-- Statement #3
INSERT INTO t VALUES (seq.CURRVAL, seq.CURRVAL);

Which of the following is/are true?

  1. The inserted rows will be {1,2}, {2,3} and {3,3}.
  2. The inserted rows will be {1,1}, {2,2} and {2,2}.
  3. The inserted rows will be {1,1}, {1,2} and {2,2}.
  4. Statements #2 and #3 will each raise “ORA-08002: sequence SEQ.CURRVAL is not yet defined in this session”.

Oracle 101: The (Very) Basics

Oracle Database SQL Reference 10g Release 1 (10.1): Datatype Comparison Rules – Numeric Values

“A larger value is considered greater than a smaller one.”

“All negative numbers are less than zero and all positive numbers.”

“-1 is less than 100; -100 is less than -1.”

On first reading these statements seem obvious. The more you read them, however, the more they take on a deeper meaning, a deeper structure and beauty that transcends this, sublimates that and begins subtly but inexorably to change the way you view the world. You start to question their simplicity – are they perhaps over-simplifications of a more complex reality? Well, perhaps not.

I guess they had to include these statements for completeness, since later they get into the more complicated cases of comparison rules for dates and strings, which are not (necessarily) so obvious. For example, I haven’t come across anyone who thinks 12 April 1961 is greater than 20 July 1969, but I’m sure there are some.


And you thought OR was commutative…

I came across this at dbdebunk (ON THE NOTHING THAT’S WRONG WITH NULLS with Hugh Darwen, Fabian Pascal). Couldn’t let that go so I tested it under 9i (9.1) and 10g (10.2) and got identical results. The last two statements are logically equivalent, but gives different results!

SQL> create type point as object (x real, y real);/

Type created.

SQL> create table t (p point);

Table created.

SQL> insert into t values (point(null, null));

1 row created.

SQL> insert into t values (point(1, null));

1 row created.

SQL> insert into t values (point(1, 2));

1 row created.

SQL> select * from t;

P(X, Y)
——————————————————————————–
POINT(NULL, NULL)
POINT(1, NULL)
POINT(1, 2)

SQL> select * from t where p=p or not p=p;

P(X, Y)
——————————————————————————–
POINT(1, 2)

SQL> select * from t where not p=p or p=p;

P(X, Y)
——————————————————————————–
POINT(1, NULL)
POINT(1, 2)

According to CM, it looks like a bug. The filter predicates seem to be applied incorrectly by the optimiser:

where ( p=p ) or ( not p=p );

filter(“T”.”SYS_NC00003$”=”T”.”SYS_NC00003$” AND
“T”.”SYS_NC00002$”=”T”.”SYS_NC00002$” OR T.”P”T.”P”)

where ( not p=p ) or ( p=p );

filter(T.”P”T.”P” OR “T”.”SYS_NC00002$”=”T”.”SYS_NC00002$”)



Simplify your SQL – Avoid Unnecessary Joins

Simplify your SQL – Avoid Unnecessary Joins

I can’t count how many times I’ve seen SQL like the following:

SELECT b.some_column
FROM a, b
WHERE a.id = :p_id
AND b.id = a.id;

(where table a is unique on a.id, and table b has a referential integrity constraint to a, i.e. b.id references a.id)

I’ve seen many variations on this theme, where tables are unnecessarily referenced in a query. In this case, the query should have been formulated as:

SELECT b.some_column
FROM b
WHERE b.id = :p_id;

To be fair, this can often be blamed on maintenance by multiple programmers over time; maybe at some point some data from table a was needed, and only later removed; or we have a cut-and-paste artifact by a less experienced programmer.

The only problem with this is code simplicity (generally, the simpler the code, the easier it is to understand it, and therefore to maintain it); this will not usually be a performance issue. On Oracle, as long as you have a unique constraint on a.id, the optimiser will generally infer that :p_id can be used to directly access table b, and will then probably do a quick index unique scan on the a.id index (technically, the optimiser should know it doesn’t even have to do this, because of the foreign key constraint on b.id -> a.id and the unique constraint on a.id).

[Aside: Of course, if those constraints did NOT exist, then the two queries above would be logically different, and only one or the other would be correct for a given situation. The programmer must in either case know why they are or are not querying any particular table.]

Here’s a simple example using the SCOTT sample schema (run on Oracle 10.1.0.2, but should be pretty much the same on any version). I’ll issue two queries that are logically equivalent; the first one will unnecessarily access DEPT; the second one will access EMP directly.

==========================================================

SQL> select emp.ename from emp, dept where emp.deptno = dept.deptno and dept.deptno = 10;

ENAME
———-
CLARK
KING
MILLER

Elapsed: 00:00:00.01

Execution Plan
———————————————————-
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=3 Card=5 Bytes=60)
1 0 NESTED LOOPS (Cost=3 Card=5 Bytes=60)
2 1 INDEX (UNIQUE SCAN) OF ‘PK_DEPT’ (INDEX (UNIQUE)) (Cost=0 Card=1 Bytes=3)
3 1 TABLE ACCESS (FULL) OF ‘EMP’ (TABLE) (Cost=3 Card=5 Bytes=45)

Statistics
———————————————————-
0 recursive calls
0 db block gets
9 consistent gets
0 physical reads
0 redo size
442 bytes sent via SQL*Net to client
508 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
3 rows processed

SQL> select emp.ename from emp where emp.deptno = 10;

ENAME
———-
CLARK
KING
MILLER

Elapsed: 00:00:00.01

Execution Plan
———————————————————-
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=3 Card=5 Bytes=45)
1 0 TABLE ACCESS (FULL) OF ‘EMP’ (TABLE) (Cost=3 Card=5 Bytes=45)

Statistics
———————————————————-
1 recursive calls
0 db block gets
8 consistent gets
0 physical reads
0 redo size
442 bytes sent via SQL*Net to client
508 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
3 rows processed

==========================================================

As you can see, the second query is much simpler. The performance is pretty much identical.

What I haven’t included above is the predicates for the plans; if I had you’d see that the optimiser does actually apply the “deptno = 10” predicate directly to EMP (thanks to Connor for pointing this out).

In these simple queries it may not seem much of an issue; but for queries involving many tables with many predicates, every bit of unnecessary code makes it all the harder to read, understand and maintain.