Lesson learned today

Comment out all “DROP TABLE” commands in my scripts.

(I accidentally hit F5 when the focus was in the wrong window – which happened to contain a “DROP TABLE / CREATE TABLE” script – my Toad session goes and happily drops the table that I’d been gradually accumulating statistics into for the past 3 days – and no, there’s no flashback table in 9i)

At least I kept all my scripts – rerunning them all now…


How to Change the ORDER BY When You Can’t Modify the SQL

Is it possible to change the ORDER BY on a query, if the SQL is defined in a wrapped PL/SQL package? A colleague asked me this interesting question today – and the answer was quite simply, Yes and No.

If the SQL already has an ORDER BY, then the answer is No. Fortunately, in our case, the SQL didn’t have any ORDER BY clause:

SELECT a.addressid
FROM address a, addresstype t
WHERE a.locationid = :locationid
AND a.typeid = t.typeid
AND t.typecode = 'STREET';

The plan currently being picked by the CBO (Oracle 9.2.0.8) is:

NESTED LOOPS
   TABLE ACCESS BY ROWID addresstype
      INDEX RANGE SCAN pkx_addresstype
   INDEX RANGE SCAN ix_address

For background, here are the relevant schema details:

TABLE address
(addressid NUMBER PRIMARY KEY
,locationid NUMBER NOT NULL
,typeid NUMBER NOT NULL --FOREIGN KEY to addresstype
);

INDEX idx_address (locationid, typeid, addressid);

TABLE addresstype
(typeid NUMBER PRIMARY KEY
,typecode VARCHAR2(30) NOT NULL
);

typecode is actually unique, but no constraint has been declared to the database.

Because the first step in the plan (the TABLE ACCESS on addresstype) only ever gets one row (in practice), the index range scan on ix_address causes this query to return the rows in ascending order of addressid.

The process that runs this query gets rows one at a time, does some work to “check” it against some other data, and once it finds a “match”, it stops getting more rows.

This design probably seemed like a good idea at the time, but now the business would like this process changed so that in the (rare) occurrence when more than one row would “match”, the process should favour a more recent address rather than an old one. For our purposes, “recent” may be approximated by assuming that bigger addressids are more recent than smaller addressids.

We can’t change the code without a lengthy and costly call to the application vendor; it would be better if we can change it ourselves. We cannot change the PL/SQL package in which this query is defined; so what can we do? Can we change the order in which rows are returned from this query?

The answer is yes – firstly, the query has no ORDER BY, so the order of rows is affected only by the query plan. If we change the query plan, which we can do with a stored outline, the order of rows can be changed.

So in this case, all we have to do is create a stored outline to add an INDEX_DESC hint:

SELECT /*+INDEX_DESC(a)*/ a.addressid
FROM address a, addresstype t
WHERE a.locationid = :locationid
AND a.typeid = t.typeid
AND t.typecode = 'STREET';

This causes the following plan to be used:

NESTED LOOPS
   TABLE ACCESS BY ROWID addresstype
      INDEX RANGE SCAN pkx_addresstype
   INDEX RANGE SCAN DESCENDING ix_address

With the stored outline, the query now returns the rows in descending order of addressid.

NOTE:
If you want to use this approach, it is highly dependant on being able to find a query plan that produces the ordering that you want. This may mean creating a suitable index designed strategically for this purpose, if one doesn’t exist already. Even creating indexes won’t guarantee that you’ll be able to find a plan that gets the order that you want; generally I’d say this approach will only be feasible for relatively simple queries.


Create Excel file from SQL query using Python

This will create a file called “emp.xls” with a nicely formatted spreadsheet based on the query specified. The header row is based on the column names, and is frozen. The column widths are set according to the max width of the data.

import cx_Oracle
from table2Excel import write_cursor_to_excel

orcl = cx_Oracle.connect('scott/tiger')
curs = orcl.cursor()
curs.execute("""
  SELECT e.ename "Employee",
         e.job "Job",
         e.hiredate "Hire Date",
         e.sal "Salary",
         e.comm "Commission",
         d.dname "Department",
         (SELECT ename FROM emp WHERE emp.empno = e.mgr) "Manager"
  FROM   emp e, dept d
  WHERE  e.deptno = d.deptno
""")
write_cursor_to_excel(curs, 'emp.xls', 'Employees')

You can get the source for table2Excel from here [table2Excel.py.txt]. Feel free to adapt it to your needs.




Add colours to your Shuttle item


I wanted to allow users to select one or more colours from a list, and to control the order of the colours, so I’ve used a Shuttle item.

I wanted to have different background colours for each value in the list, so I started here for help. I’m not very strong with javascript (yet) but with a bit of looking around and playing I ended up with what I wanted.

I created an item on the page with the following attributes:

Display As: Shuttle

List of values definition: STATIC2:Yellow;#FFFF00, Green;#00FF00, Turquoise;#00FFFF, Pink;#FF90FF, Blue;#9090FF, Purple;#FF00FF, Red;#FF9090

Post Element Text:
<script type="text/javascript">
(function() {
for (i=0;i<$x("#CURRENT_ITEM_NAME#_2").length;i++) {
$x("#CURRENT_ITEM_NAME#_2")[i].style.backgroundColor
= $x("#CURRENT_ITEM_NAME#_2")[i].value;
}
for (i=0;i<$x("#CURRENT_ITEM_NAME#").length;i++) {
$x("#CURRENT_ITEM_NAME#")[i].style.backgroundColor
= $x("#CURRENT_ITEM_NAME#")[i].value;
}

})();
</script>

Notes:

  • the value of each item in the list is a HTML colour code. This colour code is used to set the background colour of the item in the list.
  • the shuttle item actually involves two select lists in the generated page. If the item name is P1_SHUTTLE, the generated items will be P1_SHUTTLE_2 (the left-hand list) and P1_SHUTTLE (the right-hand list). These are referenced in the javascript via #CURRENT_ITEM_NAME# and #CURRENT_ITEM_NAME#_2.
  • the $x returns the select list dom object, which supports the “length” attribute – this returns the count of items in the list
  • the select list index starts at 0 and goes up to length-1

A small problem is when the “reset” button is clicked the colours disappear. They reappear if the page is refreshed, however.


Prefix searches, Wildcard characters and LIKE

Want to run a prefix search using the LIKE operator, but your search pattern includes wildcards like _ or %? A common example is searching the v$parameter view for parameters starting with the underscore (_) character. So, why doesn’t this work:

SELECT * FROM v$parameter WHERE name LIKE '_%';

This returns ALL the rows in the view, because the “_” in the pattern matches any single character. There’s all sorts of ways to get around this:

Option 1. Use TRANSLATE to modify the underscores to some other character

SELECT * FROM my_table
WHERE TRANSLATE(my_column, '_', '#') LIKE '#%';

No good if you’re after a general solution, e.g. what if the original string already has some “#”‘s in it? Also, even if the table has an ordinary index on the column the query cannot use it.

Option 2. Use SUBSTR so that we can use an equality condition instead of LIKE

SELECT * FROM my_table
WHERE SUBSTR(my_column,1,1) = '_';

Easy, simple. Except that an ordinary index on the column will still not be used (of course, an appropriate function-based index could be used).

Option 3. Use a regular expression

SELECT * FROM my_table
WHERE REGEXP_LIKE(my_column, '^_');

or

SELECT * FROM my_table
WHERE REGEXP_INSTR(my_column, '^_') > 0

Rather simple, once you know how to write regular expressions. Still can’t use an ordinary index, though (although, again, a (very specific and otherwise generally useless) function-based index could be used).

Option 4. Read the manual and learn something about the LIKE operator

SELECT * FROM my_table
WHERE my_column LIKE '\_%' ESCAPE '\';

Easy, simple; and if there’s an index on the column, the optimiser may use it. No need for any silly function-based indexes or new-fangled regular expressions.


Add a “Who’s Online Now” box to your APEX app

Something to file under “pointless fun”…

Allow your users to feel like they’re part of a community by letting them know who else is using the app at the same time.

  1. Open APEX Application Builder, and open the page you want to add this to.
  2. Click the Create Region icon.
  3. Choose Report, then SQL Report.
  4. Enter a title, e.g. “Who’s Online Now”. Click Next.
  5. Copy the following for the SQL Query (sorry about the poor formatting):
    SELECT apex_user || ' (' || minutes_ago
           || DECODE(minutes_ago,1,' min ago)',' mins ago)')
    FROM (
      SELECT DISTINCT
             apex_user,
             FIRST_VALUE(TRUNC(seconds_ago/60))
             OVER (PARTITION BY apex_user ORDER BY seconds_ago)
             AS minutes_ago
      FROM apex_workspace_activity_log
      WHERE application_id = :APP_ID
      AND seconds_ago < 3600
      AND apex_user NOT IN (:APP_USER, 'nobody')
      ORDER BY 2);
    
  6. Click Next. Click Create Region.

You can customise the region as much as you like. For example, I use:

  • Template (under Region Definition) = “Sidebar Region”
  • Report Template (under Report Attributes) = “template: 16. One Column Unordered List”
  • Pagination Scheme = “- No Pagination Selected -“
  • Headings Type = “None”

The query is based on the APEX view apex_workspace_activity_log which is supplied with the product. It looks at all session activity within the last hour by users other than the current user, and returns a list showing their most recent activity.

The only slight problem is that it doesn’t detect when someone logs out – they’ll stay in the report for up to an hour.


LIKE with wildcard (%) at start CAN use an index

A tester for our client raised a problem where searches like ‘ABC%’ would perform satisfactorily, but searches like ‘%ABC’ would time out. Of course, the reason is that the index on the column cannot be used if a wildcard is at the start of the string, so a full table scan is always performed.

I think we’re going to accept the slow response for now, but a possible solution that we might consider in future is to use a function-based index on the REVERSE() function. Its drawback is that it requires modifying the code.

First, some test data:

CREATE TABLE jka AS
SELECT ROWNUM id, dbms_random.string('x',10) v
FROM dual CONNECT BY LEVEL <= 10000;

Now, create an ordinary index:

CREATE INDEX jka_normal ON jka (v);

BEGIN
  dbms_stats.gather_table_stats
    (USER
    ,'JKA'
    ,estimate_percent=>100
    ,cascade=>TRUE);
END;

This query can use a range scan on jka_normal:

SELECT * FROM jka WHERE v LIKE 'ABC%';

But this query will use a full table scan (can’t use the index):

SELECT * FROM jka WHERE v LIKE '%ABC';

Now, create a function-based index (not to be confused with a REVERSE INDEX):

CREATE INDEX jka_reverse ON jka(REVERSE(v));

BEGIN
  dbms_stats.gather_table_stats
    (USER
    ,'JKA'
    ,estimate_percent=>100
    ,cascade=>TRUE);
END;

This query can use a range scan on jka_reverse:

SELECT * FROM jka WHERE REVERSE(v) LIKE REVERSE('%ABC');

Edit: looks like Richard Foote beat me to it, in a discussion of reverse indexes (which unfortunately don’t contribute anything useful to this particular problem) – I did a quick search on Google without results, but Foote’s article must not have been indexed yet or too far down the list.


Will Oracle use my regexp function-based index?

It was mentioned at the AUSOUG conference by one of the speakers that he couldn’t get the database to use a function-based index based on the regexp functions. I thought this was a little strange so decided to try for myself.


SQL> select * from v$version;
Oracle Database 10g Express Edition Release 10.2.0.1.0

Create a test table and gather stats on it:

SQL> create table testtable as
select rownum rn,
dbms_random.string(‘a’,10) string10,
rpad(‘x’,2000,’x’) padding
from all_objects
where rownum <= 5000;

SQL> exec dbms_stats.gather_table_stats(user,
‘TESTTABLE’, cascade=>TRUE);

This is our test query using regexp_instr:

SQL> explain plan for
select string10, regexp_instr(string10, ‘XE’)
from testtable
where regexp_instr(string10, ‘XE’) > 0;

Id | Operation | Name | Rows | Bytes | Cost
0 | SELECT STATEMENT | | 243 | 475K| 465
* 1 | TABLE ACCESS FULL| TESTTABLE | 243 | 475K| 465

1 – filter( REGEXP_INSTR (“STRING10”,’XE’)>0)
– dynamic sampling used for this statement

How many rows are actually returned by this query?

SQL> select string10, regexp_instr(string10, ‘XE’)
from testtable
where regexp_instr(string10, ‘XE’) > 0;

STRING10 REGEXP_INSTR(STRING10,’XE’)
———- —————————
fwXEKwoDhG 3
rSmdOXEkeu 6
WXEbrrjXcW 2
qWEThNXEBO 7
XEQtOwjOCW 1
MRXEoAicUQ 3
IpECtZbjXE 9
LdmXjyePXE 9
gXEaiGrjSX 2
PoqtvdGcXE 9
ZunAgePXXE 9
evXEFhBpzX 3
ZxkXstXEwJ 7
yVpjHzXECY 7
FkaPMpXEgR 7
RuXUnXEQSO 6
OooRCjXXEK 8
XzceiWPXEr 8
XEZlpdNMhG 1
jJjzfXEdDw 6
CowyXEuHDm 5

21 rows selected

Hmmm… there should be a better way.
Let’s try a simple function-based index to speed things up:

SQL> create index idx_regexp on testtable
(regexp_instr(string10, ‘XE’));

Will this query use the index?

SQL> explain plan for
select string10, regexp_instr(string10, ‘XE’)
from testtable
where regexp_instr(string10, ‘XE’) > 0;

Id | Operation | Name | Rows | Bytes | Cost
0 | SELECT STATEMENT | | 4444 | 48884 | 465
* 1 | TABLE ACCESS FULL| TESTTABLE | 4444 | 48884 | 465

1 – filter( REGEXP_INSTR (“STRING10”,’XE’)>0)

Why not? The calculated cost with the index was higher than with a full table scan.

SQL> explain plan for
select /*+ index(testtable idx_regexp) */ string10, regexp_instr(string10, ‘XE’)
from testtable
where regexp_instr(string10, ‘XE’) > 0;

Id | Operation | Name | Rows | Bytes | Cost
0 | SELECT STATEMENT | | 4444 | 48884 | 1512
1 | TABLE ACCESS BY INDEX ROWID| TESTTABLE | 4444 | 48884 | 1512
* 2 | INDEX RANGE SCAN | IDX_REGEXP | 4444 | | 10

2 – access( REGEXP_INSTR (“STRING10”,’XE’)>0)

Why was the cost higher? Let’s look at what would be in that index…

SQL> select regexp_instr(string10, ‘XE’) from testtable;

REGEXP_INSTR(STRING10,’XE’)
—————————
0
0
0
0
0
1
0
0
0
0
… (lots of zeroes and the occasional positive integer) …
0
0
0

5000 rows selected

The index has a row for each block in the table. No histogram on it, so it doesn’t know that most of the index is zeroes.
What if we just want stuff from the index?

SQL> explain plan for
select regexp_instr(string10, ‘XE’)
from testtable
where regexp_instr(string10, ‘XE’) > 0;

Id | Operation | Name | Rows | Bytes | Cost
0 | SELECT STATEMENT | | 4444 | 48884 | 4
* 1 | INDEX FAST FULL SCAN| IDX_REGEXP | 4444 | 48884 | 4

1 – filter( REGEXP_INSTR (“STRING10”,’XE’)>0)

Yes, that uses the index. So how do we get the rest of the data from the table? Let’s try something else.

SQL> drop index idx_regexp;

We can take advantage of the fact that NULLS are not stored in an index by converting any zeroes (i.e. the regular expression didn’t match) to NULL:

SQL> create index idx_regexp_better on testtable
(CASE WHEN regexp_instr(string10, ‘XE’) > 0
THEN regexp_instr(string10, ‘XE’)
ELSE NULL END);

SQL> exec dbms_stats.gather_table_stats(user, ‘TESTTABLE’, cascade=>TRUE);

Will our query use the new improved index?

SQL> explain plan for
select string10, regexp_instr(string10, ‘XE’)
from testtable
where regexp_instr(string10, ‘XE’) > 0;

Id | Operation | Name | Rows | Bytes | Cost
0 | SELECT STATEMENT | | 250 | 2750 | 465
* 1 | TABLE ACCESS FULL| TESTTABLE | 250 | 2750 | 465

1 – filter( REGEXP_INSTR (“STRING10”,’XE’)>0)

No. Why? Because it can’t use the index, even if we try to force it with a hint. The optimiser doesn’t know it can modify my expression into the one that was used to build the index.

SQL> explain plan for
select /*+ index(testtable idx_regexp_better)*/
string10, regexp_instr(string10, ‘XE’)
from testtable
where regexp_instr(string10, ‘XE’) > 0;

Id | Operation | Name | Rows | Bytes | Cost
0 | SELECT STATEMENT | | 250 | 2750 | 465
* 1 | TABLE ACCESS FULL| TESTTABLE | 250 | 2750 | 465

1 – filter( REGEXP_INSTR (“STRING10”,’XE’)>0)

What if we copy the expression from the index into the query? Yes, that’s better.

SQL> explain plan for
select string10, regexp_instr(string10, ‘XE’)
from testtable
where CASE WHEN regexp_instr(string10, ‘XE’) > 0
THEN regexp_instr(string10, ‘XE’)
ELSE NULL END > 0;

Id | Operation | Name | Rows | Bytes | Cost
0 | SELECT STATEMENT | | 21 | 231 | 22
1 | TABLE ACCESS BY INDEX ROWID| TESTTABLE | 21 | 231 | 22
* 2 | INDEX RANGE SCAN | IDX_REGEXP_BETTER | 21 | | 1

2 – access(CASE WHEN REGEXP_INSTR (“STRING10”,’XE’)>0 THEN REGEXP_INSTR(“STRING10”,’XE’) ELSE NULL END >0)

Brilliant!