AUSOUG Perth 2010 – Day 1

After a leisurely sleep-in (after a weekend away at the parents’-in-law farm) I made my way to Burswood for the first day of the AUSOUG Perth Conference 2010.

After Roland Slee’s keynote (“consolidate consolidate consolidate!”), I headed upstairs for Steven Feuerstein’s “Golden Rules for Developers” – webinar edition. Unfortunately due to technical issues it started late (no fault of Steven’s) but I think he got the important points across.

Following that was Penny Cookson with “Meet the CBO in Version 11g”. She explained a number of improvements in the Cost-Based Optimizer that came with 11g, including a detailed demonstration of adaptive cursor sharing.

After lunch I decided to take in a DBA session – Guy Harrison spoke about how Oracle runs on VMware, which had some very interesting info about the difference between Full Virtualisation, Paravirtualisation, and Hardware-Assisted Virtualisation. A lot of it went over my head but I got a slightly better picture of what’s going on when I run an OS in a VM, as well as how proper memory and CPU allocation can make a huge difference to the performance of Oracle in a virtual environment.

I lost count of how many great tips Scott Wesley gave in his “‘n’ Methods to Improve APEX Performance” presentation – but there were a lot of great ideas, many that are simple and easy to implement, which can make a big difference to the performance of your Apex applications.

It was great to see the level of interest in APEX Themes and Templates – if you’d like to look through the bits that I skipped over, feel free to download my presentation from here.


“It works, just not for the reason you think”

A somewhat mistaken attempt to format a number left-padded with up to six zeroes, made me cringe a little:

SELECT LPAD(number_column, 6, 000000) FROM ...

Yes, it does work (the domain of number_column is limited to positive integers), but not for the reason the writer thought, I think.

(Hint: what is the data type of the third parameter to LPAD?)


Count All Rows in a Schema

I wrote this simple SQL*Plus script to loop through all the user’s tables and report how many rows were in them at this point in time. Normally I might estimate this information based on the table statistics, but in this case I had to know if any rows had been inserted since the last stats were gathered. Also, I preferred not to just run the stats gathering job. Obviously, it doesn’t pick up any uncommitted data.

It doesn’t list the tables which have zero (0) rows, but it report how many tables it counted, and the total number of rows across the entire schema.

If run on an 11g database, it will skip the count on tables which have had no segment created yet.

WARNING: this script is suitable for development environments only. Use at your own risk.

SET SERVEROUT ON FEEDBACK OFF
DECLARE
  c  INTEGER;
  tc INTEGER := 0;
  tr INTEGER := 0;
BEGIN
$IF DBMS_DB_VERSION.VER_LE_10 $THEN
  DBMS_OUTPUT.enable(1000000);
$ELSE
  DBMS_OUTPUT.put_line('Only tables with segment created are counted.');
$END
  FOR r IN (
    SELECT table_name
    FROM   user_tables
$IF DBMS_DB_VERSION.VER_LE_10 $THEN
$ELSE
    WHERE  segment_created = 'YES'
$END
    ORDER BY table_name
    )
    LOOP
    tc := tc + 1;
    BEGIN
      EXECUTE IMMEDIATE REPLACE(
        'SELECT COUNT(*) FROM "#TABLE#"'
        ,'#TABLE#',r.table_name)
        INTO c;
      tr := tr + c;
      IF c > 0 THEN
        DBMS_OUTPUT.put_line(
          RPAD(r.table_name,31,' ')
          || TO_CHAR(c,'9G999G999G990'));
      END IF;
    EXCEPTION
      WHEN OTHERS THEN
        DBMS_OUTPUT.put_line(r.table_name);
        DBMS_OUTPUT.put_line(DBMS_UTILITY.format_error_stack);
    END;
  END LOOP;
  DBMS_OUTPUT.put_line('counted '
    || tc || ' table(s)');
  DBMS_OUTPUT.put_line('total rows '
    || TO_CHAR(tr,'9G999G999G990'));
END;
/
SET SERVEROUT OFF FEEDBACK ON

Example output:

SQL> @count_all_tables
Only tables with segment created are counted.
counted 1 table(s)
total rows              0

This tells me that there are no rows yet, and of all the tables, only one has had a segment created for it (e.g. perhaps an INSERT was attempted but rolled back, or the row was deleted). I’m sure you’ll be able to adapt the script to suit your obsessive record-counting needs.


Difference Between Two Timestamps

Easy, right? Perhaps not as straightforward as you’d think.

Method #1: use EXTRACT

extract(day from (x-y))*24*60*60
+ extract(hour from (x-y))*60*60
+ extract(minute from (x-y))*60
+ extract(second from (x-y))

Method #2: use CAST

( CAST( x AS DATE ) - CAST( y AS DATE ) ) * 86400

The difference? Method #2 is faster (my tests indicate faster by a factor of 3), but does not include fractional seconds. Method #1 is a bit slower, but includes fractions of a second. See the SO link for details.


Strange cheque-reading code in form

This is just a story about a really weird bit of code. It was the type of code you look at and say, “that’s obviously wrong” and delete straight away.

One of the requirements was to rebuild an Oracle form, a data-entry form for receipts (cash, cheque, credit card, etc). Some of the cashiers use these cheque readers to read the numbers off the MICR band, and with the first release of the new form they reported a defect – “the cheque readers don’t work”.

I didn’t have access to a cheque reader when I was developing the form, but assumed that it would be a normal cheque reader – they put the cheque in, it reads it, it sends a series of digits to the computer. The form should work no different than if the operator keyed in the numbers manually (which they have to do, anyway, if the cheque reader doesn’t work for a particular cheque).

So to investigate the defect I requisitioned a cheque reader, along with some test cheques; after some difficulty (turns out these things don’t work alongside my USB keyboard, I had to get a PS2 keyboard), it was working.

It didn’t take long to discover that the cheque reader was sending the cheque number and BSB in the wrong order, as far as the form is concerned; thus why the validation was failing.

I opened up the old form again, and had a good hard look at the fields. Turns out, what I missed originally is that there is a custom KEY-NEXT-ITEM trigger on the bank code field (which is where the operator puts the focus before using the cheque reader). It looks something like this:

DECLARE
  v_data VARCHAR2(50) := REPLACE(REPLACE(:rct.bak_code
                         ,' ','')
                         ,CHR(9),'');
BEGIN
  IF LENGTH(v_data) > 4 THEN
    IF LENGTH(v_data) < 14 THEN
       NULL;
    ELSE
      :rct.cheque_no := SUBSTR(v_data,1,6);
      :rct.bak_code := SUBSTR(v_data,7,3);
      :rct.branch := SUBSTR(v_data,10,3);
      go_field('RCT.CHEQUE_TYPE');
    END IF;
  ELSE
    go_field('RCT.BRANCH');
  END IF;
END;

It turns out that:

(a) the REPLACE(REPLACE( code to remove spaces and tab characters (CHR(9)) is redundant, since the cheque reader never sends spaces, and when it sends a TAB, Oracle Forms doesn’t put a CHR(9) into the field anyway; instead it fires the KEY-NEXT-ITEM trigger

(b) if the length of the data is between 5 and 13, the KEY-NEXT-ITEM trigger does nothing; which means the focus stays in the bak_code field

It’s (b) that is the reason this worked. The trigger fires three times when the cheque reader is used; the third time the trigger fires, it’s got enough digits lumped together in the bak_code field, which it then splits apart, and moves the relevant bits to the cheque_no and branch fields.

A normal, sane person, building this form, would have designed the form to accept the cheque number, bank code and branch in the same order that they are read from the cheque reader; that way, no special code is required – the cheque reader just tabs through the fields, filling them in as it goes.

Oh well – it’s too late to do a screen redesign now, so I’ve had to pretty much replicate the same behaviour in the new form; except that my new code is a little bit smarter – it can also read money orders, which I’ve been told will make the cashiers very happy.


INSERT-VALUES vs. INSERT-SELECT-FROM-DUAL

There’s no difference between the effects of the following two statements, are there:

INSERT INTO mytable (col1, col2) VALUES ('hello','world');
INSERT INTO mytable (col1, col2) SELECT 'hello', 'world' FROM DUAL;

Well, as it turns out, it is possible for the first statement to succeed where the second statement would fail – in the presence of a suitably crafted Before Insert trigger, the second will raise “ORA-04091 table is mutating, trigger/function may not see it”:

http://oraclequirks.blogspot.com/2010/09/ora-04091-table-stringstring-is.html


To Exist or Not To Exist

An interesting discussion on the PL/SQL Challenge blog here has led to me changing my mind about “the best way” to loop through a sparse PL/SQL associative array.

Normally, if we know that an array has been filled, with no gaps in indices, we would use a simple FOR LOOP:

DECLARE
  TYPE t IS TABLE OF NUMBER INDEX BY BINARY_INTEGER;
  a t;
BEGIN
  SELECT x BULK COLLECT INTO a FROM mytable;
  FOR i IN a.FIRST..a.LAST LOOP
    -- process a(i)
  END LOOP;
END;

If, however, the array may be sparsely filled (i.e. there might be one or more gaps in the sequence), this was “the correct way” to loop through it:

Method A (First/Next)

DECLARE
  TYPE t IS TABLE OF NUMBER INDEX BY BINARY_INTEGER;
  a t;
  i BINARY_INTEGER;
BEGIN
  ...
  i := a.FIRST;
  LOOP
    EXIT WHEN i IS NULL;
    -- process a(i)
    i := a.NEXT(i);
  END LOOP;
END;

Method A takes advantage of the fact that an associative array in Oracle is implemented internally as a linked list – the fastest way to “skip over” any gaps is to call the NEXT operator on the list for a given index.

Alternatively, one could still just loop through all the indices from the first to the last index; but the problem with this approach is that if an index is not found in the array, it will raise the NO_DATA_FOUND exception. Well, Method B simply catches the exception:

Method B (Handle NDF)

DECLARE
  TYPE t IS TABLE OF NUMBER INDEX BY BINARY_INTEGER;
  a t;
BEGIN
  ...
  FOR i IN a.FIRST..a.LAST LOOP
    BEGIN
      -- process a(i)
    EXCEPTION
      WHEN NO_DATA_FOUND THEN
        NULL;
    END;
  END LOOP;
END;

This code effectively works the same (with one important proviso*) as Method A. The difference, however, is in terms of relative performance. This method is much faster than Method A, if the array is relatively dense. If the array is relatively sparse, Method A is faster.

* It must be remembered that the NO_DATA_FOUND exception may be raised by a number of different statements in a program: if you use code like this, you must make sure that the exception was only raised by the attempt to access a(i), and not by some other code!

A third option is to loop through as in Method B, but call the EXISTS method on the array to check if the index is found, instead of relying on the NO_DATA_FOUND exception.

Method C (EXISTS)

DECLARE
  TYPE t IS TABLE OF NUMBER INDEX BY BINARY_INTEGER;
  a t;
BEGIN
  ...
  FOR i IN a.FIRST..a.LAST LOOP
    IF a.EXISTS(i) THEN
      -- process a(i)
    END IF;
  END LOOP;
END;

The problem with this approach is that it effectively checks the existence of i in the array twice: once for the EXISTS check, and if found, again when actually referencing a(i). For a large array which is densely populated, depending on what processing is being done inside the loop, this could have a measurable impact on performance.

Bottom line: there is no “one right way” to loop through a sparse associative array. But there are some rules-of-thumb about performance we can take away:

  1. When the array is likely often very sparsely populated with a large index range, use Method A (First/Next).
  2. When the array is likely often very densely populated with a large number of elements, use Method B (Handle NDF). But watch how you catch the NO_DATA_FOUND exception!
  3. If you’re not sure, I’d tend towards Method A (First/Next) until performance problems are actually evident.

You probably noticed that I haven’t backed up any of these claims about performance with actual tests or results. You will find some in the comments to the afore-mentioned PL/SQL Challenge blog post; but I encourage you to log into a sandpit Oracle environment and test it yourself.


Oracle OpenAustralia

The draft programme is out for the AUSOUG National Conference 2010.

If you’ll be in Perth in November I recommend you register and attend – a number of excellent papers will be presented, some of which I had the privilege of hearing when I was in Melbourne – you’ll learn new things, relearn old things you’d forgotten, and meet some giants in the Oracle world who will have travelled great distances to get here.

Some highlights, in no particular order:

  • Steven Feuerstein“Golden Rules for Developers”
  • Penny Cookson – “Meet the CBO in Version 11g”
  • Guy Harrison“Optimizing Oracle databases on VMware”
  • Mogens Nørgaard – “Licensing – Tales from the Trenches and other thoughts on Oracle”
  • Tom Kyte“The Best Way”
  • Frank Bommarito – “Outlines, Profiles and SQL Plan Baselines”
  • Connor McDonald“A Better Way of Managing Optimizer Statistics”
  • Mark Lancaster“Building Advanced APEX 4.0 UIs with ExtJS”

I’ll be presenting my “Apex Themes and Templates” paper, which I presented in Melbourne last month – however it will be updated with a few additional bits and pieces that I’ve learned since then.


Top 10 Reasons to Develop in a VM

The cost of providing Virtual Machines to all your developers can be quite high, especially in terms of initially setting it all up (e.g. a typical developer may require two VMs running concurrently, one for the database server, one for the app server; their desktops will require enough grunt to run these while they also run their dev tools, or they’ll need at least two computers on their desks; also, you’ll need a means of scaling down the data volumes if your database is too big); but you will gain a whole lot more productivity, sanity, happiness and love from your developers.

10. Fail safe.

If everything goes terribly wrong, simply restore to snapshot (duh, obviously).

9. Handle shifting conditions outside your project.

As often as I can (e.g. after a few weeks if a lot of development has gone on), I re-run the scripts on a VM based on a fresh copy of Prod – any changes that anyone else has made without my knowing it, which affect my scripts adversely, get picked up early.

8. Upgrade scripts never raise an error (unless something unexpected occurs).

It is normal for a typical upgrade script to raise many errors (e.g. “object not found” when running a standard “DROP x, CREATE x” script). However, I wrap any command like this with a custom exception handler that swallows the errors that I know are expected or benign. That way, when I hand over my upgrade scripts to the DBA, I can say, “if you get any error messages, something’s gone wrong”, which is a bit better than handing over a list of error messages they can safely ignore. Even better, I can add WHEN SQLERROR EXIT to the top of my upgrade script, so it exits out straightaway if the upgrade fails at any point.

7. Sanity Restore.

You’ve been beating your head against the wall for five minutes, and no-one’s around to add a second eye to your problem; you’re starting to wonder if the bug was something you introduced, or has always been there; and you can’t just log into production to test it. VM to the rescue – undo all your changes by restoring to an earlier snapshot, then see if your problem was a pre-existing issue.

6. Other Developers.

Let’s face it. Things would go a lot smoother if not for all the efforts of other developers to impede your progress by making random changes in the dev environment. Am I right? Well, with your private VM this is no longer a problem. Of course, with a private VM, if anything goes wrong, it’s incontrovertibly your fault now…

5. “Did you turn it off and on again?”

Finally, no need to nag the DBA to bounce the database server, or flush the shared pool, or in fact anything that requires more access than you’d usually get as a lowly developer. Need to increase that tablespace? Drop a couple hundred tables? No problem.

4. Real size estimates.

This works really well when you’re working with relatively small databases (i.e. where an entire copy of prod can be practically run in a VM). Run your upgrade script, which fails halfway through with a tablespace full error. Restore to snapshot, resize the appropriate datafile, re-run the upgrade. Rinse, repeat until no more “out of space” errors; now you know with a high degree of confidence how much extra space your upgrade requires, and for which datafiles.

3. Reduced down-time.

Dev server down? Being upgraded? Been appropriated by another department? No worries – your dev team can continue working, because they’ve got their VMs.

2. Did I say “Fail Safe” already?

I can’t emphasize this one enough. Also, the other side of “Other Developers” is that you are an Other Developer; and the mistakes you will, inevitably, make will never see the light of day (and draw everyone’s ire) if they’re made in your private VM.

1. Smug.

My last deployment to Test of a major release of an application executed perfectly, 100%, correct, first time. I couldn’t believe my eyes – I was so accustomed to my scripts failing for random reasons (usually from causes outside my control, natch). It was all thanks to my use of VMs to develop my fault-tolerant upgrade scripts. I was smug.


Generate DML/DDL/QUERY from SQL – the easy way

This is a comment on jagatheesh.ramakrishnan@oracle.com’s article “Generate DML/DDL/QUERY from SQL” – comments are not enabled on their blog so here’s my addendum.

The following query is offered as a means of generating a script to ONLINE all datafiles in a database:

select 'ALTER DATABASE DATAFILE '''|| name || ''' online ;'
from v$datafile;

I suggest an alternative method, which is both easier to write and easier to maintain:

select REPLACE(q'[
ALTER DATABASE DATAFILE '#NAME#' online ;
]','#NAME#',name) from v$datafile;

This way, the syntax within the DDL is unmuddied by the syntax required by the query to generate it. It’s immediately obvious that only single quotes will surround the name of the datafile in the generated DDL.

If  you’re on a pre-10g database, you can still use this method, but you’ll need to revert to the old quote-escape:

select REPLACE('
ALTER DATABASE DATAFILE ''#NAME#'' online ;
','#NAME#',name) from v$datafile;