Category: PL/SQL

Top 10 Confusing Things in Oracle

Every mature language, platform or system has little quirks, eccentricities, and anachronisms that afficionados just accept as “that’s the way it is” and that look weird, or outlandishly strange to newbies and outsiders. The more mature, and more widely used is the product, the more resistance to change there will be – causing friction that helps to ensure these misfeatures survive.

Oracle, due to the priority placed on backwards compatibility, and its wide adoption, is not immune to this phenomenon. Unless a feature is actively causing things to break, as long as there are a significant number of sites using it, it’s not going to change. In some cases, the feature might be replaced and the original deprecated and eventually removed; but for core features such as SQL and PL/SQL syntax, especially the semantics of the basic data types, it is highly unlikely these will ever change.

So here I’d like to list what I believe are the things in Oracle that most frequently confuse people. These are not necessarily intrinsically complicated – just merely unintuitive, especially to a child of the 90’s or 00’s who was not around when these things were first implemented, when the idea of “best practice” had barely been invented; or to someone more experienced in other technologies like SQL Server or Java. These are things I see questions about over and over again – both online and in real life. Oh, and before I get flamed – another disclaimer: some of these are not unique to Oracle – some of them are more to do with the SQL standard; some of them are caused by a lack of understanding of the relational model of data.

Once you know them, they’re easy – you come to understand the reasons (often historical) behind them; eventually, the knowledge becomes so ingrained, it’s difficult to remember what it was like beforehand.

Top 10 Confusing Things in Oracle

  1. Empty strings are NULL

  2. DATEs are not dates

  3. Nothing equals NULL

  4. ROWNUM … ’nuff said?

  5. Use VARCHAR2, not CHAR or VARCHAR

  6. Double-quotes delimit identifiersSingle quotes delimit strings

  7. No intrinsic sort orderYou can’t insert a record at the “top”, “middle” or “end” of a table

  8. You don’t add a column in the “middle” of a table

  9. Don’t create temporary tables at runtime

  10. Every User is/has a Schema, every Schema is/has a User

Got something to add to the list? Drop me a note below.

More resources:

Which packages might raise “ORA-04068 existing state of package has been discarded”?

PhotoSpinOffice Imagery© 2001 PhotoSpinwww.photospin.comI recently saw this question on StackOverflow (“Is there any way to determine if a package has state in Oracle?”) which caught my attention.

You’re probably already aware that when a package is recompiled, any sessions that were using that package won’t even notice the change; unless that package has “state” – i.e. if the package has one or more package-level variables or constants. The current value of these variables and constants is kept in the PGA for each session; but if you recompile the package (or modify something on which the package depends), Oracle cannot know for certain whether the new version of the package needs to reset the values of the variables or not, so it errs on the side of caution and discards them. The next time the session tries to access the package in any way, Oracle will raise ORA-04068, and reset the package state. After that, the session can try again and it will work fine.

Side Note: There are a number of approaches to solving the ORA-04068 problem, some of which are given as answers to this question here. Not all of them are appropriate for every situation. Another approach not mentioned there is to avoid or minimize it – move all the package variables to a separate package, which hopefully will be invalidated less often.

It’s quite straightforward to tell whether a given package has “state” and thus has the potential for causing ORA-04068: look for any variables or constants declared in the package specification or body. If you have a lot of packages, however, you might want to get a listing of all of them. To do this, you can use the new PL/Scope feature introduced in Oracle 11g.

select object_name AS package,
       name AS variable_name
from user_identifiers
where object_type IN ('PACKAGE','PACKAGE BODY')
and usage = 'DECLARATION'
and type in ('VARIABLE','CONSTANT')
and usage_context_id in (
  select usage_id
  from user_identifiers
  where type = 'PACKAGE'

If you have compiled the packages in the schema with PL/Scope on (i.e. alter session set plscope_settings='IDENTIFIERS:ALL';), this query will list all the packages and the variables that mean they will potentially have state.

Before this question was raised, I hadn’t used PL/Scope for real; it was quite pleasing to see how easy it was to use to answer this particular question. This also illustrates a good reason why I like to hang out on Stackoverflow – it’s a great way to learn something new every day.

Code I Regret: Refactoring as Penance

Recently I refactored some PL/SQL for sending emails – code that I wrote way back in 2004. The number of “WTF“‘s per minute has not been too high; however, I’ve cringed more times than I’d like…

1. Overly-generic parameter types

When you send an email, it will have at least one recipient, and it may have many recipients. However, no email will have more than one sender. Yet, I wrote the package procedure like this:

TYPE address_type IS RECORD
 (name          VARCHAR2(100)
 ,email_address VARCHAR2(200)
TYPE address_list_type IS TABLE OF address_type
 (i_sender     IN address_list_type
 ,i_recipients IN address_list_type
 ,i_subject    IN VARCHAR2
 ,i_message    IN VARCHAR2

Why I didn’t have i_sender be a simple address_type, I can’t remember. Internally, the procedure only looks at i_sender(1) – if a caller were to pass in a table of more than one sender, it raises an exception.

2. Functional programming to avoid local variables

Simple is best, and there’s nothing wrong with using local variables. I wish I’d realised these facts when I wrote functions like this:

FUNCTION address
 (i_name          IN VARCHAR2
 ,i_email_address IN VARCHAR2
 ) RETURN address_list_type;
FUNCTION address
 (i_address       IN address_list_type
 ,i_name          IN VARCHAR2
 ,i_email_address IN VARCHAR2
 ) RETURN address_list_type;

All that so that callers can avoid *one local variable*:

 (i_sender     => EMAIL_PKG.address('joe','')
 ,i_recipients => EMAIL_PKG.address(
                  'jill', '')
                 ,'bob', '')
 ,i_subject    => 'hello'
 ,i_message    => 'world'

See what I did there with the recipients? Populating an array on the fly with just function calls. Smart eh? But rather useless, as it turns out; when we need to send multiple recipients, it’s usually populated within a loop of unknown sized, so this method doesn’t work anyway.

Go ahead – face your past and dig up some code you wrote 5 years ago or more. I think, if you don’t go “WTF!” every now and then, you probably haven’t learned anything or improved yourself in the intervening years. Just saying :)

Alexandria: May Cause Addiction

Ever since I downloaded the Alexandria PL/SQL library, I haven’t been able to put it down. Just recently I decided I wanted to serve up a whole lot of media files directly from Amazon’s S3 simple storage service, instead of serving them from within my EC2 (elastic compute) instance. They were just wasting my linux server’s time responding to http requests.

So, I quickly wrote the following code to transfer them:

  l_blob BLOB;
  /* initialise my AWS session */
    ( 'yyy-my-aws-id-yyy'
    , 'xxx-not-telling-xxx'
    , p_gmt_offset => -8);
  FOR rec IN (
    SELECT id, filename, mime_type, location
    FROM myfiles
    WHERE location = 'http://myserver/media/'
  ) LOOP
    /* read the file from its current location */
    l_blob := ALEX.http_util_pkg.get_blob_from_url
      (rec.location || rec.filename);
    IF DBMS_LOB.getLength(l_blob) > 0 THEN
      /* upload the file to Amazon S3 */
        ( 'mybucket'
        , rec.filename
        , l_blob
        , rec.mime_type
        , ALEX.amazon_aws_s3_pkg.g_acl_public_read);
      UPDATE myfiles
      SET location = ''
      WHERE id =;
    END IF;

After a short while, all the files had been copied across to my bucket on S3, and my table updated so that my web site now points people’s browsers to the new location for those files.

Of course, I could have used UTL_FILE to read the files from disk, but then I’d have to first create a directory, and write a loop to read the file in chunks into the BLOB. Why bother with all that when I can just call http_util_pkg.get_blog_from_url and get it all in one go?

That’s the trouble with powerful utilities like Alexandria: they’re too easy to use, make tasks like this trivial, and you start finding all sorts of uses for them. All of a sudden, Alexandria is your hammer, and the world is full of nails.

See also: this quick intro to using Alexandria’s API for Amazon S3.

Generating unique identifiers with “SELECT MAX(id) + 1”

Normally, when you see code like this in a production system, you should duck your head and run:

SELECT NVL( MAX( id ), 0 ) + 1
INTO   :new_id
FROM   mytable;

What’s wrong with this code?

I hope the first answer that rolls off your tongue has something to do with concurrency – i.e. two sessions that run this around the same time will not see uncommitted rows from each other, and so are likely to try to insert rows with conflicting identifiers.

I hope the second answer that you might mention has to do with performance – even considering there’s a unique index on the column, this code will still need to read at least one index block to get the latest ID (assuming the query optimiser chooses to do a MIN/MAX index scan so that it doesn’t have to scan the entire index before returning a result). In a high load system this cost might be unacceptable.

Of course, the first problem (concurrency) could be solved by serializing access to the “get the next ID” function, e.g. with a DBMS_LOCK. We all know, however, that there’s no sane reason to serialize this when Oracle already provides a perfectly good mechanism for generating unique IDs, with virtually no serialization – sequences.


Sequences have the benefits of guaranteeing uniqueness, and if their “cache” setting is set appropriately, will add a negligible amount of overhead for serialization.

Problem solved. Easy, right? I bet you’re wondering why I added the word “Normally” to my first sentence in this post….

Question: When might using “SELECT MAX(id) + 1” ever be an acceptable source of unique identifiers?

Answer: Global Temporary tables.

If I’ve inserted any rows into a global temporary table, by definition no other session can see my data, so the first consideration, concurrency, is not an issue.

Also, if I’m not expecting to ever insert many rows into my global temporary table, I can be reasonably confident that performance will not be an issue either. Plus, if I put an index on the ID column, that query will be quite inexpensive.

Conclusion: if you are using global temporary tables, you don’t have to use sequences to generate unique identifiers for them. I’m not saying you shouldn’t, of course – a sequence may be faster, and may even lead to simpler code in some cases – but in other cases you might decide to forego a sequence – one less object, with perhaps its role grants and synonyms, to deploy.

Now, of course, you have to ask yourself, why query the table at all? Why not store that latest ID in a private global variable in a package? In fact, we can create a simple package to replace the sequence, e.g.:

END my_table_pkg;
    g_latest_id := NVL(g_latest_id, 0) + 1;
    RETURN g_latest_id;
  END next_id;
END my_table_pkg;

Well, now you know what to do. Whenever you need to generate a unique set of identifiers for a global temporary table, you’ve got a choice of options: sequence, package variable, or a “max(id)+1” query.