Workaround for ORA-04063: view “x” has errors

When you deploy a procedure, function or package that has a compilation error, the object is still created, and you can still apply grants on them. This is convenient when deploying a large number of objects, meaning you don’t have to get them all in the right order. After deploying your schema, you can just recompile the invalid objects.

Unfortunately, this doesn’t work for views. Now, normally if you create a view with compilation errors, the view will not be created at all; for a deployment script, however, you could use CREATE FORCE VIEW that means the view will be created (but marked invalid).

Let’s say you have a view that depends on a table that doesn’t exist yet – and won’t exist until much later in your deployment scripts. So you create the view with the FORCE option – success. Then, you apply the GRANTs for the view, and get this:

ORA-04063: view "x" has errors

Why? If you try to grant on a procedure, function or package that has errors, it works fine. For views, apparently, this is not allowed.

Obviously, to solve this you might do the hard work and reorder your deployment scripts so that they create every object in the perfectly correct order, avoiding compilation errors entirely. If you have a large number of objects to deploy, this might be more trouble than you want. Well, there is a workaround:

1. Create the view, minus the bit that causes a compilation error.

2. Apply the grant.

3. Recreate the view, compilation error and all. The grant will remain.

Why might this be useful? In my case, we have two databases connected by database links (on both sides); and we need to deploy a large number of objects to both instances. They are managed by different teams, so we want to be able to deploy the changes to each independently. For the most part, the objects on “our side” compile fine, except for some views that refer to objects on the other side of the database link; but they won’t exist until the other team deploys their changes. We could even have a chicken-and-egg problem, when their views refer to objects on our instance; either way, some of the objects cannot be created error-free until both deployments have been completed.

As it stands, we have two options: deploy everything as best we can, then afterwards (when both deployments have completed), recompile the invalid objects and apply the view grants. An alternative is to use this workaround.

Demonstration

TEST CASE #1: cannot grant on a view with errors

SQL> create or replace force view testview as
     select 1 as col from bla;

Warning: View created with compilation errors.

SQL> grant select on testview to someone;

ORA-04063: view "USER.TESTVIEW" has errors

SQL> select grantee, privilege from user_tab_privs
     where table_name = 'TESTVIEW';

no rows selected

TEST CASE #2: grant on a view with errors

SQL> create or replace view testview as
     select 1 as col from dual;

View created.

SQL> grant select on testview to someone;

Grant succeeded.

SQL> create or replace force view testview as
     select 1 as col from bla;

Warning: View created with compilation errors.

SQL> select grantee, privilege from user_tab_privs
     where table_name = 'TESTVIEW';

GRANTEE PRIVILEGE
======= =========
SOMEONE SELECT

The above has been tested on Oracle 10gR2 and 11gR2. Should this mean that Oracle should not really raise ORA-04063 in this case? I think so.


Non-Overlapping Dates Constraint

If you have a table that represents time-varying info, e.g. with From and To date/time columns, you have a few options with regards to the problem of overlapping dates:

1. Check for overlapping dates in the application layer.

2. Use an off-the-shelf product to generate the appropriate triggers, e.g. Oracle CDM*RuleFrame or Toon Koppelaars’  RuleGen.

3. Roll your own, in the database.

4. Use a different data model that can use a unique constraint.

5. Forget about it.

One reason it’s difficult is that this is an example of a cross-row constraint, i.e. one that cannot merely be checked for the current row by itself. Oracle supports a few cross-row constraints, i.e. Primary Key, Unique and Foreign Key constraints; but it doesn’t natively support arbitrary assertions, which would allow us to easily declare this sort of constraint.

The real challenge comes from the fact that Oracle is a multi-user system and concurrent sessions cannot see the uncommitted data from other sessions; so some form of serialization will be required to ensure that when one session wishes to insert/update the data for a particular entity, no other session is allowed to start working on the same entity until the first session commits (or issues a rollback).

The problem is not new; it’s been around for a long time, and tripped many a new (and old) programmer.

There are two problems with option #1 (code in the application layer): firstly, you have to repeat the code for each different type of client (e.g. you might have a Java UI on one side, as well as some batch script somewhere else); secondly, usually the programmer involved will not understand the concurrency problem mentioned above and will not take it into account.

Option #2 is probably the best, most of the time. The solution is implemented at the database level, and is more likely to work correctly and efficiently.

Option #4 (change the data model) involves not storing the From and To dates, but instead dividing up all time ranges into discrete chunks, and each row represents a single chunk of time. This solution is valid if the desired booking ranges are at predictable discrete ranges, e.g. daily. You can then use an ordinary unique constraint to ensure that each chunk of time is only booked by one entity at any one time. This is the solution described here.

Option #5 (forget about it) is also a viable option, in my opinion. Basically it entails designing the rest of the application around the fact that overlapping date ranges might exist in the table – e.g. a report might simply merge the date ranges together prior to output.

Option #3, where you implement the triggers yourself on the database, has the same advantage as Option #2, where it doesn’t matter which application the data is coming from, the constraint will hold true. However, you have to be really careful because it’s much easier to get it wrong than it is to get right, due to concurrency.

I hear you scoffing, “Triggers?!?”. I won’t comment except to refer you to this opinion, which I couldn’t say it better myself: The fourth use-case for Triggers.

There is another Option #3 using a materialized view instead of triggers; I’ll describe this alternative at the end of this post.

So, here is a small example showing how an overlapping date constraint may be implemented. It is intentionally simple to illustrate the approach: it assumes that the From and To dates cannot be NULL, and its rule for detecting overlapping dates requires that the dates not overlap at all, to the nearest second.

  1. Create the tables
    CREATE TABLE room
      (room_no NUMBER NOT NULL
      ,CONSTRAINT room_pk PRIMARY KEY (room_no)
      );
    
    CREATE TABLE room_booking
      (room_no NUMBER NOT NULL
      ,booked_from DATE NOT NULL
      ,booked_to DATE NOT NULL
      ,CONSTRAINT room_booking_pk
         PRIMARY KEY (room_no, booked_from)
      ,CONSTRAINT room_booking_fk
         FOREIGN KEY (room_no) REFERENCES room (room_no)
      );
    

  2. Create the validation trigger (note – I’ve used an Oracle 11g compound trigger here, but it could easily be rewritten for earlier versions by using two triggers + a database package):
    CREATE OR REPLACE TRIGGER room_booking_trg
      FOR INSERT OR UPDATE OF room_no, booked_from, booked_to
        ON room_booking
      COMPOUND TRIGGER
    
      TYPE room_no_array IS TABLE OF CHAR(1)
        INDEX BY BINARY_INTEGER;
    
      room_nos room_no_array;
    
    PROCEDURE lock_room (room_no IN room.room_no%TYPE) IS
      dummy room.room_no%TYPE;
    BEGIN
      SELECT r.room_no
      INTO dummy
      FROM room r
      WHERE r.room_no = lock_room.room_no
      FOR UPDATE;
    END lock_room;
    
    PROCEDURE validate_room (room_no IN room.room_no%TYPE) IS
      overlapping_booking EXCEPTION;
      dummy CHAR(1);
    BEGIN
      -- check for overlapping date/time ranges
      BEGIN
        SELECT 'X' INTO dummy
        FROM room_booking rb1
            ,room_booking rb2
        WHERE rb1.room_no = validate_room.room_no
        AND rb2.room_no = validate_room.room_no
        AND rb1.booked_from != rb2.booked_from
        AND (
             rb1.booked_from BETWEEN rb2.booked_from
                                 AND rb2.booked_to
             OR
             rb1.booked_to BETWEEN rb2.booked_from
                               AND rb2.booked_to
            )
        AND ROWNUM = 1;
        RAISE overlapping_booking;
      EXCEPTION
        WHEN NO_DATA_FOUND THEN
          -- good, no constraint violations
          NULL;
      END;
    EXCEPTION
      WHEN overlapping_booking THEN
        RAISE_APPLICATION_ERROR(-20000,
          'Overlapping booking for room #' || room_no);
    END validate_room;
    
    PROCEDURE validate_rooms IS
      room_no room.room_no%TYPE;
    BEGIN
      room_no := room_nos.FIRST;
      LOOP
        EXIT WHEN room_no IS NULL;
        validate_room (room_no);
        room_no := room_nos.NEXT(room_no);
      END LOOP;
      room_nos.DELETE;
    EXCEPTION
      WHEN OTHERS THEN
        room_nos.DELETE;
        RAISE;
    END validate_rooms;
    
    BEFORE EACH ROW IS
    BEGIN
      -- lock the header record (so other sessions
      -- can't modify the bookings for this room
      -- at the same time)
      lock_room(:NEW.room_no);
      -- remember the room_no to validate later
      room_nos(:NEW.room_no) := 'Y';
    END BEFORE EACH ROW;
    
    AFTER STATEMENT IS
    BEGIN
      validate_rooms;
    END AFTER STATEMENT;
    
    END room_booking_trg;
    /
    

That’s all you need. The trigger locks the header record for the room, so only one session can modify the bookings for a particular room at any one time. If you don’t have a table like “room” in your database that you can use for this purpose, you could use DBMS_LOCK instead (similarly to that proposed in the OTN forum discussion here).

It would not be difficult to adapt this example for alternative requirements, e.g. where the From and To dates may be NULL, or where the overlapping criteria should allow date/time ranges that coincide at their endpoints (e.g. so that the date ranges (1-Feb-2000 to 2-Feb-2000) and (2-Feb-2000 to 3-Feb-2000) would not be considered to overlap). You’d just need to modify the comparison in the query in validate_room to take these requirements into account.

Test case #1

INSERT INTO room (room_no) VALUES (101);
INSERT INTO room (room_no) VALUES (201);
INSERT INTO room (room_no) VALUES (301);

INSERT INTO room_booking (room_no, booked_from, booked_to)
VALUES (101, DATE '2000-01-01', DATE '2000-01-02' - 0.00001);

INSERT INTO room_booking (room_no, booked_from, booked_to)
VALUES (101, DATE '2000-01-02', DATE '2000-01-03' - 0.00001);

INSERT INTO room_booking (room_no, booked_from, booked_to)
VALUES (201, DATE '2000-02-01', DATE '2000-02-05' - 0.00001);

Expected: no errors

Test case #2

INSERT INTO room_booking (room_no, booked_from, booked_to)
VALUES (201, DATE '2000-02-01', DATE '2000-02-02' - 0.00001);

INSERT INTO room_booking (room_no, booked_from, booked_to)
VALUES (201, DATE '2000-02-02', DATE '2000-02-04' - 0.00001);

INSERT INTO room_booking (room_no, booked_from, booked_to)
VALUES (201, DATE '2000-02-03', DATE '2000-02-05' - 0.00001);

INSERT INTO room_booking (room_no, booked_from, booked_to)
VALUES (201, DATE '2000-02-03', DATE '2000-02-06' - 0.00001);

INSERT INTO room_booking (room_no, booked_from, booked_to)
VALUES (201, DATE '2000-01-31', DATE '2000-02-01');

INSERT INTO room_booking (room_no, booked_from, booked_to)
VALUES (201, DATE '2000-01-31', DATE '2000-02-02' - 0.00001);

INSERT INTO room_booking (room_no, booked_from, booked_to)
VALUES (201, DATE '2000-01-31', DATE '2000-02-06' - 0.00001);

INSERT INTO room_booking (room_no, booked_from, booked_to)
VALUES (201, DATE '2000-02-05' - 0.00001, DATE '2000-02-06' - 0.00001);

UPDATE room_booking SET booked_to = '2000-01-02' - 0.00001
WHERE room_no = 101 AND booked_from = DATE '2000-01-02';

Expected: constraint violation on each statement

Test case #3

in session #1:

INSERT INTO room_booking (room_no, booked_from, booked_to)
VALUES (301, DATE '2000-01-01', DATE '2000-02-01' - 0.00001);

in session #2:

INSERT INTO room_booking (room_no, booked_from, booked_to)
VALUES (301, DATE '2000-01-15', DATE '2000-01-16' - 0.00001);

Expected: session #2 will wait until session #1 issues a COMMIT or ROLLBACK. If session #1 COMMITs, session #2 will then report a constraint violation. If session #2 rolls back, session #2 will complete without error.

The No-Trigger option #3: Materialized View

This is similar to a solution proposed by Rob Van Wijk. It uses a constraint on a materialized view to stop overlapping date ranges.

So, instead of the trigger, you would do something like this:

CREATE MATERIALIZED VIEW LOG ON room_booking WITH ROWID;

CREATE MATERIALIZED VIEW room_booking_date_ranges
  REFRESH FORCE ON COMMIT
  AS SELECT 'X' AS dummy
     FROM room_booking rb1
         ,room_booking rb2
     WHERE rb1.room_no = rb2.room_no
     AND rb1.booked_from != rb2.booked_from
     AND (
          rb1.booked_from BETWEEN rb2.booked_from
                              AND rb2.booked_to
          OR
          rb1.booked_to BETWEEN rb2.booked_from
                            AND rb2.booked_to
         );

ALTER TABLE room_booking_date_ranges
  ADD CONSTRAINT no_overlapping_dates_ck
  CHECK ( dummy = 'Z' );

The nice thing about this solution is that it is simpler to code, and seems more “declarative” in nature. Also, you don’t have to worry about concurrency at all.

The constraint is checked at COMMIT-time when the materialized view is refreshed; so it behaves like a deferred constraint, which may be an advantage for some scenarios.

I believe it may perform better than the trigger-based option when large volumes of data are inserted or updated; however it may perform worse than the trigger-based option when you have lots of small transactions. This is because, unfortunately, the query here cannot be a “REFRESH FAST ON COMMIT” (if you know how this could be changed into a REFRESH FAST MV, please let me know!).

 

What do you think? If you see any potential issues with the above solutions please feel free to comment.

EDIT 30/8: added some more test cases


Which packages might raise “ORA-04068 existing state of package has been discarded”?

PhotoSpinOffice Imagery© 2001 PhotoSpinwww.photospin.comI recently saw this question on StackOverflow (“Is there any way to determine if a package has state in Oracle?”) which caught my attention.

You’re probably already aware that when a package is recompiled, any sessions that were using that package won’t even notice the change; unless that package has “state” – i.e. if the package has one or more package-level variables or constants. The current value of these variables and constants is kept in the PGA for each session; but if you recompile the package (or modify something on which the package depends), Oracle cannot know for certain whether the new version of the package needs to reset the values of the variables or not, so it errs on the side of caution and discards them. The next time the session tries to access the package in any way, Oracle will raise ORA-04068, and reset the package state. After that, the session can try again and it will work fine.

Side Note: There are a number of approaches to solving the ORA-04068 problem, some of which are given as answers to this question here. Not all of them are appropriate for every situation. Another approach not mentioned there is to avoid or minimize it – move all the package variables to a separate package, which hopefully will be invalidated less often.

It’s quite straightforward to tell whether a given package has “state” and thus has the potential for causing ORA-04068: look for any variables or constants declared in the package specification or body. If you have a lot of packages, however, you might want to get a listing of all of them. To do this, you can use the new PL/Scope feature introduced in Oracle 11g.

select object_name AS package,
       type,
       name AS variable_name
from user_identifiers
where object_type IN ('PACKAGE','PACKAGE BODY')
and usage = 'DECLARATION'
and type in ('VARIABLE','CONSTANT')
and usage_context_id in (
  select usage_id
  from user_identifiers
  where type = 'PACKAGE'
);

If you have compiled the packages in the schema with PL/Scope on (i.e. alter session set plscope_settings='IDENTIFIERS:ALL';), this query will list all the packages and the variables that mean they will potentially have state.

Before this question was raised, I hadn’t used PL/Scope for real; it was quite pleasing to see how easy it was to use to answer this particular question. This also illustrates a good reason why I like to hang out on Stackoverflow – it’s a great way to learn something new every day.


Alexandria: May Cause Addiction

Ever since I downloaded the Alexandria PL/SQL library, I haven’t been able to put it down. Just recently I decided I wanted to serve up a whole lot of media files directly from Amazon’s S3 simple storage service, instead of serving them from within my EC2 (elastic compute) instance. They were just wasting my linux server’s time responding to http requests.

So, I quickly wrote the following code to transfer them:

DECLARE
  l_blob BLOB;
BEGIN
  /* initialise my AWS session */
  ALEX.amazon_aws_auth_pkg.init
    ( 'yyy-my-aws-id-yyy'
    , 'xxx-not-telling-xxx'
    , p_gmt_offset => -8);
  FOR rec IN (
    SELECT id, filename, mime_type, location
    FROM myfiles
    WHERE location = 'http://myserver/media/'
  ) LOOP
    /* read the file from its current location */
    l_blob := ALEX.http_util_pkg.get_blob_from_url
      (rec.location || rec.filename);
    IF DBMS_LOB.getLength(l_blob) > 0 THEN
      /* upload the file to Amazon S3 */
      ALEX.amazon_aws_s3_pkg.new_object
        ( 'mybucket'
        , rec.filename
        , l_blob
        , rec.mime_type
        , ALEX.amazon_aws_s3_pkg.g_acl_public_read);
      UPDATE myfiles
      SET location = 'http://mybucket.s3-ap-southeast-1.amazonaws.com/'
      WHERE id = rec.id;
      COMMIT;
    END IF;
  END LOOP;
END;

After a short while, all the files had been copied across to my bucket on S3, and my table updated so that my web site now points people’s browsers to the new location for those files.

Of course, I could have used UTL_FILE to read the files from disk, but then I’d have to first create a directory, and write a loop to read the file in chunks into the BLOB. Why bother with all that when I can just call http_util_pkg.get_blog_from_url and get it all in one go?

That’s the trouble with powerful utilities like Alexandria: they’re too easy to use, make tasks like this trivial, and you start finding all sorts of uses for them. All of a sudden, Alexandria is your hammer, and the world is full of nails.

See also: this quick intro to using Alexandria’s API for Amazon S3.


Generating unique identifiers with “SELECT MAX(id) + 1”

Normally, when you see code like this in a production system, you should duck your head and run:

SELECT NVL( MAX( id ), 0 ) + 1
INTO   :new_id
FROM   mytable;

What’s wrong with this code?

I hope the first answer that rolls off your tongue has something to do with concurrency – i.e. two sessions that run this around the same time will not see uncommitted rows from each other, and so are likely to try to insert rows with conflicting identifiers.

I hope the second answer that you might mention has to do with performance – even considering there’s a unique index on the column, this code will still need to read at least one index block to get the latest ID (assuming the query optimiser chooses to do a MIN/MAX index scan so that it doesn’t have to scan the entire index before returning a result). In a high load system this cost might be unacceptable.

Of course, the first problem (concurrency) could be solved by serializing access to the “get the next ID” function, e.g. with a DBMS_LOCK. We all know, however, that there’s no sane reason to serialize this when Oracle already provides a perfectly good mechanism for generating unique IDs, with virtually no serialization – sequences.

CREATE SEQUENCE my_id_seq;
SELECT my_id_seq.NEXTVAL INTO :new_id FROM DUAL;

Sequences have the benefits of guaranteeing uniqueness, and if their “cache” setting is set appropriately, will add a negligible amount of overhead for serialization.

Problem solved. Easy, right? I bet you’re wondering why I added the word “Normally” to my first sentence in this post….

Question: When might using “SELECT MAX(id) + 1” ever be an acceptable source of unique identifiers?

Answer: Global Temporary tables.

If I’ve inserted any rows into a global temporary table, by definition no other session can see my data, so the first consideration, concurrency, is not an issue.

Also, if I’m not expecting to ever insert many rows into my global temporary table, I can be reasonably confident that performance will not be an issue either. Plus, if I put an index on the ID column, that query will be quite inexpensive.

Conclusion: if you are using global temporary tables, you don’t have to use sequences to generate unique identifiers for them. I’m not saying you shouldn’t, of course – a sequence may be faster, and may even lead to simpler code in some cases – but in other cases you might decide to forego a sequence – one less object, with perhaps its role grants and synonyms, to deploy.

Now, of course, you have to ask yourself, why query the table at all? Why not store that latest ID in a private global variable in a package? In fact, we can create a simple package to replace the sequence, e.g.:

CREATE OR REPLACE PACKAGE my_table_pkg IS
FUNCTION next_id RETURN my_table.id%TYPE;
END my_table_pkg;
CREATE OR REPLACE PACKAGE BODY my_table_pkg IS
  g_latest_id my_table.id%TYPE;
FUNCTION next_id RETURN my_table.id%TYPE IS
  BEGIN
    g_latest_id := NVL(g_latest_id, 0) + 1;
    RETURN g_latest_id;
  END next_id;
END my_table_pkg;

Well, now you know what to do. Whenever you need to generate a unique set of identifiers for a global temporary table, you’ve got a choice of options: sequence, package variable, or a “max(id)+1” query.


Constrain a table to only 1 row

I needed a table that could only ever have one row – if anyone tried to insert a second row they’d get an error.

CREATE UNIQUE INDEX only_one_row_allowed ON mytable (1);

Testing it:

INSERT INTO mytable VALUES ('x');
ORA-00001: unique constraint (SCOTT.ONLY_ONE_ROW_ALLOWED) violated


Quick and simple frequency analysis

I use this simple query quite often when exploring the data in a table in any Oracle database (from Oracle v8 onwards):

select q.*, 100 * ratio_to_report(c) over () rtr
from (select distinct v, count(*) over (partition by v) c from (
select MYCOLUMN v from MYTABLE
)) q order by c desc;

Just substitute the table name for “MYTABLE” and the column you’re interested in for “MYCOLUMN”. This gives a frequency analysis of values, e.g.:

V         C       RTR
========  ======  =============
INACTIVE  401001  92.9254049544
ACTIVE    30529   7.0745950455

V is the value from the column. C is the count of how many times that value appeared. RTR is the % ratio to the total. The first row indicates the most popular value.

If it’s a very large table and you want quicker results, you can run the analysis over a smaller sample easily, just by adding the SAMPLE keyword:

...
select MYCOLUMN v from MYTABLE SAMPLE(1)
...

Add business days

It starts out as a fairly simple, innocent business requirement. Create a report to list records meeting some criteria, one of which is:

“List only records where today’s date is more than 35 business days after the due date from the record.”

When you delve deeper you find that querying the table with “DUE_DATE + 35 < SYSDATE” is not going to cut it – “business days” do not include weekends. You might start with something similar to this. But even that’s not good enough, because business days should not include public holidays. How do you code that?

So, here’s my solution.

1. We need to know what days are public holidays for the region. In our case this application is only applicable for a single region, so we use a simple table:

CREATE TABLE holidays (holiday_date DATE PRIMARY KEY);

We create a simple form for users to enter new holidays every year, and give someone the job of making sure it’s up-to-date every year when the public holidays are announced.

2. Create a view that lists all non-business days – i.e. list all weekends and public holidays. To retain reasonable performance, we limit our solution to dates in the years 2000 to 2050.

CREATE VIEW non_business_days AS
SELECT TO_DATE('01012000','DDMMYYYY') + ROWNUM * 7
       AS day -- Saturdays 2000 to 2050
FROM DUAL CONNECT BY LEVEL <= 2661
UNION ALL
SELECT to_date('02012000','DDMMYYYY') + ROWNUM * 7
       AS day -- Sundays 2000 to 2050
FROM DUAL CONNECT BY LEVEL <= 2661
UNION ALL
SELECT holiday_date FROM holidays;

3. Now, when we need to take a date and add x business days to it, we query this table to find all the non-business-days that are applicable, e.g.:

SELECT day
      ,COUNT(*) OVER (ORDER BY day
                      ROWS BETWEEN UNBOUNDED PRECEDING
                      AND CURRENT ROW)
       AS count_so_far
      ,(day - p_date) AS base_days
FROM   NON_BUSINESS_DAYS
WHERE  day > p_date;

If you run this query and examine each row in order of day, if you take base_days and subtract count_so_far, when the result is less than x, then base_days – count_so_far is the number of extra days we need to add to the holiday’s date to give us the answer. You’ll find this logic in the function below.

In our final solution, we’ll also need to UNION in the date parameter as well, for the case where there are no holidays between the starting date and the number of business days requested.

Here’s our function to take any date (at least, any date between 2000 and 2050) and add x business days (positive or negative):

FUNCTION add_working_days (p_date IN DATE, p_working_days IN NUMBER)
RETURN DATE IS
  l_date DATE;
BEGIN

  IF p_date IS NULL OR p_working_days IS NULL THEN
    RETURN NULL;
  END IF;

  IF p_working_days != TRUNC(p_working_days) THEN
    RAISE_APPLICATION_ERROR(-20000,
      'add_working_days: cannot handle fractional p_working_days ('
      || p_working_days || ')');
  END IF;

  IF p_working_days > 0 THEN

    SELECT MAX(day + p_working_days - (base_days - count_so_far))
    INTO l_date
    FROM (SELECT day
                ,COUNT(*) OVER (ORDER BY day
                                ROWS BETWEEN UNBOUNDED PRECEDING
                                AND CURRENT ROW)
                 AS count_so_far
                ,(day - p_date) AS base_days
          FROM NON_BUSINESS_DAYS
          WHERE day > p_date
          UNION
          SELECT p_date, 0, 0 FROM DUAL
         )
    WHERE base_days - count_so_far < p_working_days;

  ELSIF p_working_days < 0 THEN

    SELECT MIN(day - (ABS(p_working_days) - (base_days - count_so_far)))
    INTO l_date
    FROM (SELECT day
                ,COUNT(*) OVER (ORDER BY day DESC
                                ROWS BETWEEN UNBOUNDED PRECEDING
                                AND CURRENT ROW)
                 AS count_so_far
                ,(p_date - day) AS base_days
          FROM NON_BUSINESS_DAYS
          WHERE day < p_date
          UNION
          SELECT p_date, 0, 0 FROM DUAL
         )
    WHERE base_days - count_so_far < ABS(p_working_days);

  ELSE

    l_date := p_date;

  END IF;

  RETURN l_date;
END add_working_days;

Test cases (these are some public holidays in Western Australia):

insert into holidays values (to_date('27/12/2010','DD/MM/YYYY');
insert into holidays values (to_date('28/12/2010','DD/MM/YYYY');
insert into holidays values (to_date('03/01/2011','DD/MM/YYYY');
insert into holidays values (to_date('26/01/2011','DD/MM/YYYY');

— Expected: 06/01/2011

select cls_util.add_working_days(to_date('13/12/2010','DD/MM/YYYY')
                                ,15) from dual;

— Expected: 31/01/2011

select cls_util.add_working_days(to_date('25/01/2011','DD/MM/YYYY')
                                ,3) from dual;

— Expected: 13/12/2010

select cls_util.add_working_days(to_date('06/01/2011','DD/MM/YYYY')
                                ,-15) from dual;

— Expected: 25/01/2011

select cls_util.add_working_days(to_date('31/01/2011','DD/MM/YYYY')
                                ,-3) from dual;

SPOD for a query

I have two queries that need to be executed by a PL/SQL program. Both of them are quite complex, and both of them have a large section which is identical – because they are different views of the same underlying source data.

One option is to expand out both queries in full, e.g.:

Query 1:

 SELECT <complicated expressions>
 FROM (
       <large complicated query>
      ), <other tables>
 WHERE <complicated predicates>;

Query 2:

 SELECT <different complicated expressions>
 FROM (
       <large complicated query>
      ), <other different tables>
 WHERE <different complicated predicates>;

I don’t like the fact that my <large complicated query> is repeated in full in both cursor definitions. I’d rather have one place where that subquery is defined, because it should remain the same for both queries, since they are supposed to be different views of the same underlying data.

Another option is to create a view on the <large complicated query>, and refer to that in both queries. This is a perfectly acceptable option, and one which I often use. The only downside is if there are any parameters that need to be “fed” to the view. One way is for the view to expose the parameter as a column in the view, and for the calling query to simply query it on that column. This is not always the most efficient method, however, depending on the complexity of the view and how well Oracle can “push down” the predicate into the view at execution time. Another solution to the parameter problem is to use a user-defined context as described here.

The other downside which I don’t like for this case is that the view moves the query away from the package – I’d prefer to have the definitions close together and maintained in one location.

The solution which I used in this case is a pipelined function. For example:

 FUNCTION large_complicated_query
   RETURN source_data_table_type
   PIPELINED IS
   rc source_data_type;
 BEGIN
   FOR r IN (<large complicated query>) LOOP
     rc.col1 := r.col1;
     rc.col2 := r.col2;
     -- etc.
     PIPE ROW (rc);
   END LOOP;
   RETURN;
 END;

Now, the two queries in my package can re-use it like this:

 SELECT <complicated expressions>
 FROM TABLE(my_package.large_complicated_query)
      ,<other tables>
 WHERE <complicated predicates>;

In the package spec I have:

 -- *** dev note: for internal use only ***
 TYPE source_data_type IS
   RECORD (col1 col1_data_type, etc....);
 TYPE source_data_type_table IS TABLE OF source_data_type;
 FUNCTION large_complicated_query
   RETURN source_data_table_type PIPELINED;
 -- *** ******************************* ***

Because the pipelined function is going to be called by SQL (in fact, two queries defined in the same package), its declaration must also be added to the package spec.

In the package body, I use private global variable(s) to hold the parameter for the large complicated query.

When the queries are run, the global variable(s) must first be set to the required parameter. The queries are run, then the global variables are cleared.

The pipelined function is deliberately not useful to other processes – if a developer tried to call it, they’d get no results because they can’t set the parameters (since they are declared as private globals).

A downside to this approach is that the optimizer will not be able to optimize the entire queries “as a whole” – it will execute the entire query in the pipelined function (at least, until the calling queries decide to stop fetching from it). For my case, however, this is not a problem. The entire process runs in less than a second – and this is 10 times faster than it needs to be. In other words, in this case maintainability is more important than performance.

There may be other ways to do this (in fact, I’m quite sure there are), but this way worked for me.


Handling unique constraint violations by Hibernate

A particular table in our system is a M:M link table between Bonds and Payments, imaginatively named BOND_PAYMENTS; and to make the Java devs’ jobs easier it has a surrogate key, BOND_PAYMENT_ID. Its structure, therefore, is basically:

BOND_PAYMENTS
  (BOND_PAYMENT_ID,
   BOND_NUMBER,
   PAYMENT_ID)

This is a very simple design, quite common in relational database designs. There is a Primary key constraint on BOND_PAYMENT_ID, and we’ve also added a Unique constraint on (BOND_NUMBER, PAYMENT_ID) since it makes no sense to have more than one link between a Bond and a Payment.

The application allows a user to view all the Payments linked to a particular Bond; and it allows them to create new links, and delete existing links. Once they’ve made all their desired changes on the page, they hit “Save”, and Hibernate does its magic to run the required SQL on the database. Unfortunately, this was failing with ORA-00001: unique constraint violated.

Now, the way this page works is that it compares the old set of payments for the bond with the new target set, and Hibernate works out which records need to be deleted, which need to be inserted, and leaves the rest untouched. Unfortunately, in its infinite wisdom it does the INSERTs first, then it does the DELETEs. Apparently this order can’t be changed.

This is the cause of the unique constraint violation – if the user deletes a link to a payment, then changes their mind and re-inserts a link to the same payment, Hibernate quite happily tries to insert it then delete it. Since these inserts/deletes are running as separate SQL statements, Oracle validates the constraint immediately on the first insert.

We had only a few options:

  1. Make the constraint deferrable
  2. Remove the unique constraint

Option 2 was not very palatable, because the constraint provides excellent protection from nasty application bugs that might allow inconsistent data to be saved. We went with option 1.

ALTER TABLE bond_payments ADD
  CONSTRAINT bond_payment_uk UNIQUE (bond_number, payment_id)
  DEFERRABLE INITIALLY DEFERRED;

This solved it – and no changes required to the application. If a bug in the application were to cause it to try to insert a duplicate row, it will fail with ORA-02091 (transaction rolled back) and ORA-00001 (unique constraint violated) when the session COMMITs.

The only downside is that the index created to police this constraint is now a non-unique index, so may be somewhat less efficient for queries. We decided this is not as great a detriment for this particular case.

If you know of any other options that we should have considered, let me know 🙂