JSON_MERGEPATCH is sneaky

The Oracle JSON functions are very useful for generating JSON from a query, and developing using these functions requires understanding the limitations of the string data types they return.

Unless otherwise specified, they return a VARCHAR2 with a maximum of 4000 bytes. If your query might return more than this, you must either specify a larger length, e.g. RETURNING VARCHAR2(32767), or request a CLOB, e.g. RETURNING CLOB.

If the data exceeds the limit, calls to JSON_OBJECT, JSON_OBJECTAGG, JSON_ARRAYAGG, and JSON_TRANSFORM will fail at runtime with the following exception:

select
    json_object(
        'name-is-twenty-chars' : rpad('x',3974,'x')
    )
from dual;

ORA-40478: output value too large (maximum: 4000)

The error occurs here because the representation of the entire JSON object requires more than 4000 bytes. No-one likes to see errors, but it’s better than the alternative because it is more likely to alert you to the problem so you can fix it.

You may have noticed I missed one of the JSON functions from the list above – JSON_MERGEPATCH. By default, this function does not raise an exception if the size limit is exceeded. Instead, it merely returns NULL at runtime. This behaviour can cause confusion when debugging a complex query, so it’s something to be aware of.

select
    json_mergepatch(
        json_object(
            'part1' : rpad('x',3973,'x')
            returning clob
        ),
        json_object(
            'part2' : rpad('x',3973,'x')
            returning clob
        )
    )
from dual;

(NULL)

Note that even though both the JSON objects specified RETURNING CLOB, this was missed for JSON_MERGEPATCH; which means it is limited to the default 4000 bytes, causing it to return NULL. The fix is to add RETURNING CLOB to the JSON_MERGEPATCH:

select
    json_mergepatch(
        json_object(
            'part1' : rpad('x',3973,'x')
            returning clob
        ),
        json_object(
            'part2' : rpad('x',3973,'x')
            returning clob
        )
        returning clob
    )
from dual;

{"part1":"xxx...xxx","part2":"xxx...xxx"}

If you don’t like this option, there are others. Refer to the links below for information about the TRUNCATE and ERROR ON ERROR clauses.

Further Reading


Comparing Timestamps with Time Zone

If you break out into a sweat reading the title, it probably means that like me, you have had too little exposure to working with timestamps in Oracle.

(meme pretending that Gordon Ramsay says "I hate timezones ... in all timezones")

Until recently I never really had much to do with time zones because I lived in the (now even moreso, due to covid) insular state of Western Australia. In WA most places pretend that there is no such thing as time zones – so our exposure to Oracle data types is limited to simple DATEs and TIMESTAMPs, with nary a time zone in sight. We just set the server time zone to AUSTRALIA/Perth and forget it.

Now I’ve helped build a system that needs to concurrently serve the needs of customers in any time zone – whether in the US, in Africa, or here in Australia. We therefore set the server time zone to UTC and use data types that support time zones, namely:

  • TIMESTAMP WITH TIME ZONE – for dates and times that need to include the relevant time zone;
    and
  • TIMESTAMP WITH LOCAL TIME ZONE – for dates and times of system events (e.g. record audit data) that we want to always be shown as of the session time zone (i.e. UTC), and we don’t care what time zone they were originally created in.

A colleague came to me with the following issue: a business rule needed to check an appointment date/time with the current date; if the appointment was for the prior day, an error message should be shown saying that they were too late for their appointment. A test case was failing and they couldn’t see why.

Here is the code (somewhat obfuscated):

if appointment_time < trunc(current_time) then
    :p1_msg := 'This appointment was for the previous day and has expired.';
end if;

We had used TRUNC here because we want to check if the appointment time was prior to midnight of the current date, from the perspective of the relevant time zone. The values of appointment_time and current_time seemed to indicate it shouldn’t fail:

appointment_time = 05-MAR-2021 07.00.00.000000 AM AUSTRALIA/Perth
current_time     = 05-MAR-2021 06.45.00.000000 AM AUSTRALIA/Perth

We can see that the appointment time and current time are in the same time zone, and the same day – so the tester expected no error message would be shown. (Note that the “current time” here is computed using localtimestamp at the time zone of the record being compared)

After checking that our assumptions were correct (yes, both appointment_time and current_time are TIMESTAMP WITH TIME ZONEs; and yes, they had the values shown above) we ran a query on the database to start testing our assumptions about the logic being run here.

select
    to_timestamp_tz('05-MAR-2021 07.00.00.000000 AM AUSTRALIA/Perth') as appt_time,
    to_timestamp_tz('05-MAR-2021 06.45.00.000000 AM AUSTRALIA/Perth') as current_time
from dual

APPT_TIME    = '05-MAR-2021 07.00.00.000000000 AM AUSTRALIA/PERTH'
CURRENT_TIME = '05-MAR-2021 06.45.00.000000000 AM AUSTRALIA/PERTH'

So far so good. What does an ordinary comparison show for these values?

with q as (
    select
        to_timestamp_tz('05-MAR-2021 07.00.00.000000 AM AUSTRALIA/Perth') as appt_time,
        to_timestamp_tz('05-MAR-2021 06.45.00.000000 AM AUSTRALIA/Perth') as current_time
    from dual)
select
    q.appt_time,
    q.current_time,
    case when q.appt_time < q.current_time then 'FAIL' else 'SUCCESS' end test
from q;

APPT_TIME    = '05-MAR-2021 07.00.00.000000000 AM AUSTRALIA/PERTH'
CURRENT_TIME = '05-MAR-2021 06.45.00.000000000 AM AUSTRALIA/PERTH'
TEST         = 'SUCCESS'

That’s what we expected; the appointment time is not before the current time, so the test is successful. Now, let’s test the expression actually used in our failing code, where the TRUNC has been added:

with q as (
    select
        to_timestamp_tz('05-MAR-2021 07.00.00.000000 AM AUSTRALIA/Perth') as appt_time,
        to_timestamp_tz('05-MAR-2021 06.45.00.000000 AM AUSTRALIA/Perth') as current_time
    from dual)
select
    q.appt_time,
    q.current_time,
    trunc(q.current_time),
    case when q.appt_time < trunc(q.current_time) then 'FAIL' else 'SUCCESS' end test
from q;

APPT_TIME           = '05-MAR-2021 07.00.00.000000000 AM AUSTRALIA/PERTH'
CURRENT_TIME        = '05-MAR-2021 06.45.00.000000000 AM AUSTRALIA/PERTH'
TRUNC(CURRENT_TIME) = '03/05/2021'
TEST                = 'FAIL'

Good: we have reproduced the problem. Now we can try to work out why it is failing. My initial suspicion was that an implicit conversion was causing the issue – perhaps the appointment date was being converted to a DATE prior to the comparison, and was somehow being converted to the UTC time zone, which was the database time zone?

with q as (
    select
        to_timestamp_tz('05-MAR-2021 07.00.00.000000 AM AUSTRALIA/Perth') as appt_time,
        to_timestamp_tz('05-MAR-2021 06.45.00.000000 AM AUSTRALIA/Perth') as current_time
    from dual)
select
    q.appt_time,
    q.current_time,
    cast(q.appt_time as date),
    cast(q.current_time as date)
from q;

APPT_TIME                  = '05-MAR-2021 07.00.00.000000000 AM AUSTRALIA/PERTH'
CURRENT_TIME               = '05-MAR-2021 06.45.00.000000000 AM AUSTRALIA/PERTH'
CAST(APPT_TIME AS DATE)    = '03/05/2021 07:00:00 AM'
CAST(CURRENT_TIME AS DATE) = '03/05/2021 06:45:00 AM'

Nope. When cast to a DATE, both timestamps still fall on the same date. Then I thought, maybe when a DATE is compared with a TIMESTAMP, Oracle first converts the DATE to a TIMESTAMP?

with q as (
    select
        to_timestamp_tz('05-MAR-2021 07.00.00.000000 AM AUSTRALIA/Perth') as appt_time,
        to_timestamp_tz('05-MAR-2021 06.45.00.000000 AM AUSTRALIA/Perth') as current_time
    from dual)
select
    q.appt_time,
    q.current_time,
    cast(trunc(q.current_time) as timestamp with time zone),
    case when q.appt_time < trunc(q.current_time) then 'FAIL' else 'SUCCESS' end test
from q;

APPT_TIME                              = '05-MAR-2021 07.00.00.000000000 AM AUSTRALIA/PERTH'
CURRENT_TIME                           = '05-MAR-2021 06.45.00.000000000 AM AUSTRALIA/PERTH'
CAST(TRUNC(CURRENT_TIME) AS TIMESTAMP) = '05-MAR-2021 12.00.00.000000 AM +00:00'
TEST                                   = 'FAIL'

Ah! Now we can see the cause of our problem. After TRUNCating a timestamp, we have converted it to a DATE (with no timezone information); since Oracle needs to implicitly convert this back to a TIMESTAMP WITH TIME ZONE, it simply slaps the UTC time zone on it. Now, when it is compared with the appointment time, it fails the test because the time is 12am (midnight) versus 7am.

Our original requirement was only to compare the dates involved, not the time of day; if the appointment was on the previous day (in the time zone relevant to the record), the error message should appear. We therefore need to ensure that Oracle performs no implicit conversion, by first converting the appointment time to a DATE:

with q as (
    select
        to_timestamp_tz('05-MAR-2021 07.00.00.000000 AM AUSTRALIA/Perth') as appt_time,
        to_timestamp_tz('05-MAR-2021 06.45.00.000000 AM AUSTRALIA/Perth') as current_time
    from dual)
select
    q.appt_time,
    q.current_time,
    case when cast(q.appt_time as date) < trunc(q.current_time) then 'FAIL' else 'SUCCESS' end test
from q;

APPT_TIME                              = '05-MAR-2021 07.00.00.000000000 AM AUSTRALIA/PERTH'
CURRENT_TIME                           = '05-MAR-2021 06.45.00.000000000 AM AUSTRALIA/PERTH'
TEST                                   = 'SUCCESS'

Our logic therefore should be:

if cast(appointment_time as date) < trunc(current_time) then
    :p1_msg := 'This appointment was for the previous day and has expired.';
end if;

It should be noted that if the tester had done this just an hour later in the day, they would not have noticed this problem – because Perth is +08:00, and the timestamps for the test data were prior to 8am in the morning.

Lesson #1: in any system that deals with timestamps and time zones it’s quite easy for subtle bugs to survive quite a bit of testing.

Lesson #2: when writing any comparison code involving timestamps and time zones, make sure that the data types are identical – and if they aren’t, add code to explicitly convert them first.


Saved Interactive Reports, Lost and Found

Sometimes we can too quickly say “No” when we should take a moment to think about whether we can actually say “Yes”. This was one of those times for me.

We had just gone live with a fairly big system move + upgrade – a suite of APEX applications shifted from one database instance running APEX 5 to another instance running version 19.1. The migration went smoothly. After the new instance was released to the business to do some final shakedown testing, they noticed one problem:

“All our saved reports are gone.”

Oops. We had built the new instance by simply importing each application from source control; since these scripts don’t include the Saved Report formats that users had customised in many of the Interactive Reports in prod, they didn’t get migrated to the new database.

When they asked if we can transfer the saved reports across, I initially replied, “Sorry, we can’t” – thinking that redoing the migration from scratch with a full export from the old database, followed by re-doing all the app changes, was going to take too much time and effort.

A minute later, I sent another email. “Hold that thought – there is a way.” I’d forgotten that my script for exporting the applications uses an APEX API with some parameters that until now I’d rarely used, but which would solve our problem. My script included this:

wwv_flow_utilities.export_application_to_clob (
    p_application_id => 100,
    p_export_ir_public_reports => 'N',
    p_export_ir_private_reports => 'N',
    p_export_ir_notifications => 'N');

NOTE: on more recent versions this API became supported as apex_export.get_application.

In order to restore all the user saved reports, I created a new version of my export script, but changed all the “N”s into “Y”s:

wwv_flow_utilities.export_application_to_clob (
    p_application_id => 100,
    p_export_ir_public_reports => 'Y',
    p_export_ir_private_reports => 'Y',
    p_export_ir_notifications => 'Y');

I call this the “Yes” script. Using this “Yes” script, we performed the following steps:

  1. Log into the schema on the old database
    (we hadn’t shut it down yet, thankfully – we’d just locked the schemas and set all the applications to “Unavailable”)
  2. Run the “Yes” script.
    Rename the resulting export files, e.g. f100_ir.sql, f110_ir.sql, etc.
  3. Log into the new database.
  4. Run the IR scripts (f100_ir.sql, etc.).
    This reverted all the applications to their old versions, but they included all the user’s saved reports.
  5. Re-import the new versions of the applications from source control.
    This upgraded all the applications, leaving the user’s saved reports intact.

Problem solved.


Many happy birthdays

Last year I *(not pictured) celebrated my 42nd circuit around the sun. In accordance with time-honoured tradition, it has been celebrated some time around the same day each September with variations on the following theme:

  • a get-together with friends and/or family
  • my favourite meal (usually lasagne)
  • my favourite cake (usually a sponge coffee torte, yum)
  • a gift or two
  • the taking of photographs, to document how much I’ve grown since the last one

Each year, determining the date this anniversary should fall on is a simple calculation combining the current year with the month and day-of-month. So, in the example of the special, but somewhat disadvantaged, people whose birthday falls on Christmas day (if you are among this select group, you have my sympathies), we could calculate their birthdays using a simple SQL expression like this:

with testdata as (
select date'2000-12-25' as d1
from dual)
select rownum-1 as age
      ,extract(day from d1)
       || '-' || to_char(d1,'MON')
       || '-' || (extract(year from d1)+rownum-1) as d1
from testdata connect by level <= 12;
AGE D1
0   25-DEC-2000
1   25-DEC-2001
2   25-DEC-2002
3   25-DEC-2003
4   25-DEC-2004
5   25-DEC-2005
6   25-DEC-2006
7   25-DEC-2007
8   25-DEC-2008
9   25-DEC-2009
10  25-DEC-2010
11  25-DEC-2011

Of course, as you should well know, this is wrong. It assumes that every year has every day that the anniversary might fall on. If a person is in that very special group of people who were born on the leap day of a leap year, our algorithm produces invalid dates in non-leap years:

with testdata as (
select date'2000-12-25' as d1
      ,date'2000-02-29' as d2
from dual)
select rownum-1 as age
      ,extract(day from d1)
       || '-' || to_char(d1,'MON')
       || '-' || (extract(year from d1)+rownum-1) as d1
      ,extract(day from d2)
       || '-' || to_char(d2,'MON')
       || '-' || (extract(year from d1)+rownum-1) as d2
from testdata connect by level <= 12;
AGE D1          D2
0   25-DEC-2000	29-FEB-2000
1   25-DEC-2001	29-FEB-2001 **INVALID**
2   25-DEC-2002	29-FEB-2002 **INVALID**
3   25-DEC-2003	29-FEB-2003 **INVALID**
4   25-DEC-2004	29-FEB-2004
5   25-DEC-2005	29-FEB-2005 **INVALID**
6   25-DEC-2006	29-FEB-2006 **INVALID**
7   25-DEC-2007	29-FEB-2007 **INVALID**
8   25-DEC-2008	29-FEB-2008
9   25-DEC-2009	29-FEB-2009 **INVALID**
10  25-DEC-2010	29-FEB-2010 **INVALID**
11  25-DEC-2011	29-FEB-2011 **INVALID**

This is because we are constructing a string which may or may not represent a real date in our crazy calendar system. So any self-respecting Oracle developer will know that the “correct” way of calculating this sort of thing is to use the ADD_MONTHS function, generously gifted to all of us for free:

with testdata as (
select date'2000-12-25' as d1
      ,date'2000-02-29' as d2
from dual)
select rownum-1 as age
      ,add_months(d1,12*(rownum-1)) as d1
      ,add_months(d2,12*(rownum-1)) as d2
from testdata connect by level <= 12;
AGE D1          D2
0   25-DEC-2000	29-FEB-2000
1   25-DEC-2001	28-FEB-2001
2   25-DEC-2002	28-FEB-2002
3   25-DEC-2003	28-FEB-2003
4   25-DEC-2004	29-FEB-2004
5   25-DEC-2005	28-FEB-2005
6   25-DEC-2006	28-FEB-2006
7   25-DEC-2007	28-FEB-2007
8   25-DEC-2008	29-FEB-2008
9   25-DEC-2009	28-FEB-2009
10  25-DEC-2010	28-FEB-2010
11  25-DEC-2011	28-FEB-2011

Hurrah, we now have valid dates (this is guaranteed by ADD_MONTHS), and those poor souls born on a leap day can still celebrate their birthday, albeit on the 28th of the month in non-leap years. Some of the more pedantic of these might wait until the following day (the 1st of March) to celebrate their birthday, but for the purpose of our calculation here the 28th is a little simpler to work with.

We package up our software and deliver it to the customer, it passes all their tests and it goes into Production where it works quite fine – for about a week or so.

Someone notices that for SOME people, whose anniversary did NOT fall on a leap day but were born on the 28th of February, are being assigned the 29th of February as their day of celebration in every leap year. However, not everyone has this problem: other people whose birthday is also on the 28th of February are being correctly calculated as the 28th of February whether it’s a leap year or not.

Obviously there’s a bug in Oracle’s code, somewhere. Maybe. Well, not a bug so much, this is due to the way that ADD_MONTHS chooses to solve the problem of “adding one month” when a “month” is not defined with a constant number of days. ADD_MONTHS attempts to satisfy the requirements of most applications where if you start from the last day of one month, the result of ADD_MONTHS will also be the last day of its month. So add_months(date'2000-06-30', 1) = date'2000-07-31',add_months(date'2000-06-30', 1) = date'2000-07-30', and add_months(date'2000-05-31', 1) = date'2000-06-30'.

Let’s have a look at those dates. There’s one person whose birthday was 28 Feb 2000 and our algorithm is setting their anniversary as the 28th of February regardless of year. That’s fine. There’s another person who was born a year later on 28 Feb 2001, and our algorithm is setting their “gimme gimme” day to the 29th of February in each subsequent leap year. That’s not what we want.

with testdata as (
select date'2000-12-25' as d1
      ,date'2000-02-29' as d2
      ,date'2000-02-28' as d3
      ,date'2001-02-28' as d4
from dual)
select rownum-1 as age
      ,add_months(d1,12*(rownum-1)) as d1
      ,add_months(d2,12*(rownum-1)) as d2
      ,add_months(d3,12*(rownum-1)) as d3
      ,add_months(d4,12*(rownum-1)) as d4
from testdata connect by level <= 12;
AGE D1          D2          D3          D4
0   25-DEC-2000	29-FEB-2000 28-FEB-2000 28-FEB-2001
1   25-DEC-2001	28-FEB-2001 28-FEB-2001 28-FEB-2002
2   25-DEC-2002	28-FEB-2002 28-FEB-2002 28-FEB-2003
3   25-DEC-2003	28-FEB-2003 28-FEB-2003 29-FEB-2004 **D4 is INCORRECT**
4   25-DEC-2004	29-FEB-2004 28-FEB-2004 28-FEB-2005
5   25-DEC-2005	28-FEB-2005 28-FEB-2005 28-FEB-2006
6   25-DEC-2006	28-FEB-2006 28-FEB-2006 28-FEB-2007
7   25-DEC-2007	28-FEB-2007 28-FEB-2007 29-FEB-2008 **D4 is INCORRECT**
8   25-DEC-2008	29-FEB-2008 28-FEB-2008 28-FEB-2009
9   25-DEC-2009	28-FEB-2009 28-FEB-2009 28-FEB-2010
10  25-DEC-2010	28-FEB-2010 28-FEB-2010 28-FEB-2011
11  25-DEC-2011	28-FEB-2011 28-FEB-2011 29-FEB-2012 **D4 is INCORRECT**

Edge cases. Always with the edge cases. How shall we fix this? We’ll have to pick out those especially special people who were born on the 28th of February in a non-leap year and add some special handling.

with
  function birthday (d in date, age in number) return date is
  begin
    if to_char(d,'DD/MM') = '28/02'
    and to_char(add_months(d,age*12),'DD/MM') = '29/02'
    then
      return add_months(d,age*12)-1;
    else
      return add_months(d,age*12);
    end if;
  end;
select * from (
  with testdata as (
  select date'2000-12-25' as d1
        ,date'2000-02-29' as d2
        ,date'2000-02-28' as d3
        ,date'2001-02-28' as d4
  from dual)
  select rownum-1 as age
        ,birthday(d1,rownum-1) as d1
        ,birthday(d2,rownum-1) as d2
        ,birthday(d3,rownum-1) as d3
        ,birthday(d4,rownum-1) as d4
  from testdata connect by level <= 12
);

LiveSQL

AGE D1          D2          D3          D4
0   25-DEC-2000 29-FEB-2000 28-FEB-2000 28-FEB-2001
1   25-DEC-2001 28-FEB-2001 28-FEB-2001 28-FEB-2002
2   25-DEC-2002 28-FEB-2002 28-FEB-2002 28-FEB-2003
3   25-DEC-2003 28-FEB-2003 28-FEB-2003 28-FEB-2004
4   25-DEC-2004 29-FEB-2004 28-FEB-2004 28-FEB-2005
5   25-DEC-2005 28-FEB-2005 28-FEB-2005 28-FEB-2006
6   25-DEC-2006 28-FEB-2006 28-FEB-2006 28-FEB-2007
7   25-DEC-2007 28-FEB-2007 28-FEB-2007 28-FEB-2008
8   25-DEC-2008 29-FEB-2008 28-FEB-2008 28-FEB-2009
9   25-DEC-2009 28-FEB-2009 28-FEB-2009 28-FEB-2010
10  25-DEC-2010 28-FEB-2010 28-FEB-2010 28-FEB-2011
11  25-DEC-2011 28-FEB-2011 28-FEB-2011 28-FEB-2012

Now that’s what I call a happy birthday, for everyone – no matter how special.

p.s. that’s a picture of Tony Robbins at the top of this post. He’s a leap day kid. I don’t know when he celebrates his birthdays, but as he says, “the past does not equal the future.”


Unique constraint WWV_FLOW_WORKSHEET_RPTS_UK violated

If your APEX application import log shows something like this:

...PAGE 73: Transaction Lines Report
declare
*
ERROR at line 1:
ORA-00001: unique constraint (APEX_040200.WWV_FLOW_WORKSHEET_RPTS_UK)
violated
ORA-06512: at "APEX_040200.WWV_FLOW_API", line 16271
ORA-06512: at line 6

(this is on an Apex 4.2.4 instance)

This is due to a Saved Report on an Interactive Report that was included in the export, which conflicts with a different Saved Report in the target instance. The log will, conveniently, tell you which page the IR is on.

The solution for this problem is simple – either:

(a) Export the application with Export Public Interactive Reports and Export Private Interactive Reports set to No;
OR
(b) Delete the Saved Report(s) from the instance you’re exporting from.

You can find all Saved Reports in an instance by running a query like this:

select workspace
      ,application_id
      ,application_name
      ,page_id
      ,application_user
      ,report_name
      ,report_alias
      ,status
from APEX_APPLICATION_PAGE_IR_RPT
where application_user not in ('APXWS_DEFAULT'
                              ,'APXWS_ALTERNATIVE');

You can delete Saved Reports from the Application Builder by going to the page with the Interactive Report, right-click on the IR and choose Edit Saved Reports, then select the report(s) and click Delete Checked.


SQL Problem Solved: Merge History Tables

Yesterday I published a SQL problem (that I I had solved – or at least, thought I had solved!) and invited readers to submit their own solutions.

SPOILER ALERT: if you haven’t had a chance to have a go yourself, go back, don’t read the comments (yet), but see if you can solve it yourself.

I was very pleased to see two excellent submissions that were both correct as far as I could tell, and are (in my opinion) sufficiently simple, elegant and efficient. Note that this was not a competition and I will not be ranking or rating or anything like that. If you need to solve a similar problem, I would suggest you try all of these solutions out and pick the one that seems to work the best for you.

Solution #1

Stew Ashton supplied the first solution which was very simple and elegant:

with dates as (
  select distinct deptid, dte from (
    select deptid, from_date, to_date+1 to_date from t1
    union all
    select deptid, from_date, to_date+1 to_date from t2
  )
  unpivot(dte for col in(from_date, to_date))
)
select * from (
  select distinct a.deptid, dte from_date,
  lead(dte) over(partition by a.deptid order by dte) - 1 to_date,
  data1, data2 from dates a
  left join t1 b on a.deptid = b.deptid and a.dte between b.from_date and b.to_date
  left join t2 c on a.deptid = c.deptid and a.dte between c.from_date and c.to_date
)
where data1 is not null or data2 is not null
order by 1,2;

SQL Fiddle

Stew’s solution does a quick pass over the two tables and generates a “date dimension” table in memory; it then does left joins to each table to generate each row. It may have a disadvantage due to the requirement to make two passes over the tables, which might be mitigated by suitable indexing. As pointed out by Matthias Rogel, this solution can be easily generalised to more than two tables, which is a plus.

Solution #2

The second solution was provided by Kim Berg Hansen which was a bit longer but in my opinion, still simple and elegant:

with v1 as (
   select t.deptid
        , case r.l
             when 1 then t.from_date
             when 2 then date '0001-01-01'
             when 3 then t.max_to + 1
          end from_date
        , case r.l
             when 1 then t.to_date
             when 2 then t.min_from -1
             when 3 then date '9999-01-01'
          end to_date
        , case r.l
             when 1 then t.data1
          end data1
     from (
      select deptid
           , from_date
           , to_date
           , data1
           , row_number() over (partition by deptid order by from_date) rn
           , min(from_date) over (partition by deptid) min_from
           , max(to_date) over (partition by deptid) max_to
        from t1
     ) t, lateral( 
         select level l
           from dual
         connect by level <= case t.rn when 1 then 3 else 1 end
     ) r
), v2 as (
   select t.deptid
        , case r.l
             when 1 then t.from_date
             when 2 then date '0001-01-01'
             when 3 then t.max_to + 1
          end from_date
        , case r.l
             when 1 then t.to_date
             when 2 then t.min_from -1
             when 3 then date '9999-01-01'
          end to_date
        , case r.l
             when 1 then t.data2
          end data2
     from (
      select deptid
           , from_date
           , to_date
           , data2
           , row_number() over (partition by deptid order by from_date) rn
           , min(from_date) over (partition by deptid) min_from
           , max(to_date) over (partition by deptid) max_to
        from t2
     ) t, lateral( 
         select level l
           from dual
         connect by level <= case t.rn when 1 then 3 else 1 end
     ) r
)
select v1.deptid
     , greatest(v1.from_date, v2.from_date) from_date
     , least(v1.to_date, v2.to_date) to_date
     , v1.data1
     , v2.data2
  from v1
  join v2
      on v2.deptid = v1.deptid
     and v2.to_date >= v1.from_date
     and v2.from_date <= v1.to_date
 where v2.data2 is not null or v1.data1 is not null
 order by v1.deptid
        , greatest(v1.from_date, v2.from_date)

Since it takes advantage of 12c’s new LATERAL feature, I wasn’t able to test it on my system, but I was able to test it on Oracle’s new Live SQL and it passed with flying colours. I backported it to 11g and it worked just fine on my system as well. An advantage of Kim’s solution is that it should only make one pass over each of the two tables.

Oracle Live SQL: merge-history-kimberghansen (this is the 12c version – Oracle login required)

SQL Fiddle (this is my attempt to backport to 11g – you can judge for yourself if I was faithful to the original)

Solution #3

When my colleague came to me with this problem, I started by creating a test suite including as many test cases as we could think of. Developing and testing the solution against these test cases proved invaluable and allowed a number of ideas to be tested and thrown away. Big thanks go to Kim Berg Hansen for providing an additional test case, which revealed a small bug in my original solution.

with q1 as (
select nvl(t1.deptid, t2.deptid) as deptid
      ,greatest(nvl(t1.from_date, t2.from_date), nvl(t2.from_date, t1.from_date)) as gt_from_date
      ,least(nvl(t1.to_date, t2.to_date), nvl(t2.to_date, t1.to_date))            as lt_to_date
      ,t1.from_date AS t1_from_date
      ,t1.to_date   AS t1_to_date
      ,t2.from_date AS t2_from_date
      ,t2.to_date   AS t2_to_date
      ,min(greatest(t1.from_date, t2.from_date)) over (partition by t1.deptid) - 1 as fr_to_date
      ,max(least(t1.to_date, t2.to_date)) over (partition by t2.deptid)        + 1 as lr_from_date
      ,t1.data1
      ,t2.data2
      ,row_number() over (partition by t1.deptid, t2.deptid order by t1.from_date) as rn_fr1
      ,row_number() over (partition by t1.deptid, t2.deptid order by t2.from_date) as rn_fr2
      ,row_number() over (partition by t1.deptid, t2.deptid order by t1.to_date desc) as rn_lr1
      ,row_number() over (partition by t1.deptid, t2.deptid order by t2.to_date desc) as rn_lr2
from   t1
full outer join t2
on     t1.deptid = t2.deptid
and    (t1.from_date between t2.from_date and t2.to_date
        or t1.to_date between t2.from_date and t2.to_date
        or t2.from_date between t1.from_date and t1.to_date
        or t2.to_date between t1.from_date and t1.to_date)
order by 1, 2
)
select deptid
      ,gt_from_date AS from_date
      ,lt_to_date   AS to_date
      ,data1
      ,data2
from   q1
union all
select deptid
      ,t1_from_date as from_date
      ,fr_to_date AS to_date
      ,data1
      ,'' AS data2
from   q1
where  fr_to_date between t1_from_date and t1_to_date
and    rn_fr1 = 1
union all
select deptid
      ,t2_from_date as from_date
      ,fr_to_date AS to_date
      ,'' AS data1
      ,data2
from   q1
where  fr_to_date between t2_from_date and t2_to_date
and    rn_fr2 = 1
union all
select deptid
      ,lr_from_date as from_date
      ,t2_to_date AS to_date
      ,'' AS data1
      ,data2
from   q1
where  lr_from_date between t2_from_date and t2_to_date
and    rn_lr2 = 1
union all
select deptid
      ,lr_from_date as from_date
      ,t1_to_date AS to_date
      ,data1
      ,'' AS data2
from   q1
where  lr_from_date between t1_from_date and t1_to_date
and    rn_lr1 = 1
order by deptid, from_date

SQL Fiddle

This solution is probably the ugliest of the three, and though it only requires one pass over the two tables, it materializes all the data in a temp table so probably requires a fair amount of memory for large data sets.

I’m so glad I posted this problem – I received help in the form of two quite different solutions, which help to show different ways of approaching problems like this. Not only that, but it highlighted the value of test-driven development – instead of groping in the dark to come up with a quick-and-dirty solution (that might or might not work), starting with comprehensive test cases is invaluable for both developing and validating potential solutions.


SQL Problem: Merge record history tables

A colleague asked me if I could help solve a problem in SQL, and I was grateful because I love solving problems in SQL! So we sat down, built a test suite, and worked together to solve it.

I will first present the problem and invite you to have a go at solving it for yourself; tomorrow I will post the solution we came up with. I think our solution is reasonably simple and elegant, but I’m certain there are other solutions as well, some of which may be more elegant or efficient.

I have two “record history” tables, from different data sources, that record the history of changes to two sets of columns describing the same entity. Here are the example tables:

T1 (deptid, from_date, to_date, data1)
T2 (deptid, from_date, to_date, data2)

So T1 records the changes in the column “data1” over time for each deptid; and T2 records the changes in the column “data2” over time for each deptid.

Assumptions/Constraints on T1 and T2:

  • All columns are not Null.
  • All dates are just dates (no time values).
  • All date ranges are valid (i.e. from_date <- to_date).
  • Within each table, all date ranges for a particular deptid are gap-free and have no overlaps.

We want to produce a query that returns the following structure:

T3 (deptid, from_date, to_date, data1, data2)

Where the dates for a particular deptid overlap between T1 and T2, we want to see the corresponding data1 and data2 values. Where the dates do not overlap (e.g. where we have a value for data1 in T1 but not in T2), we want to see the corresponding data1 but NULL for data2; and vice versa.

Below is the setup and test cases you can use if you wish. Alternatively, you can use the SQL Fiddle here.

create table t1
  (deptid number not null
  ,from_date date not null
  ,to_date date not null
  ,data1 varchar2(30) not null
  ,primary key (deptid, from_date)
  );

create table t2
  (deptid number not null
  ,from_date date not null
  ,to_date date not null
  ,data2 varchar2(30) not null
  ,primary key (deptid, from_date)
  );

--Test Case 1 (4001)
--T1  |-------------||-------------|
--T2         |-------------||-------------||-------------|

insert into t1 values (4001, date'2015-01-01', date'2015-01-31', 'jan data 1');
insert into t1 values (4001, date'2015-02-01', date'2015-02-28', 'feb data 1');
insert into t2 values (4001, date'2015-01-15', date'2015-02-14', 'jan-feb data 2');
insert into t2 values (4001, date'2015-02-15', date'2015-03-14', 'feb-mar data 2');
insert into t2 values (4001, date'2015-03-15', date'2015-04-14', 'mar-apr data 2');

--Expected Output:
--DEPTID  FROM_DATE    TO_DATE      DATA1           DATA2
--======  ===========  ===========  ==============  ==============
--4001    01-JAN-2015  14-JAN-2015  jan data 1
--4001    15-JAN-2015  31-JAN-2015  jan data 1      jan-feb data 2
--4001    01-FEB-2015  14-FEB-2015  feb data 1      jan-feb data 2
--4001    15-FEB-2015  28-FEB-2015  feb data 1      feb-mar data 2
--4001    01-MAR-2015  14-MAR-2015                  feb-mar data 2
--4001    15-MAR-2015  14-APR-2015                  mar-apr data 2

--Test Case 2 (4002)
--T1         |-------------||-------------||-------------|
--T2  |-------------||-------------|

insert into t1 values (4002, date'2015-01-15', date'2015-02-14', 'jan-feb data 1');
insert into t1 values (4002, date'2015-02-15', date'2015-03-14', 'feb-mar data 1');
insert into t1 values (4002, date'2015-03-15', date'2015-04-14', 'mar-apr data 1');
insert into t2 values (4002, date'2015-01-01', date'2015-01-31', 'jan data 2');
insert into t2 values (4002, date'2015-02-01', date'2015-02-28', 'feb data 2');

--Expected Output:
--DEPTID  FROM_DATE    TO_DATE      DATA1           DATA2
--======  ===========  ===========  ==============  ==============
--4002    01-JAN-2015  14-JAN-2015                  jan data 2
--4002    15-JAN-2015  31-JAN-2015  jan-feb data 1  jan data 2
--4002    01-FEB-2015  14-FEB-2015  jan-feb data 1  feb data 2
--4002    15-FEB-2015  28-FEB-2015  feb-mar data 1  feb data 2
--4002    01-MAR-2015  14-MAR-2015  feb-mar data 1
--4002    15-MAR-2015  14-APR-2015  mar-apr data 1

--Test Case 3 (4003)
--T1  |-------------|
--T2  |-------------|

insert into t1 values (4003, date'2015-01-01', date'2015-01-31', 'jan data 1');
insert into t2 values (4003, date'2015-01-01', date'2015-01-31', 'jan data 2');

--Expected Output:
--DEPTID  FROM_DATE    TO_DATE      DATA1           DATA2
--======  ===========  ===========  ==============  ==============
--4003    01-JAN-2015  31-JAN-2015  jan data 1      jan data 2

--Test Case 4 (4004)
--T1  |-------------|
--T2      |-----|

insert into t1 values (4004, date'2015-01-01', date'2015-01-31', 'jan data 1');
insert into t2 values (4004, date'2015-01-10', date'2015-01-20', 'jan data int 2');

--Expected Output:
--DEPTID  FROM_DATE    TO_DATE      DATA1           DATA2
--======  ===========  ===========  ==============  ==============
--4004    01-JAN-2015  09-JAN-2015  jan data 1
--4004    10-JAN-2015  20-JAN-2015  jan data 1      jan data int 2
--4004    21-JAN-2015  31-JAN-2015  jan data 1

--Test Case 5 (4005)
--T1      |-----|
--T2  |-------------|

insert into t1 values (4005, date'2015-01-10', date'2015-01-20', 'jan data int 1');
insert into t2 values (4005, date'2015-01-01', date'2015-01-31', 'jan data 2');

--Expected Output:
--DEPTID  FROM_DATE    TO_DATE      DATA1           DATA2
--======  ===========  ===========  ==============  ==============
--4005    01-JAN-2015  09-JAN-2015                  jan data 2
--4005    10-JAN-2015  20-JAN-2015  jan data int 1  jan data 2
--4005    21-JAN-2015  31-JAN-2015                  jan data 2

--Test Case 6 (4006)
--T1  ||------------|
--T2  |-------------|

insert into t1 values (4006, date'2015-01-01', date'2015-01-01', 'jan data a 1');
insert into t1 values (4006, date'2015-01-02', date'2015-01-31', 'jan data b 1');
insert into t2 values (4006, date'2015-01-01', date'2015-01-31', 'jan data 2');

--Expected Output:
--DEPTID  FROM_DATE    TO_DATE      DATA1           DATA2
--======  ===========  ===========  ==============  ==============
--4006    01-JAN-2015  01-JAN-2015  jan data a 1    jan data 2
--4006    02-JAN-2015  31-JAN-2015  jan data b 1    jan data 2

--Test Case 7 (4007)
--T1  |-------------|
--T2  ||------------|

insert into t1 values (4007, date'2015-01-01', date'2015-01-31', 'jan data 1');
insert into t2 values (4007, date'2015-01-01', date'2015-01-01', 'jan data a 2');
insert into t2 values (4007, date'2015-01-02', date'2015-01-31', 'jan data b 2');

--Expected Output:
--DEPTID  FROM_DATE    TO_DATE      DATA1           DATA2
--======  ===========  ===========  ==============  ==============
--4007    01-JAN-2015  01-JAN-2015  jan data 1      jan data a 2
--4007    02-JAN-2015  31-JAN-2015  jan data 1      jan data b 2

--Test Case 8 (4008)
--T1  |-------------|
--T2    |---||---||---|
 
insert into t1 values (4008, date'2015-01-01', date'2015-01-31', 'jan data 1');
insert into t2 values (4008, date'2015-01-09', date'2015-01-17', 'jan data a 2');
insert into t2 values (4008, date'2015-01-18', date'2015-01-26', 'jan data b 2');
insert into t2 values (4008, date'2015-01-27', date'2015-02-04', 'jan-feb data 2');
 
--Expected Output:
--DEPTID  FROM_DATE    TO_DATE      DATA1           DATA2
--======  ===========  ===========  ==============  ==============
--4008    01-JAN-2015  08-JAN-2015  jan data 1
--4008    09-JAN-2015  17-JAN-2015  jan data 1      jan data a 2
--4008    18-JAN-2015  26-JAN-2015  jan data 1      jan data b 2
--4008    27-JAN-2015  31-JAN-2015  jan data 1      jan-feb data 2
--4008    01-FEB-2015  04-FEB-2015                  jan-feb data 2

--Test Case 9 (4009)
--T1    |---||---||---|
--T2  |-------------|
 
insert into t1 values (4009, date'2015-01-09', date'2015-01-17', 'jan data a 1');
insert into t1 values (4009, date'2015-01-18', date'2015-01-26', 'jan data b 1');
insert into t1 values (4009, date'2015-01-27', date'2015-02-04', 'jan-feb data 1');
insert into t2 values (4009, date'2015-01-01', date'2015-01-31', 'jan data 2');
 
--Expected Output:
--DEPTID  FROM_DATE    TO_DATE      DATA1           DATA2
--======  ===========  ===========  ==============  ==============
--4009    01-JAN-2015  08-JAN-2015                  jan data 2
--4009    09-JAN-2015  17-JAN-2015  jan data a 1    jan data 2
--4009    18-JAN-2015  26-JAN-2015  jan data b 1    jan data 2
--4009    27-JAN-2015  31-JAN-2015  jan-feb data 1  jan data 2
--4009    01-FEB-2015  04-FEB-2015  jan-feb data 1

EDIT 15/10/2015: added Kim Berg Hansen’s test cases

UPDATE 15/10/2015: solutions (spoiler alert!)


Static File not updating in runtime APEX environment

The UAT environment is a runtime APEX installation (4.2.4.00.08) and all deployments are done via SQL scripts. My application uses a small number of static files that for convenience we are serving from APEX (at least for now); to deploy changes to these static files, I export f100.sql and static_files.sql from APEX in dev and we run them in the target environment after setting a few variables, like this:

declare
  v_workspace CONSTANT VARCHAR2(100) := 'MYWORKSPACE';
  v_workspace_id NUMBER;
begin
  select workspace_id into v_workspace_id
  from apex_workspaces where workspace = v_workspace;
  apex_application_install.set_workspace_id (v_workspace_id);
  apex_util.set_security_group_id
    (p_security_group_id => apex_application_install.get_workspace_id);
  apex_application_install.set_schema('MYSCHEMA');
  apex_application_install.set_application_id(100);
end;
/

@f100.sql
@static_file.sql

Many months after this application went live, and after multiple deployments in all the environments, we suddenly had an issue where the static files being served from one instance (UAT) were an older version. The logs showed the correct files had been deployed, and re-deploying into DEV seemed to work fine. I got the DBA to temporarily change the schema password in UAT so I could login to see what was going on.

When I ran this query in DEV, I got the expected two records:

select * from apex_workspace_files
where file_name in ('myapp.css', 'myapp.js');

When I ran it in UAT, I got four records – two copies of each file, and the BLOB contents showed that the older copies were the ones being served to the clients. I have no idea how the extra copies got created in that environment. It must have been due to a failed deployment but the deployment logs didn’t seem to show any errors or anomalies.

Killing the Zombie Static File

I tried editing the static_file.sql script to remove the files (as below), but it only ever removed the new files that were created; re-running it never causes it to drop the old file copies.

...
declare
  l_name    varchar2(255);
begin
  l_name := 'myapp.css';
  wwv_flow_html_api.remove_html(
    p_html_name => l_name,
    p_flow_id   => nvl(wwv_flow.g_flow_id, 0) );
end;
/
...

Next thing I tried was something I picked up from here:

NOTE: run this at your own risk! It is not supported by Oracle.

declare
  v_workspace CONSTANT VARCHAR2(100) := 'MYWORKSPACE';
  v_workspace_id NUMBER;
begin
*** WARNING: DO NOT RUN THIS UNLESS YOU KNOW WHAT YOU ARE DOING ***
  select workspace_id into v_workspace_id
  from apex_workspaces where workspace = v_workspace;
  apex_application_install.set_workspace_id (v_workspace_id);
  apex_util.set_security_group_id
    (p_security_group_id => apex_application_install.get_workspace_id);
  delete from wwv_flow_files where filename like 'myapp.%';
*  commit;
end;
/

That seemed to do the trick. Thankfully this problem only occurred in a test environment – I would be uncomfortable running this in Prod.


“Smart quotes” showing as “?” in emails

When some of my users were using my system to send emails, they’d often copy-and-paste their messages from their favourite word processor, but when my system sent the emails they’d have question marks dotted around, e.g.

“Why doesn’t this work?”

would get changed to

?Why doesn?t? this work??

Simple fix was to detect and replace those fancy-pants quote characters with the equivalent html entities, e.g.:

function enc_chars (m in varchar2) return varchar2 is
begin
  return replace(replace(replace(replace(m
    ,chr(14844060),'&#8220;')/*left double quote*/
    ,chr(14844061),'&#8221;')/*right double quote*/
    ,chr(96)      ,'&#8216;')/*left single quote*/
    ,chr(14844057),'&#8217;')/*right single quote*/
    ;
end enc_chars;

P.S. Stupid wordpress keeps mucking around with my code, trying to replace the html entities with the unencoded versions. In case this doesn’t work, here’s an image of what the above code is supposed to look like:
enc_chars


Help for your keyboard users

APEX’s Blue Responsive Theme 25 is a great theme for building a user-friendly website, and unlike many other themes which make the item labels clickable to get help, it renders little question-mark icons to the right of each item that has a help message defined.

theme-25-help-icon

One issue with this design, however, is that a keyboard-savvy user (such as myself) hates to switch between keyboard and mouse – so they Tab between each field as they fill in a form. With this theme, however, those little question-mark icons are in the tab order, so the user has to Tab twice between each field. If they’re not looking at the page they might forget; and if one item doesn’t happen to have a Help icon, they have to remember to only Tab once. All of this adds up to a poor user experience.

To fix this, we simply tell the browser to move the Help icons out of the tab order – and yet again, jQuery comes to the rescue as we can simply pick out all the elements with the class itemHelpButton and set their tabindex to “-1”:

$(".itemHelpButton").attr('tabindex',-1);

Put this code in the page attribute Execute when Page Loads – when the page loads, this will set the tabindex on all the help icons on the page, so now the user can tab through the page without interruption.