Category: SQL

# SQL Problem Solved: Merge History Tables

Yesterday I published a SQL problem (that I I had solved – or at least, thought I had solved!) and invited readers to submit their own solutions.

SPOILER ALERT: if you haven’t had a chance to have a go yourself, go back, don’t read the comments (yet), but see if you can solve it yourself.

I was very pleased to see two excellent submissions that were both correct as far as I could tell, and are (in my opinion) sufficiently simple, elegant and efficient. Note that this was not a competition and I will not be ranking or rating or anything like that. If you need to solve a similar problem, I would suggest you try all of these solutions out and pick the one that seems to work the best for you.

Solution #1

Stew Ashton supplied the first solution which was very simple and elegant:

```with dates as (
select distinct deptid, dte from (
select deptid, from_date, to_date+1 to_date from t1
union all
select deptid, from_date, to_date+1 to_date from t2
)
unpivot(dte for col in(from_date, to_date))
)
select * from (
select distinct a.deptid, dte from_date,
lead(dte) over(partition by a.deptid order by dte) - 1 to_date,
data1, data2 from dates a
left join t1 b on a.deptid = b.deptid and a.dte between b.from_date and b.to_date
left join t2 c on a.deptid = c.deptid and a.dte between c.from_date and c.to_date
)
where data1 is not null or data2 is not null
order by 1,2;
```

SQL Fiddle

Stew’s solution does a quick pass over the two tables and generates a “date dimension” table in memory; it then does left joins to each table to generate each row. It may have a disadvantage due to the requirement to make two passes over the tables, which might be mitigated by suitable indexing. As pointed out by Matthias Rogel, this solution can be easily generalised to more than two tables, which is a plus.

Solution #2

The second solution was provided by Kim Berg Hansen which was a bit longer but in my opinion, still simple and elegant:

```with v1 as (
select t.deptid
, case r.l
when 1 then t.from_date
when 2 then date '0001-01-01'
when 3 then t.max_to + 1
end from_date
, case r.l
when 1 then t.to_date
when 2 then t.min_from -1
when 3 then date '9999-01-01'
end to_date
, case r.l
when 1 then t.data1
end data1
from (
select deptid
, from_date
, to_date
, data1
, row_number() over (partition by deptid order by from_date) rn
, min(from_date) over (partition by deptid) min_from
, max(to_date) over (partition by deptid) max_to
from t1
) t, lateral(
select level l
from dual
connect by level <= case t.rn when 1 then 3 else 1 end
) r
), v2 as (
select t.deptid
, case r.l
when 1 then t.from_date
when 2 then date '0001-01-01'
when 3 then t.max_to + 1
end from_date
, case r.l
when 1 then t.to_date
when 2 then t.min_from -1
when 3 then date '9999-01-01'
end to_date
, case r.l
when 1 then t.data2
end data2
from (
select deptid
, from_date
, to_date
, data2
, row_number() over (partition by deptid order by from_date) rn
, min(from_date) over (partition by deptid) min_from
, max(to_date) over (partition by deptid) max_to
from t2
) t, lateral(
select level l
from dual
connect by level <= case t.rn when 1 then 3 else 1 end
) r
)
select v1.deptid
, greatest(v1.from_date, v2.from_date) from_date
, least(v1.to_date, v2.to_date) to_date
, v1.data1
, v2.data2
from v1
join v2
on v2.deptid = v1.deptid
and v2.to_date >= v1.from_date
and v2.from_date <= v1.to_date
where v2.data2 is not null or v1.data1 is not null
order by v1.deptid
, greatest(v1.from_date, v2.from_date)
```

Since it takes advantage of 12c’s new LATERAL feature, I wasn’t able to test it on my system, but I was able to test it on Oracle’s new Live SQL and it passed with flying colours. I backported it to 11g and it worked just fine on my system as well. An advantage of Kim’s solution is that it should only make one pass over each of the two tables.

Oracle Live SQL: merge-history-kimberghansen (this is the 12c version – Oracle login required)

SQL Fiddle (this is my attempt to backport to 11g – you can judge for yourself if I was faithful to the original)

Solution #3

When my colleague came to me with this problem, I started by creating a test suite including as many test cases as we could think of. Developing and testing the solution against these test cases proved invaluable and allowed a number of ideas to be tested and thrown away. Big thanks go to Kim Berg Hansen for providing an additional test case, which revealed a small bug in my original solution.

```with q1 as (
select nvl(t1.deptid, t2.deptid) as deptid
,greatest(nvl(t1.from_date, t2.from_date), nvl(t2.from_date, t1.from_date)) as gt_from_date
,least(nvl(t1.to_date, t2.to_date), nvl(t2.to_date, t1.to_date))            as lt_to_date
,t1.from_date AS t1_from_date
,t1.to_date   AS t1_to_date
,t2.from_date AS t2_from_date
,t2.to_date   AS t2_to_date
,min(greatest(t1.from_date, t2.from_date)) over (partition by t1.deptid) - 1 as fr_to_date
,max(least(t1.to_date, t2.to_date)) over (partition by t2.deptid)        + 1 as lr_from_date
,t1.data1
,t2.data2
,row_number() over (partition by t1.deptid, t2.deptid order by t1.from_date) as rn_fr1
,row_number() over (partition by t1.deptid, t2.deptid order by t2.from_date) as rn_fr2
,row_number() over (partition by t1.deptid, t2.deptid order by t1.to_date desc) as rn_lr1
,row_number() over (partition by t1.deptid, t2.deptid order by t2.to_date desc) as rn_lr2
from   t1
full outer join t2
on     t1.deptid = t2.deptid
and    (t1.from_date between t2.from_date and t2.to_date
or t1.to_date between t2.from_date and t2.to_date
or t2.from_date between t1.from_date and t1.to_date
or t2.to_date between t1.from_date and t1.to_date)
order by 1, 2
)
select deptid
,gt_from_date AS from_date
,lt_to_date   AS to_date
,data1
,data2
from   q1
union all
select deptid
,t1_from_date as from_date
,fr_to_date AS to_date
,data1
,'' AS data2
from   q1
where  fr_to_date between t1_from_date and t1_to_date
and    rn_fr1 = 1
union all
select deptid
,t2_from_date as from_date
,fr_to_date AS to_date
,'' AS data1
,data2
from   q1
where  fr_to_date between t2_from_date and t2_to_date
and    rn_fr2 = 1
union all
select deptid
,lr_from_date as from_date
,t2_to_date AS to_date
,'' AS data1
,data2
from   q1
where  lr_from_date between t2_from_date and t2_to_date
and    rn_lr2 = 1
union all
select deptid
,lr_from_date as from_date
,t1_to_date AS to_date
,data1
,'' AS data2
from   q1
where  lr_from_date between t1_from_date and t1_to_date
and    rn_lr1 = 1
order by deptid, from_date
```

SQL Fiddle

This solution is probably the ugliest of the three, and though it only requires one pass over the two tables, it materializes all the data in a temp table so probably requires a fair amount of memory for large data sets.

I’m so glad I posted this problem – I received help in the form of two quite different solutions, which help to show different ways of approaching problems like this. Not only that, but it highlighted the value of test-driven development – instead of groping in the dark to come up with a quick-and-dirty solution (that might or might not work), starting with comprehensive test cases is invaluable for both developing and validating potential solutions.

# SQL Problem: Merge record history tables

A colleague asked me if I could help solve a problem in SQL, and I was grateful because I love solving problems in SQL! So we sat down, built a test suite, and worked together to solve it.

I will first present the problem and invite you to have a go at solving it for yourself; tomorrow I will post the solution we came up with. I think our solution is reasonably simple and elegant, but I’m certain there are other solutions as well, some of which may be more elegant or efficient.

I have two “record history” tables, from different data sources, that record the history of changes to two sets of columns describing the same entity. Here are the example tables:

```T1 (deptid, from_date, to_date, data1)
T2 (deptid, from_date, to_date, data2)
```

So T1 records the changes in the column “data1” over time for each deptid; and T2 records the changes in the column “data2” over time for each deptid.

Assumptions/Constraints on T1 and T2:

• All columns are not Null.
• All dates are just dates (no time values).
• All date ranges are valid (i.e. from_date <- to_date).
• Within each table, all date ranges for a particular deptid are gap-free and have no overlaps.

We want to produce a query that returns the following structure:

`T3 (deptid, from_date, to_date, data1, data2)`

Where the dates for a particular deptid overlap between T1 and T2, we want to see the corresponding data1 and data2 values. Where the dates do not overlap (e.g. where we have a value for data1 in T1 but not in T2), we want to see the corresponding data1 but NULL for data2; and vice versa.

Below is the setup and test cases you can use if you wish. Alternatively, you can use the SQL Fiddle here.

```create table t1
(deptid number not null
,from_date date not null
,to_date date not null
,data1 varchar2(30) not null
,primary key (deptid, from_date)
);

create table t2
(deptid number not null
,from_date date not null
,to_date date not null
,data2 varchar2(30) not null
,primary key (deptid, from_date)
);

--Test Case 1 (4001)
--T1  |-------------||-------------|
--T2         |-------------||-------------||-------------|

insert into t1 values (4001, date'2015-01-01', date'2015-01-31', 'jan data 1');
insert into t1 values (4001, date'2015-02-01', date'2015-02-28', 'feb data 1');
insert into t2 values (4001, date'2015-01-15', date'2015-02-14', 'jan-feb data 2');
insert into t2 values (4001, date'2015-02-15', date'2015-03-14', 'feb-mar data 2');
insert into t2 values (4001, date'2015-03-15', date'2015-04-14', 'mar-apr data 2');

--Expected Output:
--DEPTID  FROM_DATE    TO_DATE      DATA1           DATA2
--======  ===========  ===========  ==============  ==============
--4001    01-JAN-2015  14-JAN-2015  jan data 1
--4001    15-JAN-2015  31-JAN-2015  jan data 1      jan-feb data 2
--4001    01-FEB-2015  14-FEB-2015  feb data 1      jan-feb data 2
--4001    15-FEB-2015  28-FEB-2015  feb data 1      feb-mar data 2
--4001    01-MAR-2015  14-MAR-2015                  feb-mar data 2
--4001    15-MAR-2015  14-APR-2015                  mar-apr data 2

--Test Case 2 (4002)
--T1         |-------------||-------------||-------------|
--T2  |-------------||-------------|

insert into t1 values (4002, date'2015-01-15', date'2015-02-14', 'jan-feb data 1');
insert into t1 values (4002, date'2015-02-15', date'2015-03-14', 'feb-mar data 1');
insert into t1 values (4002, date'2015-03-15', date'2015-04-14', 'mar-apr data 1');
insert into t2 values (4002, date'2015-01-01', date'2015-01-31', 'jan data 2');
insert into t2 values (4002, date'2015-02-01', date'2015-02-28', 'feb data 2');

--Expected Output:
--DEPTID  FROM_DATE    TO_DATE      DATA1           DATA2
--======  ===========  ===========  ==============  ==============
--4002    01-JAN-2015  14-JAN-2015                  jan data 2
--4002    15-JAN-2015  31-JAN-2015  jan-feb data 1  jan data 2
--4002    01-FEB-2015  14-FEB-2015  jan-feb data 1  feb data 2
--4002    15-FEB-2015  28-FEB-2015  feb-mar data 1  feb data 2
--4002    01-MAR-2015  14-MAR-2015  feb-mar data 1
--4002    15-MAR-2015  14-APR-2015  mar-apr data 1

--Test Case 3 (4003)
--T1  |-------------|
--T2  |-------------|

insert into t1 values (4003, date'2015-01-01', date'2015-01-31', 'jan data 1');
insert into t2 values (4003, date'2015-01-01', date'2015-01-31', 'jan data 2');

--Expected Output:
--DEPTID  FROM_DATE    TO_DATE      DATA1           DATA2
--======  ===========  ===========  ==============  ==============
--4003    01-JAN-2015  31-JAN-2015  jan data 1      jan data 2

--Test Case 4 (4004)
--T1  |-------------|
--T2      |-----|

insert into t1 values (4004, date'2015-01-01', date'2015-01-31', 'jan data 1');
insert into t2 values (4004, date'2015-01-10', date'2015-01-20', 'jan data int 2');

--Expected Output:
--DEPTID  FROM_DATE    TO_DATE      DATA1           DATA2
--======  ===========  ===========  ==============  ==============
--4004    01-JAN-2015  09-JAN-2015  jan data 1
--4004    10-JAN-2015  20-JAN-2015  jan data 1      jan data int 2
--4004    21-JAN-2015  31-JAN-2015  jan data 1

--Test Case 5 (4005)
--T1      |-----|
--T2  |-------------|

insert into t1 values (4005, date'2015-01-10', date'2015-01-20', 'jan data int 1');
insert into t2 values (4005, date'2015-01-01', date'2015-01-31', 'jan data 2');

--Expected Output:
--DEPTID  FROM_DATE    TO_DATE      DATA1           DATA2
--======  ===========  ===========  ==============  ==============
--4005    01-JAN-2015  09-JAN-2015                  jan data 2
--4005    10-JAN-2015  20-JAN-2015  jan data int 1  jan data 2
--4005    21-JAN-2015  31-JAN-2015                  jan data 2

--Test Case 6 (4006)
--T1  ||------------|
--T2  |-------------|

insert into t1 values (4006, date'2015-01-01', date'2015-01-01', 'jan data a 1');
insert into t1 values (4006, date'2015-01-02', date'2015-01-31', 'jan data b 1');
insert into t2 values (4006, date'2015-01-01', date'2015-01-31', 'jan data 2');

--Expected Output:
--DEPTID  FROM_DATE    TO_DATE      DATA1           DATA2
--======  ===========  ===========  ==============  ==============
--4006    01-JAN-2015  01-JAN-2015  jan data a 1    jan data 2
--4006    02-JAN-2015  31-JAN-2015  jan data b 1    jan data 2

--Test Case 7 (4007)
--T1  |-------------|
--T2  ||------------|

insert into t1 values (4007, date'2015-01-01', date'2015-01-31', 'jan data 1');
insert into t2 values (4007, date'2015-01-01', date'2015-01-01', 'jan data a 2');
insert into t2 values (4007, date'2015-01-02', date'2015-01-31', 'jan data b 2');

--Expected Output:
--DEPTID  FROM_DATE    TO_DATE      DATA1           DATA2
--======  ===========  ===========  ==============  ==============
--4007    01-JAN-2015  01-JAN-2015  jan data 1      jan data a 2
--4007    02-JAN-2015  31-JAN-2015  jan data 1      jan data b 2

--Test Case 8 (4008)
--T1  |-------------|
--T2    |---||---||---|

insert into t1 values (4008, date'2015-01-01', date'2015-01-31', 'jan data 1');
insert into t2 values (4008, date'2015-01-09', date'2015-01-17', 'jan data a 2');
insert into t2 values (4008, date'2015-01-18', date'2015-01-26', 'jan data b 2');
insert into t2 values (4008, date'2015-01-27', date'2015-02-04', 'jan-feb data 2');

--Expected Output:
--DEPTID  FROM_DATE    TO_DATE      DATA1           DATA2
--======  ===========  ===========  ==============  ==============
--4008    01-JAN-2015  08-JAN-2015  jan data 1
--4008    09-JAN-2015  17-JAN-2015  jan data 1      jan data a 2
--4008    18-JAN-2015  26-JAN-2015  jan data 1      jan data b 2
--4008    27-JAN-2015  31-JAN-2015  jan data 1      jan-feb data 2
--4008    01-FEB-2015  04-FEB-2015                  jan-feb data 2

--Test Case 9 (4009)
--T1    |---||---||---|
--T2  |-------------|

insert into t1 values (4009, date'2015-01-09', date'2015-01-17', 'jan data a 1');
insert into t1 values (4009, date'2015-01-18', date'2015-01-26', 'jan data b 1');
insert into t1 values (4009, date'2015-01-27', date'2015-02-04', 'jan-feb data 1');
insert into t2 values (4009, date'2015-01-01', date'2015-01-31', 'jan data 2');

--Expected Output:
--DEPTID  FROM_DATE    TO_DATE      DATA1           DATA2
--======  ===========  ===========  ==============  ==============
--4009    01-JAN-2015  08-JAN-2015                  jan data 2
--4009    09-JAN-2015  17-JAN-2015  jan data a 1    jan data 2
--4009    18-JAN-2015  26-JAN-2015  jan data b 1    jan data 2
--4009    27-JAN-2015  31-JAN-2015  jan-feb data 1  jan data 2
--4009    01-FEB-2015  04-FEB-2015  jan-feb data 1
```

EDIT 15/10/2015: added Kim Berg Hansen’s test cases

# Trivia question: Why does the 2nd Drop Work?

I have an Oracle database (10g or 11g, doesn’t matter) with an empty schema. I am the only user on the system and I only have one session open.

I create a single table (FOO) and run a few simple SQL commands. Then, I run this, twice in a row without any intermediate steps:

`DROP TABLE foo;`

The first time, some errors are raised and the table is not dropped.

The second time, it succeeds – the table is dropped without error.

Q. What did I do to cause this behaviour (which is 100% reproducible) – and what errors were reported?

A. Congratulations to Kirill Leontiev, Matthias Rogel, Charles Hooper and Mette who all found the answer!

# Workaround for ORA-04063: view “x” has errors

When you deploy a procedure, function or package that has a compilation error, the object is still created, and you can still apply grants on them. This is convenient when deploying a large number of objects, meaning you don’t have to get them all in the right order. After deploying your schema, you can just recompile the invalid objects.

Unfortunately, this doesn’t work for views. Now, normally if you create a view with compilation errors, the view will not be created at all; for a deployment script, however, you could use CREATE FORCE VIEW that means the view will be created (but marked invalid).

Let’s say you have a view that depends on a table that doesn’t exist yet – and won’t exist until much later in your deployment scripts. So you create the view with the FORCE option – success. Then, you apply the GRANTs for the view, and get this:

```ORA-04063: view "x" has errors
```

Why? If you try to grant on a procedure, function or package that has errors, it works fine. For views, apparently, this is not allowed.

Obviously, to solve this you might do the hard work and reorder your deployment scripts so that they create every object in the perfectly correct order, avoiding compilation errors entirely. If you have a large number of objects to deploy, this might be more trouble than you want. Well, there is a workaround:

1. Create the view, minus the bit that causes a compilation error.

2. Apply the grant.

3. Recreate the view, compilation error and all. The grant will remain.

Why might this be useful? In my case, we have two databases connected by database links (on both sides); and we need to deploy a large number of objects to both instances. They are managed by different teams, so we want to be able to deploy the changes to each independently. For the most part, the objects on “our side” compile fine, except for some views that refer to objects on the other side of the database link; but they won’t exist until the other team deploys their changes. We could even have a chicken-and-egg problem, when their views refer to objects on our instance; either way, some of the objects cannot be created error-free until both deployments have been completed.

As it stands, we have two options: deploy everything as best we can, then afterwards (when both deployments have completed), recompile the invalid objects and apply the view grants. An alternative is to use this workaround.

Demonstration

TEST CASE #1: cannot grant on a view with errors

```SQL> create or replace force view testview as
select 1 as col from bla;

Warning: View created with compilation errors.

SQL> grant select on testview to someone;

ORA-04063: view "USER.TESTVIEW" has errors

SQL> select grantee, privilege from user_tab_privs
where table_name = 'TESTVIEW';

no rows selected
```

TEST CASE #2: grant on a view with errors

```SQL> create or replace view testview as
select 1 as col from dual;

View created.

SQL> grant select on testview to someone;

Grant succeeded.

SQL> create or replace force view testview as
select 1 as col from bla;

Warning: View created with compilation errors.

SQL> select grantee, privilege from user_tab_privs
where table_name = 'TESTVIEW';

GRANTEE PRIVILEGE
======= =========
SOMEONE SELECT
```

The above has been tested on Oracle 10gR2 and 11gR2. Should this mean that Oracle should not really raise ORA-04063 in this case? I think so.

# Top 10 Confusing Things in Oracle

Every mature language, platform or system has little quirks, eccentricities, and anachronisms that afficionados just accept as “that’s the way it is” and that look weird, or outlandishly strange to newbies and outsiders. The more mature, and more widely used is the product, the more resistance to change there will be – causing friction that helps to ensure these misfeatures survive.

Oracle, due to the priority placed on backwards compatibility, and its wide adoption, is not immune to this phenomenon. Unless a feature is actively causing things to break, as long as there are a significant number of sites using it, it’s not going to change. In some cases, the feature might be replaced and the original deprecated and eventually removed; but for core features such as SQL and PL/SQL syntax, especially the semantics of the basic data types, it is highly unlikely these will ever change.

So here I’d like to list what I believe are the things in Oracle that most frequently confuse people. These are not necessarily intrinsically complicated – just merely unintuitive, especially to a child of the 90’s or 00’s who was not around when these things were first implemented, when the idea of “best practice” had barely been invented; or to someone more experienced in other technologies like SQL Server or Java. These are things I see questions about over and over again – both online and in real life. Oh, and before I get flamed – another disclaimer: some of these are not unique to Oracle – some of them are more to do with the SQL standard; some of them are caused by a lack of understanding of the relational model of data.

Once you know them, they’re easy – you come to understand the reasons (often historical) behind them; eventually, the knowledge becomes so ingrained, it’s difficult to remember what it was like beforehand.

# Top 10 Confusing Things in Oracle

10. ## Every User is/has a Schema, every Schema is/has a User

Got something to add to the list? Drop me a note below.

More resources: