Skip to content

A fascinating SQL problem

July 29, 2010

Can you solve this problem in SQL – i.e. without resorting to a procedural solution?

SQL combine multiple identifiers to create a group id for duplicate records

“I have a data extract with three different identifiers: A, B, C
Each identifier may appear in more than one row, and each row may have one or more of these three identifiers (i.e the column is populated or null).
I want to be able to group all records that have any combination of either A, B or C in common and assign them the same group id.
Extract table showing what the eventual groups should be:

A    | B    | C    | Group
====   ====   ====   =====
p      NULL   NULL   1
p      r      NULL   1
q      NULL   NULL   2
NULL   r      NULL   1
NULL   NULL   s      2
q      NULL   s      2

So, the input data is a table with three columns (A, B, and C), some of which are NULL. The output is a third column, “Group”, which will be assigned a number which classifies the row into a “group”. Each group will be distinct in that none of its members will have a value in A, B or C that appears in any row in any other group.

This question is fascinating because it cannot be solved, I believe, without some form of iteration. If I get the row (p) along with (q), they are in two different groups; but, if I add the row (p,q), all of a sudden my original rows are now in the same group along with the new row.

The solution will probably have to examine each row in consideration with the entire record set – an operation of O(n^2), if my understanding of CS theory is correct. I suspect a solution using at least a CTE and/or the MODEL clause will be required.

UPDATE:
An elegant solution, using a hierarchical query and Oracle’s CONNECT_BY_ROOT function, has been posted by Vincent Malgrat.

About these ads

From → SQL

Leave a Comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 200 other followers

%d bloggers like this: