Why Kyte Tolerates Poor Questions

I have Tom Kyte’s “Ask Tom” in my feed reader, and every now and then it opens for questions and the flood pours in. Quite often there is an interesting question, but more often than not there are quite a lot of “poor” questions: some just poorly worded, some too-easily-googleable, others are could-have-answered-from-the-docs questions. It’s not dissimilar to StackOverflow, which doesn’t suffer so much because of the army of people who work together to either improve or close these types of questions. Sometimes I think “Why did you waste Tom’s time with that? You could have googled it, searched the docs, or opened SO and you’d probably find a good answer there without even having to ask it.”

However, my impression is wrong, I think, because my feed reader only shows me the initial question, and not the detailed follow-up by Tom and others. I was reminded today of how a poor question can still lead to enlightenment, by this excellent quote by Tom:

As things grow and change over time – engineers need to use different approaches. The planes I fly in have wings that go straight out from the plane. That is because I fly at about 500 mph. A fighter jet at mach 2 would be going 1,500 mph. The wings on a fighter jet are significantly different in their architecture – because if they weren’t it wouldn’t work. We have different types of bridges in order to cross different types of chasms. We have different building styles for different types of builds (stick and brick for a 2 story house, steel girders in a frame for 30 story buildings).

As you scale up, the solutions that worked for trivial amounts of data will not necessarily work for large amounts of data.

If you are in a place that says “make it work but do not change code” – expect to have a bad day.

(Source)

And, if you’re wondering how he keeps up with all this:

After 14 years on asktom and 20 years total (usenet news groups last century) answering tens of thousands of questions, I took a break from social media stuff.

(Source)

We could all do with a break from time to time, I think 🙂


Which packages might raise “ORA-04068 existing state of package has been discarded”?

PhotoSpinOffice Imagery© 2001 PhotoSpinwww.photospin.comI recently saw this question on StackOverflow (“Is there any way to determine if a package has state in Oracle?”) which caught my attention.

You’re probably already aware that when a package is recompiled, any sessions that were using that package won’t even notice the change; unless that package has “state” – i.e. if the package has one or more package-level variables or constants. The current value of these variables and constants is kept in the PGA for each session; but if you recompile the package (or modify something on which the package depends), Oracle cannot know for certain whether the new version of the package needs to reset the values of the variables or not, so it errs on the side of caution and discards them. The next time the session tries to access the package in any way, Oracle will raise ORA-04068, and reset the package state. After that, the session can try again and it will work fine.

Side Note: There are a number of approaches to solving the ORA-04068 problem, some of which are given as answers to this question here. Not all of them are appropriate for every situation. Another approach not mentioned there is to avoid or minimize it – move all the package variables to a separate package, which hopefully will be invalidated less often.

It’s quite straightforward to tell whether a given package has “state” and thus has the potential for causing ORA-04068: look for any variables or constants declared in the package specification or body. If you have a lot of packages, however, you might want to get a listing of all of them. To do this, you can use the new PL/Scope feature introduced in Oracle 11g.

select object_name AS package,
       type,
       name AS variable_name
from user_identifiers
where object_type IN ('PACKAGE','PACKAGE BODY')
and usage = 'DECLARATION'
and type in ('VARIABLE','CONSTANT')
and usage_context_id in (
  select usage_id
  from user_identifiers
  where type = 'PACKAGE'
);

If you have compiled the packages in the schema with PL/Scope on (i.e. alter session set plscope_settings='IDENTIFIERS:ALL';), this query will list all the packages and the variables that mean they will potentially have state.

Before this question was raised, I hadn’t used PL/Scope for real; it was quite pleasing to see how easy it was to use to answer this particular question. This also illustrates a good reason why I like to hang out on Stackoverflow – it’s a great way to learn something new every day.



Difference Between Two Timestamps

Easy, right? Perhaps not as straightforward as you’d think.

Method #1: use EXTRACT

extract(day from (x-y))*24*60*60
+ extract(hour from (x-y))*60*60
+ extract(minute from (x-y))*60
+ extract(second from (x-y))

Method #2: use CAST

( CAST( x AS DATE ) - CAST( y AS DATE ) ) * 86400

The difference? Method #2 is faster (my tests indicate faster by a factor of 3), but does not include fractional seconds. Method #1 is a bit slower, but includes fractions of a second. See the SO link for details.


Tag wikis on StackOverflow

StackOverflow now allows the creation of wiki articles and tag synonyms.

I’ve gone ahead and started a few articles about a few topics dear to my heart:

These articles are not intended to be replacements for the documentation or Wikipedia, but primarily as a guide for people choosing tags when asking questions.

If you click on the “oracle” tag you’ll see a large number of “related tags”, some of which could probably benefit from additional articles.


A fascinating SQL problem

Can you solve this problem in SQL – i.e. without resorting to a procedural solution?

SQL combine multiple identifiers to create a group id for duplicate records

“I have a data extract with three different identifiers: A, B, C
Each identifier may appear in more than one row, and each row may have one or more of these three identifiers (i.e the column is populated or null).
I want to be able to group all records that have any combination of either A, B or C in common and assign them the same group id.
Extract table showing what the eventual groups should be:

A    | B    | C    | Group
====   ====   ====   =====
p      NULL   NULL   1
p      r      NULL   1
q      NULL   NULL   2
NULL   r      NULL   1
NULL   NULL   s      2
q      NULL   s      2

So, the input data is a table with three columns (A, B, and C), some of which are NULL. The output is a third column, “Group”, which will be assigned a number which classifies the row into a “group”. Each group will be distinct in that none of its members will have a value in A, B or C that appears in any row in any other group.

This question is fascinating because it cannot be solved, I believe, without some form of iteration. If I get the row (p) along with (q), they are in two different groups; but, if I add the row (p,q), all of a sudden my original rows are now in the same group along with the new row.

The solution will probably have to examine each row in consideration with the entire record set – an operation of O(n^2), if my understanding of CS theory is correct. I suspect a solution using at least a CTE and/or the MODEL clause will be required.

UPDATE:
An elegant solution, using a hierarchical query and Oracle’s CONNECT_BY_ROOT function, has been posted by Vincent Malgrat.


If at first you don’t succeed… it’s impossible.

How many times have you tried something, got either an error or unexpected results, and decided what you were trying to do was not possible? Have you later on discovered someone quietly doing the impossible?

I think this phenomenon is a form of the “correlation-implies-causation” fallacy.

Unfortunately, this seems to happen too often, if the kind of questions I see quite often are any guide. A recent example is: “Why cannot I select from more than one table in Oracle?”. Here, the author seems to have followed the following thought process:

  1. “SELECT * FROM table1” returns some rows.
  2. “SELECT * FROM table1, table2” returns no rows.
  3. Therefore, you can’t query more than one table in one SQL statement in Oracle.

In this case, the writer had not realised that table2 had no rows in it; what complicated things somewhat was that in one session, the second query was returning rows – because he’d inserted some rows into table2 in that session but hadn’t issued a COMMIT, so those rows were not visible by other sessions.

For a person inexperienced in SQL or Oracle, this sort of mistake is forgivable; but I suspect we all make this sort of mistake quite often. I know I have!

When trying something new, it takes diligent research and testing to determine whether one’s approach is simply wrong, or if unrelated factors (e.g. getting the syntax wrong, or the environment is not set up correctly) are causing failure. This gets more tiresome and frustrating (a “gumption trap”, in Persig‘s parlance) when one was halfway through solving some other problem, and this unexpected problem gets in the way.

Sometimes you just have to go to bed and see if it becomes clearer the next day. If the problem persists, ask a question on StackOverflow!

P.S. if a Google search reveals “doing X is impossible”, ask “Why?”


Unit Test != Requirements

Are one or more usage examples enough to specify the requirements for something? For example:

rtrim('123000', '0'); would return '123'

No, as can be seen here: Oracle 8, SQL: RTRIM for string manipulation is not working as expected (Stackoverflow)

When I read that question I thought of TDD (Test Driven Development), something I think I should be doing more of. As said here, however, “Are tests sufficient documentation?  Very likely not, but they do form an important part of it.”

I’ve seen unit test cases used as a form of documentation. Generally they could be useful for this – to tell part of the story – but if they only consist of “enter this, expect that”, they will never be good enough to replace requirements documentation.

Footnote: How about the source code – is that sufficient as documentation? In one sense, yes – the source code is the best documentation of what the system does now. What’s lacking, however, is documentation of the business requirements – and this gap can be huge (see e.g. Agile Development and Requirements Management).



OracleOverflow?

I’ve enjoyed reading and participating in StackOverflow for over a year now.

With the introduction of Area51, there is now the possibility of starting a new Q&A site for all things Oracle – as pointed out by Rob Van Wijk and Gary Myers. Sure there are other fora such as OTN, but they don’t have the kinds of features that make SO fun and self-moderating. SO is more like a cross between a Q&A forum and a wiki, with the addition of a democratic system of reputation points that allow good questions and answers to bubble up to the top.

If you agree, please head over to Area 51 – Oracle Databases and Follow it to voice your agreement.