Python: it’s been done before

I’ve learned that with Python, if something takes more than a few lines to write, there’s a very good chance that there’s another way to do it but with one or two lines of code. Chances are someone else has already come across a similar problem, and written a module to solve it – and it’s just a matter of importing their module and reusing it.

The other day I blogged about some functions that query a database and return a list or dictionary. Today I was browsing further in the python docs and learned a much simpler way of achieving the same effect, using named tuples:

>>> from collections import namedtuple

>>> def tableTuples(source, connection, name = None):
        curs = connection.cursor()
        curs.execute('SELECT * FROM (%s)' % (source))
        query_fields = [desc[0] for desc in curs.description]
        Row = namedtuple(name or source, query_fields)
        return [Row(*row) for row in curs]

>>> emp = tableTuples('EMP', orcl)

>>> emp[0]
EMP(EMPNO=7369, ENAME='SMITH', JOB='CLERK', MGR=7902,
HIREDATE=datetime.datetime(1980, 12, 17, 0, 0),
SAL=800.0, COMM=None, DEPTNO=20)

>>> emp[1].ENAME
'ALLEN'

One limitation is that the query must provide names for each column (easy enough with aliases). This limitation is surmountable in Python 2.7 and 3.1, wherein namedtuples support a rename attribute.


Empythoning Oracle

Quite a few months ago I read Wrapping Your Brain Around Oracle + Python and was somewhat interested but too busy to look into it in depth.

Recently I’ve had more free time than I’d like, and wanted to make the most of it by learning a new language. As they say, “Master a few [programming languages]”. I thought about a few languages, remembering several articles that cast Python in a good light. In the end it was a choice between C# on .Net, which seems to be quite popular around this area (at least if job ads are anything to go by), or Python. Python won.

I must say I am very happy with this choice. I love this little language! It has everything, plus much more, and then some. Object-oriented programming, functional programming, multi-threaded programming, list processing, you name it, it’s built in or just an “import x” away. I’ve learned a new concept – duck typing.

Monday – day one.
From www.python.org, downloaded and installed Python 2.6.2 for Windows, and cx_Oracle 5.0.2 since I had no intention of wasting time with Python unless it integrated well with Oracle. Followed the aforementioned article by Catherine Devlin line by line, and was happy to find that everything worked exactly as stated, saw a lot of promise in terms of integration with Oracle, and enjoyed it too.

Tuesday – day two.
Started going through the Python Tutorial. Every few minutes I’d come across some little feature or quirk that just delighted. Started raving about it to whoever happened to come within shouting distance of my cubicle.

Wednesday – day three.
Finished the tutorial. Started browsing through the Standard Library, and got sidetracked down a lot of alleys, trying different ideas out. Got a basic HTTP server running with a few lines of code. Heh.

Thursday – day four.
Decided I need to get a better handle on the various list data types and their syntax. Got a better handle. Also, accidentally flooded one of the CPUs and a whole lot of memory when I queried a table full of large BLOBs, and then multiplied it by some integer (100, from memory)… just to see what would happen.

Friday – day five.
A lot of playing around with Oracle integration. Wrote two functions, tableList and tableDict. I don’t know if these are useful or not, but I learned a lot by writing them.

tableList accepts a table/view name or SQL query and a connection to an Oracle database, and returns a list, each entry is a row, represented as a Python “dictionary”. e.g.

>>> import cx_Oracle
>>> import mystuff
>>> orcl = cx_Oracle.connect('scott/tiger@orcl')
>>> dept = mystuff.tableList('DEPT', orcl)
>>> for row in dept: print row
{'DNAME': 'ACCOUNTING', 'LOC': 'NEW YORK', 'DEPTNO': 10}
{'DNAME': 'RESEARCH', 'LOC': 'DALLAS', 'DEPTNO': 20}
{'DNAME': 'SALES', 'LOC': 'CHICAGO', 'DEPTNO': 30}
{'DNAME': 'OPERATIONS', 'LOC': 'BOSTON', 'DEPTNO': 40}
>>> print dept[0]['LOC']
NEW YORK

tableDict is similar to tableList except that it returns a dictionary instead of a list. It assumes that the first column is a unique identifier, and uses it as the index into the dictionary. Each entry is also a dictionary, each of which represents a row (as for tableList). e.g.:

>>> dept = mystuff.tableDict('DEPT', orcl)
>>> for key, row in dept.iteritems(): print key, '=', row
40 = {'DNAME': 'OPERATIONS', 'LOC': 'BOSTON', 'DEPTNO': 40}
10 = {'DNAME': 'ACCOUNTING', 'LOC': 'NEW YORK', 'DEPTNO': 10}
20 = {'DNAME': 'RESEARCH', 'LOC': 'DALLAS', 'DEPTNO': 20}
30 = {'DNAME': 'SALES', 'LOC': 'CHICAGO', 'DEPTNO': 30}
>>> print dept[20]
{'DNAME': 'RESEARCH', 'LOC': 'DALLAS', 'DEPTNO': 20}
>>> print dept[20]['LOC']
DALLAS

For the morbidly curious, here’s the source code [mystuff.py] (WARNING: this is example code for fun, it does not protect against SQL injection!).

Fun, isn’t it? What I find especially great about Python with cx_Oracle is that good habits like using bind variables and bulk binding are easy and intuitive.

Saturday – day six.
Got the monthly email from the missions team at church, with a list of all the “prayer requests” in a roughly consistent format that I need to upload to the church’s web site (some time ago I tried to give them a web-based data entry screen to enter them in, without success). I think, “hey, this is a perfect candidate for automation using Python.” – in the past I’ve formatted it into INSERT statements by hand, which takes about half an hour or so. A few hours later and I’ve written a short script (81 lines*) that does the job very nicely. That’ll save me a bit of time every month from now on, plus I learned even more about Python in the process. Could I have done this with PL/SQL? Yes, but the code would have been quite a bit more complex – which is why I’ve never bothered to automate it before.

* not including comments, doc strings and error-checking code

Sunday – day seven.
Day of rest.

Now, for some things that worry me a bit about this language:

  1. Class members are all public. My instinct is to want to hide all the details of my classes so that all callers are forced to only use my public interface. In Python it’s all to easy to bypass the “standard” API and access anything at all.
  2. Member attributes and methods can be added, modified or removed. Any time, anywhere. This is even worse than point #1, I think – I might write a class called Employee, which exposes a function called promote(). This function might implement some checks and balances, but they can just be overridden by the caller – by simply writing a new function, myBadPromote(), then simply Employee.promote = myBadPromote! Sure, a good dev team will have standards, but I’m just not sure what benefit this particular feature has. It has a “GOTO” smell about it.

At this early stage I’ll assume that I don’t understand it all well enough yet, I have a feeling that I’ll eventually learn these are not problems at all, but make prefect sense for this language. Maybe.

Next Steps: Try out Build a Rapid Web Development Environment for Python Server Pages and Oracle, and read How to Write a Spelling Corrector until I understand it perfectly.