What’s the biggest clue you can give that your database is vulnerable to SQL injection? When your list of “forbidden words” looks suspiciously like a sample of SQL / PL/SQL keywords:
I notice that they haven’t forbidden BEGIN, CREATE, MERGE, or TRUNCATE …
Congressman Peters, your IT staff are doing it wrong.
The final day at Burswood was just as enjoyable as day one. Well done to all the AUSOUG committee!
I started with two APEX talks – first, Scott Wesley on APEX 4.1 Security. Personally I very much enjoyed the unique presentation style. You can experience it for yourself here. After that, Mark Lancaster from Queensland gave his analysis of the changes from APEX 4.0 to 4.1, and commented on a number of new features that have been added or improved.
Just before lunch I caught “Tips and Best Practices for DBAs” by Francisco Munoz Alvarez, who spoke less about actual DBA tasks (as I was expecting) but more about the “soft” skills – attitude, professionalism, working in a team, delegating tasks, and automating everything.
After lunch Vinod Patel moderated a discussion panel comprising Debra Lilley, Tim Hall, Connor McDonald, Penny Cookson, Chris Muir, and a guy from Oracle (whose name escapes me for the moment) – and they were plied with many questions about the art of presenting. It was encouraging to hear what they had to say, both about their success and their failure stories. I think I got away with taking this photo without them noticing 🙂
I took in Graham Wood‘s final presentation, a live demo of Exadata. He demonstrated how blazingly fast it is for loading huge amounts of data in a very short time (e.g. 500GB in about 10 minutes IIRC) and running horrible queries even faster (e.g. multiple full table scans with self joins, running in mere seconds). It was very impressive, although it did highlight that to get the full benefit of Exadata, some queries may need to be rewritten. For example, a big report you’re running now might get a modest x10 or x20 speed improvement on Exadata, but after rewriting you could get on the order of x100 to x200 speed improvements! If you don’t believe me, go ask Graham yourself 🙂
The day ended with Connor McDonald‘s talk, A Year in Purgatory – Diary of an 11.2 RAC Upgrade. It held a lot of promise, but unfortunately I was called away to an emergency at work so I missed most of it. I was quite disappointed to miss that one. By the way, Connor is now blogging – at connormcdonald.wordpress.com. Finally!
I’ve enjoyed each AUSOUG conference since 2000, and this year was one of the best in my opinion. It was great to catch up with colleagues, network with Oracle nerds and professionals, and get inspired by a variety of talks on topics I’m interested in.
In addition, the last few years I’ve also presented. This has been a good experience which I intend to continue. I hope that with practice I’ll get much better at it.
I had a most enjoyable* day today at the Oracle conference at Burswood, Perth.
* This is significant in light of the fact that the start of the day was quite hectic, having had to drop the car off to get a new alternator (the electrics died on the way home from work yesterday, so this morning I called RAC out to get the car started and charge the battery up a bit. I managed to drive the car within 100m of Ultratune, then it died completely. So I left the car in the traffic and walked the rest of the way, borrowed some kind of block box that was like a CPR machine for cars.), thereafter going with Rosalie to drop the eldest at her school, so she could then gift me a lift to Burswood.
I got to the conference in time to hear Graham Wood from Oracle talk about DB Time-based Oracle Performance Tuning, which was excellent, then Tim Hall on Edition-Based Redefinition which led to some excellent discussions with both Tim and colleagues from my current client.
My talk on a simple ETL method using just SQL seemed to go well and there were some good questions, comments and suggestions. If you missed something it was probably because I talk too fast – if you want to look something up from the talk you can look at the slides here, or have a play with a full SQL*Plus demo [ETL-by-SQL-sqlplus-demo.zip]. I also blogged about this technique back in February.
I finished the day with Graham Wood’s second talk, on Oracle Hidden Features – which indeed included info on some Oracle features that I’d either not yet come across, or had forgotten about. Doug Burns blogged about this presentation earlier. The bit about external tables, especially the new 11g feature for running the source file through a preprocessor, was particularly interesting.
I’m looking forward to the second day tomorrow.
Disclaimer: if you’re looking for an Amazon-basher, you’ll have to look elsewhere.
I had two web sites running on a server at Amazon’s Virginian data center when it suffered the recent issues. The first I heard of it was when one of my primary users called on 20 April to let me know that the web site was down – it was consistently returning 500 Internal Server errors. I wasn’t able to connect to the web site, to Apex, to the database, or even by SSH. Next I checked the AWS Management Console, which reported “Instance connectivity, latency and error rates” for the US-East region – which happens to be the region where our site was running.
Immediately, I calmly went into panic mode. First thing I’d thought I’d try is to reboot the instance. That took a while, but then it wouldn’t come back up again. Uh oh. Next thing I thought I’d do is try to detach the volume from the instance, re-attach it to a new instance, and hope I could bring it back up. AWS, however, seemed unable to detach the volume (probably because it needed to wait for outstanding writes to the disk to be written) – it was stuck in “detaching” mode.
There is an option to “force detach” a volume, but it comes with warnings that it might leave the volume in a corrupted state (since it might not have every change written). I decided to leave it alone for now.
Meanwhile, the web site is down, and my first priority is to get something up and running – so this is a chance to demonstrate that my restore procedure works. Yes, I’ve tested my restore process several times, so I was confident it would work, assuming the Amazon infrastructure is working ok.
The last backup I’d taken was a week ago – the morning of 16 April – stored as a snapshot in the US-East region. Generally, the data center was working reasonably well – it was mainly existing EBS instances that were suffering. Everything was extremely slow, and I got frequent errors. For example, I’d select my snapshot, go “create volume”, and it would report “sorry, an error stopped the volume being created”. So I’d try again, in various availability zones, same error every time. I’d wait for a while, then go back in and find half a dozen volumes in various states of Pending or Available status in various availability zones – it seems it was able to create the volumes after all.
So I delete all but one of them, then start up a new instance. By this time, Amazon have started putting more info on their status page, and it seems the problems are now limited to one availability zone – probably the one our site was in. Now, attach the backup volume to the new instance, and follow my restore procedure to get the site up and running. It took a bit longer than usual (hours instead of minutes), but finally it was all up and running. Switch the Elastic IP over to the backup instance, test the web site – all looks good, except (of course) that all data inserts/changes since 16 April have been lost. So I flip a few switches to stop emails and SMSs being sent (I don’t want people being confused by out-of-date notices), and set the Global Notification string on some of the key applications to let people know what’s going on.
Next step is to see if I can access the stuck volume. It took about two days, but finally the volume was detached, and my instance was terminated. So I start up another instance, get it set up, then attach the volume to it. It immediately was set to “attaching” status, and I couldn’t mount it – and it turned out to be another 24 hours (give or take 12 hours) before it was attached.
In the meantime, since I was just waiting, I thought it would be a good time to finally do what I’d been meaning to for a while – move the site to the Singapore data center, and take a long-overdue offline backup.
The purpose of my first goal was basically to see if being closer physically translates to lower latency to here in Perth. Also, I wanted to work out how best to move data between Amazon regions, so that I’d get even better redundancy. It was good that I had been taking weekly backups, but they were all stored in one data center on the US East coast, and I felt it would be a good idea to move a monthly backup set to a different region in case they irretrievably lost an entire data center.
Amazon don’t provide any option to just move a snapshot, volume, instance or anything between regions. It’s easy to move between availability zones within a region (by taking a snapshot, then creating a new volume from the snapshot targetting a different zone), but you can’t move from one region to another. Therefore, the only option I could find to do this was to bring up a new instance in the Singapore data center, create a volume with matching size (15GB), then transfer the data within the instance using dd and scp. This I did – it took more than 6-7 hours (I’m not sure, because I got tired of waiting and went to bed) – and I had to run fsck on the volume after loading the data, because some of the directories seemed to have lots of inaccessible files showing “? ? ? ?” in the directory listing. After that, however, the site was up and running and my quick smoke test seemed to show up no issues. So I allocated an Elastic IP to the instance, then changed my DNS settings to point to it.
If the entire Amazon EC2 ecosystem were to be blasted off the map in some cataclysmic fireball, I doubt many people would care that my two little sites were gone. More likely, of course, is that I might suffer reduced access to one or more Amazon regions, and if I cannot access them I cannot restore from backup – unless I have one in my care. So I downloaded the dd’ed snapshot. That took about 3 hours to download the 7GB file.
Meanwhile, back on the ranch in Virginia, today (23 April) the old volume was marked as “attached” to my new instance, and I was able to log in to it and mount the volume. I ran fsck on it but it reported no issues. Got the database started up without any issues, and immediately took a complete export. scp’ed the export across to the new instance in Singapore. Imported the data to a new schema. There were only a few tables that would have had critical data created or updated in the past week, so I just compared them – it looked like no-one had made any changes to anything in the replacement instance, so I just truncated them and re-populated them from the data taken from Virginia.
All done – a bit of stress, lots of patience required, but in the end it was relatively simple to get things up and running again. In hindsight, it would have worked out even simpler and easier to just leave the stuck instance alone and wait – I suspect, eventually, it would have come up again no problems. I could have just brought up the backup instance while I waited, maybe set it to read-only. Oh well, lesson learned.
There’s a whole lot of FUD and anger directed at Amazon and cloud services in general, but to my mind, this would have been a darn sight more difficult (or even impossible) if we’d been self-hosting. For example, I didn’t have to leave home to do any of this. I haven’t had to worry about hardware failures or network connectivity at all. If the service goes down once a year like this, it’s not the end of the world. As long as I have a backup to go back to, all is not lost. Amazon never promised 100% uptime or even 100% data retention. So far, for me personally it’s been about 99.6% (over the past two years), and 100% data retention.
And that’s a whole lot better than I could do by myself.