Using compound triggers to boost your journal table performance
If your schemas are like those I deal with, almost every table has a doppelgänger which serves as a journal table; an “after insert, update or delete” trigger copies each and every change into the journal table. It’s a bit of a drag on performance for large updates, isn’t it?
I was reading through the docs (as one does) and noticed this bit:
Scenario: You want to record every change to hr.employees.salary in a new table, employee_salaries. A single UPDATE statement will update many rows of the table hr.employees; therefore, bulk-inserting rows into employee.salaries is more efficient than inserting them individually.
Solution: Define a compound trigger on updates of the table hr.employees, as in Example 9-3. You do not need a BEFORE STATEMENT section to initialize idx or salaries, because they are state variables, which are initialized each time the trigger fires (even when the triggering statement is interrupted and restarted).
The example shows how to use a compound trigger to not only copy the records to another table, but to do so with a far more efficient bulk insert. Immediately my journal table triggers sprang to mind – would this approach give me a performance boost?
The answer is, yes.
My test cases are linked below – emp1 is a table with an ordinary set of triggers, which copies each insert/update/delete into its journal table (emp1$jn) individually for each row. emp2 is a table with a compound trigger instead, which does a bulk insert of 100 journal entries at a time.
I ran a simple test case involving 100,000 inserts and 100,000 updates, into both tables; the first time, I did emp1 first followed by emp2; in the second time, I reversed the order. From the results below you’ll see I got a consistent improvement, shaving about 4-7 seconds off of about 21 seconds, an improvement of 19% to 35%. This is with the default value of 100 for the bulk operation; tweaking this might wring a bit more speed out of it (at the cost of using more memory per session).
Of course, this performance benefit only occurs for multi-row operations; if your application is only doing single-row inserts, updates or deletes you won’t see any difference in performance. However, I still think this method is neater (only one trigger) than the alternative so would recommend. The only reason I wouldn’t use this method is if my target might potentially be a pre-11g database, which doesn’t support compound triggers.
Here are the test case scripts if you want to check it out for yourself:
Connected to: Oracle Database 11g Enterprise Edition Release 18.104.22.168.0 - 64bit Production With the Partitioning, OLAP, Data Mining and Real Application Testing options insert emp1 (test run #1) 100000 rows created. Elapsed: 00:00:21.19 update emp1 (test run #1) 100000 rows updated. Elapsed: 00:00:21.40 insert emp2 (test run #1) 100000 rows created. Elapsed: 00:00:16.01 update emp2 (test run #1) 100000 rows updated. Elapsed: 00:00:13.89 Rollback complete. insert emp2 (test run #2) 100000 rows created. Elapsed: 00:00:15.94 update emp2 (test run #2) 100000 rows updated. Elapsed: 00:00:16.60 insert emp1 (test run #2) 100000 rows created. Elapsed: 00:00:21.01 update emp1 (test run #2) 100000 rows updated. Elapsed: 00:00:20.48 Rollback complete.
And here, in all its glory, is the fabulous compound trigger:
CREATE OR REPLACE TRIGGER emp2$trg FOR INSERT OR UPDATE OR DELETE ON emp2 COMPOUND TRIGGER FLUSH_THRESHOLD CONSTANT SIMPLE_INTEGER := 100; TYPE jnl_t IS TABLE OF emp2$jn%ROWTYPE INDEX BY SIMPLE_INTEGER; jnls jnl_t; rec emp2$jn%ROWTYPE; blank emp2$jn%ROWTYPE; PROCEDURE flush_array (arr IN OUT jnl_t) IS BEGIN FORALL i IN 1..arr.COUNT INSERT INTO emp2$jn VALUES arr(i); arr.DELETE; END flush_array; BEFORE EACH ROW IS BEGIN IF INSERTING THEN IF :NEW.db_created_by IS NULL THEN :NEW.db_created_by := NVL(v('APP_USER'), USER); END IF; ELSIF UPDATING THEN :NEW.db_modified_on := SYSDATE; :NEW.db_modified_by := NVL(v('APP_USER'), USER); :NEW.version_id := :OLD.version_id + 1; END IF; END BEFORE EACH ROW; AFTER EACH ROW IS BEGIN rec := blank; IF INSERTING OR UPDATING THEN rec.id := :NEW.id; rec.name := :NEW.name; rec.db_created_on := :NEW.db_created_on; rec.db_created_by := :NEW.db_created_by; rec.db_modified_on := :NEW.db_modified_on; rec.db_modified_by := :NEW.db_modified_by; rec.version_id := :NEW.version_id; IF INSERTING THEN rec.jn_action := 'I'; ELSIF UPDATING THEN rec.jn_action := 'U'; END IF; ELSIF DELETING THEN rec.id := :OLD.id; rec.name := :OLD.name; rec.db_created_on := :OLD.db_created_on; rec.db_created_by := :OLD.db_created_by; rec.db_modified_on := :OLD.db_modified_on; rec.db_modified_by := :OLD.db_modified_by; rec.version_id := :OLD.version_id; rec.jn_action := 'D'; END IF; rec.jn_timestamp := SYSTIMESTAMP; jnls(NVL(jnls.LAST,0) + 1) := rec; IF jnls.COUNT >= FLUSH_THRESHOLD THEN flush_array(arr => jnls); END IF; END AFTER EACH ROW; AFTER STATEMENT IS BEGIN flush_array(arr => jnls); END AFTER STATEMENT; END emp2$trg;
Just to be clear: it’s not that it’s a compound trigger that impacts the performance; it’s the bulk insert. However, using the compound trigger made the bulk operation much simpler and neater to implement.
UPDATE 14/08/2014: I came across a bug in the trigger which caused it to not flush the array when doing a MERGE. I found I had to pass the array as a parameter internally.