There had been some high priority customer problems in the morning. I got called into some meetings. The customer had an idea to shut off a type of transaction until we figured out what was wrong. I said that would work. Managers and executives from the customer organization spent a lot of time deliberating over what to do.
At the end of the day, my boss asked me to join another conference call. They wanted the transactions halted. My boss thought we could just put some trigger on the table containing the transaction requests. I was tasked with figuring this out and shipping it immediately.
This seemed easy enough. I wrote an after insert trigger for each row. In the trigger I moved the inserted row to another table. Then I deleted the row from the table on which the trigger fired. Boom. You can't do that. You get an error that the table is mutating, and the trigger cannot operate on that data.
I looked around. The net had some ideas to store the data from the row level trigger in some structure. Then a statement level trigger could gather that data and operate on it. Shoot. I did not have time for this nonsense. But I know a bit about how this data gets inserted. Every row affected happens in its own statement. Therefore there is a one to one correspondance to the row and statement level triggers. Bamm.
I moved everything into the statement level trigger and was off to the races. Polished up my scripts and sent them over to test. The test team took a long time to do their tests. Our database team built a database release. It got sent out to production in the middle of the night. I was up until 3am to make sure things went smoothly. They did. This morning the customer was relieved to find the transactions halted.
Now we need to investigate the root cause of the problem. Somebody else got assigned that task. As for me, I worked only about half a day today. I was up all night. No need putting in any more hours for now.
Reproducing a Race Condition
-
We have a job at work that runs every Wednesday night. All of a sudden, it
aborted the last 2 weeks. This caused some critical data to be late. The
main ...