This year my team was tasked with making some complex changes in our system. These changes needed to be made deep in our PL/SQL packages that run as a part of our nightly process. Since I am the one who has been around the longest, and I am also one of the few who actually know how our nightly process works, I got assigned this task.
Our requirements team documented the needed changes as best they could. And I questioned them mercilessly until they got all the details from the customer. Then I went off and coded up a number of changes to meet the specification as I understood it. Unfortunately I made some incorrect assumptions about the values set by some other parts in the system. As soon as this code hit production, the trouble tickets started coming in.
As this was my code, I jumped in and determined what was wrong and coded up some PL/SQL code changes. Our process requires us to conduct unit testing. So I doctored up some data to cover all the test cases. There was only one problem. The code I implemented is only supposed to be run on Wednesdays. And today is a Thursday. No problem. I just modified the code to run every day while I did my unit tests.
Another part of our process is to document all changes so that they can be peer reviewed. Often times this seems like unnecessary overhead. But today it saved the day. As I was generating the documentation, highlighting the portions that I changed, I noticed that the code was written to run every day. Oops. Luckily I caught it before peer review, and more importantly, before we shipped this code to Production. If that had gone through there would have been all kinds of fireworks. We do have other safeguards such as independent testings. But I am not sure they would have caught this given the rush on the fix.
I can safely say that I was saved today by our standard development process. Maybe I also need to look into other ways to test "Wednesday only" functionality without hacking up the code. What do you think?
Reproducing a Race Condition
-
We have a job at work that runs every Wednesday night. All of a sudden, it
aborted the last 2 weeks. This caused some critical data to be late. The
main ...