Oracle Development: July 2008

Select From Dual

I am designing some changes to the application suite my project supports. There are some requirements that need me to add new tables to the database schema. So I requested the DBAs to add them. Our project has some Oracle consultants which review the database change requests. One of my new tables was grouped in with some other needs into a global table. The consultants had a lot of ideas on this new table and its access. That sounded good to me. One consultant wrote a PL/SQL package that managed the table, including things such as auditing.

Before I go on vacation, I wanted to make sure the development staff had what they needed to implement the code changes. So I tested out the PL/SQL package written by the consultant. I logged into SQL*Plus. There were a bunch of functions exported by the PL/SQL package. I executed a SELECT FROM DUAL for one of the getter functions. It worked fine. Then I tried to SELECT FROM DUAL one of the functions that modified the underlying database table. The function returned an error to me. So I pushed it back to the consultant because I thought maybe he had a bug.

The consultant came back to me with an answer. The function was trapping an ORA-14551: cannot perform a DML operation inside a query. Of course that made sense. The SELECT FROM DUAL cannot change a value in the database. Why didn’t I think about that. The way to test out this stuff is to write some PL/SQL code, and call the functions from there. That is how the developers are going to have to implement this. The consultant also cautioned me against testing this directly from SELECTs in SQL*Plus because the functions are autonomous transactions. Apparently things get screwy when you call autonomous transactions like that.

The bottom line is that you cannot issue a SELECT FROM DUAL on a function that modified the database. There is a rule in Oracle that prohibits this. I have actually run into this before. However since I did not write the implementation for this package, I did not see what was going on under the hood. Developers beware.

Recycle Bin

Next year we are going to Oracle 10g on the back end. Currently production is using an Oracle 9i database. Our development database has been upgraded to 10g so we can be ready for next year. Developers have curiously found a bunch of tables in our schemas with names like “BIN$TktvMTcFRA3gRAADunAMgQ==$0”. However we are unable to drop these tables. A little research revealed that this is a new feature for Oracle 10g called the Recycle Bin.

I want to give a shout out to Natalka Roshah from Oracle FAQ. Her article explained the mysteries of the Recycle Bin to me. Apparently given the right 10g option being set, Oracle shall not actually drop a table when you issue the DROP command. Instead it renames the table so you can restore it easily in the future. A little research shows that the table does disappear from the USER_TABLES view. However it still exists in the USER_OBJECTS views with a type of “TABLE”.

Our project actually wants to truly drop the table when we issue a drop. So we will have our DBA team disable the Recycle Bin option. However for now we have resorted to issuing the following command to drop a specific Recycle Bin table:

PURGE TABLE “BIN$TktvMTcFRA3gRAADunAMgQ==$0”;

In my case, I wanted to get rid of all Recycle Bin tables in my schema. I had amassed 313 of them (which were taking up space). So I got rid of them all with the following command:

PURGE RECYCLEBIN;

The real funny thing about this is that some experienced developers who were not familiar with this Oracle 10g feature were sure these were not real tables. This included dudes who used to work for Oracle. I do not know about anybody else. But I think it is time for my company to send me to Oracle 10g training. Don’t you?

Lost Connection

Our testing team reported that all application would error out after 30 minutes of non-use. They got an error message stating the connection had been lost. Initially I picked up this problem and researched it. Most people immediately discounted this as a database server issue. However I had some evidence that this might not be so. This year we moved from the Oracle 8 client to the Oracle 10g client. Thread based connections seemed to behave differently in 10g. So I found some instances where you needed to do some explicit connection management between threads using database contexts. If not, I recall getting a lot of connection lost error messages.

I was able to duplicate the problem easy enough. Just log in and wait 30 minutes without doing anything. Sure enough I made the problem occur. However after reviewing the code involved with just logging in, I determined this was not a thread coding issue. I hit Google search to see if I could find any knowledge on the subject. Most of the web sites I visited recommended that you modify the timeout parameter on the server’s sqlnet.ora file. Armed with this information, I forwarded the problem to the DBA Team.

The next time I saw the lead DBA, I asked him about our timeout problem. He said that he encountered the problem as well. It was happening even in SQL*Plus. I told him I read a lot of articles about the timeout value in sqlnet.ora. However the DBA insisted that the timeout was not set on our server. Instead he believed this to be a network issue. So he forwarded the problem to the network administration team. This did not immediately resolve the problem. But it was due to the problem not receiving the priority it required. A couple calls later from the DBA, and it seems the problems have been resolved.

A lot of my experience with this problem is that you cannot just trust your instinct with any given problem. Nor can you entirely defer to the knowledge gained by Oracle. You need to dig in to the specifics of the problem and methodically identify the true cause. This is not to say that you need to discount any hunch you have. However you need to identify the hunches as theories, and test out those theories. Agent Moulder said it best when he recommend you trust nobody.

Report Server Removal

Recently I had some problems with Oracle reports working on my machine. So I decided to uninstall and reinstall Oracle Reports Server 6i. There is no option to remove the Reports Server from Windows Add/Remove Programs. So I ran the Oracle Installer that came with Oracle Forms and Reports. It is nice that Oracle provides an installer that can uninstall its software. However I found this program a bit confusing and ineffective. The installer lists all installed software (that the installer can remove). So I need select the software I want removed. I selected all the software, assuming that this would cover all of the Report Server. This is where my problems began.

The Oracle installer produced a number of error message on my machine. Some messages seemed clear enough. For example, a service was running that needed to be manually shut down before Oracle could proceed with the removal. Too bad the error message did not tell me how to shut down the service. I tried going to Windows Services. That did not seem to resolve the issue. Then there were other cryptic error messages. They seemed to imply that the selected software was either uninstalled of in a bad state. That does not help me much.

When I went as far as I could, I was hoping the Report Server was gone. No such luck. There was still a directory where the Report Server was installed. A lot of the files and subdirectories had been removed. But many more remained. I tried to delete the whole directory with Windows Explorer. This did not work. Some program or service was locking the executables in the bin directory. I resorted to Google for help. Turns out you can rename the binaries in the bin directory. But you just cannot delete them. I wanted to reboot Windows in safe mode, then delete the pesky files. However I was in a hurry so I skipped it.

Another place where I saw remnants of the Report Server after uninstalling was the Windows registry. The data is stored in HKLM\SOFTWARE\Oracle. I continued to see an Oracle home entry for the Report Server. Well at least I have full control over the registry with the Windows Regedit program. The only danger is that you need to know what you are doing or else you could make matter worse by hacking the registry. I just deleted the Oracle Home registry entry (mine was HOME1). I also decided to remove ORACLE_HOMES\ORACLE1. And I figured that I should decrement the HOME_COUNTER in ALL_HOMES. I really do not recommend this level of registry hacking. But I was desperate.

Finally I got to reinstalling the Reports Server. At first I brought the software up to patch 3b which mimics ours Production configuration. But I later found I needed patch 18 for everything to work correctly. What a nightmare. I sincerely hope the never version of Report Server work better during removal.

Count or Continue

At work we have a large system coded mainly in the C++ programming language. The back end is an Oracle 10g database. Users have numerous requirements for configurable reports. We implement these using Oracle Reports. Frequently the application will create temporary tables in the user’s schema. The table names are passed to the report. This allows long running reports to essentially work on snapshots of the data. It also simplifies some of the data gathering performed in the actual reports.

Each time a new reported is selected, the application shall typically delete the prior temporary tables. The application then recreates the temporary tables based on user selections and current data in the database. This usually works fine since the same users normally run the same reports over and over. The only tricky part is that sometimes a user will choose a report for the first time. And in that scenario, there is no existing temporary table to delete.

My instinct for this scenario is to first detect whether the table exists or not. This can be a query from the USER_TABLES view. Once I find that the count is not zero, the code executes SQL to drop the old table. I was surprised to find another technique used in some code I recently debugged. It always attempted to drop the old table even if it did not exist. It preceded this code with a EXEC SQL WHENEVER SQLERROR DO CONTINUE.

Logically the ignore errors method works just as well as the count with optional drop of table. I started to wonder which way was better though. One would think the ignore errors method was better for the normal case. Only one SQL statement was issued. I would imagine this to run faster. But maybe there is heavy penalty for trying to drop a table that does not exist. In the end I left the code alone to ignore errors and always try to drop the table which might not even exist.

This whole discovery got started by a tester finding a scenario where the DROP failed and then a following CREATE TABLE failing as well. But that story is a subject for a future post.

Password Policy

Our system has stringent requirements for database passwords. One of the many rules is that old passwords cannot be reused for a very long time. This includes both a long number of days, and a large number of different passwords in between reuses. Developers and testers frequently have a lot of different accounts to conduct testing. The password policy makes it difficult to keep track of all the different passwords that have to keep changing.

Recently a tester asked a DBA to reset her passwords back to the ones she normally uses. The DBA attempted to do this, but was halted by the implementation of the security standards. It was too soon to reuse the old password. So the DBA chose a new password for the tester. Later I spoke with the DBA about this problem. Both he and I knew that it was the database profile that tied the user to the password verify function which prevented quick password reuse. I said we could just temporarily switch the user to a different database profile. Then the password rules would not be enforced. Oracle would not recheck the password when the original database profile was restored. This seemed like the ultimate plan to skirt the database password rules. In fact I coded up a script to do this for all my database accounts.

The DBA I spoke with knew about the profile swap trick. But he was hesitant to do this for the testers. He likes to do things by the book. The trick would be violating the security policy. I told him he was right. However we are just talking about a test environment. In reality, it is impossible to keep track of all the changing password for the multitude of accounts that we require. So people write down their passwords. The security rules get violated anyway. But at least the system was set up to discourage illegal password reuse.

I think the DBA has the right attitude. If you are always serious about security, you will most likely be safe always as well. It is a slippery slope to skirt security rules sometimes. There are times when the urgency of the moment requires some rules to be bent. But this decision should not be taken lightly. However I have not thrown out my script which resets my passwords by first switching up the profiles. In that way I am able to remember all my well known passwords and not write them down. That is my compromise.

Oracle Development