SQL Review and Native Compilation

I just started reading a new web site on PL/SQL programming. It is actually hosted by Oracle. Today I read a review of the different ways to execute SQL in PL/.SQL. They are embedded SQL, Native Dynamic SQL, and the DBMS_SQL API.

Embedded SQL is easy. You just code up the SQL like normal. There are some caveats like specifying the variable into which you select some data. But it is very natural. Here is a hint. Make sure you define aliases for the table you select from. Then use these aliases to qualify the columns you select.

Native Dynamic SQL is when you run the commands with an EXECUTE IMMEDAITE. You should exercise care to protect against SQL injection. I have written about this before. This style is fast than DBMS_SQL. Note that DBMS_SQL is the old way of executing dynamic SQL. It is not entirely obsolete. If you do not know the return values until run time, this is the only option. But most times you can skip it.

Now let’s talk briefly about native compilation. PL/SQL is interpreted. Normally it gets compiled down to M-code. The M-code is executed by the PL/SQL Virtual Machine (PVM). However for speed purposes you can go native. In the Oracle 9i and 10g days, native compilation would transform your PL/SQL to code in the C programming language. This C code would be compiled by a C compiler on the operating system. The result was a library that could run the code very fast on the machine.

Things change up a little bit for native compilation in Oracle 11g. But let’s leave that for a future blog post.

Pro*C Mystery Revealed

Our customer said there were too many problems with our system. And development was not resolving problems fast enough. The customer shared their concerns with management. Management put the heat on team leads. Now my team lead wants more and more bugs fixed faster. I had to push back and say problems will be resolved when they get resolved. However I did attempt to be diligent in problem resolution. There was one last problem to resolve. The fix sounded simple when I determined the cause. But I was encountering some strange Pro*C behavior. Here is the code I was trying to use:

void getType(char *pszType)
{
EXEC SQL BEGIN DECLARE SECTION;
char *pszLocalType = pszType;
END SQL BEGIN DECLARE SECTION;

EXEC SQL
SELECT type
INTO :pszLocalType
FROM myTable;
}

No matter how many times I ran the code, I always ended up with an empty string. However I got an actual value when I ran the SQL manually from SQL*Plus. What gives? Can you spot the problem? It took a while for me to realize what I was doing wrong. The variable into which I was trying to retrieve the data was defined as a pointer to the memory passed into the function. I really needed to have some space defined within the DECLARE section to store the SQL results. Then I could copy the value to the calling function. Here is a slightly updated result that worked:

void getType(char *pszType)
{
EXEC SQL BEGIN DECLARE SECTION;
char szLocalType[10+1];
END SQL BEGIN DECLARE SECTION;

EXEC SQL
SELECT type
INTO :szLocalType
FROM myTable;

Strcpy(pszType,szLocalType);
}

SQL Injection

SQL injection is an attack whereby the user gets to execute some command via a trick. You can prevent such behavior by using bind variables for arguments to SQL. You should also valid all inputs.

There are some other techniques which limit the ability for a SQL injection attack. Limit the privs you give to any given user. Also use invoker rights via AUTHID CURRENT_USER to ensure that commands that do get through run at the right priority.

Limit the number of routines that are exposed in a package specification. This will close the gap that a SQL injection hack can penetrate. Stay away from dynamic SQL. Get rid of debug messages and tracing. You don't want a hacker to know anything about the system you have developed.

You can employ fuzzing to test for SQL injection vulnerabilities. This will not prove you are secure, but can uncover some weaknesses. Test each parameter to your functions separately.

Curse of the Trigger

A tester found a problem in our application. I traced it back to some test configuration data in the database. That seemed like an easy fix. I wrote a one-time PL/SQL script to delete the test data. I had our DBA team implement the fix in the tester's database.

Luckily our DBA did a little quality assurance himself. He inspected the contents of my script. Then he checked whether the test records got deleted. They did not. It seems a trigger fired and prevent the delete. Doh. I should have checked that.

I like writing PL/SQL code. So I was not worried that the results of promoting my fix were a failure. That just meant I could write more code. I researched the logic of the trigger. Then I added some more code to detect child records of the test data. I wrote code to move those record to depend on another foreign key. Bingo. I even stayed late at work knocking out the updated fix.

Tracing and Profiling

You can employ an Oracle trace to view the execution path of your PL/SQL. This can be enabled either by a session parameter or a compile switch within your code. The data is accessible through some Oracle views. You must run the "tracetab.sql" file in order to set up these views.

A related topic is that of profiling. This is where you find out which portions of your code get executed the most. You can use this to choose which code you need to optimize. Profiling gets turned on and off with the DBMS_HPROF built in package.

Just like with trace, you need to run the "dbmshprof.sql" script to set up the tables for profiling. There is even an option to view the profiling output in HTML form. Next time I am going to talk about the hot topic of SQL injection attacks. Until then, be good.

Use the Source Luke

PL/SQL starts out as ASCII text files. Then you compile that source code into the database. Once it is there you can inspect what has been compiled. The old school way to do this is to inspect the USER_SOURCE view. Specifically you can query the TEXT column. It has your source code.

There are more advanced ways to check up on the code in the database. The SQL Developer tool has a function to search PL/SQL source code. You can also check out the ALL_ARGUMENTS view. It has parameter information for packages.

One setting that helps control your ability to analyze source code is PLSCOPE_SETTING. This can be set at either the session of system level. This allows you to analyze identifiers. You access this data using views such as ALL_IDENTIFIERS. Finally there is the DBMS_METADATA built in package. You can use it to get object definitions. The output is in XML format. Enjoy.

Result Cache

The result cache is a place in memory where Oracle stores results. Subsequent queries can then obtain these results quickly. This cache encompasses both SQL queries and PL/SQL functions. Note that the cache can become invalid when the source data of the queries changes.

The SQL query result cache is stored in the SGA. The results can be from just fragments of a query. You use the hint /*+ result_cache */ to advise Oracle that you would like to make use of the cache. Check out the DBMS_RESULT_CACHE built-in package to clear or inspect the contents of the cache.

The PL/SQL function result cache stores data across sessions. It also is stored in the SGA. When defining a function, use the RELIES_ON clause to identify tables whose changes may invalidate the results of the cache.

Next time I will start by explaining how you can extract your PL/SQL source code from the database.

Keeping It Simple

My boss told me to help another team out on our project. They are behind schedule. And just recently, a review of their code showed they were missing some important stuff. Their team was clueless about this. Since I knew what was going on, I decided to help.

First I helped out requirement team figure out what changes needed to be made. Then I went into design mode. The existing code in a PL/SQL package that implements a ton of business logic. We needed to store parameters for the logic in the database instead of hard coding a lot of the values.

How does one get this done? This package gets called a lot. In other words, it needs to be fast. I can't have the code look up the values each time it needs them. I could be fancy and have the package load the values on initialization. But I want to make this simple as possible.

Therefore I created a new record type that stores all the lookup values. Then I created a new procedure which loads the lookup values from the database. Then I pass this record around to the different functions that need the values. This should be simple enough for any developer to follow. I just want to code the stuff. I don't want to inherit another team's code. Good enough for now. The real fun comes when I need to thoroughly test this stuff.

Bulk Binding and other Tuning Tricks

As promised, here is a discussion on bulk binding and other misc performance tuning tips. The goal of bulk binding is to reduce the amount of times Oracle has to do a context switch between the PL/SQL and SQL engines.

There are two directions that bulk binding operates in. The first is when you want to put the data from a collection into the database. To do this you employ the FORALL statement. It will make the whole collection of elements be transferred to the database with one SQL context switch.

The converse bulk binding technique is when you have data in the database you want to extract into a collection. This is where you use BULK COLLECT INTO a collection. Again the PL/SQL engine only has to make one switch to the SQL engine.

Here are some other ways to make PL/SQL run fast. Take advantage of the short circuit evaluation of expressions. If you know one part of an IF statement is sufficient for the whole decision, and this part occurs often, put that as the first check in your IF statement. Oracle will skip the rest of the checks for you. Another feature you can enable is inline subroutines. Instead of having your code jump to a separate subroutine, Oracle can place the full code of the subroutine right in your function/procedure. This can be controlled globally with the PLSQL_OPTIMIZE_LEVEL parameter, or done a line at a time with PRAGMA INLINE.

Tuning Tips

Here are some good tips to make your PL/SQL fast. Do not rely on implicit data type conversions. Also choose the smallest data type that will fit your data. The PLS_INTEGER data type is good. The SIMPLE_INTEGER type is better. Smaller types are better.

You should code less code per module for a small gain. The big boost will come from using a SQL statement to do the work instead of a whole procedure worth of PL/SQL. I struggle with this. Our data sets are huge. I am sometimes afraid to do something huge with one SQL statement. The thing may never complete. Or it may bomb in the middle and get rolled back. At least my PL/SQL routines can commit often and avoid redoing work on errors.

I thought I could fit all of this into one post. But I have more to say. Look for me to cover bulk binding next time, along with some other misc tips.

Interpreted and Native Compilation

There are two types of PL/SQL compilation. The first is interpreted compilation, which is the default. Interpreted compilation is done at run time. It is good for development.

The other type of compilation is native compilation. It has better performance. Therefore it is better suited for a production setting. Compiled code is stored in the SYSTEM schema. You can enable this by setting PLSQL_CODE_TYPE to NATIVE on a session or system level.

Next time I will review some PL/SQL tuning techniques.

LOBs

The plain old LOB is a new data type for the Oracle 11g database. It supports compression of the data. And it sports better performance. You get a LOB column when you define a column with the SECUREFILE keyword.

Even with compression, LOBs may take up a lot of space. That is why Oracle has a feature called deduplication. This will detect when there is more than one copy of a LOB in the database. It will then cause these to reference just one copy of the LOB. You enable this feature by modifying the column using the DEDUPLICATE LOB keywords.

Oracle now provides encryption which is done at the block level. You get to choose your encryption algorithm such as 3DES or AES. Note that you can specify the key length of the encryption (128, 192, or 256 bits). You enable encryption on a column by modifying it and using the ENCRYPT USING keywords.

There is a promotion strategy to get old large object to use the new Oracle 11g type. This option allows the table with the old types to be available while the promotion is taking place. The promotion is accomplished with calls to the DBMS_REDEFINITION built in package.

ORA-01403 Even With NVL

It was the night before the big release to our customer. Our internal test team was still finding some bugs in our software. Ouch that’s painful. One such error being reported was an ORA-01403 when they brought up a new screen. This was perplexing. I thought every single column in the SELECT statement was wrapped with the NVL function. And I knew there was a record in the database that we were selecting from. What the heck was going on?

Well the best way to debug weird bugs like this is to try things out. I knew there were some NULL column values in the record the testers were selecting. So I updated some test data and filled in some values. The problem went away. Was I losing my mind? Or perhaps the NVL was broken. I did not really write this code from scratch. I copy and pasted it from another application. It should have worked.

That’s when it hit me. The guilty column was defined as evilCol CHAR(1). However the column being selected was written as NVL(evilCol, ‘NA’). Do you see the problem? The default for NULL values of the column was one which was too large to fit in the column given its definition. Oracle won’t replace a NULL with something that is too large to fit in the source column. Thus I got a SQL warning that we had a NULL issue.

Here is the real shame about this bug. I had previously fixed a bad NVL default for another column being selected in this SQL statement. However I did not go through every column to check whether there were other bad defaults. Shame on me. Luckily I went through all the column and their NVL defaults this time. The hard lesson was learned. Be careful of the size (and type) of defaults you return using the NVL function. They might not match the column type you are selecting from.

The Microsoft Connection

I used to be an exclusive Microsoft developer. That meant I normally interacted with Microsoft SQL Server. Yes I know. It is sacrilege. Just recently I subscribed to the MSDN magazine to read up on the latest Microsoft technologies. There were a lot of articles that referenced programming APIs to access databases. I thought I would share some of them here to mix things up.

The first topic is ADO.NET. We use it in some of the newer applications in our system. This is included with the .NET framework from Microsoft. It is used to access a relational database like Oracle. The technology started as an evolution of Active X Data Objects (ADO) from Microsoft. A data provider provides access to a data source. In the case of Oracle, the data provider is the Oracle Data Provider for .NET.

A related topic is EF, which stands for Entity Framework. That is short for the ADO.NET Entity Framework. This is a set of data access APIs. It is similar to the Java Persistence API. An entity has a primary key. An entity data model maps data to a database schema. The programmer then uses either Entity SQL or LINQ to Entities to access the entity data model. This abstraction allows the programmer to access the data independent of where the data is actually stored.

Next we have DTO, which stands for Data Transfer Objects. These were previously called VO, or Value Objects. They implement a simple design pattern. They are used with DAO, which is short for Data Access Objects, another Microsoft technology. The main idea behind DTO is that they just provide accessors and mutators. There is no functionality provided other than get and set.

Finally we have POCO, which stands for Plain Old CLR Objects. The CLR is the Common Language Run time from Microsoft. This term is a play on POJO, or Plain Old Java Objects. POCO pokes fun at the complexity of accessing data using Java. POCO are simple access objects compared to hard to use ORM objects.

LOBs

I want to start by talking about the LONG data type in Oracle. You can have at most one column of type LONG per table. This data is stored inline (which means it is in the table, not referenced by the table). This column can store up to 2G of data.

A LOB is a large unstructured data type. An example of a LOB is an image. Use the DBMS_LOB built-in package to deal with this data type.

A related data type is the BFILE. This represents an external LOB. You initialize columns of type BFILE using the BFILENAME function to point to the actual file that is backed by the BFILE.

Finally let me mention temporary LOBs. They last as long as your session. These types of LOBs are not associated with tables. They are good for transforming data. Next time I will continue to discuss LOBs, with emphasis on performance and encryption.

Fine Grained Access

Oracle allows you to set general access to functions in the database. You can also take the access privs down to the table level. This is fine grained access (FGA). When it is in place, SQL statements get dynamically modified to restrict the results.

Setting up conditional access requires the use of an application context and policy. The application context is a pair of attribute and value pairs. These can be set with the SET_CONTEXT procedure of the DBMS_SESSION package. You then define a policy on a table using the DBMS_RLS package. The policy grants or revokes access to table data based on one or more attributes of the application context.

On Packages

How do you test a function from the SQL*Plus command line? Well first you must define a variable. Then you can run the function, assigning its value to the variable as such:

EXECUTE :my_var := my_proc(param);

Note the semicolon in front of the variable name. When writing the function, you must remember that you can only use SQL types. They cannot be PL/SQL specific types. You must also use positional notation for parameters. Finally you must be logged in as the owner, or be granted EXECUTE privilege on the function.

Here is a rule that is not specific to calling functions from SQL*Plus. The functions called from the SELECT statement are not allowed to do any DML. That includes anything called by the function, including triggers. This is just like another rule enforced by Oracle. You cannot query the table that is changing in an UPDATE/DELETE statement.

Now let's talk a little more about variables. A package body can have its own variables. These are private to the package. Variables exported in the specification are automatically initialized to NULL by Oracle.

I will end with dependencies. Suppose procedure A depends on procedure B. Also assume that procedure B depends on procedure C. Then we say that procedure A indirectly depends on procedure C. You can use "utldtree.sql" to set up the environment to trace dependencies.

There is still a lot more to learn before I become an Oracle Certified Professional. Come along as I bring you on my journey.

All About Cursors

I am trying to study up to be an Oracle Certified Professional. It is difficult with all the craziness at work these days. To tell the truth, I am also finding that just reading training materials is not enough. I am not learning that much. It might be time to attend some more in person training.

On to the subject for today. That is cursors. You can use a FOR LOOP to make cursor use really easy. However if you do not, and make use of explicit cursors, just make sure you close your cursors when you are done with them.

There are two type of REF CURSORs. These are the strong and weak cursors. Strong cursors return a specific type. Weak cursors can be used with any query. Both REF CURSOR types can be passed to functions and procedures. You can think of the cursor as a pointer.
Let me close by briefly mentioning subtypes. A subtype is a subset of an existing type with some constraints applied. It can be used any place the more general type is it based on can be used. Here is an example of a subtype:
SUBTYPE small_num_type IS NUMBER(4) NOT NULL;

Types of Collections

I continue to read up on advanced PL/SQL features. This is to get to the next level of Oracle certification. A recent chapter I read was about collections. These includes associative arrays, nested tables, and varrays.

An associative array is faster than a database table. You create one define as a TABLE OF INDEXED BY PLS_INTEGER. You can loop through it with the FIRST and NEXT functions. Access individual records and column like this: my_var(i).my_field.

A nested table is a table within a table. This does seem a bit weird at first. You make a column in the outer table of type TABLE OF . The outer table must be created with the NESTED TABLE keyword. You can insert many records in the sub table. This is done at the time of inserting recording in the outer table.

Finally a VARRAY is created in a two stage process. First you define a type such as CREATE TYPE m_type AS VARRAY(10) OF . Then you can create a table with a column of your newly defined VARRAY type.

There are a couple ways to set the data in these collections. You can use a constructor. Maybe I will give you an example of that later. You can also get the data from the database. Finally you can assign one collection to another one.

Get ready as next time I am going to cover cursors. This includes REFCURSORS.

Java and C Integration with Oracle

I am brushing up to eventually take the exams to become an Oracle Certified Professional. One thing I need to learn is the techniques to mix C and Java code with my PL/SQL.

There are two ways a cross language call can go. One is the callout, where PL/SQL calls with a C or Java function. The converse is the callback, where C/Java calls PL/SQL.

An extproc is one process per session that manages such calls. The extproc needs to be configured with the listener for it all to work correctly. I am just speaking at a high level now and will not delve into the details.

Let me briefly focus on Java here as it seems to be more popular. Java is stored and runs in the Oracle database. You use the loadjava command at the operating system level. You can specify either the Java class or source file with loadjava.

Next time I will briefly review good stuff like associative arrays, nested tables, and varrays.

Materialized View

The final topic I read about in my Oracle 11g book was materialized views. These were originally referred to as snapshots. They improve performance. They are like views in that they are built using a query of one or more tables. However they retain the query results from a specific moment in time. Unlike normal views, they do not get refreshed automatically every time you access the view.

We use materialized views on my project. There is a lot of data to roll up that our applications cannot do in real time. However the customers do not need an up to the minute accurate rollup. They want a reasonable approximation. Therefore we have two separate materialized views. The application will use one or the other, depending on which one is fresher. We alternate refreshing these two materialized views twice a day. It really works.

I am contemplating going for my Oracle Certified Professional certification. That will require a lot of learning. Specifically I want to be certified in Advanced PL/SQL. So I will be sharing what I learn with you.

ASM and Partitioning

I have to confess that I skipped a lot of chapter in the last Oracle 11g book that I read. This is because I am not a DBA. Most topics I defer to our actual DBA team. But it is good to know a thing or two about Oracle database internals.

One example is ASM. It stands for Automatic Storage Management. This is a file and disk manager for Oracle files. Note that you don't need Real Application Clusters (RAC) to take advantage of ASM.

Partitioning is not a new topic. We have been using table partitioning in my project for many years now. This is the process of breaking a table into smaller chunks to improve performance. There are difference types of partitioning such as range, list, hash, composite, reference, and more.

Range partitioning is where you choose a column, and the values in the column determine which partition the records are placed. You define a range of values that the column takes on for each partition.

Important Users

I am not a DBA. So I have never logged into a database as SYS or SYSTEM. And I probably never will. That does not mean I should not know anything about them.

In Oracle 11g, both the SYS and SYSTEM accounts have the DBA role. In addition, SYS has the SYSDBA role. Did that help you? Me neither. Time to get certified as an Oracle DBA.

Here is something I can relate to. Oracle 11g has a technology called streams. This is used to replicate data. You have the database data in a server and it needs to get into another remote location. Stream it over there. Obviously there is more to it than that. But this is the idea.

Another area of weakness in my Oracle knowledge is shared server processes. Oracle recommends that you have one shared server process for every 10 connections into the database. We have a lot of users (connections). Therefore we will most likely want a lot of shared server processes. Luckily we have a lot of hardware to back the database.

Next time I will provide an overview of ASM.

Passing My Test

We interview a lot of candidates for developer positions at my company. I only look for three technical competencies: C++ programming, Windows framework programming, and SQL skills. Let me focus a little on the third trait.

Now I don't need an Oracle expert for my team. We already have that. But I do need somebody who knows how to construct basic SQL statements. I need to see evidence of hands-on experience in this area. So I asked about the clauses in a SELECT statement.

I want to hear that a candidate knows that GROUP BY treats the rows selected as one record or group. I also get impressed if the candidate can explain that HAVING restricts the results of GROUP BY.

With that I have given away one third of the keys to our kingdom. Study up your C++ and Microsoft Foundation Classes programming, and you have yourself a job.

Oracle Certified Professional

Today I started reading up to study to be an OCP. My plan is to be an Oracle Certified Profession in PL/SQL programming. I read the first and second chapters in the Oracle training for this exam. I will be sharing what I learned later.

For now I continue discussing some things I learned reading a book on Oracle 11g. Triggers are associated with a single table. The trigger gets dropped automatically if the table is dropped.

An Oracle user have some other names. It is called an account. It is also called a schema. This means the user has a password for authentication. The user potentially owns some database tables, as well as other database objects.

Next time I will discuss some standard power user accounts in Oracle, as well as the clauses in a SELECT statement.

11g Background Processes

The Oracle 11g database has familiar processes running in the background. Most of these processes were there in prior version of the database. However it seems like there are new names for these well known processes. Let's go over them.

There is the database writer called dbw0. Previously this was known as dbwr. There is also the log writer which writes to the redo logs. Pmon is the process monitor. Finally the archiver copies online redo logs to secondary storage (disk).

Obviously there are a lot of other background processes that make up the database instance. I am not as interested in them as I am a PL/SQL programmer, not a DBA. Next time I will go over simple ideas such as triggers and users.

Oracle Database 11g

I recently picked up a book from the library titled Oracle Database 11g A Beginner's Guide. I was excited because I thought I could learn the new PL/SQL features in 11g. Well this book did not specialize in PL/SQL. It also did not specialize in just 11g technologies. The book covered everything in Oracle 11g. That is just too broad. But I did learn a thing or two.

Oracle first ran on a Digital VAX system. That is strange. I assumed it first ran on some flavor of UNIX. An instance is a set of processes in memory. I frequently hear about the Redo Logs from DBAs. These are essentially transaction logs.

Another common term DBAs use is the tablespace. That is a fancy way of describes a disk file the databases uses. Another file of interest is the Spfile. It is the server parameter file. This file has startup parameters in it.

I will be blogging about this 11g book for a little bit. Next time I will cover the new names for the background processes in Oracle 11g.

Five Become One

I got a call from my team lead yesterday. He said he was swamped with the task that is due in a few days. He needed help writing a stored procedure to create a temporary table. The query was apparently all ready. It just needed to do it in PL/SQL. Although I was busy, I jumped at the opportunity.

Later I got the specs for the task. It was more complicated than I thought. there were a total of 5 queries. This was tricky because each of the 5 queries produced different columns of results. However they all needed to be combined into a big table.

I decided there was no way to CREATE a table AS SELECT FROM because there were multiple queries. So I determined the list of columns, and manually created the table with code. The create statement was so huge that I had to break the table creation down into one CREATE TABLE and multiple ALTER TABLE dynamic statements.

It was quite a chore to handle all the cases where a row in the resultant big table was sparse. Then I ran into more trouble when my dynamic statements to update pieces of the resultant table choked on some NULL inputs. That required some custom logic. By the time I was done, my script was around 1000 lines long. I did not leave work until midnight. What a chore.

Back To Basics

We have a new developer on our team. He is smart. He has a PhD in computer science. But there were some concepts he was not getting. So I had to break it down for him.

There is a text file that contains the source code for an Oracle PL/SQL package. You need to log in as the schema owner. Then you can compile the source code into the database. Once that is done, there is no relation between the source code file and the stored procedure in the database. The code gets stored within the database at compile time.

My peer thought that he could somehow step through the source code file on the hard drive. That's not how you do thing. I recommended he just add some logging to his new functions. He later came back saying he encountered problems. It did not take long to diagnose the problem.

I mentioned earlier that the source code upon compilation generates a package in the schema of the user logged in during compilation. We have users which access tables and packages in the main schema by use of synonyms. However you need to log in as the schema owner to modify that package. The code is not written to explicitly reference the schema owner. I have been doing these things for a long time and take them all for granted. Sometimes you need to step back and understand another person's perspective to help them see the light.

Oracle 11g Release 2

I have read some blurbs about the Oracle 11g Release 2 of the database. So I thought I would mention some things I have learned that are going to be in this release.

You will be able to run 11g on a cluster of non-Unix machines. For example you could string together a couple Intel machines and spend less money. Conversely you can run 11g RAC on a single machine, and still take advantage of RAC features.

11g R2 has new and improved data compression. Anything that saves disk space is a good thing in my book as long as it works correctly. Finally you will be able to do a hot application upgrade in 11g without shutting down the system. That's hot.

Database Design

We are getting down to the physical design of the new changes in our system. Here is where we need to figure out which database tables get the new columns. We may also create a new table or two.

I am working with another team on a back end piece of the system. We will tie the front end to the back with some new database columns. They back end developers just thought we would add a column to an existing table.

This approach normally makes sense. You add the column to the table that represents the object that is getting a new property. However you need to look deeper to see how the new data will be used. You might want to separate some data by creating a new table. It might make you queries easier and the system faster. We are going to have the design debate later this week. This is a just a specific instance of the greater problem of good database design.

Speedy GROUP BY

We had a big problem with our production system today. The customer determined we had not been transferring some files for a couple week. And those are some of our most crucial files. We had a big TO DO list to work this problem. I signed up to analyze the damage that was done to the business.

At first I ran some queries on a pretty large set of data. Our server is a massive one with tons of CPUs and memory. However these queries took a long time. I published the results. The managers wanted more details. They were looking for a way to spin the data to do the least amount of damage to our image. I told them this would take a long time. So I went home and brought my computer with me.

Here was the surprising thing. When I ran the SQL and divided up the result with a GROUP BY function, the queries seemed to run faster. Perhaps the database was able to split the pieces of the query amongst the CPUs better. Or maybe my prior query results were cached, and the database could take advantage of the result set. Whatever the reason, I was happy that my homework was not taking too long. I was able to knock out the tasks and move on to the next emergency. I even got time to eat dinner and watch a movie this evening. Thanks Oracle10g.

Back End Development

My project at work seems to be loaded with too much to do with little resources. Yes I know that happens to everybody. But there is no way were are going to even be close to completing everything on time. My team leader has been tasked with trying to come up with a plan to get everything done. Today he had a brilliant idea. We could carve out the pieces of the newest changes that are for the back end. Then we could assign those to a junior PL/SQL developer that we have.

I gave this idea some thought. At first I thought there was maybe 10% of the work that needed to be done in PL/SQL packages and database triggers. However on second thought, the number might be closer to 15-20%. Out PL/SQL developer has some work to do. However she could help out this latest development task.

Here is the difficulty in this arrangement. The actual PL/SQL coding is not difficult. The hard part is understanding the business. This is why we cannot give this part to some new developers on the team. The business of our customer takes a couple years to understand. I guess we could give the junior developer some hints, and let her code away. We shall see how this would pan out. I was hoping that I would get to do the back end work. I like writing PL/SQL code. And I know the business inside out.

Definer and Invoker Rights

In our system, we have the stored procedures owned by a central schema. Users have their own database accounts (schemas) where we stored temporary tables. The stored procs frequently create these tables. One developer reported that a rewrite of some stored procedures broke this pattern. The stored procedure, owned by the central schema, was creating temp tables in the central schema. That is no good, as each user needs their own copy of the temp table.

When I was asked for help, I said that we have the technology to make this work. We have it set up in the old stored procedures. However I referred the developer to our DBA Team. The lead DBA told us to set up invoker’s rights for the procedure. Normally we seemed to use whatever the default was. And it seemed to work before. We tried all kinds of debugging, but could not get this to work without using invoker’s rights.

Definer’s rights s the default in Oracle. Whoever owns the stored procedure (the definer) has the credentials that determine what other code the stored proc can call, and what tables and objects are used by default. This is not what we want on our project.

You have the ability to override the default behavior and specify invoker’s rights. This is done by specifying the AUTHID CURRENT_USER keywords when creating the stored procedure. This causes the context of the current user to be used when a stored procedure is called. Sure enough when we made this change, the temporary tables started getting created in the calling users’ schemas. Now we got some new stored proc code to update.

Automatic Documenation

How can you create an Entity Relationship diagram automatically from the structure of the Oracle database? Other databases have tools to do this such as PostgresSQL Autodoc and MySQL Workbench.

I got some advice from some people for products to try. One of them is Schema Spy. Another is Enterprise Architect. A final one is DB Visualizer. Right now I use tools to execute SQL. Somebody else on the database team generates documentation from the database. However I don’t think they use any specialized tools. I think they either user Oracle tools, or some custom written scripts to do the work.

Our documentation is mainly text information on the database. This is normally based on the data dictionary. But it would be nice to have some fancier ER diagrams to use and show off. It might be time to ask for some new software to be purchased.

Oracle Alert Log

The other day, our customer reported that some functions in our system were slow. Others were just plain hung. This issue went to our DBA Team. They asked me about some of the functionality that was involved. I provided them the overall flow, as well as the major database tables involved. Then I went back to my normal work.

Our manager held a conference call to work the issue. I share a room with the manager, so I overheard some of the talk. My manager was interested in the DBAs looking at the Oracle Alert Log. Now I have heard this term before. But I did not know exactly what it was. Time for a little research.

The alert log is a text file containing messages and errors. Specifically it will contain ORA-00600 deadlock errors. It also contains database startup and shutdown events. The filename itself is alert_.log. The location of the file is governed by parameter background_dump_test. Go figure.

Apparently the alert log is the first place a DBA should look when there are database problems. In fact, a proactive DBA will be monitoring this file a couple times a day. One interesting fact is that you can write your own messages to the alert log using the ksdwrt procedure in the dbms_system package.

If you have Oracle 11g, the alert log can be directly queried using the X$DBGALERTTEXT fixed table. However you must connect as SYSDBA to access this table like you would any other fixed table. Finally I should mention that Oracle itself computes a number of metrics based on the contents of the alert log. Now I am at the beginning of my journey to becoming a DBA. I just don’t know if that is where I want to go.

Disabling Constraints

My boss told me to go help a teammate out in replicating a problem found by our customer. I worked with her to look at the code. We needed to set up some very specific data to make the condition occur. I told her we should execute some SQL to generate the new records required.

At first we wanted to clone an existing record. She told me that some triggers were preventing that. So I had her disable all the triggers on a few tables. The syntax for that command was straight forward:

ALTER TABLE my_table DISABLE ALL TRIGGERS;

Then it came to disable the constraints. She tried a similar command. However we could not figure out the syntax to disable all constraints in a single commend. Later I found out that the command does not exist. You need to disable each constraint manually, one at a time. Or you could write a script that finds all the constraints and disables them one at a time automatically. Come on Oracle. Can't you make it easy for us? We were in a hurry.

Stored Procedures

Recently I read two blog posts about stored procedures. Personally I like writing stored procs. However I am sometimes hesitant to do so because of the difficulty in releasing the software to clients. It is simple to write all the code in C++. We have an single installer which deploys the application on the workstation. If I include part of my solution in a stored procedure, I need to make sure the client and back end stored procedures are in sync. That is one extra headache that makes me choose the easier route more times than not.

But let’s get back to the debate on stored procedures. One author stated that placing code in a stored procedure does not give you a performance advantage per se. That is because normal queries from a client get cached and have equal performance. I am not sure if I agree. A stored procedure which has been compiled into the database is going to have some inherent benefits for performance, especially if there is a lot of SQL involved.

Stored procedures stop injection attacks. They also enforce data integrity if you choose to do so. I recall from my Oracle programming class that you should use constraints to enforce integrity, not stored procs. However that might be a personal decision. One author said that you should not code business logic within a stored procedure. Again I don’t think I agree. Your business layer can be at the stored procedure level.

When you do code logic in a stored procedure, you are normally locked into a particular database vendor. That’s not a problem on my current project. We are an Oracle shop, and will most likely be so until the end of time. One benefit with stored procedures are that you can have fine control over the permissions on who can execute a stored procedure. That is one ability that I like.

Export Import

Our testing team found some problems with our latest software delivery. I needed to fix those problems quickly. The first step to correcting a problem is to replicate it. Unfortunately the testing team does all their work in their own database. Luckily I have an account in that database.

I wanted to replicate a bug discovered by a senior tester. So at first I spied on the data for the transaction. My eye could not detect anything unusual. Therefore I decided to copy the data into my own development database. I figured the best way to do this was to export the data of interest as SQL insert scripts, which later could be run.

My PL/SQL Developer tool supported the export of data as SQL script. I selected the tables I wanted. After inputting the WHERE clause, PL/SQL Developer generated SQL insert scripts for me. However when I tried to execute them with Oracle SQL*Plus, I kept getting errors. Apparently there is a 256 character limit per line of script using SQL*Plus. PL/SQL Developer had created scripts with very long lines. I manually broke those lines up into short lines to placate SQL*Plus.

There has to be an easier way than that. Perhaps I could execute the SQL inserts from select statements right in SQL*Plus. Normally I don't do things that affect more than one database at a time. However I can do things through database links I guess. There should be an option within PL/SQL Developer that controls output line length, right? Let me know if you have any other ideas for this problem. It stinks to have to manually format script files.

Oracle Reports 6i

I am working on a big update to our software system. This update includes two new reports. At the same time, the rest of the team is working on redoing how the application produces reports.

We currently use Oracle Reports 6i. That is some old but stable technology. The customer has said we cannot upgrade to Oracle Reports 10g as it requires a server which they do not wish to deal with right now. So we are rolling our own reports with custom code.

I had to integrate the new reports with the new changes I am making to the application. Right now they do not have the new report technology ready. So I figured I had better code in some stubs where the new reports will go. I decided to use Reports 6i for this.

To my surprise, it was very simple to use the Oracle Reports wizard to knock out some quick report prototypes. Things got a little trickier when I tried to proceed with some layout modifications. However I decided the wizard based output was good enough for stubs. Maybe it is a bad idea to abandon Oracle Reports technology.

XML DB

Today I listened to a pitch from one of our team members. The topic was the technologies available to deal with input files we will be getting in XML format. One of the recommended approaches was to use XML DB.

To tell you the truth, I knew little to nothing about XML DB. Apparently it is a part of the Oracle database. It is supposed to allow native XML processing. Oracle says it is a high performance solution. They give you different options to store the XML within the database. Some options have better performance than others.

Once the XML is stored in a repository within the database, you can get to it using languages such as Java, PL/SQL and C. You access it via the DOM API. There is also a new XMLType. You can use this type in table columns or PL/SQL variables. Currently one of our DBAs cautioned us that it might be slow to put big files of XML data in the database. We are going to do a proof of concept to see what kind of performance we will encounter.

Triggers

One of the last things I learned at my instructor led Oracle training class was trigger. Now I had worked a little with triggers before. However I learned more about them in training. A theme I got from my instructor was that derived data is bad. That means you should not use triggers to then update other tables based on data that just changed in the table with a trigger. Oooops. That’s exactly what we do with some of the triggers on my project. Then again, I never said we were not bad.

You can write either before or after triggers. After triggers are good for auditing. That is what we mostly use them for. Perhaps my project is not all that bad. You have the ability to change the new values for an update in a before trigger. We also do that on our project. However it usually feels kind of like a hack.

Here is another theme I got from my instructor. You should not use triggers to enforce integrity constraints. That is what constraints themselves are for. Oooops again. We do some integrity constraint checking within some of our triggers.

The source code for the triggers you compile into your database is stored in DBA_TRIGGERS. That sounds logical. It is best if the source for the triggers is very small. If you need to do complex work, have the trigger call another procedure that has the meat in it.

In retrospect, my company shelled out a lot of money for me to attend a week of PL/SQL training. This training alone did not prepare me for the certification test that I subsequently took. Nonetheless this training was very valuable. The material in the class was adequate. However the instructor’s knowledge was outstanding. I was able to ask a whole lot of questions. And I received excellent consultant style answers. Your mileage may vary. But I am one satisfied customer, even with the hefty price tag. I plan to attend some more training as soon as I can find some time within my busy schedule.

Rights, Hints, and Bulk Collect

I am back to recounting what I learned at my hand on instructor led Oracle training. The first topic deals with who is authorized to execute database objects like functions or procedures. The default method is called definer’s rights. This means that the rights of the user who created (compiled) the function or procedure is used when determining what the code can do. This authorization is checked at the function and procedure level.

On a totally different topic, you can provide Oracle with a NOCOPY hint when specifying parameters to a function or procedure. This is truly a hint that Oracle may end up ignoring. This hint implies that the developer recommends the parameters be passed by reference. The benefit is a speed increase especially when the parameters are large. However there is a danger when using this. If your program blows up in the procedure or function, the state of your data may be undefined.

Finally I want to briefly mention the BULK COLLECT mechanism. I have seen top notch Oracle developers use this feature. It has always mystified me. Perhaps it has something to do with how the words BULK COLLECT sound. Anyway this technique is used when exchanging data between the database and a collection.

Normally you would iterate through a collection one record at a time and exchange the data with the database. This however is a slow technique as you are switching context between the PL/SQL and SQL governors. The BULK COLLECT let’s you do all the work at once. It is good when you are dealing with a lot of data (a big table). It is also easier to code than to manually iterate through the records yourself.

Dynamic SQL and Such

On the third day of my instructor led Oracle training, we seemed to cover a number of points. We also touched upon the techniques for performing dynamic SQL. So this post may be a bit erratic. I just thought I would get some more info out there for you.

Database Control Language (DCL) is the way to control permissions to database objects. These commands begin with GRANT or REVOKE. On a different topic, you must provide bind variables to an EXECUTE IMMEDIATE statement. This allows you to supply positional parameters to the SQL. The name of the bind variables is arbitrary. It is the order that determines which variables are matched with the placeholders in the SQL.

There are two main types of cursors. These are strong and weak cursors. A strong cursor will return data in a record type. The weak cursor will just return unstructured data. There are two major techniques to executing dynamically formed SQL statements. The older technique is to use the DBMS_SQL package. This method has been around a long time. However it is much slower than Native Dynamic SQL (NDS). Note that the OPEN_CURSOR command in DBMS_SQL does not actually open the cursor. It just creates it.

Finally I got a peek at the latest Enterprise Manager for the Oracle database. Since I am a programmer and not a DBA, I do not use the Enterprise Manager. This product is now web based. It can be used to assist with SQL tuning. Next time I will probably write about invoker’s and definer’s rights. Until then be well.

SQL Execution

Part of the reason to attend in person training is to pick up the secrets that you cannot read in a book. I got plenty of that at my last Oracle hands on training class. In addition to that, I got some pieces of information that were beyond the scope of the class. However they were nevertheless very interesting and potentially useful. Today I want to discuss a sample of one topic that I have learned. That is what Oracle does when it is working towards executing a SQL statement.

The SQL execution portion is separate from the PL/SQL processor in the database. This SQL execution portion is governed by a common subsystem which handles all SQL requests, whether they come from the result of PL/.SQL or a user typing SQL at the SQL*Plus command prompt.

Here is the order of things that happen when Oracle prepares to run a SQL statement. First there are syntax checks. Then there are semantic checks. Third there are checks whether all the permissions are in place for the objects being queried. Then the cost based optimizer (CBO) is consulted on how to best approach the query for performance purposes. The CBO step has a number of sub-steps. Finally the row selector identifies the results.

There are many parts to the CBO step. Indexes are analyzed to determine an access path to the data. Then the join order is evaluated if more than one table is involved. Then the plan costs are counted and evaluated. Obviously Oracle is going to want to choose a plan that has the least costs.

My instructor for PL/SQL programming told me there was a whole separate class for performance tuning of Oracle. In that class he would delve deeper into the SQL execution process. A coworker of mine is encouraging me to attend this class and get tested on it. However I think my heart is with PL/SQL development. Therefore my next class will be an advanced PL/SQL development class. However it is nice to know that there is a whole world of information to learn about within the Oracle universe.

Built In Packages

Oracle has quite a few built in packages available for PL/SQL developers to use. Some packages are optional for the database and must be manually installed. You can find information on the built in packages at Tahiti dot Oracle dot com. Look under Application Development, then SQL and PL/SQL, and finally PL/SQL Packages and Types Reference.

If you are not planning any commercial uses for it, Oracle allows you to download, install, and use any version of their databases. The Express Edition of Oracle is a good choice for basic Oracle database needs. It was created in response to the open source databases available now. It competes with MySQL for example. It has a 3TB maximum size limit. And it is completely free for any type of use.

Normally the exceptions raised by a package are defined by it. However generic errors sometimes cause a package to raise an exception that it does not define. One sample package is UTL_FILE. As you can imagine, it is used for file input/output. Note that you will probably not want to use this package to output reports. There is other software specialized for that purpose.

Another package available is UTL_MAIL. It works with an SMTP server. The package itself does not actually send email. The SMTP server does that. Note that this package is not installed by default during an Oracle install. You must manually install and configure it to work with your SMTP server.

OCA Certificate

This week I received my certificate from Oracle. I am now an Oracle PL/SQL Developer Certified Associate (OCA). This demonstrates that I did the work and have the capability to do Oracle PL/SQL.

Now Oracle is encouraging me to continue and get an Oracle Certified Professional (OCP) credential. For PL/SQL developers, I believe that you only have to pass one more exam to get to the OCP level. That assumes that you already have some received some training. My company supports its employees for such certification. So I think I might go for it.

Since I am officially an OCA, I can display the OCA logo from Oracle. Now I don't think I will go as far as putting it on my business cards. That logo is reserved for my own company. However I might add it to my email signature. I have seen a lot of Oracle DBAs do that when they get their OCP.

I plan to continue to share the interesting facts I learned during my hands on training class. Be on the lookout for future posts with this info.