Big Numbers Cause Big Problems

I was tired this morning. Slept in an extra half hour. When I got in front of my computer, I started working on the task I had planned for today. My team lead calls me up and said there were problems during deployment to test. Okay. We looked at the logs for guidance.

The database error was that the code was trying to fit a large number into a NUMBER(9) column. Seems simple enough. I said we were probably selecting a column that was bigger than precision 9. We looked at the code. Nope. Source columns were also NUMBER(9).

Then I said we must be getting our insert and select order mixed up. This was a large insert statement. We pulled the column names and the values we were pulling into an Excel spreadsheet. Yep. Something got out of order in the select. Just a couple out of order can cause some pain.

A manager wanted to know what was going on. Why did this work in one environment and not another? Well it all depends on what type of data (and how big the numbers are) in the columns we got switched.

Further analysis of the rest of the SQL showed some more SQL statements out of whack.

The Mystery Table

Yes I know. I don't post much here any more. But I work in a PostgresSQL database now. Thought I would recount some challenges I had in the DB today. Needed to make some last minute changes in a function. This function dynamically inserts data into tables. Turns out it was using a mishmash of source columns as input. Needed to handle a couple of scenarios based on whether the primary input was NULL.

It was easy enough. I inspected the catalog to figure out the sets of data I needed to choose from. There is no NVL function in this DB. So instead I had to use CASE expressions. I coded up the changes, compiled them into the database, and executed my function. The good news is I got all the primary data I needed. Unfortunately the copy of production data I had did not exercise all paths.

Okay. I could just clone some records in my own schema, modify them to simulate the second data set, and rock and roll. Unfortunately I don't have my own schema. I just share a schema with my whole team. We usually just make sure all objects are owned by a role. And we all have that role. That works until somebody creates an object and forgets to reassign ownership.

Wouldn't you know it? Somebody created the table I needed in our common schema. But they did not share ownership. I could not even select rows from the table. I needed to modify data in that table. What was my solution? We were running out of time, so I created a similar table with another name. Then I modified my function to use this other table. Hack? Yes. But it worked to test out my logic.

Performance Tricks in Other Databases

I have been working exclusively with the Oracle database for over 15 years. Now I have jumped to a project that uses the Greenplum database. This is a distributed database with PostgresSQL underneath. We are using it to deal with massive amount of data that have a short timeframe for processing.

My new team has informed me that the first rule of thumb is to not do updates. We insert records only. We also do not join too many tables together at once either. Mostly just two tables are joined. Here is a trick they use. The got functions which use temporary tables. They join two tables and stick the result in one temp table. Then they join that temp table with other tables to create a second temp table.

This goes on and on with different functions. They build up a huge table with many columns (1000+) by continually joined two tables together and storing the result in temp tables. Apparently this is very fast in the Greenplum environment. It does not seem intuitive. But hey. If it works, why worry about it?

Roles and Privileges

I have been studying Oracle privileges and roles today. The basics are easy. The details are tedious. There are two types of privileges: system and object. You grant these privileges to either a user or a role. If they are granted to a role, then a user gains access to the privileges when they are granted that role. There is a WITH ADMIN OPTION that allows users that you grant privileges to the ability to themselves grant the privilege to others.

When you drop an object, any privileges on the object that have been granted to roles disappear automatically. You can grant multiple privileges at once if you separate the privileges by commas. You remove privileges by REVOKE FROM. You can remove all privileges with REMOVE ALL PRIVILEGES or simply REVOKE ALL.

PUBLIC is an Oracle account that represents all users. You can grant privileges to PUBLIC. Then all users get that access. If you own an object, you do not need explicit privileges to access it. You get them by default. All grants operate immediately without the need for a COMMIT.

There are a slew of data dictionary views to look at privileges. USER_SYS_PRIVS contains those system privileges related to the user. DBA_SYS_PRIVS has all system privileges.

USER_TAB_PRIVS contains those privileges where you own the object, granted the privilege, or are a grantee of the privilege. ALL_TAB_PRIVS adds privileges through a role or that are granted to PUBLIC and available to you. DBA_TAB_PRIVS has everything.

ALL_TAB_PRIVS_RECD are privileges you have directly, through a role, or that are PUBLIC. There is a SESSION_PRIVS view that I am not too familiar with yet.

There are some old Oracle defined roles that are being phased out. But I think you still need to know them. CONNECT gives you CREATE SESSION. It is for a general user. RESOURCE is a role for application developers. DBA is a role for administrators.

Roles can be granted to roles. There are a number of data dictionary views associated with roles. ROLE_ROLE_PRIVS are the roles that have been granted other roles. ROLE_SYS_PRIVS are system privileges assigned to roles. ROLE_TAV_PRIVS are object privileges granted to roles.

Here is a trivia fact: there are no privileges specifically for indexes. You get the ability to create an index when you have the CREATE TABLE privilege.

Materialized Views

I have been tracing down a customer problem. Sometimes the summary screen shows correct data. Sometimes it is all zeros. Looks like the data comes from one of two materialized views. The views themselves have the right data. What could be wrong? That's the story for another post. Today I want to talk about materialized view technology.

Materialized views (MVs) were previously called snapshots. They are like regular views in that there is some SQL run to get the results. However unlike normal views, MVs store the results in a table. They were introduced in Oracle 8. The query used to populate the MV can go against tables, views, or even other MVs. Since the hard work is done when the MV is created, use of them can really speed up your queries.

Other databases have MVs. In Microsoft SQL Server they are called indexed views. And in DB2 they are called materialized query tables. You get the speed increase when you directly use the MVs. There is also a query rewrite technology where the database can use an MV instead of a base table to get speedup. This query rewrite was introduced in Oracle 8i.

There are different strategies to refresh the data in an MV. You can manually do it. Or you could periodically update. Or you could automatically update them when the underlying tables change. That last change gets captured in materialized view logs. There is a fast refresh option that requires specifics around the underlying query:
  • No aggregate functions
  • No joins other than subqueries
  • No mismatch columns in a UNION
  • Plus some other subquery restrictions
Not only is an MV good for improving query performance, it can help replicate data across databases. You just set up an MV in your database with the underlying tables being remote. Of course any user defined types in the source databases need to be in your own database.

You cannot perform any DDL on an MV. But you can perform DML. It depends on how your MV was set up. You can make it read only. Or you could make it updatable. Do that by specifying FOR UPDATE AS prior to the SELECT. There are even writable MVs. But they are rarely used.

The query under the MV can have an ORDER BY clause. However that only applies to the data from the initial creation. Updates can make the MV out of order. The MV can be built deferred. Then the data can later be added during a refresh. Finally if you have a regular table that acts like an MV, you can turn it into an MV  by using the ON PREBUILT TABLE during MV creation.

Merge and Flashback

I will be honest. I have not used the MERGE statement before. Looking promising though. You can do a lot with just one MERGE. It accomplishes two or more DML operations in one SQL statement. It can do an INSERT, UPDATE, or DELETE.

With flashback, you can look at data at a previous point in time. You can go back to the last DDL that occurred on a table. You are also constrained by the undo retention period specified by the Oracle database parameters. There are a couple types of flashback you can employ:
  1. FQ - Flashback Query
  2. FVQ - Flashback Query Version
  3. FTQ - Flashback Transaction Query
FQ lets you look at data at a specific point in time. An important pseudo column that works with FQ is the System Change Number (SCN). Each commit gets a unique one. You can convert it to time using SCN_TO_TIMESTAMP(). FQ can query prior values of a table with respect to a certain TIMESTAMP. Or you can go back to a specific SCN.

FVQ gives you rows that represent committed versions of data. Note that if you insert/update and then delete rows in a transaction, those operations do not show up in the FVQ. You only see material changes per transaction. You SELECT data whose VERSIONS are between TIMESTAMP values. You can also look between SCN ranges too. FVW does not work on views.

FTQ is running queries against the FLASHBACK_TRANSACTION_QUERY view. Working with this technology involves the global transaction identifier (XID). It is a RAW value tied to the transaction the change was made in. You can look at the value using the RAWTOHEX() function. Basically you want to query the view using some value(s) of the XID.

Subqueries and Such

Today I was studying for my Oracle SQL Expert certification exam. I started out knowing the material I was studying. Then I hit a number of new topics. Some of them I had heard of but did not understand. Others I was in the dark about. Still need to practice up. But I am learning a whole lot.

First up let's talk about the CTAS - Create Table As Select. You create a table and insert data into it all in one SQL statement. Constraints and indexes do not get copied over (except for the NOT NULL constraint). If you SELECT something other than a column, then you need to provide a column alias to define the column name in the new table. Or you could provide the fully qualified list of columns in the CREATE TABLE part.

The CTAS technique can be used to INSERT rows into a table using a subquery. Just SELECT INTO your table. Not need for a VALUES clause. Stick the SELECT subquery right after that.

Multitable insert allows you to, as the name suggests, insert into more than one table with a single SQL statement. This can be an unconditional or conditional insert. For unconditional inserts, you must specify the ALL keyword. Otherwise you can specify the ALL or FIRST keyword for a conditional insert. ALL means all of the WHEN clauses are evaluated. FIRST means only the first WHEN clause to match gets chosen. Multitable inserts do not work on views. Be warned that you cannot put sequences on the subquery for the mutlitable insert. Also watch out because the insert is a single SQL statement, which generates only one sequence value.