Perfomance for Dropping a Column

I am working on some experimental routines to analyze hash functions. The big idea is to generate a lot of hashes. Of course I am storing them in a database table. For a while, I have been going back and forth trying to determine how to identify each hash.

Initially I think I just has a key that was generated from a sequence. Then I moved to multiple keys that were related to the input data. However I am changing how I generate that input data. So I thought I would drop those columns. Oops. It is taking a very long time. I only have 20M rows in the table. What is going on here?

Now I realize that I am going to regenerate all my hashes. So those existing 20M rows are just going to be truncated anyway. I guess I should have truncated first, then dropped the columns. You live, you learn. Still want to figure out why it is taking so long to drop a column here. Yeah there are a bunch of rows, and some of the columns may be big. But come on.

Package Access Mystery Solved

I have been playing around with the DBMS_CRYPTO package, trying to generate a lot of hashes. I want to study the distribution of hashes for different inputs using assorted hash functions. The HASH() function seemed easy enough. Just pass two parameters: the input your want to hash, plus the hash function you want it to use. Unfortunately, I ran into trouble.

The DBMS_CRTPYO package defines six hash types in its packages specification. For example, HASH_MD4. But when I tried to specify MD4 using this variable, I kept getting PL/SQL: ORA-00904: "SYS"."DBMS_CRYPTO"."HASH_MD4": invalid identifier. What was going on here? I was explicitly granted EXECUTE on the DMB_CRYPTO. I should be able to see that variable.

I searched the web and was about to give up. Just hard coded the hash function values. Then I found the answer on TalkAPEX. My problem was that I was trying to use the package variable from SQL*Plus. Not allowed. You can only get access to the package variable from PL/SQL. Aha!

So now I just wrote a procedure to run my hashing. The procedure, via definer's rights, has EXECUTE on the package. And it can access the package spec variables. Now I am good to go. Only thing left to do is feed these hashes a whole lot more inputs.

A Little Bit of Crypto

I have been trying to figure out to "collision resistant" some of these standard hash functions are. It is a tough concept to get my head around. I figure what better way to understand than to do some hashing. And let's put the result in a database table so I can analyze it from different angles.

The first thing I needed to do was grant access to the dbms_crypto package which is owned by SYS. Not a problem. Then I called the hash function. Seemed easy enough. I wanted it to do a SHA256 on my plain text. For some reason, I could not get the function to recognize HASH_SH256. It is supposed to be defined in the dbms_crypto package. But Oracle kept complaining.

I needed to get this project moving. So I just hacked the hash type and passed in a 4. I have it on good authority that 4 means perform a SHA256. Eventually I should figure out what the problem here is. Now I got a bunch of hashes. But they are in raw format. Not too good to look at.

After running the raw hashes through rawtohex() plus a to_char(), I am good to go. I was surprised that all the hashes have the same length. I know the upper bound is supposed to be 256 bits wide (i.e. 64 characters). However I thought some of the hashes might be shorter. Nope. all were the same size.

What input data did I pass to the hash function? Heh. I grabbed a bunch of sentences from the book Pride and Prejudice. That is a story for another blog post.

Jobs in Information Technology

I saw a post on a board asking for some help. The dude wanted to know what kind of skills or certs would be good to get into the Information Technology industry. He also wanted to know what kind of jobs he should shoot for. He thought maybe a help desk position. While that is an option, the pay is at the low band of the industry.

Here is a relevant observation for today: Getting involved with security is a good thing. Network security is even better. These are hot right now. It seems to help if you have a government security clearance. Provides some protection against offshoring. Another general truth is that you always need to be learning in this field. Stuff changes a lot. New stuff pops up.

I heard a rumor that a business analyst position is the sweet spot. Money is really good. Plus the work is "easy". Well that is what I heard. Not sure I agree. Yeah the money is good; it is a lot higher than help desk work. But I doubt the job is an easy one. In fact, it might be one of the harder ones. Let's just say it is hard to be a good business analyst.

Regarding pay, a help desk position may get you $35k and top out around $50 to maybe $60k per year. Decent. But not great, especially in IT. I was surprised that some guys who were gainfully employed turned down contracting rates as high as $90 per hour. That sounds like a lot of money to me. But maybe you could earn more if you are really good.

Moving Forward with Many Unknowns

I used to be part of a big team. Then I got put on a big assignment that lasted over a year. I was on a much smaller sub-team. The new development is over. That project is in maintenance mode. I have been tagged to do general maintenance of that project. My old team leader got wind of that. He assigned me a high priority maintenance task. Unfortunately it dealt with a code base I don't work with on a regular basis.

So I dug in. Traced how the data was getting loaded. I asked the ETL team why there was data missing from a table they populate. The only answer I got was that was what they received. Not too enlightening. The guidance I got was to try to find the data somewhere else. Luckily the analysts had a lead on a source where I could find the data.

With the new source in my hand, the fix should be easy, right? Nope. I got to modify this big system. I decided to just come up with a hypothesis of how I could achieve this. Then I went to research how that could be done. Halfway through, I found out that I was barking up the wrong tree for half of the solution. Okay. Time to regroup.

Once I figured out where I needed to make the changes, I found that I had the wrong version of code to start with. I was working with something that was two years old. I backed up and downloaded all the code from our source code repository. Was able to match up the version of code that was most recently deployed to production. Now I am cooking with gas.

The are a couple good lessons here. Sometimes you got to try something out when you don't know all the answers. Also, by doing some hard work, you can go figure stuff out for yourself without having to bother other busy people. Finally I learned that there is nothing like making changes, running the code, and seeing the results work. Very satisfying.