Setting up Wallet Using Orapki

I have a stored procedure that is using utl_http to retrieve the contents of a web page. Basically I am doing web scraping for information. I have previously set up my ACLs so that Oracle will not block my web access. However I encounter a ORA-29024: Certificate validation failure. What do I got to do to get web access?

Turns out I was going to a site that uses Secure HTTP (https). And you just cannot access such a site without some setup. Okay. I know Internet Explorer can get to the site. So I click in my browser to access the root certificate the browser uses to access the https site. That root server is VeriSign.

I export the certificate to a file. Then I use my new friend orapki. This is a command line program used to set up my wallet. I first create the wallet. Give it a password plus a location on my file system where the wallet will reside. Then I add the VeriSign certificate to the wallet. I have got to be good to go now.

Well there is good new and bad. The good news is that Oracle let's me go out to the Internet and download web content. The bad news is that the web site thought I was a bot (I am). So I could not get the actual content I desired. Now I might need to figure out how to trick this web site into believing I am just another web browser.

Access Control Lists

I have a stored procedure that is trying to use utl_http go scrape a web page. Initially I was encountering an ORA-24247: network access error. Apparently the user running the procedure must be explicitly given access to the network before Oracle will let this web request go through.

So I create an ACL by calling the CREATE_ACL procedure. I actually need to name the XML file where the details will be stored. I provide the user I will use as a parameter to the procedure. Now that the ACL has been created.

I also need to call the ASSIGN_ACL procedure. This is where I specify the host and ports. I am not choosing any specifics here. Just want to punch through the network using Oracle. Luckily I can use wildcards and NULL to be generic.

After crossing my fingers, I run my custom procedure to access a web page. Now I get an ORA-29024: Certificate validation failure. Does it really have to be this hard? The browsers make it look really easy to retrieve content on the web.

Package utl_http to the Rescue

I have a table full of data that I extracted out of an XML file. That was a good start. However the data is minimal. I need more info. Where can I get it? From the Internet. Of course. I have written scrapers before. But they were client programs. Now I want to do this from inside my Oracle 12c database. Can you say utl_http? That is the package to use.

Now I am going to be doing a bunch of scraping. Don't want the web site to detect that this is being done from code. So I have to use web proxies in the middle. Not a problem. I just call the set_proxy function. Someone has thought about this need before.

This is the typical web request/response business. So I have to formulate a request. Then I just read the response, line by line. It is a bit strange that when I read past the end of the HTML body, an exception is thrown. But I can catch that fine. This almost seems too easy.

When I run my code, I get an ORA-24247: network access error. What? Note that this is a fresh install of Oracle 12c enterprise edition. Apparently it comes out of the box with Internet access disabled. Darn. How can I punch through this limitation? I need to set up some ACLs. Give my user permission to go out to the web site through the ports I am using. Okay.

Have not figured the details out yet. I read a bit and got the feeling that it is a two step process. More on that when I get it working.

Using XML DB

All right. I got my Oracle 12c enterprise database installed. Now I got a big XML file I want to parse. Extract out the data. Stick everything in a database table or two. But how the heck do you do that? I am used to writing client/server programs that could do this. However this time around, I want to do it in the database.

Well I can declare an XMLType variable. Then for a quick and dirty, put the XML text in the constructor. Use the Extract function to dig down into a certain XPath in the data. Then use the ExtracValue function to get fields. Iterate over the whole XML document. Insert data into a real table. Done. Right?

This hack worked for a small data set of three records. Happy with that success, I jammed 5k records into the XMLType variable. Ran the code. It bombed. Not even sure what the error was. Because the whole sqlplus program blew up. Obviously I was doing something wrong here.

Okay. Let's declare a CLOB variable instead. Read the XML file into the CLOB. Then parse this XML. This feels better. It feels right. I can call selectNodes. That gives me a DOMNodeList. Iterating through that list gives me the data I want. No trouble throwing 5k records at that. I now have a table full of data.

Next up, I need to work that data.

Password Problems

All the members of my old team had their own development schemas. We were give the schema password so we could make changes as we saw fit. These schemas were special in that they all had the same password. That was to make the life of the DBAs easier. This of course was a security violation. But hey. What are you going to do?

The passwords were furthermore shorter than the normal required length. These accounts were supposed to be set up to allow that. Recently my password expired for my schema account. This was not supposed to happen. Okay. So I changed the password and tried to reset it to the old one. Multiple problems popped up - password too short and cannot reuse a recent password.

All right. Need to get around this. I just changed my password a lot of times to get around the recent password reuse limitation. However the password length required me to set up a new profile that had a NULL password_verify_function. This seems like a lot of work. But hey. You got to get stuff done. While I was at it, I reset the password for my old team lead so he could log in without further ado.

Big Numbers Cause Big Problems

I was tired this morning. Slept in an extra half hour. When I got in front of my computer, I started working on the task I had planned for today. My team lead calls me up and said there were problems during deployment to test. Okay. We looked at the logs for guidance.

The database error was that the code was trying to fit a large number into a NUMBER(9) column. Seems simple enough. I said we were probably selecting a column that was bigger than precision 9. We looked at the code. Nope. Source columns were also NUMBER(9).

Then I said we must be getting our insert and select order mixed up. This was a large insert statement. We pulled the column names and the values we were pulling into an Excel spreadsheet. Yep. Something got out of order in the select. Just a couple out of order can cause some pain.

A manager wanted to know what was going on. Why did this work in one environment and not another? Well it all depends on what type of data (and how big the numbers are) in the columns we got switched.

Further analysis of the rest of the SQL showed some more SQL statements out of whack.

The Mystery Table

Yes I know. I don't post much here any more. But I work in a PostgresSQL database now. Thought I would recount some challenges I had in the DB today. Needed to make some last minute changes in a function. This function dynamically inserts data into tables. Turns out it was using a mishmash of source columns as input. Needed to handle a couple of scenarios based on whether the primary input was NULL.

It was easy enough. I inspected the catalog to figure out the sets of data I needed to choose from. There is no NVL function in this DB. So instead I had to use CASE expressions. I coded up the changes, compiled them into the database, and executed my function. The good news is I got all the primary data I needed. Unfortunately the copy of production data I had did not exercise all paths.

Okay. I could just clone some records in my own schema, modify them to simulate the second data set, and rock and roll. Unfortunately I don't have my own schema. I just share a schema with my whole team. We usually just make sure all objects are owned by a role. And we all have that role. That works until somebody creates an object and forgets to reassign ownership.

Wouldn't you know it? Somebody created the table I needed in our common schema. But they did not share ownership. I could not even select rows from the table. I needed to modify data in that table. What was my solution? We were running out of time, so I created a similar table with another name. Then I modified my function to use this other table. Hack? Yes. But it worked to test out my logic.