I have a table full of data that I extracted out of an XML file. That was a good start. However the data is minimal. I need more info. Where can I get it? From the Internet. Of course. I have written scrapers before. But they were client programs. Now I want to do this from inside my Oracle 12c database. Can you say utl_http? That is the package to use.
Now I am going to be doing a bunch of scraping. Don't want the web site to detect that this is being done from code. So I have to use web proxies in the middle. Not a problem. I just call the set_proxy function. Someone has thought about this need before.
This is the typical web request/response business. So I have to formulate a request. Then I just read the response, line by line. It is a bit strange that when I read past the end of the HTML body, an exception is thrown. But I can catch that fine. This almost seems too easy.
When I run my code, I get an ORA-24247: network access error. What? Note that this is a fresh install of Oracle 12c enterprise edition. Apparently it comes out of the box with Internet access disabled. Darn. How can I punch through this limitation? I need to set up some ACLs. Give my user permission to go out to the web site through the ports I am using. Okay.
Have not figured the details out yet. I read a bit and got the feeling that it is a two step process. More on that when I get it working.
Reproducing a Race Condition
-
We have a job at work that runs every Wednesday night. All of a sudden, it
aborted the last 2 weeks. This caused some critical data to be late. The
main ...