The Rise of NoSQL

Recently I read about the Cassandra project. It is a distributed database. The goal is to store a large amount of data. It uses the Google BigTable model (more on BigTable later). The databases stores key value pairs.

Cassandra was created by FaceBook. It is used by FaceBook, Twitter, and Digg. It is an open source project managed by Apache. The intended use is large web applications. There are other projects similar to Cassandra. They are part of the NoSQL movement. Are these databases poised to replace the relational ones like Oracle, DB2, and SQL Server? Let’s see what these NoSQL databases have to offer.

NoSQL in general are data stores that do not impose a fixed structure for the data. Access to the NoSQL databases tries to avoid joins. The technical term for these databases is structured storage. They have weak consistency. That means if you update a copy on your server, the update is not guaranteed to be propagated everywhere immediately. These type of databases also have very simple interfaces.

Let’s go over some other NoSQL databases. Hadoop is a project with many subprojects. One such subproject is MapReduce, which is a framework for large data set distributed processing. Then there is Google’s BigTable. It is a distributed storage system that can scale to a large size.

Next we have MemCacheDB. It is a distributed key value storage system. Despite the name, it is not for caching. It is a persistent storage mechanism that uses the memcache protocol to access a Berkeley database on the back end. Another NoSQL database is project Voldemort. It is a distributed database providing key value storage. LinkedIn uses it.

CouchDB is an Apache project. It is a document oriented database. You query it using MapReduce. It has a RESTful JSON API. Note that CouchDB is written in the Erlang functional programming language. A similar offering is MongoDB. The name is a play on humongous. It is also a document oriented database. This one was written in C++. It collects JSON documents and stored them in a binary BSON format.

I have covered a lot of NoSQL implementation. They mostly provided distributed storage. Although they scale very well for large data sets, their functionality is limited. They are not a replacement for relational databases. You still need RDBs for transaction processing. Nonetheless it is good to know a bit about the NoSQL movement, and the problems it tries to solve.

Edition Based Redefinition

Previously you could not compile any PL/SQL that was currently being executed. If you did, the result would be an ORA-04068 “Existing state of packages has been discarded”. This restriction is lifted in Oracle 11g rel 2. You can now perform an online upgrade of an application. In other words you can patch code while users are executing the old version. You can also do some schema changes while users access the old version. The goal of this new capability is to ensure that downtime to users is minimized.

Objects in general fall into two categories. They are either editionable or noneditionable. Editionable objects includes functions, procedures, package, and so on. They can be easily upgraded online. Noneditionable objects are things like database tables. Normally you cannot upgrade them online. However with a new object called an editioning view, you can even simulate changes to database tables while users are on the system running your apps.

An editioning view has specific limitations. You can only select from this view. Furthermore you can only reference one database table (and thus no joins are permitted). This view acts like a synonym in that it projects attributes from the underlying table. This view can have triggers. The trick is that this view can effectively hide columns from the user. This hiding operation is similar to dropping a column a table. The new kind of column drop is a virtual one. The column in the underlying table remains, but access to it has been shut off.

Here is the operational practice to use such editioning views. You first rename the underlying table. Then you create an editioning view that has the same name as the old table. You drop the triggers on the original table. Recreate the triggers on the new view. Revoke any privs on the base table. This is what we call a hot rollover. There are other similar features in Oracle 11g rel 2. Perhaps I will cover them in a future post.

SQL Developer

There is a new release of Oracle SQL Developer available. It is version 2.1. This has two new big features. They are PL/SQL unit testing and a Data Modeler/Viewer.

The PL/SQL unit testing part is a unit test framework. It allows you to share unit tests. It also let’s you run regression tests.

The Data Modeler/Viewer gives you read only access to a database schema. You can view the schema objects graphically. A benefit is the ability to browse the foreign keys between tables.

To tell the truth, there is even a version 2.1.1 patch that is available now. The big v2.1 release is 94M large for the install. It requires the Java Development Kit v1.6 update 11..

This app is still free of charge. You can still edit PL/SQL just like the older versions of the tool. And it still integrates with all kinds of source code control.