Died and Gone to an RDBM

Like many programmers, I find myself working in a mixed environment. I use "state of the art" object oriented programs in conjunction with "legacy" relational databases. The challenge in a mixed environment is to find the best ways to leverage the strengths of both technologies.

The first thing to realize is that the two theories have different design philosophies, and different strengths. In many ways, the goals of relational theory and object technology are diametrically opposed. The goal of relational theory is to separate the data from the processes that create data. Object technology ties data and process together into a single object. 

A DBA would describe a program simply as a means to gather precious data. The object oriented architect would say that all access to the data should be controlled by the object, and that any attempt to access the raw data directly is a form of heresy that violates the most basic tenets of theory - encapsulation. 

The database modeler celebrates the data, while the object technologists denounces the existence of data. The two camps do not live in harmony. But the two paradigms bring tremendous strengths to the table, and the best system architects will learn how to tie the two competing technologies together.

The Problems of Persistence

The object oriented paradigm breaks large complex systems into self-contained components called objects. Since the objects are self-contained, you can build complex systems from these objects without having to worry about unanticipated dependencies between the components. The object oriented architect can quickly stack objects together, like Legos, and build extremely complex, yet robust systems.

The keyword to the success of this technique is self-containment. If different components of a program sneak in and change the data at will, then you are likely to develop strange dependencies between the different components. These dependencies are unpredictable, and will not only lead to confusion, but could result in major catastrophes. 

Since the object architect seeks self-containment, saving data in a database is considered anathema. It is a violation of the most sacred doctrine of the programmer's covenant with the object. It violates encapsulation. However, computer users, who have not subscribed to the oo doctrines, have a really nasty habit of demanding the ability to save their work (go figure?). Object-oriented programmers have labeled this annoying desire of computer users to save their work as "the problem of persistence." 

Relational Theory

Relational theory, however, is not about building programs, but about well understood mathematical relations between data sets. Relational theory is code reuse at its finest. A good relational database package will use a core set of programs to manipulate any of the tables within the system. Many relational programs use a standardized query language called SQL. With SQL, you can retrieve, insert, update and delete data from any table in the database.

While the process of creating the tables in a relational database is simple, the process of populating the tables can be quite difficult. An accounting system might have several hundred different tables. Writing a program that puts the right data in the right tables at the right time can be quite difficult. SQL makes it extremely easy to modify the data. It also makes it extremely easy to totally hose a few years worth of work. 

Relational technology can also be inefficient. In a relational database, you normalize the data. To normalize data, you split everything up into flat tables. If you have an invoice with line items, you must store the invoice information in one flat table, and the line items in another flat table. To retrieve the entire invoice, the program needs to read the invoice information from the invoice table, look up the invoice number in an index, then read each of the line items from the line item table. There are times, however, when this extra chopping up and recombining the data does little else than introduce possible errors, and churns up processing time on the server.

The Object Relational Layer

So, we find ourselves in a strange world. Object technologies is perhaps the best tool for developing programs, and the relational database is one of the best tool for storing data. How can we use both tools to its fullest. 

Although it is tempting to simply imbed database calls in objects as needed. I find that the gaps between the two different paradigms are so great that it generally best to take a serious look at the logical foundations of the two methods and find correct area to build a bridge between the two technologies.

The Life and Death of an Object

A mentioned early, all objects should be self contained. They should have a defined start, include methods to control the manipulation of the object, and have a defined end. That's right, the object programmer should define both the beginning and end of the object. This process of designing each object from start to finish is called life cycle analysis.

-- to be completed ---



Both relational and object technologies have strengths and weaknesses. Object technology is stronger in modeling complex processes, while relational theory is strongest at storing raw. The challenge in a mixed environment is to leverage these strengths. The end of an objectís life cycle provides a natural boundary between the object and relational world.

I hope I have provided some insights in how to integrate object oriented designs with relational technology. Most importantly, I hope I have provided some insight into the deepest metaphysical question of all: "Where do objects go when they die?" The answer, of course, is: "to that big relational database in the sky."