Monday, January 06, 2014

Encapsulation of Database Design

It seems to me like every software developer goes through a phase where they think stored procedures are bad and a phase where they think they are good.  Like me, some alternate through these positions several times before settling on a position that works.

In the end, I discovered that there was never a dichotomy in the first place.  The real question was not whether to use stored procedures but how to use them.  The struggle over whether or not to use stored procedures was never really about stored procedures.  It was always about properly separating concerns and encapsulating designs.

In the Beginning...

I started off loving the idea of stored procedures.  I loved them based on a single argument that, at the time, was possibly relevant: performance.  That pernicious specter that looms over our industry, the myth of proactive optimization, cast its spell over me and I was totally sold.
In the nineties, it might even have been true that there was a compelling performance-based reason to write a stored procedure.  In general, for modern database systems, that is no longer the case.

Flip

As with many people in the pro-stored-procedure camp, I ran into one of the many potential failings of heavy stored-procedure use.  Putting behavior in a stored procedure because it is "faster" or "more appropriate" often drives bad design decisions.

The more logic I added to a database, the harder it was to test.  This was before I stumbled into the ideas in my book but it would still be true, at least to some extent, today.  Also, the fatter I made my database layer, the thinner I made the other layers.  The coup de grace was the fact that, on a regular basis, large stored procedures were actually slower than performing dumb selects and processing the data into meaningful results in a middle-tier service.

Essentially, every time I transferred business logic from somewhere more appropriate to a database, I made my system harder to test, less flexible, and slower.  So I did what humans often do and switch from one extreme to its opposite: no stored procedures of any kind.

Flop

The problem with toggling between extremes is that it reduces the world to binary decisions.  Of course it is true that, in any decision, one is considering a finite set of options which can be reduced to a series of binary decisions.  There's nothing wrong with choosing between two things but there is something wrong with limiting your options early in the decision-making process.

The "no stored procedures" stance didn't work out very well either, although it does work better than "everything in the database" way of working.  Instead of a tightly-wound, change resistant machine that served as the center of an application's universe and ground change to a halt, I was building delicate arrangements of glass work that shattered at the first hint of change which served as the center of an application's universe and ground change to a halt.

The problem with not having any stored procedures is that something has to have an intimate relationship with the structure of a database.  When the stored procedures go away, that tends to become whatever is connected to the database (usually a middle-tier service).

That kind of intricate understanding of implementation details between one module and another is inherently more difficult to enforce that such an understanding residing exclusively within a module.  It drives the cost of change through the roof.

The Other Way

As is often the case, the artificial binary decision was hiding the true nature of the decision in the first place.  The question was never "should stored procedures be used or not?"  The right question to ask is "for what should we use stored procedures?"  Note the "yes" and "no" answers to the former question map to answers in the latter but there are many other potential answers as well.

I don't want to iterate through all the possible ways stored procedures could be used in a database design; mostly because that list is probably infinite in length.  So I'll just cut straight to the one that I think is right: encapsulation.

A while back, someone I was working with, Christopher Taylor, suggested to me that stored procedures were the equivalent of a class's interface for a database.  They are a contract between a database and its clients.  So long as the contract is fulfilled, you are free to revise the implementation details to your heart's desire.  He didn't use these words and I don't recall the exact words he used but it the point is that it was he who turned me on to this idea.

Database Designs Require Encapsulation

Probably the most sapient portion of the observation had little to do with stored procedures, specifically.  Stored procedures are merely a tool we use in recognizing that databases need encapsulation of their designs just the same way that any other object would.

Depending on the features available in your chosen database platform, you could conceivably create encapsulation other ways.  For instance, some platforms allow updateable views.  So long as you could keep satisfying the contract specified by an earlier set of exposed tables, you can change the design of your database behind that wall of encapsulation.

As Scott Bain says, encapsulation is the first principle of design.  It should come as no surprise that this is true for database designs.