In a previous post, I discussed the shape of the class itself, arguing that it should be organized around actual released versions plus the thing you intend to deploy next. This is not revolutionary. It's not ubiquitous but I'm certainly not the only one to suggest that this is how databases should be maintained.
There are several ways to manage how versions are tracked when an instance of a database has been deployed. One way that I've seen done is to keep a text file with a list of the scripts that have been executed. Another way is to fill a directory with scripts that have been executed, with each script living in its own file.
The way that I think works best and that fits most naturally within the concept of a class of databases is to store the state of a database instance within the instance itself. The alignment is so strong, in fact, that it makes the other methods seem bizarre and beyond the realm of consideration.
Dependency
From a practical perspective, storing a database's version information inside the database only makes sense. You don't want to have something outside a database that is required to run or maintain it. What if that thing gets lost?the puzzle stops working when it has a missing piece |
In most environments, the rigorous database maintenance procedures ensure that losing data stored inside a database is many, many, many times less likely than losing a text file or a folder full of scripts. The main way that you might lose said data is if you somehow lost the database itself, along with all its backups, in which case you would probably not care if you also lost the version data for that database.
Management
Another matter of pragmatism is managing databases is that of managing the dependency between them and their version-tracking data. If those data are external to a database instance, then you have to know where they are stored in addition to how to connect to a database.
The most dreadful consequence of this management hassle is that you might accidentally upgrade one database based on the state of another database. Depending on what precautions you take, that could have consequences ranging from a few minutes of aggravation to loss of valuable data.
The most dreadful consequence of this management hassle is that you might accidentally upgrade one database based on the state of another database. Depending on what precautions you take, that could have consequences ranging from a few minutes of aggravation to loss of valuable data.
do you want to feel like this guy? |
Moreover, there is a persistent and non-trivial management cost. Everything you do with a database - every single instance you create - also has to have this manifest of its upgrades stored somewhere else and the association between the two must be tracked for as long as the database lives.
What about all those little databases that only last a few seconds or a few minutes? As your test suite is executing? What about all the development databases? It's not like tracking each of these relationships is or expensive but all those little tasks accumulate to produce an awkward development environment fraught with menial tasks that draw your attention from the things that matter.
Principle
this is not a house not yet |
The wholeness of an object is what grants it its identity; what elevates it from being a mere blob of data that can be accessed by a program into an object. Tracking the upgrade state of a database in said database is part of keeping it whole.