Monday, December 30, 2013

Test-Driven Architecture Article Accepted by InformIT

As you may recall some friends of mine and I are working on articles and blog entries intended to reform architecture for the modern age.  This is a quick note that the article Mike Brown and I wrote regarding the impact of TDD on architecture has been submitted to the publisher and is in the editorial phase.

I'll post something here when it is published.

Friday, December 27, 2013

The Version Manager

One of the more recent phases in my journey to a class of databases was understanding that the best way to codify a database's design was as the series of revisions that could get you there rather than in terms of specific design elements.  It's harder to find fault in this way of thinking because I still use and recommend it today but it was incomplete; sufficient to enable some level of sustainable database development but not true test-driven database development.

Let me refresh your memory on the building technique to which I am referring.  You store an ordered list of scripts that are used to build a database.  You build some infrastructure that ensures only the right scripts are executed against any given database and in the right order.  So, if you have a version 2 database instance and you want to upgrade it to version 5, the infrastructure will execute the steps to upgrade to version 3, then 4, then 5.

Digging the nugget of truth out of this way of doing things is weird because the thing is its own nugget of truth and, at the same time, is not sufficient to support a modern test-driven environment.  The reason for this weirdness is that version-based builds are the best way I know to do things but they aren't enough.  More was required.

Anyway, the grain of truth in this way of thinking was the recognition that the actual revisions applied to real production databases should govern how we organize database build scripts.  That is, while I had not yet discovered the true class of database concept - at least not as I understand it today - I had discovered one of the principles that drives test-driven database development: the historical reality of important database instances must be respected and always trumps our wishes, hopes, and ideals.

Monday, December 23, 2013

Knowledge, Behavior, and Information

I mention this in my book but I thought I might elaborate.  I think a useful way to conceptually divide the parts of a database is into three groups of design elements: information, knowledge, and behavior.

The Three Concepts

Information and knowledge are often confused in everyday language.  So, first, I'll disambiguate those two words.  Not all data are information or knowledge and rarely are the two interchangeable.

Information is a special subclass of data.  I'm sure someone who is an expert in communication theory or some other kind of academic would be glad to correct me and I won't fight them on what the technical definition of the word is.  I'm only interested in what the useful definition for my own purposes is, not the officially right definition.  For the purposes of this blog entry, and of everything I say and write, it is the part of a signal that the recipient did not know in advance.  Simply put, information is data which informs its recipient.

If information could be thought of as facts in transit, then knowledge would be facts at rest.  In essence, knowledge is potential information but it is also a potential driver for action.  That is, the two uses of any given object's knowledge are to inform other objects, thereby adding to their knowledge, and to inform decisions, thereby improving the value of an action taken.

That latter purpose is the perfect opening to briefly introduce the third player in the database design world: behavior.  If knowledge and information are facts at rest and in motion, behavior can be thought of as how something responds to knowledge or information.  For instance, you drive on the correct side of the road because you know you will slam into something if you do.  Likewise, you yank your hand away from a too-recently poured cup of coffee have become informed that the cup is too hot to touch without damaging tissue.

Information as Pertains to Databases

In the database world, information is the set of signals sent or received by a database.  A query and its parameters, the invocation of a stored procedure, the results set, and any errors that occurred are all examples of information as a database sees it.

In essence, information is the "surface" of a database's design.  It is impossible for external parties to access the value of a database except by sending and receiving signals.

Moreover, it is the means by which value is conveyed between a database and its clients.  It is pointless to update a database with information it already knows.  It is useless to query a database for what you already know.  Value is created by such actions that results in one of the other entities "learning" something.

Knowledge as Pertains to Databases

Knowledge is the reason why databases cannot be maintained using the simple "blast and rebuild" upgrade path we apply to most software deployment problems.  All the facts stored in a database are knowledge; not all the data, because you can introduce noise into a database's design, but all the facts.

Knowledge is the purpose of a database.  Most software products and components exist to convey facts between parties or to process data and discover new facts.  Some software exists to entertain.  Databases exist to preserve knowledge.  Each production database is a modern day Library of Alexandria, complete with the ability for some asshole to burn it down and, in so doing, to cause irreversible damage.

We have known this for a long time - as close to "forever" as matters.  Databases have always been designed around the knowledge they capture and preserve.  Those design decisions stand as a reflection of our implicit understanding that databases aren't merely data bases, but knowledge bases.

Behavior as Pertains to Databases

So what is the role of behavior in database design?  It's another one of those things that can be put simply or drawn crisply, but can take a lot of work to implement correctly.  The role of behavior in a database is to mediate between knowledge and information.

All the information that a database receives needs to be translated into knowledge and stored for safekeeping.  Why?  So that, later, that knowledge can be translated back into information to help other actors make decisions or discoveries.

You can codify the behavior of a database in many different ways.  At the time of this entry, the most common way is to couple the behavior offered by a database directly to the kinds of knowledge that database can store.  This is accomplished by creating, publicly exposing, and coupling clients to table structures and relationships.

The Relationship Between the Three

I find it useful to divide database design into three parts.  The information layer of design is where the interactions between databases and other objects are defined.  The knowledge layer of design is where the facts you want to store in a database are housed.  The behavior layer is where one codifies the manner in which facts are absorbed or emitted.

behavior translates between knowledge and information

I'll post more on each of the specific layers of design with some implementation recommendations later.

Saturday, December 21, 2013

Division of Ideas

We choose how we organize ideas in our heads.  Take the concept of a wrench.  For me, there are two kinds of wrenches: regular wrenches and monkey wrenches.  That distinction exists because one kind of wrench is useful for binding while the other is good for bashing.  For a mechanic, however, I'm sure there are many kinds of wrenches.

I'm not sure either of us is wrong.  Now, I want to be clear: I am not saying that everyone who believes anything is correct because it's what they believe.  I am saying that someone who looks at a room full of people and sees two groups is every bit as right as someone standing right next to him who sees three groups in that there is no "right" way to divide things into groups, only "useful" ways to do so.

This may be obvious to others but it wasn't obvious to me.  I'm documenting it here as much to ensure I'll have the discovery available for myself when I forget it later as to share it with you.

Friday, December 20, 2013

The Database Installer

Another fallacious idea I, and many others, had was to treat database instances like programs that need to be installed.  Again, there are many things wrong with this line of reasoning, but something positive came from it.

You know me... "Mr. Positive."

This particular step in my journey to a class of databases bore what was, at least for me, a pretty subtle value.  Part of the subtlety came from the fact that the installer paradigm looks like it works for longer than a lot of its predecessors, which tended to break down extremely early.  Part of it was my own stubbornness - I was spending so much energy arguing the small improvement that I couldn't see the bigger improvements waiting just around the corner.

I'm pretty sure that's irony: that this way of thinking was so successful kept me from seeing other, more successful, ways of understanding a problem.  Wait.  Maybe that's not irony.  Maybe that's the human condition.  ...or maybe those things do not really oppose one another.

Anyway, grain of truth in this way of imagining database build technologies is that it recognizes the importance of discrete, tracked, testable deltas in design and highly controlled, repeatable ways of introducing those changes.  That ends up being a pretty fundamental concept.  It serves as the basis for building a testable class of databases, enabling test-driven database development, and ultimately unlocking database agility.

So there you have it.  Another step in the journey.  Another failure to hit the mark.  Another lesson that built to what we know today.

Thursday, December 19, 2013

Next Step: Test-Driven Architecture

Recently, I published an article on InformIT about how we should focus our software-architectural efforts.  I've been blogging about it a little bit since.

Like everything in the software development industry, Test-Driven Development can make architecture better.  So Mike Brown and I have set out to document our ideas on that subject.  Originally, that was going to be a single article on InformIT.

That's not going to cut it.  As one might have guessed (I didn't but, now, I can imagine someone else might), the topic is too rich to fit into a single 3,000 word article.  So, with luck, it will turn into a series of articles written by my friend and myself (and possibly others) that help this industry take another step toward being truly test-driven.

Wednesday, December 18, 2013

The Database Design Applicator

Another step in my journey to a class of databases was believing that a design tool could properly maintain all my database instances for me.  Again, I want to try and dig the nugget of truth out of this belief rather than beat it to death for what is wrong about it.

I think that, in this case, the nugget of truth is actually pretty self-evident.  The idea is to have a document that specifies the current design of a database and a tool that can update any database to have that design.  If you scrape away the part of that sentence that is obviously magical thinking, you are left with this:
"The idea is to have a document that specifies the current design of a database."
The part that is not quite as readily-apparent is that the motivation runs just a little bit deeper than that.  The real drive behind a tool that can update any database to a well-understood definition of the most recent design is that you have a document that codifies the current design and a tool that enforces the current design.

That's a noble goal, actually and an attainable one at that.  Think about it...

Disregarding all the ancient ideas, don't we still have a way of doing that when defining a "regular" object model?  We just do it a different way.  Instead of technical specifications and UML diagrams, we have unit tests.  Instead of a tool that generates code from high-level specifications, we have test runners.

Given a class of databases, using unit tests as the primary specification of design is a discipline that transfers straight over to database development; practically unmodified.

So that's the positive takeaway from the era of magical design-application tool mysticism: that we do in fact need a way to specify and enforce the current design of a class of databases and that, like Java, .Net, or C++ classes, that should be automated unit tests specifying how a database behaves and enforcing that specification on a regular basis.

Tuesday, December 17, 2013

Wrangling Wild Databases

I've received some interesting questions lately.  One question at my most recent talk for the DAMA group in Portland rang very familiar.  It got me thinking.

There are a lot of database instances out there that were developed before my book was published.  Even if everyone adopted the techniques therein, that would still leave trillions of rows of data in databases that were not in accordance with the book.  Among other things, these databases are generally created with insufficient automated test coverage so, in the spirit of Working Effectively with Legacy Code, I call such databases "legacy databases."

I cover legacy databases a little bit in the book but there is a limited number of scenarios.  This blog seems like a natural place to start addressing other issues.

The Scenario

Imagine the following:

You work for a company that sells an enterprise product; we'll call that product "Calm Cheddar."  As with most enterprise software products, Calm Cheddar has a database back end.  Calm Cheddar has been successful in several applicable markets and has been sold to numerous customers over the course of several years.  In that same time, it has grown as a product.  Along with the overall design of Calm Cheddar, the design of its database has grown and morphed over time.
manual deployment and broad customer base leads to
deviation in database builds

The final result is that there is an array of customers with varying versions of the software and each with a different path to their current deployment.  The saving grace is that Calm Cheddar's customers tend to upgrade to the latest version.  They may or may not upgrade often, but then never upgrade to something that is already outdated.

Now let's say you want to start emerging a class of databases in this scenario.  What I've shown in the past and what I teach in the first two thirds of my book do not cover scenarios like this.  This specific scenario is not covered anywhere and I imagine that there are several people in a similar situation.

The Ideal and the Real

The ideal class of databases is expressed as a linear sequence of versions created using a linear series of revisions.  The class of databases is simple, knows how to perform any reasonable upgrade, and is very robust.

Circumstances, however, are rarely ideal.
just because you want it, doesn't make it so

There are at least two lessons from Test-Driven Database Development: Unlocking Agility that apply to this scenario.  First: drive variation out of your database build process as much as you possibly can.  Second: above all else, make your database build process reflect the actual transitions that have really been applied to real databases.

These two forces might appear to be contradictory but, in fact, they align perfectly.  You should drive all variation out of your database development process, yes, but that does not mean you will succeed or will have started autonomating your database build mechanism at the very beginning of your very first database's life.  That you should drive all variation from a system does not mean that none will ever be there.

The Reckoning

The key to reconciling these two forces - the impulse to minimize variation in database build paths and the need to recognize the true database upgrade steps that have actually occurred - with the reality that there is a large, diverse population of databases within a given class is in understanding that you must merge these disparate paths together.  What was once a vast array of trickling creeks should, over time, be coalesced into a single coursing river of features.
there can be only one

Two techniques must be applied to resolve any differences between the various deployment paths.  One is taming a legacy database.  The other is remediation deviations.

Without getting into the details, the former amounts to creating a new class of databases that has conditional build logic in its very first version to address the potential of a database that has been built prior to the class's creation.  The latter consists of documenting variations in a database's historical construction patterns, then using transition tests to drive conditional logic that reconciles the various "flavors" of a database design.

Sometimes it will be a great deal of time.  Imagine that eighty percent of your database instances are almost exactly alike, fifteen percent fall into a few other distinct categories, and the remainder are "lone wolves" with highly deviant paths.

In such a case, you might want to phase your database wrangling activities, starting with the large body of highly similar databases first, moving on to the smaller groups second, and start picking off the lone wolf types on an "as-needed" basis.

This technique works great in a relatively controlled environment where, among other things, all of the databases are roughly the same version.  They don't need to all have exactly the same design to start but it does make things a lot easier if they have approximately the same design.

In the Calm Chowder scenario, however, we don't have the luxury.  Remember: we have variation in both the version and the manner of construction.

The Conditioner

A friend of mind, Seth McCarthy, has also come up with an interesting twist on this way of doing things; one that, I think, addresses the extra kind of variation.

He suggested adding a proxy over the simple, linear-step-oriented database builder at the heart of a class of databases.  This proxy's job is twofold.  First and foremost, it detects the version of a database's design then conditions it to look as though the infrastructure for a class of databases has been used to build it.  Typically, that means creating and populating some kind of version registry table.

Naturally, if acting on an empty database, the proxy does nothing but delegate to the core database builder.

rectify...
Also, the conditioner code is in a position to perform conditional transformations before delegating to the core class of databases.  It can even inject custom transitions in between steps if necessary.

The conditioner represents almost the exact opposite of the linear sequence of database upgrade steps I ordinarily recommend but special circumstances demand special responses.  Ordinarily, one would be adding a new version to a class of databases on a very regular basis and one would have to manage an ever-growing number of possible historical versions as a starting point.  There should be very little variation between one instance and another of any given version.  In that case, having special transformations that go from each version to each other version adds complexity and work.

In this scenario, however, the conditioner proxy exists in the exact opposite context.  There's only one target version with which the proxy is principally concerned.  There are many source versions and there is the potential for some amount of variation between two instances of any given historical version.  So having a special place to handle one or more of the special cases makes perfect sense.

So There You Have It

At a very high level, this should serve as a strategy one could apply to the problem of handling a large, diverse population of databases in various historical states and previously managed with an at least somewhat unreliable process such has being built by hand.

The strategy can be stated simply, though it is not always easy to do.  Capture the historical versions of your class of databases.  Codify those versions, either as conditional logic in the initial version of a simple linear sequence of modification scripts or as a proxy to a similarly sustainable database class format.  Drive each conditional behavior from transition tests that model the variant starting points.  Force all of the variation out of existence in a controlled way.  If you have too much variation to handle all at once, ingest smaller segments of the source database space into your class of databases.

Monday, December 16, 2013

Tracking Versions in your Database

As you may or may not know, I preach creating classes of databases over manipulating the designs of individual databases.  There are many different implementation decisions one could make while elevating design from instance to class.

In a previous post, I discussed the shape of the class itself, arguing that it should be organized around actual released versions plus the thing you intend to deploy next.  This is not revolutionary.  It's not ubiquitous but I'm certainly not the only one to suggest that this is how databases should be maintained.

There are several ways to manage how versions are tracked when an instance of a database has been deployed.  One way that I've seen done is to keep a text file with a list of the scripts that have been executed.  Another way is to fill a directory with scripts that have been executed, with each script living in its own file.

The way that I think works best and that fits most naturally within the concept of a class of databases is to store the state of a database instance within the instance itself.  The alignment is so strong, in fact, that it makes the other methods seem bizarre and beyond the realm of consideration.

Dependency

From a practical perspective, storing a database's version information inside the database only makes sense.  You don't want to have something outside a database that is required to run or maintain it.  What if that thing gets lost?

the puzzle stops working when
it has a missing piece
Sure, one could make the same argument about the class of databases itself but the odds of a document or set of documents that serve as the source of record for something's design getting lost are extremely low - we have revision control and backup policies to ensure that.

In most environments, the rigorous database maintenance procedures ensure that losing data stored inside a database is many, many, many times less likely than losing a text file or a folder full of scripts.  The main way that you might lose said data is if you somehow lost the database itself, along with all its backups, in which case you would probably not care if you also lost the version data for that database.

Management

Another matter of pragmatism is managing databases is that of managing the dependency between them and their version-tracking data.  If those data are external to a database instance, then you have to know where they are stored in addition to how to connect to a database.

The most dreadful consequence of this management hassle is that you might accidentally upgrade one database based on the state of another database.  Depending on what precautions you take, that could have consequences ranging from a few minutes of aggravation to loss of valuable data. 
do you want to feel like this guy?

Moreover, there is a persistent and non-trivial management cost.  Everything you do with a database - every single instance you create - also has to have this manifest of its upgrades stored somewhere else and the association between the two must be tracked for as long as the database lives.

What about all those little databases that only last a few seconds or a few minutes?  As your test suite is executing?  What about all the development databases?  It's not like tracking each of these relationships is or expensive but all those little tasks accumulate to produce an awkward development environment fraught with menial tasks that draw your attention from the things that matter.

Principle

There are several different supporting arguments and each, to me, is sufficient but all are derivatives of one overriding factor: objects are only objects if they are whole.  In a normal object-oriented environment, access to one object is achieved with a single reference.  One pointer to a spot in memory.  One connection string to a database instance.  Those are a good ways to access an object.

this is not a house
not yet
Imagine if, instead, you needed two things to make an object work.  What if a string object stored all its data intrinsically except for its length and, in order to properly use it, you had to track its length in another variable?  Could you do it?  Of course.  Would it be insanely complex?  Definitely.

The wholeness of an object is what grants it its identity; what elevates it from being a mere blob of data that can be accessed by a program into an object.  Tracking the upgrade state of a database in said database is part of keeping it whole.

Thursday, December 12, 2013

Why the Linear Chain of Deltas Is the Best Way to Define a Database

There are several ways to organize the revisions in a class of databases.  In fact, there are at least two distinct ways that one's strategy can vary: how one organizes the outcomes of applying a class of databases and how one organizes the implementation of a class of databases.  It is my stance that, in almost every case, the most effective solution is to have a class of databases codify a series of released versions and to use a sequence of delta scripts to get there.

What

The first order of business is to decide what a class of databases describes.  You could have it describe the components of a database.  You could have it describe the current state of a database only.  You could have it describe each released version of a database plus the one you are working on now.

As stated above, I think the latter is best.  Rather than disprove its competitors and every other possible competitor, I will demonstrate its superiority.

There is at least one database of consequence for most release products.  Usually, that is a production database acting as the source of record for one or more software applications.  Usually, if you were to take a time-lapsed video of that database's design diagram over, say, a decade condensed down to a few minutes, the image in the video would remain almost completely static for many seconds, then it change almost instantaneously to a new design before going back to being stable again.  This process would probably repeat for the course of the video.

The real database - the most important one in your product, organization, or design - is almost always expressed as a series of discrete versions.  Its content grows gradually over time but it's design alternates between long periods of stasis and short periods of violent change.

Databases Naturally Transition from Version to Version

These long, stable versions of your production database are the natural targets around which to organize your class of databases.  Why build anything that won't produce one of those versions?  Of course, the version you are working on now is a bit of a moving target but, when it gets released, it ceases to be the version you are working on now and becomes another in the series of versions in a deployed database.

How

So I've shown that the allowable targets of a class of databases should be past and future versions of a released database only because the reality of the most important databases is that they transition from one version to another.  Once you are doing that there are a few options for how you do it.

Define Every Possible Transition
One option is to manage every possible transition in design.  That is, if you've developed three versions of your database design already and you want to add a fourth, you would produce and test a script for how to get from each of those previous versions to the latest.

Aside from the fact that this strategy cannot be depicted without violating the cardinal rule of diagramming design (no crossing lines), this is a lot of work.  You've got to test four paths for your fourth version, five paths for your fifth version, six paths for your sixth... you get the idea.

There's another option: only define the transition from the most recently-released version of your database to the version you intend to release next.  That way, when you want to add a ninety-fifth version, you only need to add to your class and test one set of transition scripts, not ninety-five.

Only Define the Next Transition
How do you get from an older version to the latest?  By applying each of the intervening transitions in the correct order.  Of course, to ensure this is done consistently, you have to build a little bit of infrastructure but that infrastructure is a one time investment and costs almost nothing to write in the first place.  It usually pays for itself in months, not years.

On top of all that, I could make the same argument for how a class of database grows as I did for what its outputs can be.

The most important, valuable, and rigid database in your life probably transitions from to design to design by a sequence of transformations applied to it.  The linear nature of its development is nothing more than a reflection of the linear nature of time (as we are able to interact with it).  You made a series of changes in a particular order so there is a series of changes to applied in a particular order.

Do It

Unless you are in the two percent of people for whom this way of doing things actually doesn't make sense, you should start managing your database designs this way: a linear chain of deltas allowing you to move between discrete versions of design and managed by a lightweight infrastructure.

Wednesday, December 11, 2013

I'll Be Speaking At DAMA Iowa In May, 2014

This is just a quick note that I will be speaking at the Iowa chapter of DAMA in May of 2014.  I'll announce the official date and place when they have been set.

Tuesday, December 10, 2013

Databases as Monuments

In an earlier post, I mentioned wanting to transfer the focus of database developers from individual databases over to classes of databases.  Focusing on the care and feeding of individual database instances was the starting point for my journey to a class of databases.

I want to drill in to each one of the steps I took in the aforementioned post, including the place where I started.  However, rather than talk about what is wrong with each concept, I'm going to focus on what was right about it.  This is as much to challenge myself as it is to explore the problem in an interesting way.

Thousands of Years Ago, Before the Earth Had Cooled

you change your course for a
monument, not the other way around
Once, long ago, we thought of databases primarily in terms of the ultimately deployed instance.  This corresponds with the Lean concept of a monument - a piece of equipment that is so much a fixture that it controls your process rather than conforming to it.

Often, individuals were given caretaker positions and these caretakers quickly became responsible for the maintenance of both health and design.  The caretakers became gatekeepers in very short order.

Even early on, before agility was formalized, patterns-oriented development was codified, or test-driven development was understood, eye witnesses have told me this was a major impediment to the flow of new features in many software systems.

This is no surprise.  Monuments almost always impede flow.

When I was starting out as a developer, this was still a problem.  When I started capturing knowledge about test-driven database development, nearly a decade ago, this was still a problem.  I observe organizations today where this is still a problem.

The real question should never have been "How do we remove the impediment here?"  It should have been "Why does this problem exist everywhere I look?"

Lifeblood

The answer, as with many of the old ways that we so readily scorn today, is that there was something right about the gate-keeping behavior.  Not that it was completely right - I'm not saying that.  There was, however, one thing right about it: the motivation.

Databases are unlike most other kinds of software.  At least that is true in the case of production databases acting as the source of record for business data, which is what most of us think of when we hear the word "database."

Most kinds of software we write have two properties that are interesting in this post.  First, the designs are very complex.  Despite all our efforts to "keep it simple," we are addressing complex problems that demand sophisticated solutions.  Second, they contain little or no data other than their design.  In the old days there was some weird stuff about storing certain data as resources within binaries but that is mostly gone now.

By contrast, databases tend to have small, simple designs housing large, complex bodies of content.  That content is usually extremely important; like, "lose your job if it goes away" important.  As software becomes a more influential part of business, the contents of databases become more vital.

Cover Your Heart!  Cover Your Heart!

When something is vital, you protect it.  Sometimes, you protect it in a way that is otherwise detrimental to your health.

Imagine you are trapped under water but just barely.  Your wrist is caught in something and it is forcing you to face downward.  If you could just turn over, you can breath but you have to break your arm or dislocate your shoulder to do it.

You might think you're too weak to do anything about it - that you would drown.  That's certainly what we are all taught would happen.  Don't kid yourself: You're an animal.  Somewhere in your brain is a thing that wants to survive even if it hurts.  That thing doesn't care that you heard some words in school telling you that humans are different, special, and weak.

At some point, you'd start thrashing about until something gave and you'd stand a very good chance of flipping over and sucking in some air as a result.  It would hurt.  It would cost you something - maybe even a hand or a thumb - but you would live.

That is how I think of monuments, now.  Databases as monuments create gate keeping caretakers.  Those people are not malicious in their conservation of power.  They are not incompetents maintaining job security.

They are an organization's "lizard brain" (in a good way): the part of an organization that forces it to do anything - bend over backwards or even sacrifice a body part - in order to avoid the devastating effects of losing critical data.

Keeping the Faith

When I started getting good at "regular" TDD, I went through a phase where I believed that traditional testers with no programming skills were no longer needed.  While we may not need as many of them now as we did in the '80s (I hear), I freely admit that I was wrong in believing we didn't need them at all.

Testers were once essential to the software development process, providing vital feedback that ensured a product was at least worth showing a customer.  While we can get a lot of that feedback from automation, now, it is still difficult for a developer to have the same perspective as a real tester.

It's a skill.  As one of my coworkers at the time of this post would say, you have to be able to imagine what an end user might do.  So traditional testers have found a new place in the modern world.  One in which they are extremely effective; maybe even more so than before people did a lot of automation, because now they aren't stuck doing as many menial tasks.

Similarly, as we work to build a new practice of test-driven development in the database world, we should remember the vital function that the traditional DBA has filled.  That role and the perspective that comes with it should not be lost.

Part of learning how to create and test-drive classes of databases will be providing views into those structures that allow traditional DBAs to contribute, not to satisfy some political requirement, but to harvest the valuable knowledge and experience in their heads.

Monday, December 09, 2013

Do We Need Architects?

I'm still not certain that we need the role of architect but I'm coming around to support the need for the activity of architecture.

The kind of long-range planning an architect is asked to do, which so readily turns into a long-term commitment, triggers a sort of allergic reaction in most agile software developers these days.  I know it did for me.  It kind of still does.

However, I've found a way around that with value-stream-oriented architecture.  At least, I found a way that works for me.  Using a tool like VSOA, you can avoid creating the kinds of plans that become commitments or that interfere with implementation-decisions better made by individual  teams.

So I'm coming around to accepting the value of someone doing architecture.  The next question is should it always be the same someone?  Put another way: should there be "an" architect or should architecture be another kind of activity in which an agile team engages?  I'd like to warn you right now that I don't have the answer.

Pros

The most obvious argument on the pro side of the question is that if you have a designated architect, architecture will get done.  Architecture, while a valuable task under certain circumstances, is not usually a task critical to the completion of any given story in an agile environment.  If you don't have someone dedicating some portion of their time to it, then it is likely to fall by the wayside.  The effects of not doing long-term planning an coordination are felt much later and difficult to trace back to a lack of architectural attention.

Designating an individual as a team's or enterprise's architect ensures that time will be spent on the task.  Most agile teams have designated product owners for precisely this reason: someone needs to be thinking about the next steps.  In fact, my friend Mike Brown likes to refer to architects as product owners of technical value.

Just like a product owner, this also means that if coordination must be done with other sites or organizations, you already know who is going to head out there to do that.  While that may be distracting for the architect, it allows the rest of his team/enterprise to focus on creating valuable products.

It is also possible that concentrating the architectural tasks into one person allows them to gain a perspective on the whole that they otherwise would not have.  That perspective would then allow them to create more coherent visions for future development, better interact with product ownership, and better drive developers toward that vision.

Cons

The most obvious argument against having a designated architect is that putting someone in that kind of position is tantamount to handing them a bunch of dynamite and they might take themselves and their teams out just as easily as they could blast out a quarry.

It's to true to ignore.  In some respects, I wonder if the notion that someone who is interested in politics ought to be banned from politics also applies to architecture.  Most people who want the power of being an architect will not use it to create value.  Few architects today are architects because they wanted the influence.

The fact that an architect can be like a kind of product owner carries both edges of the product owner sword.  Sure, only one person has to travel but that means that the one person responsible for those kinds of decisions and that kind of learning is often not available for questioning.  Sure, they might get a better perspective than they otherwise may have had but it is still a single perspective and thus more vulnerable to confirmation bias.

Widening Our Options

In Decisive, it is said that dichotomies are a decision-making smell.  I've observed that for myself several times since having read the book.  So I'm going to add one more option.

The team that I was working on went through similar gyrations with product owners (POs) for several years.  These issues were exacerbated by the POs' very aggressive travel schedule.  What we settled on was a blend of the "have a PO" and "share PO responsibilities throughout the team" options.

We have product owners and they do own product design but they do not make all of the product design decisions.  They are responsible for the product design decisions and, for that reason, they can countermand one they don't like but we don't wait on our thumbs for all product innovation to flow from the product owner as someone preaching a very strict, very academic version of scrum might suggest.

Instead, the team as a whole contributes to product design.  The product owners guide that effort with the knowledge and perspective they have gained and they set direction by making big decisions and performing long-range planning.

Sound familiar?

So the other option I would like to put forth is that you can have a designated architect who is responsible for making sure that long-range technical planning is done and teams are coordinated and you can have architectural work being shared throughout a team or enterprise by all of the developers.

No Answer

I stand by my earlier statement that I have no answer to this question.  For one thing, I didn't have an answer when I started writing this entry.  For another, I'm not sure I do even now.  Certainly, the idea of decoupling ownership of architectural issues from execution of architectural tasks is an interesting one.  The success my current team has had by employing that strategy with product ownership makes it a tempting one.

So tempting I'm going to try it but I still don't have an answer.  Not yet.

Sunday, December 08, 2013

A Solution to the Knockout Game

You can't stop this "knockout game" with words.  At least, you can't do it with words alone and not with kind or reasoned words of any kind.  It must be met with violence.  First, violence of the mind.  Then violence of the body.  Here's the plan.

Step 1: Dehumanize
The first thing we need to do is start denigrating the people who play this game.  We need a national media campaign to paint them as the animals they are.  These aren't young punks trying to deal with surplus adrenaline by picking a fight with another male of fighting age.  These are worthless little pukes picking on women and the elderly.

There should be nowhere for them to run from the message.  Nowhere to hide.  Everywhere they go, there should be billboards, marquees, and television advertisements shaming them and reinforcing one message: you are nothing - not even the lowest level of person - if you play this game against a woman, a child, or an old person.

The goal is not just to demonize these people in the eyes of others but to dehumanize them in their own mind.  People who play the knockout game in its present day form should be allowed to ego nor any measure of self esteem.  They need to be put in a place of utter despair.  They need to be willing to do anything - anything at all - in order to prove that they are not animals but men.

Step 2: Retaliate
Since kids - and it appears most people in general - are pretty stupid and easily swayed by the media, step 1 should turn their behavior a little.  Some of them will probably still choose the wrong targets but, hopefully, the culture of shame and humiliation will rob them of their will to live.  I'd like to see less kids killing themselves because they are gay and more killing themselves because they played the knockout game.

Some, however, will still want to play the game and will feel they have something to prove.  They will attack other men.  This is where the hard work starts.  I know it sounds crazy but, when they attack us because we are men of fighting age, we have to actually be men.

People who assault others on the street for no apparent reason must be dealt with.  That does't mean the cops get called.  That doesn't mean a gun gets pulled.  It also doesn't mean a single punch to the face or a little smacking around.

We need anchormen frozen with the chill of what they see.  We need mothers on the news collapsing to their knees and weeping when they discover what their child did and what happened to them as a result.  We need follow up stories months after attacks about the occasional knockout game player beating the odds and being transferred from the hospital into the county jail to await trial.

We need to teach the people playing this game that they still have something to lose and that the pain of playing the game is far greater than the pain of whatever fleeting angst they might be experiencing.

Step 3: Reprogram
The final step is to start teaching children to police themselves.  Juvenile humans are vicious, especially to other children.  We simply need to point that viciousness in another direction.  We need to re-align the culture of our schools so that people who attack random strangers are seen as different.  There's nothing that brings down the righteous fury of the masses like being different.

When I was a kid, schools made a practice of singling out children with certain attributes.  If you were exceptionally bright or artistically talented, there were structures in place to make sure everyone knew you were different.  This led to years of torture and many children tried to conform in order to fit in to the crowd.  Not me, but a lot of kids.

While this property of schools has traditionally been used for evil - as nothing more than a mechanism for small people to manipulate other small people into punishing the exceptional - I believe it can be used for good.  By the end of phase 2, acts of random violence should be cut down pretty close to the minimum.  Only the really broken kids would still be conducting them - the real "seed of evil" types.

The final stage is to identify these people early and sick the masses on them while they are young.  The pressure to fit in should drive many of them to try and conform which, in this case, means not playing the knockout game.  Those few who still engage in the behavior will serve as reminders of what happens when you threaten the fabric of society.  Better still: it's the truly evil kids being sacrificed to this cause, which is no real sacrifice at all.

If this plan is executed, the knockout game will cease to be a game.  It will lose its name.  It will instead be called what it really is: random acts of violence by misguided and defective teens.

Of course, it never will, so none of this really matters but I enjoyed writing it and, if even one person scolds a knockout game player as they run by because of this post, then I've made a difference.

Friday, December 06, 2013

I'll Be Speaking At DAMA SoCal in February

This is just a quick announcement that I will be giving the same talk I mentioned earlier again at DAMA SoCal on February 24th, 2014.

Click here for more details.

Thursday, December 05, 2013

Robbery on the Road as a Sign of the Times

My father in aw bravely thwarted a robbery a while back.

It was taking place on the sidewalk.  A large man and a small man were struggling over a backpack.  My father in law got out of his car and asked them what was happening.

The large man said "I'm trying to take his stuff but he won't let me!"

My in law began calling the police at which time the large man trundled off into the city.

I think this is emblematic of the thinking in America.  Nobody is concerned with what is fair or right anymore.  Everybody is concerned with what they can get regardless of how they get it.

People like that large man have been robbing people like us for ever.  It's the kind of robbery we all know and it's what people imagine when you say "robbery."  But the thinking which underlies that behavior is flung a lot further and a lot wider than just hoodlums and ruffians.

You see this thinking in many welfare recipients, both individual and corporate.  I know more than a few people on "disability" who could easily work if they so desired.  Instead, because their easiest path is to steal from the taxpayer, they lounge around in their free apartment, watch their free cable, and eat their free food at my (and possibly your) expense.

Likewise, rather than earn a profit by offering a valuable product that people are willing to buy, corporations are becoming increasingly dependent on government contracts.  Why build something useful when you can just suckle at the teat of the U.S. government?  Why cover your risks when you know the government will just bail you out because you're too big to fail?  Why learn how to get people from point A to point B at a reasonable price when you know that the government will keep subsidizing your airline?

Another great example is the "false fee" industry.  In business schools, some have taught that adding fees onto an agreed-upon account is a "growth industry" and will represent some ridiculous amount of money (tens of billions per year) by the end of the decade.  Phone companies are notoriously bad about this.  They get you to agree to one price, then they tack on a bunch of extra fees and "taxes," some of which are legitimate and many of which are just plain old fashioned fraud.

I've also personally experienced a credit union stealing my money by charging me a fee for not using my account.  When I confronted them about this, they looked at me like I was crazy and said "We're just trying to fee down all the inactive accounts so we can close them."

In every one of these cases, the thief must justify his actions and his own existence; at least to him- or herself.  The welfare consumer must argue that they are "down on their luck" or otherwise unable to take care of themselves and that justifies, in their minds, the use of money taken at gunpoint.  The corporate welfare recipient is probably every bit as unenlightened in its thinking... it's probably something like "I've got a lot of jobs and shareholders to which I must answer so I've gotta do what I've gotta do."

People who perpetrate false fees also must justify their actions and probably say something like "I'm just trying to make a buck" to themselves and others who confront them about their nefarious acts.  Ten years ago, the credit union manager told me straight to my face, as though I was the one in the wrong, that they were just trying to take my money.

What's the big deal, right?

All of these people are about the same as our thug on the street.  They all want more than they have or could gain on their own by fair means.  They all want to take something from someone else; something that doesn't belong to them.  They all believe that they are in the right for "just trying to get ahead."

I've got no idea what to do about it.  It seems like the only corrective course of action is to radically change the makeup of the people who are deciding how the country I live in works.  There are lots of ways to do that.

You can increase the population of smart and thinking people, possibly by education and definitely by reproduction.  That takes time and a lot of energy.  Who wants to wait fifty years for a problem like this to get solved?

You can decrease the population of stupid or unthinking people but it's difficult and requires unethical activities.  A pogrom against the stupid and evil has one major roadblock: there are whole lot of stupid and evil people out there and none of them want to be pogrommed.

An increasingly viable impulse, for me, is exodus.  Instead of making more good people and instead of rounding up the bad ones,  just round up the few rational people left in this country and leave to make a new one.

That's probably very nearly as impractical as restoring freedom and reason to America but it seems unlikely that it would be any harder and it's probably at least a little bit easier.

Wednesday, December 04, 2013

My Journey to a Class of Databases

If you know anything about me or you read my article on InformIT, Ten Tips for Constructing an Agile Database Development Environment that Works, you know that I think the foundation of database TDD lies in creating a class of database.  That means transferring focus from the design of individual databases onto something that makes databases.  You want the ability to create or upgrade as many instances as you like and know that they all have exactly the same design.

Exactly the same design.

There are many possible ways to implement a concept like that, though.  When I began developing these concepts nearly a decade ago, before I even knew that I wanted a class of databases, I was just focused on controlling database creation.  I experimented with a lot of different kinds of infrastructure and a bunch of different patterns of database growth.

Since I started being a professional software developer, I've tried a bunch of different ways to control how databases grow and transform.  I think a lot of people have had similar thoughts and various times in their lives.  These are the ways and times I thought about these problems.

The Fool's Errand (pre-2005)

The most naive solution is the idea that you can specify what you want the design to be right now and have some tool that will update an existing database to have the new design.  Sometimes, it is a tool that compares two databases.  Sometimes, it is a diagramming tool that will inspect a database and figure out how to make it comply with a drawing.

magic will transmit design changes!
The problem is that this doesn't work.  It doesn't work for the same reason that you can't unscramble an egg.  There is no way for a software system to look at the current design, look at a new design, and figure out how to get from point A to point B.

At least, it's not possible to do that every time and with current technology.  Maybe, one day when we have computer systems that can infer intent, it will be possible.  Right now, however, that's too complex a task for a computer.

The Installer Fallacy (2005-2006)

Another way of thinking about the problem is the way we imagine installers.  Databases have components.  Components have dependencies.  You ask the installer to make sure the features you want are there and it ensures the dependencies are satisfied.

The problem is that there is always a meltdown.  In this case, I'm using that term a little less figuratively than usual.  In a healthy database design, things are changing.  Tables are splitting and recombining into newer, better shapes all the time.

The features all melt together more quickly than you could imagine.  Pretty soon, it's difficult to tell why you are creating separate features and components at all.  Eventually, all the components blend together and you wonder why you ever divided components in the first place.

Rise of the Versions (2006-2008)

After about my third database "feature" that depended on exactly one feature, which in turn depended on only one feature, I started to get the message.  I realized that the forces in the database world are telling us to organize around time, rather than around features.

It turns out that there is usually at least one database instance that as an extremely linear path of transformation, and it happens to be the absolute most important kind of database there is: a source of record database in production.

Production databases tend to metamorphose over time in a series of discrete transitions from one design to another.  At the same time, production databases are the most indispensable and long-lived databases of all.

Everything else (e.g.: test databases or development databases) tends to have a little more flexibility.  At the very least, nothing else has less flexibility.  So why shouldn't the most important and least flexible kind of database define how all databases of a particular kind are built.

Have a Little Class (2008-present)

When I started formulating these thoughts into something that I could start evangelizing, I realized there was more to this than just regulating the flow of design changes from a development environment out into a production environment.  That's an important feature but it's just an implementation detail of a much more critical shift in mindset.

a path of confidence
What really matters is having uniformity of design between all the different database instances filling the same role.  If you have that, tests executed against one instance allow you to make predictions about how another instance will behave.

That mechanism - that way of thinking - serves as a critical underpinning for test-driven development in the database world.

Tuesday, December 03, 2013

A Class of Databases for Non-Middle-Tier-Developers?

As I previously mentioned, you can use the sample code for my book as a starting point to establish a class of databases and you can also inexpensively write one for yourself.

However, if you don't want to do any non-SQL coding and you don't want to use the infrastructure I built and made freely available, it might get a little stickier.  At least it would for me because the test frameworks and expressiveness available for SQL-only solutions is kind of weak.

One option is to write something yourself.  If you do that, I'd love to hear about it and I bet a lot of other people would too.

Another option is to wait for the release of DataClass 3.  Maybe you could even participate in its design to make sure it meets your needs.  Right now, it is being made by a software developer for software developers.

If you need something now, though, you'll need to invest in developing something yourself or try downloading and compiling the sample code to see if it will work for you out of the box, which it easily could do.

Monday, December 02, 2013

Pivoting a Little on DataClass 3

I've found a smaller deliverable for DataClass 3 and will be releasing that, first.  As it is a side project, I cannot commit to a date but it shouldn't be too long.  I don't want to get in to the details but, basically, I'm going to rewrite the examples for my book incrementally.

The first deliverable should cover about a third of the book.  From there, I can add support for each practice independently.  After that, I will add the higher-level language and refine each of the examples as I go.

The initial output will still be C# 4.0 and the initial database platform will be still be SQL Server 2012.  Then I'll widen support from that point as demanded by my readers & followers.