The case against Static SQL


The older readers of this blog may remember that, back in the days, in a world where databases were relational and resided on big iron there was one commandment that you simply couldn’t ignore: THOU SHALT NOT USE DYNAMIC SQL!

There were several reasons that sounded reasonable to me. The first one was execution speed of queries. Having written a few cobol programs in my early career (yes, that’s been roughly 20 years now…) I was always impressed by how fast an MVS mainframe was in sorting and processing large files or in querying and processing a few million records in DB2. You usually had to wait longer for your job to be started than it actually ran.

The second one was system stability: one programmer’s badly crafted statement could possibly stop the whole machine and thus bring the company to a halt. To me, this sounded a bit theoretical, because there always were operators who would kill jobs that take more CPU than they should or block too many resources, but there was a theoretical chance something would go wrong.

Just the other day I stumbled across this post by Craig Mullins.

His point is that there still might be good reasons to statically bind programs and use a pre-defined access plan to process a very well defined set of records really fast.

On the other hand, it is almost impossible to optimize db access for the needs of more dynamic applications. Additionally, runtime optimizations of today’s relational DBs are so much better than they  used to be in the nineties and many applications today don’t really fit well into the static schema.

In fact, one customer I’ve worked for over the last few years introduced an architecture that required every DB access to be pressed into a COBOL program with static SQL. They built a really huge machinery to access processes that will not do much else than just query a few records from the DB. The whole development process slowed down almost to a complete halt. If the web front end guys needed a few more attributes, they had to ask the mainframe guys to change the associated COBOL process, which consisted of a couple of programs and stuff, and so every tiny change took weeks and months where a simple change in the SQL would suffice if the execution of dynamic SQL was allowed. What seemed to be an easy change cost a few thousand euros and had to be massaged by so many hands that people started building the craziest workarounds, using two processes and bombing their unrelated results in Java. Changes to one of these processes suddenly affected places in the application that seemed to be completely unrelated. I’m sliding off-topic here, but the point I want to make is that this organization replaced the dangers of one SQL statement running mad with tremendous organizational efforts and almost unmaintainable software that tends to be very brittle. And you might imagine that the strange workarounds didn’t help in making the overall system more performant.

So while the basic idea of the architecture was to increase stability and performance, it achieved the complete opposite.

This is where Craig Mullins’ post comes into play: Maybe dynamic SQL these days isn’t that bad a choice any more as it seemed to be. The problem of bad SQL possibly eating up CPU and I/O may still be there, but maybe it is better to look for ways to find and kill such statements, instead of building up an enormous machinery to avoid this one statement from being issued.

So Craig’s post is well worth reading and thinking about a little. I only hope his arguments will make their way into the heads of system architects over the next few years so that people can start looking at the “problem at hand” from a different angle and look for better solutions.