I have been posting examples of defensive database programming
for some time now. I am by no means done with this topic, there is much more to
it. Yet this time I would like to skip concrete examples and write up a long
overdue introduction.
Defensive programming is an approach that is designed to ensure
high quality of software by eliminating existing bugs and avoiding potential
ones. The goal of defensive programming to make sure that software robustly and
gracefully handles cases of unintended use. Programming defensively includes
the following techniques:
- ·
Explicitly list the assumptions that you made
during development
- ·
Whenever feasible, remove assumptions that are
not essential
- ·
Ensure that the essential assumptions always
hold
- ·
Use all your imagination to come up with cases
of unintended use, trying to break your module. Incorporate these cases into
your testing suite.
- ·
Lay out your code in short, fully testable, and
fully tested modules
- ·
Reuse your code whenever feasible
Although at the time of this writing apparently the most
popular example of defensive database programming is the prevention of SQL
injection attacks, there is much, much more to it. Also unit testing is another
very powerful approach in ensuring high quality of software. Defensive programming
and unit testing are very closely intertwined and are frequently used together;
this is why there are several references to unit testing in this post.
If you drive defensively, it frequently means driving
slowly. You may need more time to get to your destination, but you are less
likely to get into an accident. Similarly, if you program defensively, it frequently means delivering
slower becasue of all the precautions you are taking. Whether you need to program defensively depends on your situation. I am not saying that the techniques which I demonstrate need to be used in all the cases.
Identifying your hidden or implicit assumptions.
This is extremely important. This is never easy. I have
already posted several examples, such as:
Rewriting queries with NOT IN():
Defensive
database programming: rewriting queries with NOT IN().
A correlated subquery with an unqualified column
with an implicit assumption that a table does not have a column:
Defensive
database programming: qualifying column names.
One very commonly used but rarely recognized assumption is
the absence of concurrency. There are quite a lot of stored procedures that
work perfectly well when invoked only from one connection, yet intermittently
fail in high concurrency environments. The post on IF ELSE logic demonstrates
that.
Removing assumptions that are not essential.
Examples of this technique are the same as in the previous section.
Ensuring that essential assumptions always hold.
Whenever possible, separate such checks from the code that
actually runs on the server, so that they do not slow down the execution. For instance,
you can use unit tests to make sure that some condition is true. I provided an
example in my previous post on defensive database programming: http://sqlblog.com/blogs/alexander_kuznetsov/archive/2009/01/25/defensive-database-programming-set-vs-select.aspx
However, in some cases you do need to make sure that your
assumption holds at execution time. For example, if your stored procedure only
works correctly if there is no concurrent execution of the same procedure from
other connections, you can only guarantee that at execution time, as Tony
Rogerson demonstrated in his post “Assisting Concurrency by creating your own
Locks (Mutexs in SQL)”:
http://sqlblogcasts.com/blogs/tonyrogerson/archive/2006/06/30/855.aspx
Coming up with examples of unintended use.
Try to come up with cases that could break your module right
before you have started developing it. This is a very good time to do so,
because you might come up with better. For example, if you are submitting a
comma-separated list of zip codes (on 2005 or earlier), and expect to get back
a list of customers for those zip codes, make sure to submit imperfect lists
such as “,12345,12346”, “12345,,12346”, “12345,12346,,”, and so on. Try to
break your module again right when you have just developed it, because the
implementation details are still fresh in your mind. For instance, if you used
a query with LIKE clause in your implementation, you may later realize that you
have not expected patterns such as “[OT]%”. This is demonstrated in the following
post:
http://sqlblog.com/blogs/alexander_kuznetsov/archive/2008/11/08/defensive-database-programming-adding-escape-clauses.aspx
Also ask somebody else to break your module – that might yield
more results.
Conclusion
As you have seen, defensive database programming includes
several different techniques applied on case-by-case basis. I still have a lot
of examples to share, but I felt that it is necessary to summarize what I have
written so far. That done, I can get back to writing up examples, one at a
time.
This post continues my series on defensive database programming.Here is my next post:
Here are
the previous posts from the series:
Defensive database programming: SET vs. SELECT.
Stress
testing UPSERTs
Defensive
database programming: fun with UPDATE.
Defensive
database programming: eliminating IF statements.
Defensive
database programming: fun with changing column widths.
Avoid
mixing old and new styles of error handling.
Defensive
database programming: adding ESCAPE clauses.
Defensive
database programming: qualifying column names.
Defensive
database programming: rewriting queries with NOT IN().