In this post we shall see how to incorporate a typical manual test in an automated test harness with minimal effort.
Because our goal is
to benefit from unit testing, and not to develop as many features as we can think of, we shall concentrate on the most common use cases. In fact, just one use case is so common that is happens more often all all others combined together, at least in my practice. Naturally
this whole "the most common" thing is very subjective, and your perspective may be very different
from mine. Anyway, the following example demonstrates what I think is the most
common scenario in my practice. Supporting this very common scenario allows me to gain about 80% of possible benefits by implementing about 5% or less of possible
features (naturally, my estimates are very subjective).
This post continues the series on unit testing, the previous posts are
How to Benefit from Unit Testing T-SQL: choosing what not to test
How to Benefit from Unit Testing T-SQL: choosing what to test
How to Benefit from Unit Testing T-SQL. Part One.
Developing a stored procedure
So here is my example of the most common scenario in my day-to-day development
activity.
Suppose that we need to implement this requirement: "For a given sender
name, and a range of DATETIME values, select all the messages sent by the sender between these
two times, ordered by the time they were sent. "Between" must mean the way SQL interprets it, with both ends
included. One or both ends of the interval can be omitted, in which cases we must return the following:
-
If DateFrom is omitted, return all the messages from the sender sent no later
than DateTo,
-
If DateTo is omitted, return all the messages from the sender sent no earlier
than DateFrom,
-
If both DateFrom and DateTo are omitted, return all the messages from this
sender"
We have already discussed that we prefer to test complex modules, so let us
assume that this is a challenging task, and that developing this module fascinates us. In other words, this module is complex enough and eventually we need unit tests for it.
Here is the procedure that implements these requirements:
CREATE PROCEDURE dbo.SelectMessagesBySenderNameAndTimeRange
@SenderName VARCHAR(30),
@DateFrom DATETIME = NULL,
@DateTo DATETIME = NULL
AS
BEGIN
SELECT Subject, Body, SentAt
FROM dbo.Messages
WHERE SenderName = @SenderName
AND SentAt>=COALESCE(@DateFrom, SentAt)
AND SentAt<=COALESCE(@DateTo, SentAt) ;
END
Note: Maybe this
implementation is not performant enough, and at some later time we shall need to speed it up. Right now we are making sure that this module work
correctly. Maybe, however, this procedure will perform well enough for the time we need it, so
let us not over-engineer it just yet, let us not optimize it before we know we need
optimization. For now, let us concentrate on the correctness of this procedure.
Developing manual tests
Even if there is considerable pressure to get things done fast, we absolutely
want to test the module as it is being developed, manually or automatically. For
thorough testing, we need enough test data to cover all the cases. It does not really matter too
much whether we generate test data with the help of some tool, or just type some insert commands manually - there are many
possible ways to accomplish this task. Yet it does matter very much that the test data
represents all the cases.
Suppose that we have already come up with the procedure and all the needed test
data, and that we want to test the procedure. I find it easier and faster to first test my
procedure manually, in SSMS, as shown in the following script:
-- no messages from jb12345, should return nothing
EXEC dbo.SelectMessagesBySenderNameAndTimeRange
@SenderName = 'jb12345';
-- must return all messages from jbrown
EXEC dbo.SelectMessagesBySenderNameAndTimeRange
@SenderName = 'jb12345';
-- must return all messages from jbrown sent no later than 20101004
EXEC dbo.SelectMessagesBySenderNameAndTimeRange
@SenderName = 'jb12345',
@DateTo = '20101004';
-- must return all messages from jbrown sent no earlier than 20101001
EXEC dbo.SelectMessagesBySenderNameAndTimeRange
@SenderName = 'jb12345',
@DateFrom = '20101001';
-- must return all messages from jbrown sent no earlier than 20101001
-- and no later than 20101004
EXEC dbo.SelectMessagesBySenderNameAndTimeRange
@SenderName = 'jbrown',
@DateFrom = '20101001',
@DateTo = '20101004';
Suppose that these test cases cover all the possible
situations we know of (maybe I am missing something, but I don't know it at this time), and that when we run this manual
test script in SSMS against our test data, we are sure all the results are correct. Now
that we have a working procedure, we need to come up with unit tests equivalent to the script
above. Of course, we also need to document the module, and to add all the results of our
work to version control.
Needless to say, we want this process of coming up with unit tests to be very
efficient, because we are going to repeat it many times a day.
From manual tests to automated without writing code
So I have come up with the following simple process: to indicate that my manual test should be incorporated into my test harness, I just save it into a folder named Tests. I have a little utility
which executes the manual test and saves all the output from it in a
separate XML file, in the same tests folder, with the same name as the sql test script.
When I have my expected results persisted in a file, I can run another utility
which executes my test script and matches its results against the XML file with my
expected results. Note, however, that I don't have to write a single line of code
neither for expected results, nor to match the actual results against expected.
In the past I was writing trivial glue C# code to invoke my T-SQL modules and to match the actual results against expected ones, as we described in Close those Loopholes - Testing Stored Procedures
This
worked very well for me, but surely not having to write any code at all, having
a tool do everything for us, is more efficient than having to write trivial
glue code. So here we are, eliminating fluff, getting rid of unnecessary
complications.
So let me summarize the process described above: I will spend as much time as
needed,
making absolutely sure that I have enough test data and enough test cases. I will not cut any corners and will not compromise the quality of my code until
I know that my code works.
However, when I know that my code works, I will save my manual test in Tests folder, to indicate that this is a
test. I will generate my expected results, and save them as a file in the same Tests folder.
and I do not need to write a
single line of code to invoke my tests. A utility will run this test for me.
Generating expected results is faster and less prone to errors, than doing it manually
The process of reading a test script from a file, executing it against some
test database, and matching the results against the expected ones is highly repetitive. As
such, it can be fully automated. This – the
fact that expected results are generated and not manually developed - may be considered by some professionals as a very serious
compromise.
Some, probably many,
will not approve the fact that I generate my expected results, some will say that
it violates some kind of basic principle of unit testing or TDD. Maybe so. Yet
this approach saves me a huge amount of time, and allows me to concentrate on what really
matters, what really makes the difference: building a comprehensive set of test cases.
In this case, should we choose to stick to pure theoretical principles, we have
nothing to gain and a lot of time and effort to lose. More to the point, coding the expected results manually is a very mundane and time-consuming process. As such, it is very prone to errors. In addition to the
time spent manually defining the expected results, a lot of time is usually spent fixing
bugs in our test harness. When we generate expected results, we eliminate the source of
such bugs and save a lot of time.
Next steps
Note that our test script can and should be also used as documentation, which
saves even more time. We shall discuss it a later post.
Another important point here is this: when the test script is stored separately
from its expected results, this can simplify maintenance. We shall talk about it more in
some later post.
One last note: if at some later time our unit test fails, we need to expose the
discrepancies between actual and expected results in a very clear way. When the expected results are stored in a separate file, this is very easy to accomplish. This
is described in my next post,
How to Benefit from Unit Testing T-SQL: Exposing Failure in a User-Friendly Way