“it is generally a lot easier to verify a result reported by the test harness than it is to figure out the right answer yourself beforehand and write the code to check for it”
I don’t particularly enjoy writing tests, but l noticed long ago that I enjoy debugging insufficiently tested systems even less. If you have a suite of old tests that you can run whenever you think you’ve gotten something new working, it can save you a ton of trouble later by showing you that you’ve broken something — right away, when you know that something in those 40 lines you just mucked with must have caused the problem.
The traditional way of writing tests looks something like this:
- Have the system under test do something.
- Capture some result that you can use to determine whether what you tried worked.
- Check that result in various ways, raising a flag if it’s wrong.
Eons ago I was writing a date/time library and realized that I would need hundreds of tests; it was worth my effort to make writing them as simple as possible. I created a simple CLI to invoke the routines in the library and a test harness that used the CLI to run a bunch of tests. Each test was just a single line that made the library do something, e.g. convert a date to Julian; the harness did all the rest.
“Wait a minute,” you complain — “how did the harness know whether the test passed? A test must have done more than just make the system do something!”
But it did not. A test really did just look something like
addDays 1988-12-29 5
So how did the test harness know whether the test passed or failed? Well, the first time the test was run, the harness did not know — it had to ask whether the output was correct. If I said yes, it saved the output as the correct result. The tests were “lazy” inasmuch as the correct results were not established until the tests were run the first time.
This approach proved extremely convenient — I could create a new test in a few seconds. And while the regression tests usually just passed without incident, there were in fact many times when I had inadvertently broken something and the tests saved my bacon. Without those tests I would have discovered far later, perhaps even at a customer site, that something somewhere wasn’t working, and would have had to laboriously trace it back to those 40 lines. And it might take me a while to fully remember what those 40 lines were about.
The point of doing tests this lazy way is that it is generally a lot easier to verify a result reported by the test harness than it is to figure out the right answer yourself beforehand and write the code to check for it. This is especially true if the right answer is, as is often the case for me, a dump of some tree structure, dozens of lines long. I can look at such output and pretty quickly say “yes, that’s right,” but if I had to explicitly code such a tree structure into each test, you sure wouldn’t see me writing many of them!
Furthermore, if I deliberately change something that causes the tests to fail, I don’t have to go back and fix all of the affected tests manually. The harness stops at every failure, shows me a side-by-side diff of what was expected and what we actually got, and asks me whether I want to accept the new output as the “good” output. If I say yes, the harness fixes the test for me. I can even tell the harness that if further tests fail with the same diff, they should be automatically updated without asking.
In some scenarios this approaches presumes the existence of a CLI. These days I write server apps with REST APIs, and I always create CLIs for them. “We don’t have a requirement for a CLI,” a manager told me recently, thinking we would save time by not bothering with one. “You’re getting one anyway,” I responded. I always port/write a shell-based CLI, giving us a very powerful way to control and script the server — very handy. Then I port/write a lazy regression test harness (a couple of hundred lines of shell, currently) to the CLI and begin writing lots of one-line tests.
And suddenly testing is not such a drag.
UPDATE: Eric Torreborre would like to see support for lazy test development in his highly regarded “specs2” unit test framework for Scala code. That would be a fantastic feature.
UPDATE: I discovered from Bill Venners (who wrote ScalaTest) that somebody has created a facility for doing this with unit tests, called ApprovalTests. Unfortunately it seems tightly bound to JUnit, so interfaces to specs2 and ScalaTest are unlikely.