PART ONE

Don't Let Yourself Get Unitized

I have somewhat miraculously never seriously used JUnit in my career until very recently. Oh, I've played with it and dealt with code bases that used it, but it had never been anything central to my day to day project activities. But in the past few weeks the miracle has ended, and looking more deeply into Unit Tests and JUnit has become a fact of life for me. Oh, unit tests have always been there, but in my own special style, and not relying on broadly known tools. But times change, and we all have to change with them or risk obsolescence.

Co-commitant with my own personal motivations, I see that a few others have been talking about Unit tests (again!) in the past week or so, including Martin Fowler and Cedric Beust. So it seemed time to fire up Vim and put in my own two cents on the subject. The remainder of this entry is a collection of notes and observations on the state of Unit testing in the Java world today, and some comments on recent articles on the topic.

What's with the obsession with JUnit?

Cedric notes in a recent blog entry:

I think JUnit got traction because there was nothing else when it came out. It was simple, lightweight, easy to use, so everybody adopted it. Fair enough. Then its limitations started showing and since its development was pretty much stopped then, people started building tools to work around its limitations.

He goes on to note JUnit's deficiency's in greater detail, and to pitch his own TestNG and how it addresses those deficiencies.

I happen to agree with Cedric, but I'm a bit surprised he hasn't stated his case more forcefully. Maybe he's just a nice guy, or BEA's had a mellowing influence on him over the years, or he's been hanging out too long with Cameron, but my own views on this mirrors him but with much more intensity. In my own opinion:

JUnit is constraining. It forces a narrow view of unit testing on the user.
JUnit is simplistic. Let's be honest folks, it's a toy, not a serious tool
JUnit offers developers nothing they couldn't bang together themselves in a day

Like Cedric says, JUnit was nice in its day. It filled a void, and certainly lived up to the "lightweight" moniker much more successfully than most projects have. But once the developers tackled the low-hanging fruit, and were into what I'd consider only the first 10% of development, they just stopped. Strangely enough, the developers and a huge number of users seem to feel JUnit is good enough. Frankly, it's not. It's a simplistic tiny little framework that gets in your way more than helps you.

Given what JUnit actually is and what (little) it does for you, it's amusing to see the accolades that have been heaped upon it over the years. The Fowlers and Eckels and (of course) Becks of the world write article after article on this tiny little piece of rather sucky code, and we're supposed to fall over ourselves with gratitude. It's amazing. Take a look at the code, people. Drag your ass over to Sourceforge and browse the CVS tree. I know I could replicate it and come up with something far more useful in a couple of days, and many people could probably do even more in less time. Thank God Cedric is looking to inject a bit of competition back into this space; let's hope even more people realize what a pain JUnit is and stop genuflecting to the toy God.

It's really funny that the Captains of XP dogma think JUnit is more than sufficient for their needs. To me the fact that such a toy fills their testing requirements says loads about the complexity of the problems they're trying to solve. To me, looking deeply into JUnit's capabilities and its underlying code underscores how brain dead the XP movement really is. Wake up people, this is the fruits of the best known XP Captain's labor. Seriously, take 20 minutes out of your busy schedules and look through the actual JUnit code. Just try not too hard to bust your gut laughing in the process.

Fowler's at it Again

Shaking off JUnit for the moment, let's turn our gaze towards Martin Fowler's latest missive from on high. In a recent entry in his Blog/Wiki thing, Mocks Aren't Stubs, Martin once again tries to convince the world to spend tens of thousands of dollars on his ThoughtWorks consultants - and to expect 90% of that money to be spent on the brutally difficult problem of writing Unit Testing code, and if you're lucky you might even get the extra 10% that actually does something useful.

In this article Martin tells us that Mock objecs aren't stubs, they're something different and much more grand. He says:

But as often as not I see mock objects described poorly. In particular I see them often confused with stubs - a common helper to testing environments. I understand this confusion - I saw them as similar for a while too, but conversations with the mock developers have steadily allowed a little mock understanding to penetrate my tortoiseshell cranium. [...] Mockers often refer to this difference as the difference between state-based testing and interaction-based testing. My purpose in this article to illustrate these two approaches and to explore the consequences of these differences. The confusion between mocks and stubs are merely a surface symptom of these differences.

Wow, sounds impressive and very important, now doesn't it? All this time I've thought of Mocks as a cool XP-word for stubbing out external code, but Martin tells me it ain't so. In fact Mocks encapsulate interaction based testing. Apparently all along my concept of mocks has been mired in the backwards world of state-based testing. By extension I assume that testing state is bad, and testing the call-flow of the moment is much more importnat. Oooh, I feel a chill! Do you have goosebumps too?

Reading into the article, the idea of calling into objects and then reaping their state is WRONG. It's a dumb practice. Unit tests of the Mock persuasion shouldn't be testing state (hmm, mebbe - Martin hedges on this later on). State's lame. They should really be testing the intricate cross-object contracts. In other words, rather than tracking state, you should track and test the flow of information across objects. In plain words, it's critical to capture today's implementation details; the end result (e.g. the state) is rather immaterial. Martin must've been having a bad day when he wrote this, because he missed the obvious marketing buzz phrase cross over that captures this: The Journey is the Reward. Or maybe he didn't use it because he's pissed that Steve Jobs beat him to the punch (or was that Buddha?).

The sharp reader may have noticed my qualifier in the above paragraph (Martin hedges on this later on). As it turns out, the latter half of the article features Fowler fighting an epic battle with himself, on the one hand praising this interaction based testing technique, and then turning around and criticizing it. It's rather confusing. He even goes off on a tangent to tell us that method chaining is apparently another of his smells (yes, dear reader, in his opinion getThis().getThat().getTheOther() is some how violating some greek Goddess in a most rude and graphic fashion).

I don't know about you, but in the end I found Martin's piece a towering tribute to the ultimate forms of Consultant gibberish. He tells us:

Mocking is not stubs; mocking is this interactive thingy.
The interactive thingy is good (but the reasoning behind this is somewhat murky
The interactive thingy is bad; you're coupled to your detailed implementation, you're fragile, and you may end up writing yet more unit testing code
If you're using Mock Objects, by definition you're using the interactive thingy. It's apparently impossible to do so-called state-based testing using Mocks.
Stubs are for losers :-)
Thou shalt not violate Goddesses with your chains

What really sucks is that, like many people, I got sucked in by this article and initially missed the kicker that invalidates the whole thing (read: I've wasted my time, again). It was only on subsequent read throughs that I noticed this gem near the top:

I first came across the term "mock object" a few years ago in the XP community. Since then I've run into mock objects more and more. Partly this is because many of the leading developers of mock objects are colleagues of mine in the ThoughtWorks London office. Partly it's because I see them more and more in the XP-influenced testing literature.

<Groan>. The MoFo got me again. Running this through my High Priced Consultant Translator reveals the truth behind this entire article:

Our bodies in London are saying that Mock Objects are generating a lot of revenue for us. Let's write a bliki article telling everyone that their whole idea about Mocks is wrong, and imply they should really hire smart people like us to do their development. Throw in a few hedging statements to cover our ass. Oh, and make sure to mention that "the leading developers of mock objects are colleagues of mine". That way we show our leadership, and we can refer to "leading developers" in the rest of the text to make it sound like this is an industry consensus, not just our own bullshit to generate revenue.

Really, this article is a tour de force of Martin's style. He's showing how his ThoughtWorks people are smarter than you. As with IoC, he tries to hijack a well-known term and replace it with his own terms. He throws in a few hedges so clients can't scream at/rant at/sue his company. And he throws in the charming self-deprecation.

But that's not the really important part. What's important is that yet again a high-priced gun like Fowler keeps writing about and refining a trivial subject like Unit tests. Wake up and smell what you're shovelling, Martin - unit tests are dead simple to write, only catch simple errors, and at best maybe address 1% of people's software development issues. Just like Eckel, Fowler attempts to mystify unit testing and make it seem complex, almost scary, and imply that you should hire smart boys like themselves to deal with this horrendous buggaboo for you. What they really want is for you to go out and pay their firms a few hundred grand for 30,000 lines of Unit Tests and 2,000 lines of actual application code. All I can say is that thank God people like Cedric are injecting a bit of reality and pragmatism into this space and are depicting unit testing as it should be - a tiny piece of the development pie, for which it would be nice to have some decent tools and not toys like JUnit.

Once Again, with Feeling

Now that I've been exposed even more heavily in recent weeks to comprehensive unit testing, my original opinions on the subject remains unchanged. Unit tests are a nice little tool in software development, but only a small one. Yeah, they do good things, and you should use 'em because they're so easy to write and do provide value. But don't fall into the obsession that the Thought leaders are pushing. The fact is that Unit tests solve a very small set of problems, and are only a tiny piece of the software development puzzle. By all means embrace the practice, but for God's sake don't make it the central practice in your projects. Focus on unit testing with the mania that a Fowler or Eckel do, and you too will end up with a toy app that breaks under any serious load. Keep your priorities straight, and allocate your precious time in an intelligent manner. By all means write those unit tests, but if you're smart you'll spend significantly more time on things that have a much larger impact - design, integration testing, performance and failure/recovery testing, writing code that solves your customers problems. Don't obsess on the easy low hanging fruit like unit tests, and let the truly difficult problems hit you over the head later in your development cycle like a fast ball whizzing from a professional baseball pitcher. And if yet another Consultant or Thought Leader gets in your face telling you that Unit Tests are the keys to success, don't waste your money on them - just politely show 'em the door.

PART TWO

Limits of Unit Testing

An anonymous poster wrote a comment to my previous blog entry, Don't Let Yourself Get Unitized, which I think deserves a bigger response than a mere comment would allow. He said, in part, that:

Nowhere in the article did you give any support to "The fact is that Unit tests solve a very small set of problems, and are only a tiny piece of the software development puzzle.".

[...]

I understand, that you disagree with the idea of "unit tests are everything" but you clearly did not do your research or put in any real substance to back up your claim. Unlike you, Martin Fowler and the rest of the so-called "Thought leaders" do put in their research. If you compare his articles to yours, you will notice there is logic and reason to it rather than emotions.

Well, Mr. Anon, when I wrote the Unitized post I believed the small-time nature of unit testing was rather self-evident, so I didn't go into much detail on it. But perhaps I was in error, so I'll explain my thoughts in more detail here.

Consider the goals and tenents of unit testing:

Very small "units" are tested
Testing is almost always done of individual components in isolation from other components
Mocking strengthens the isolation aspect
The code and the tests are almost always written by the same person

Taken together this means that unit tests are testing the lowest level pieces of your code, each in turn and in isolation from all other pieces, and the definition of the tests and the code are done by the same person.

This sort of testing catches what I consider "low hanging fruit". It catches problems-in-the-small. It'll find individual methods or classes which don't match what the unit tests say should happen.

This is a good thing and provides very valuable feedback on the correctness of your code. But keep in mind it _only_ catches low hanging fruit. By design, unit testing is supposed to be easy, and to consider individual small pieces of a system in isolation. Because of this, by its very nature, unit testing does not consider the _composition_ of a system, only its individual parts. Unit tests never check the interconnections of an application, it never checks how they are wired together.

In my experience, the interconnections and "wiring" of an app is where most of the complexity of the application lies. The wiring defines your design, and if considered at a high enough level it can even be considered to capture your architecture. How information flows across many software layers and between many components really define what an application does. And the very definition of Unit Testing is that it does not test these aspects of an application. Unit testing ignores information flow across software layers and components, ignores how classes and objects are interrelated and are put together into larger designs and architecture. This means that unit tests can catch simple errors in individual pieces of code, but says nothing whatsoever about your system's design or architecture. And what makes or breaks an application really is the overall design and architecture. The design and architecture captures your system's performance, it's memory use, the "end-to-end" correctness from the user's inputs out to whatever servers you might be using, and the round trip back again. How all the wiring interconnects shows the true system behavior, and it is in this area where the toughest bugs and problems lie, and where people sweat blood to get things right. Writing individual components in isolation is easy. It's hooking them together into a cohesive whole that's hard - and unit tests only pass judgement on the individual parts in isolation, not the whole.

Getting one component to act "correctly" in a system is almost always a pretty trivial exercise. Writing one component in isolation is not the difficult part of computer programming. Any single small component of a system is generally easy to code. The hard part of development comes in getting all of the components of a system to work together - to get the wiring right. Unit tests can verify that each of your individual components does what you the developer thinks it should do. But by its very definition, unit testing cannot check the more complex "wiring" - and the wiring is where most of our design, development, and debugging time goes into.

On top of all this, as I mentioned in the intro to this entry, unit tests are written by the same people writing the code. This means that if a component passes a test, it means that the component does what the programmer thought it should do (and generally, what the programmer thought it should do in isolation). Said tests say nothing whatsoever about how the component will work with other components. Nor do such tests indicate whether the programmer's intent and ideas of "correctness" are accurate or not. The fact is that by the very definition of unit testing you could have 5,000 classes that all pass their unit tests with flying colors, and yet the larger application and "wiring" is hideously broken and unusable. This is a critical point that many people miss - a body of code can be completely unusuable and horribly broken by any reasonable standard, and at the same time still pass all of its unit tests.

Let's face facts for a moment. A bunch of interns with 3 months of Java experience can code almost any individual component you could possibly name. Where the interns run into trouble is hooking all of those individual components together into a cohesive whole.

What this all means to me is that unit tests verify that the very low-level pieces of your application do what the developers think it should. But unit tests say nothing about how they work together. Unit tests do not indicate whether end-to-end processing works. It says nothing about performance. It says nothing about memory usage. It says nothing about usability. It says nothing about whether the code does what the users want. It will not catch multi-threading bugs, subtle interaction errors, or misunderstandings on how "external" APIs and subsystemms should be used. This does not mean that unit testing is bad or should be avoided - but it does mean that unit testing only gives you a limited return on investment. Given that we as developers do not have infinite time to develop our code, or an infinite number of bodies to write said code, we have to intelligently decide where to apply our efforts. We have to constantly compromise, and decide what effort will give us the best bang for our buck. In most software development efforts I've been involved in, this mean that unit tests are written to cover a great deal of code, but far more effort is put into areas such as:

The application design itself. You should spend more time on your design in an iterative, realisitic fashion than on your unit tests, because a good design will pay far greater returns than any amount of unit testing ever will.
Integration tests. Integration tests test features on an end-to-end basis, and by their design prove that your individual components can work together. By way of example, in an SOA sort of design, a passed integration test that shows a server can receive a request, get the data, process it, and return the right response is far more valuable than individual unit tests that peck at each little bit in the chain in isolation. Passing an integration gives you a much higher confidence that your system works in an end-to-end fashion, as opposed to individual objects floating about in isolation.
Functional tests and regression tests. The system does not just what the developers think it should, but what the users demand it should. Further, regression tests verify that high level functionality remains unchanged and correct as new features are added and underlying code is changed.
Non-functional tests. The code as a whole runs within acceptable runtime requirements. Requests are processed in an acceptably timely fashion. Your server does not blow out memory when 3 users make calls that generate large result sets. The system as a whole responds correctly when faults are injected at critical external junctures. Recovery processes actually recover your system state correctly.

Doing the above is much harder than unit testing, but at the same time generates far higher returns given their investment in time and effort. The Thought Leaders go on and on about the almost trivial exercise of unit testing, but are almost completely mute on the subjects of integration tests, functional tests, regression tests, load tests, etc. To use a really bad analogy, the Thought Leaders have written massive treatises and waxed eloquent on "Best Practices" on how to walk to the corner Deli with your spouse, and are eerily silent on how to coordinate a wedding with 200 people and all of the logistical nightmares it involves. Most people can figure out how to walk to the deli on their own just fine; where we all need help is on the more complex issues of the day which involve many interactions and a multitude of dependencies. You don't plan your route and strategy for getting to the corner deli with your spouse for days on end, you just do it - yet a Thought Leader wants you to believe that you need an army of consultants, Map Quest, a GPS receiver and several other bells and whistles and doodads to accomplish this feat. But ask them to plan your wedding and they'll stare at you without a clue on how to proceed.

As for Mr. Anon's other criticism:

Finally, your direct attack on his character was uncalled for. I don't see how bringing in that cheap shot was needed. That tactic killed your ethos for the rest of the article.

Perhaps I am just a hopeless cynic, but the Eckels and Fowlers of the world do not do what they do out of the goodness of their hearts. These people are fundamentally driven to sell services and to promote their books and consulting companies. This doesn't mean that their advice and opinions are wrong, but it does mean that you should keep in mind that the person speaking is trying to sell you something at the same time, and that you should keep this fact lurking in the back of your head. When Martin Fowler is writing about Mock objects, he isn't doing it solely to inform the world of some new technique with zero benefit to himself. Perhaps he is trying to further the art of software development, but this is not his sole motivation. He's also trying to justify his company and his consultants, and he's trying to get you to buy his company's services. In this particular case it's rather blatant, as Fowler talks about the fundamentals of this radical idea being proposed by "leaders" in the industry - all of who happen to work for his company's London office. :-) Maybe the man is explaining ideas that he believes in, but at the same time remember he makes money off of this. If you happen to not believe in his view of Mock objects (or in using them quite so aggressively as ThoughtWorks employees advocate), then most likely you will not be paying ThoughtWorks consultants on a project of yours. But if he convinces you that the idea is good, he's enhancing the chance that at some time in the future you just might start forking some hefty dollars his way.

Look at it this way. If Fowler worked in some company's IT department directly developing solutions for that company, then none of this argument would be valid. Even if he were working for some company that actually produced something and also was writing books, it would still be a poor argument - people just don't make all that much off of technical books. But, in fact, Fowler is not developing solutions for his employer - his reason for existence and his employers as well is to sell services to other companies. The same is true of a number of other Thought Leaders - they are not people working on their own solutions and writing about them in their spare time. Instead, they're consultants who use their articles as one more tool to sell their wares and validate their ideas.

This doesn't mean consultants are evil, or lie about everything, or do everything with ulterior motives. Consultants are just plain people, some good, some bad, most indifferent. But it does mean that they have more at stake in what they say and what they write than is true for non-consultants.

To bring this back to more concrete issues, when Fowler wrote his article on "Dependency Injection", he did not do it solely to help consolidate current thought on the topic. Part of the reason was to associate ThoughtWorks directly with the IoC craze - and in redefining IoC as "dependcy injection" he made ThoughtWorks much more prominent in that area than it was before. The same is true with the Mock objects article - he's writing about Mock objects, but at the same time he's twisting public attention to himself and his company.

So in my mind I was not attacking Fowler's character; instead, I was pointing out explicitly some of his motivations. It is a fact that much of his writing is financially motivated. That's not an attack on his character - it's a cold, hard reality that developers should keep firmly in mind.

About the author

Mike Spille mike@krisnmike.com
Blog: http://jroller.com/page/pyrasun

Mike Spille is an enterprise developer who has been living in the development world for a long time. He has written very interesting articles on topics such as distributed transactions.