Possible improvement for SUnit: yellow by default

This is just a small idea, but I’d like to share and discuss it: A test method that contains no assertions should turn yellow by default.

Why on earth?

Well, you know, in theory, everybody starts writing tests with assertions. And as we all know, we all write tests first and all that stuff.

In practice, the grass is a little less green and I admit freely that if I do test-first (or at least test-together-eith-evolving-code), I often start by writing a test method that simply contains a method send to the real method, just to see if it runs at all, or if I have some very stupid logic error in there. Sometimes, I even don’t really know what I’d expect as a result from the error, I just work myself towards the logic of looping over a stream or whatever. The approach of using a TestCase for the first experimental runs of an eveolving method is a good thing for me, because I can run my code and collect all that’s necessary to feed into it for its runs as setup code in the TestCase from the very start. I’ll have to implement that setup later anyways, so this saves me time and makes writing tests a very natural experience (just swap out the workspace for a test case).

Okay, so what?

So I start writing the method (most of the times, I must admit, I start with writing a draft method before I write the test method, and then start with a test driver for it to see if my idea for the method’s logic is any good) and then write a test method that simply calls my method. That’s all good and well, and I like the way this helps me write code.

BUT far too often, I jump away from this test method and work on something else, because I like the results of my method and since I have a test that’s green, all is perfect.

But wait: it’s green, but this doesn’t mean it meets any requierements, because I’ve not formulated any. Remember, my test  method has no assert: deny: or anything like that, it simply runs my method. So it doesn’t really test anything other than whether the method crashes, because that would turn the test result to red.

It has happened to me more than once now that I was proud to have green tests and having done everything so much better than most developers ever do, just to find out that even though I had a bunch of green tests, the end result simply was wrong, because the green tests simply were green because they didn’t test anything.

But hey, if I did it right, this would not be a problem!

That’s true. If I always started by formulating a (probably useless or sure-to-fail) assertion around my call to the method, the test would fail from the very start and I couldn’t forget to turn the test green before shipping the code.

But there are several major problems with this approach:

  1. I am bad in breaking habits – good or bad
  2. Having to think about the result of a method before I even know how the logic of the method will be, and even before I know if it’s going to be broken up into several methods in just a few methods, interrupts my train of thought. But if this assertion should always fail anyways, why can’t SUnit fail for me (see also next argument) ;-)
  3. Isn’t a tool made for making life easier? So shouldn’t it check whether it’s checked anything, because if it didn’t it simply lies to me when it turns a test green!
  4. I guess I would write an assertion to the end of each test method that makes no sense, just to turn the test yellow. This would only help me develop a new bad habit instead of support my in my good intentions. And: SUnit can do that for me and thus enable me to concentrate on assertions that make sense while it reminds me I still need to write an assertion.

So what am I asking for ?

In fact, I am asking for a default state for a test to be yellow until a first assertion is made (or an error occurs, of course). Because green is a very dangerous color for a test method that didn’t check anything. It gives me the illusion that I am on the right track, and this is dangerous and can be expensive

So what do others think? Should we change SUnit or am I the only one with this problem?

Using the new Monticello Importer from Instantiations

I just wanted to try the new Monticello Importer Beta that Instantiations just released yesterday and thought it wouldn’t hurt to describe the installation and use a little. Not that it’s hard, but not everybody might have seen monticello before.

Installation

Monticello’s .mcz files basically are zip files with a defined structure of ST source files and descriptions in them. Continue reading

Yet another Mock Object “framework” for VA Smalltalk

I’m currently preparing a training/workshop for a new customer where we want to find ways and techniques to improve their use of unit tests in a legacy Smalltalk project. They’ve been using VA Smalltalk for a couple of years successfully and provide a very respected family of products in their field. In fact, they say they are a market leader in the very business. After a few years of trying to ignore the fact they successfully use Smalltalk in their product, they’ve come back to a point where they accept it as a useful technology that has its place and benefits – even if it is not considered mainstream.

So they – like many other Smalltalk projects – are faced with the fact that they missed the train for unit tests and code improvements in their Smalltalk pool while most other teams adopted these techniques years ago. So now they need to find a way to get their product under test without breaking it. Not that this can’t be done, it’s just a question of where to start and how to get all team members into the same boat while accepting that there is no chance to get a significant percentage of test coverage any time soon.

But back to what I wanted to write about. I am going through notes and code snippets of projects I’ve worked on to find useful stuff that we can use as starting points or discussion base for their specific way to test land. I stumbled upon a little Mock Object implementation that I had implemented back when I was young, so much younger than today… (wait, isn’t that some line from an old Beatles song, that’s even older than that?).

It simply consists of one class that can be configured to answer messages with defined results and interview it how it went after it was used. The whole thing was inspired by an old blog post by Sean Malloy and is really a neat, tiny and feature-poor simplest thing that could possibly work. Nevertheless, I couldn’t find a ready-made Mock Object implementation for VAST on G, so I thought it probably won’t hurt if I uploaded mine.

So there it is on VASTGoodies, ready for you to explore, use, extend and post a better version back to VASTGoodies. Feel free to like it or port another one to VA ST, there are several better ones available for VisualWorks, Squeak and maybe more.

There really isn’t much to say about the tool, other than that it’s easy to setup and use, you can always take a loot at the TestCase in the MiniSMockTests Application that comes with MiniSMock.

Here’s a mini-tutorial for MiniSMock:

You set up a Mock Object like this:

mock := MockObject new.
mock answer: #test with: [Date today].
mock answer: #sayHelloTo: 
  with: [:aPerson| Transcript show: 'Hello, ', aPerson asString;
    cr. aPerson].
mock answer: #sayHelloTo:and:
with: [:aPerson1 :aPerson2| Transcript show: 'Hello, ',
   aPerson1 asString, ' and ', aPerson2 asString; cr. aPerson2].

and you can then send messages to it. Just like this:

mock sayHelloTo: 'Joachim'.

In a Testcase you can then ask a MockObjects a few questions:

mock receivedMessage: #sayHelloTo:.
mock receivedMessage: #sayHelloTo: withArguments: #('Joachim').

And that’s all folks.
But don’t underestimate the power of Mock Objects when it comes to introducing unit tests in a legacy project!

SUnit extension uploaded to VASTGoodies.com

I’ve just uploaded my little extension to the SUnit Framework to VASTGoodies. If you’d like to try it, I suggest loading the map z.ST: SUnit Testing.

What does it do?

It simply keeps the text that were defined in test assertions like assert:description: stored in an individual TestCase when the test fails, or the name of an exception thrown when a TestCase ended in an error. This is achieved with an additional instance variable in TestCase and a tiny change to TestCase>>performTest.

What is that good for?

One thing that I’ve been doing for quite a while now was helping Smalltalk Legacy projects to get back to the speed and agility that Smalltalk enables, but these projects never reached due to their corporate state of being legacy and about to be switched off any time soon now. Knowledge and Motivation was lost to other projects and therefor the code quality and project techniques are sometimes in a very sad state.
The most unwanted job in such project teams is the packager’s and code manager’s, because it takes a lot of time and is a mine field  in most projects: rotten code structures, undocumented procedures for packaging and deployment preparation and lack of knowledge, because the guy who used to do it left the team some years ago and never kept records of what to do and why.
In comes continuous integration and tools like hudson/jenkins that are excellent in performing automatic tasks like running tests, packaging, preparing a deployment directory and such.

A big shortcoming of SUnit in combination with hudson is that it doesn’t keep the description texts of assumptions anywhere. You can only see them if you debug a TestCase in the Test Browser or a workspace snippet, and only if you run that very test individually. Running a big TestSuite means losing these texts completely.
This is especially bad in combination with a tool that can keep a list of all failed tests for statistical reasons and more importantly for all team members to check every morning. It’s really annoying if you have to rerun a single t

So I wanted to add the descriptive Text to some kind of list that I can use to provide useful feedback.

So what is different now with failureTexts?

You can now run a TestSuite and inspect the TestResult. TestResult keeps a list of passes, failures and errors, which are the TestCases themselves (an instance of TestCase with the name of the individual test method it ran in the variable testSelector). Now a TestCase has an additional variable named failureText which holds a descriptive text of what failed or error’d (If you used assert:description:).

So if you write an assertion like

self assert: (1+1=3) description: ‘Smalltalk knows that 1+1 is not 3′.

You will find the TestCase in the failures list of the TestResult, and its failureText variable will contain the text ’Smalltalk knows that 1+1 is not 3′.

That’s all folks. A little change that can help a lot!

What does it help?

In our projects we have a little exporter tool for SUnit test result in a jUnit compatible XML file. This can be imported by hudson and kept as one of many statistics of a build. So now we have a list of test results that each team member can browse each morning and see if their tests failed and what exactly failed in it.

Can I use it in my dialect?

I haven’t tried yet, but the code change is very small and I suspect it is completely portable. One of the tests in SUnit Testing references an exception class by name (ZeroDivide) and that may have to change in other dialects, but other than that I see no reason why it shouldn’t be portable. I’m happy to provide file out instead of a .dat to anybody who’d like to try the change in their favorite smalltalk. I’d also be happy to hear from you if you tried it and what you think of it.

Little Addition to SUnit: keep a failure text with errors and failures

I’ve posted about a problem I’m having with SUnit before: a TestResult does not hold the description Strings of failures and errors, so it is not easy to log unit test results with bare SUnit.

I need this for our Hudson Build Server integration on our project. While some developers have solved this problem by using their own Test runner implementation that writes a log file in jUnit Format right during the run of a TestSuite, I’d prefer to first run the tests and harvest all the Results from the TestResult afterwards. It may be a mere question of Taste but I think mixing test running code with XML generation has a certain smell.

For quite a while we simply ignored the problem and fed an XML file back to hudson which contained the text “The test failed – please rerun it in the SUnit Browser for details”. Which is of course exactly what you have to do when you want to fix that test or the code it tests: rerun it in the Test Browser and maybe debug or step it to see what’s wrong. But it felt strange. We had spent so much effort into automated build tools and hudson integration batch files and stuff just to see a bloody placeholder text in our statistics. It just feels like being a whining coward, not a redneck programmer. But what’s even worse is that the developers tend to not add descriptions to their assertions, because they’re useless anyways. So this little glitch in SUnit may lead to tests that tend to be cryptic. Part of my consulting job is to convince long-time Smalltalkers about the usefulness of unit testing in legacy projects. There’s not much that’s more depressing than to see how people finally use unit tests and suddenly realize they give up on descriptive assertions for that reason.

So there’s always been the idea of “fixing” SUnit to make it keep the description Strings for failed assertions. It turned out that it’s neither easy nor elegant, at least none of the ideas I tried.

The best and most natural place for an extension first seemed to be TestResult>>#sunitAnnounce:toResult:  which was introduced in SUnit 4. Here’s what the release notes for SUnit 4  say:

TestResult now double-dispatches via the exception (see #sunitAnnounce:toResult:). This makes it easier for users to plugin specific behaviour.

Unfortunately, there are two problems with this method:

  1. The exception thrown or failed assertion is not handed over to this method, so you can’t store it anywhere
  2. If you simply create and instance of a TestCase and send it #runCase to run it, there is no TestResult involved in running it

Another problem is that even if we mostly run our tests in the form of a TestSuite which always produces a TestResult, the collections #failures and #errors do not store some kind of test result object, but simply the TestCase instance that failed/error’d.

So the simplest thing that could possibly work is to add an instance variable to TestCase to hold a String (well, it can of course hold anything, we’re using Smalltalk after all). This variable would be set to an empty String just before a test is run and be filled with either an exception’s description if test execution results in an error, or with the description text of a failed assertion.

This gives us the ability to not change much about the way TestResult works, not break the SUnit Browser or any other SUnit-related tool, but keep the error texts.

It turns out this worked quite well for our project, and there was very little I needed to change in SUnit:

  1. Add an instance variable #failureDescription to TestAsserter (with getter/setter)
  2. Change TestCase performTest to:
    performTest
      "Implemented and tested on VA Smalltalk 8.5, but should be portable"
        [
          self failureDescription: ''.
          self perform: testSelector sunitAsSymbol]
            sunitOn: TestResult failure , TestResult error
            do: [:ex |
                self failureDescription: ex description.
                ex pass]

And that’s it. I can now iterate over the #failures or #errors of a TestResult and harvest their decriptions to do whatever I want with them. In my case I want to add them as XML tags into our jUnit-xml for the hudson server. So far we’ve not had any problems with this change.

I’m well aware that this is neither an extraordinarily clever nor galactically elegant solution, especially since it involves changes to the SUnit code base. But I like the effects I see so far. The feedback from the team is: “I can now tell right from the hudson log what my bug is in many cases”.

Before I am putting this change up to VASTGoodies.com, I’d like to hear what other people think about this change. Is it going in the wrong direction? Too invasive into the concepts of SUnit? Would people rather put this kind of thing into the logging portion of SUnit (Be aware that logFailure* methods are only executed for failures, not errors!) or do they like this approach? Any better ideas out there? How did you solve this problem?

A little update on sUnit and should:/shouldnt:

A few days after my post sUnit and should:/shouldnt: vs. assert:/deny: I received a mail from Niall Ross who maintains sUnit. I was quite busy the last weeks, so I am a bit late with this update…

Niall pointed out that should: and shouldnt: both are deprecated, so it is not a good idea to use them anyways. If I had taken a closer look at the category names in TestCase, I could have found out myself.

He also confirmed that the current behaviour of shouldnt:raise: is a bug. We still have to work out if the method should fail if the Block raises another exception than the one mentioned or if this means the test is passed. I think shouldnt:raise: is really meant to say: throw whatever exception you like, as long as it’s not the one(s) I name here! I’m planning to try and fix this one this week if I find the time and nerve.

Along with Niall, quite a few people made a strong point about my suggestion to make sUnit accept Blocks as a parameter to assert: .

I’m not sure it buys anything to make assert: take a block.  When not checking for the raising or non-raising of errors, why use a block?  (By all means give me counterexamples).

Even stronger opinions have been expressed about my suggestion to do this by implementing the value method on Object. I tend to not agree to most of them, but it was again Niall who came up with the strongest argument: as long as Object>>value is not standard on all Smalltalk implementations, sUnit shouldn’t introduce it. I must admit I agree to this one. Porting code from one Smalltalk dialect to another can be hard without this one, and even if implementing Object>>value is not much of a deal, it should be avoided.

So I am now happy to write

self assert: (myBlock value = true).

in my future tests.

This is probably necessary in most cases anyways, because the Blocks we use most often do not evaluate to a Boolean. So checking their result in an assert: rather than in a (deprecated) should: is the way to go.

Thanks to all commenters and to Niall for taking time to think about my post and comment.

sUnit and should:/shouldnt: vs. assert:/deny:

[Update]Be sure to read the comments if you’re interested in the topic. There’s a lot of interesting info to find there[/Update]

Last month I’ve given a fast-start Smalltalk training session to a new team member at a Customer’s project and since that person was an experienced developer we not only covered Smalltalk basics but also some special areas like sUnit. As I’ve mentioned before, I found a few errors in my material and I am currently working on corrections and extensions of the slides. So now I am taking a fresh look at sUnit – especially the variants of #should: – and found a few interesting things that I’d like to share with whoever is interested.

Let’s kill should:!

Let me first explain what should: is good for. In essence, it is the same as assert:, but it accepts a zero-argument Block. So the following two assertion statements are completely equivalent:

self assert: 1 < 3. self should: [1 < 3].

The only difference between the two is that assert: expects a Boolean parameter and should: expects a Block which it will evaluate before delegating to assert:. The same is true for deny: and shouldnt:.

[Update] To be a bit more precise: assert: means “I expect this expression to be true” while should: means “I expect this Block to evaluate to true”. In effect, both means the same thing: the result of the parameter is expected to be true.
[/Update]

So my first thought here is: why not eliminate should: completely and implement assert: in a way that it can work with both Booleans and Blocks. While this sounds complicated, it really isn’t: since in many Smalltalk dialects there is an implementation of value in Object (so that Objects return themselves and Blocks the result of their evaluation*), the only thing to do is to send value to the parameter of assert:.

[Update] The Implementation of TestAsserter>>should: is in fact nothing more than

should: aBlock  self assert: aBlock value

While assert: looks like this:

assert: aBoolean aBoolean ifFalse: [self logFailure: 'Assertion failed'. TestResult failure sunitSignalWith: 'Assertion failed'].

In the end, the implementation of our new assert: is easy:

assert: aBooleanOrBlock aBooleanOrBlock value ifFalse: [self logFailure: 'Assertion failed'. TestResult failure sunitSignalWith: 'Assertion failed'].

[/Update]

This would mean there’s no need to memorize a different method for making assertions about Blocks. We could simply write

self assert: [a < b].

So let’s move on to should:raise: and kill it also

The idea of should:raise: is to formulate an assertion about whether a piece of code will raise a certain exception. Like this:
self should: [5/0] raise: ZeroDivide
description: 'Division by zero was not signalled'.

This will work like a Clockwork: The exception is raised, so the test case will pass, because we expected it to.

My first critique here is again the naming: why not call it assert:raises: instead of should:raise:? I find should:raise: is ambiguous. Does it mean the Block should run and not raise an exception (because #should: means the Block is expected to evaluate to true, and therefor run correctly), or does it mean evaluating the Block should raise the exception? In fact the latter is the case, but I find it hard to memorise.
A second argument for assert:raises: would be that it is in line with method names like assert:equals: as they were backported from jUnit.
Together with the above-mentioned value trick, this would mean that you can even test every piece of code (not only Blocks) whether it raises an exception or not. The following two assertions would be equivalent, while there is currently no way of achieving the second one:

self assert: [5/0] raises: ZeroDivide description: 'Division by zero was not signalled'.

self assert: (5/0) raises: ZeroDivide description: 'Division by zero was not signalled'.

[Update]Giving this an even deeper look, there still is somethng wrong with all this should: stuff. While should: means “I expect this Block to evaluate to true”, should:raise: means “This Block should raise the following exception, no matter what it evaluates to if it doesn’t”. So the meaning of should: and should:raise: is far more different than their names suggests. Not sure what exactly that means for my naming idea yet, but I’m still meditating on it ;-) .
Also, I am not yet decided if I like assert:doesntRaise: more than deny:raises: because the latter one suggests I want the parameter (namely Block or expression) to be false. Well, I guess I like the *doesntRaise: variant more…[/Update]

So, back to shouldnt:raise:: I found a bug!

Last, I found out that while should:raise: can test whether an expected exception gets thrown, the opposite case is only partially testable:

self shouldnt: [5/1] raise:  ExError description: 'An Error was signalled unexpectedly'.

Can test whether an exception is not thrown. Unfortunately, this only works if no exception is thrown at all in the Block. The case of an explicit exclusion of an exception that should not be thrown by a Block is not covered by sUnit. What I mean is that I want to formulate an assertion that means something like

No matter what kind of exception this piece of code may throw, I want to be sure it is not an exception of Kind X.

"This doesn't work!!! - results in an error instead of passing" 
self shouldnt: [5/0] raise:  ExHalt description: 'A Halt was signalled unexpectedly'.

I’d be interested in hearing what other sUnit users think about my suggestion of replacing should: with assert: and whether there are any reasons for keeping them that I haven’t thought of. Also I’d like to learn what others think about the “Exception Exclusion bug”.

*) The implementation of value in Object is a brilliant example of how useful polymorphism can be. I like the value trick very much and find it extremely helpful in making object protocols lightweight and elegant (just like in this example of sUnit). Being able to pass either an object or a Block as a parameter into a message without a performance penalty or code overhead other than sending value to a method argument is extremely powerful. The whole topic is probably worth a blog post on itself…

Please note: the code examples were tried on VA Smalltalk 8.0.

Another note: I’ve been wanting to post these ideas to some forum that is read by Smalltalkers of all flavors, but since I think the relevance of comp.lang.smalltalk is somewhat extinct (and my carrier even won’t let me use Usenet any more) and I know of no suitable mailing list for the topic, It’s only here…

What sUnit could learn from jUnit [UPDATE]

The complete family of xxxUnit testing frameworks have their roots in Kent Beck’s sUnit testing Framework for Smalltalk.

While jUnit (the Java flavor) and sUnit were very similar for a long time, jUnit “lately” added a lot of stuff that is helpful in Java, but quite useless in Smalltalk (like Java Annotations), mostly with jUnit Version 4.

There is, however, one difference between sUnit and jUnit that I think should be overcome. I am speaking of the TestFailure Class in jUnit that has not been added to sUnit. I don’t know exactly when the jUnit implementation got TestFailure, but I miss it.

Why do I miss TestFailure? Because its non-existence makes post-testing reporting very hard, unless you write your own TestRunner.

But let’s start from the beginning.

I’ve played with the Hudson Build Server (which will soon be known as Jenkins, but that’s another ugly Oracle-related story) lately in order to have a nice frontend for automatic builds of the VA Smalltalk projects of a customer (you can learn more about the beginnings of this project here) .

One thing that I wanted to do is use my TestResults and report them back to Hudson. Which is quite easy once you find out about a suitable xml format for TestResult. The gold standard here is again an artefact that’s coming out of the jUnit project.

Since my build machine works like a batch processor, and since I wanted to simply use the TestResult of a Suite to create an xml file that can be read by Hudson, I was hoping for an easy way to marshal TestResult to XML. So the first idea was to add a step to my batch that accepts the TestResult from the Test runner step and simply convert the results to XML.

Unfortunately, sUnit does not keep the output of failed tests anywhere, it just keeps the test class and test selector of a failed test.

So if you have an assertion like

self assert: (1>2) description: ’1 should be greater than two’.

and the assertion fails, you will not find the description anywhere in the TestResult. The testwill be recorded as failed, but there will be no way of finding out what went wrong. You’ll have to rerun the test in the sUnit Browser and take a look at the Exception that gets thrown.

I took a look at Lukas’ implementation of his Hudson build integration for Pharo. He wrote his own TestRunner that immediately outputs the jUnit XML file, but I think this is not the right solution. The right solution is what jUnit does: keep an instance of TestFailure for each failed test that includes the description. (you can take a look at TestFailure API documentation – it’s lean and easy)

Why do I think so? Because writing your own TestRunner will probably decouple you from future sUnit development, and because writing an xml output is probably not the only place in which you might need the description.

What do others think? Should we change sUnit and add a TestFailure Class?

[UPDATE]: I just found out that sUnit V4 has a TestFailure class, as a subclass of Exception. From the comments on the SUnit Config Map in VAST 8.02:

- The major 4.0 change is to dispatch on the exception (see #sunitAnnounce:toResult:).  This allows
user-defined subclasses of TestFailure or Error to plugin different behaviour in specific cases.  (For
robust behaviour in the face of old code using pre-ANSI-error-raising, Signal also responds to it.)

So I now have to take a closer look at V4 to see whether and/or how I can collect them in a TestResult.

Still I think the design and implementation of #sunitAnnounce:toResult: is wrong or insuffcicient for reporting purposes. #sunitAnnounce:toResult: adds the TestCase to the list of failures, but what I’d really want is some extra infor to which I can add the error description so that it is accessible from the TestResult.

One approach (which seems obvious) would be to add the TestCase to the TestFailure Exception and add the exception to the failures collection. Thus the failures list could access all info that’s needed for Reporting.

I’ll have to Play with this a little, and hope to find some time in the next few days…