Nose and Decorators: be careful!

One theory of unit testing is that your tests should all be entirely independent. I’ve heard that Google actually has a test runner that randomizes the order of the tests on each run, so you’re guaranteed that they’re independent (otherwise they fail).

I want my test files to be independent, but I find that having one test lead into another is often a good way to reuse fixture and avoid making my test fixture more complicated than it needs to be.

Nose (and py.test) make this really easy, because they run the tests in the order in which they appear in the file. (Note that if you’re using Nose to run unittest.TestCases, those tests still run alphabetically as they do under Python’s built-in test runner). The file order running of tests is very natural and leads to few surprises. It’s very easy to make a string of tests that build up a nice, complicated fixture testing little pieces as they move along. I’ll mention that most of my tests will actually run just fine in any order, but I like being able to count on the order when it’s handy to do so.

All of that is a lead-in to my fun from today. I’m using Fuzzyman’s Mock module to drop in stand-ins for things that are annoying to test. Mock comes with a handy @patch decorator that will replace the callable of your choice with a Mock on the way in, and restore it automatically on the way out. It works great and is really easy to use.

I was thrown off for a bit today because I put a @patch on one of my test functions, and it started failing because a file that it expected to see did not exist! Removing the decorator brought the file back. That file was created in a test function earlier in the file.

If you’re nodding knowingly at this point, you’ve probably worked with decorators a fair bit before. I’ve been known to use more than my fair share of decorators. I’ve found that while the syntax is great, some of the side effects can be really annoying. That’s why Paver‘s decorators aren’t true decorators. They just register behavior rather than replacing the function object.

So, what’s the relationship between using a decorator and the file-based ordering of tests? To sort by file order, Nose looks at the test function’s func_code.co_firstlineno. In the case of ,my test today, that was around 250. However, when I applied the patch decorator, the function was no longer my original function… it was the function that Mock’s decorator returned (the one that does the nifty swapping in and out of the Mock). That was around line 97 of mock.py.

When Nose went to sort the functions by func_code.co_firstlineno, after decoration my test function was thought to be at line 97. I naively thought that I’d just change func_code.co_firstlineno. Nope, read only. Then I tried sticking an object in place of func_code that returned the value I wanted for co_firstlineno. Nope, func_code has to be a ‘code’ object.

Luckily, Nose itself has a solution to this problem. The function object can have a ‘compat_co_firstlineno’ attribute on it, and that attribute will be used instead of func.func_code.co_firstlineno. A one-line change to mock.py was all it took.

6 thoughts on “Nose and Decorators: be careful!”

  1. Hi Kevin. Interesting note. I was experimenting the other day about using mocking with py.test. it reinforces my current idea of rather going for attaching attributes to a test function object and making the test machinery aware of it. cheers, holger

  2. Your post reinforces my belief that relying on the order in which tests are run to make them run correctly is a REALLY BAD IDEA :).

    Some reasons why:

    – you can’t run individual test cases reliably if they depend on other tests to do the correct thing

    – setup and teardown are what the setup and teardown fixtures are FOR

    – you’re violating commonly held expectations of unittest – for better or for worse, people EXPECT test cases to be independent

    – running tests in random order is a great way to find hidden dependencies in the code you are testing. Such hidden dependencies are a major source of bugs, in my experience.

    The real question is this, tho: to re-use test fixtures, why not have the appropriate setup/teardown functions attached to a class, and group the tests by classes? (There are plenty of other options in both nose and py.test for doing this, but I tend to just use classes.)

  3. @Holger: that strategy has definitely worked nicely with Paver. I’d go for that rather than true decorators any day.

    @Titus: I generally don’t care about running individual test cases. Any single test module of mine runs quickly enough that it’s not a problem.

    And, it only takes a few seconds to communicate to others on my team that the test cases are set up to run in order. (For a public project, there is a difference there…)

    I actually don’t tend to run into many bugs that would be caught by reordering my tests…

    That said, when I switched to py.test/nose/testido (the thing I made before nose came into existence), I stopped using unittest.TestCases because of how annoying they were to work with. The reason that unittest.TestCases are annoying is really only test collection, though. You’re right that unittest.TestCases are plenty convenient when working within nose, and I will consider using them the next time I have more fixture I want to build up and tear down.

    Just because I don’t have to use unittest.TestCase doesn’t mean I shouldn’t 🙂

  4. “I actually don’t tend to run into many bugs that would be caught by reordering my tests…” — this raises a red flag for me. Absence of evidence is not evidence of absence!

    Anecdote: I once found a bug in my Cartwheel project (probably my biggest public project) that I detected because I could run tests in any order. The bug was severe but basically invisible; I don’t think I would have found it any other way until it had caused data corruption in my database.

    More generally, order dependency of tests allows hidden linkages between disparate parts of your program. These are sources of additional complexity and (therefore) often sources of bugs…

    but I’ll stop brow-beating you now 🙂

  5. Well, sure, absence of evidence is not evidence of absence… That’s the argument made by functional programming types who want *provably* accurate software (and never ship anything 🙂

    I’ll tell you what, though, I will concede that that the cost of not depending on test ordering is just a few lines of code. I’ll also agree that that cost is small enough that allowing for random order/individual running is more of a win than the cost.

    This does raise an interesting question… has anyone released a nose plugin that does random test ordering?

  6. FWIW i also think that randomized running of tests is a good idea (in some cases running tests concurrently would also be worthwhile). With py.test this implicitely already happens when you use the experimental “distribute test to multiple machines or processes” or “loop on failing test set” modes. Having a more explicit option, though, would be nice.

Comments are closed.