Why frozen test fixtures are a problem on large projects and how to avoid them

Posted by amalinovic 22 hours ago

Comments

Comment by matsemann 20 hours ago

To me, fixtures are a code smell. If you need so much common setup to test your application, the code under testing is doing too much. It's unfortunately quite common in Rails or Django projects. You need to pass the Foo model to your function, but it will lookup foo.bar.baz, so you need to wire up these as well, which again need further models. Of course everything also talks with the database.

Instead, if you're able to decouple the ORM from your application, with a separate layer, and instead pass plain objects around (not fat db backed models), one is much freer to write code that's "pure". This input gives that output. For tests like these one only needs to create whatever data structure the function desires, and then verify the output. Worst case verify that it called some mocks with x,y,z.

Comment by radanskoric 20 hours ago

In theory you're 100% right, a true unit test is completely isolated from the rest of the system and than a lot of the problems disappear.

In reality, that is also not free. It imposes some restrictions on the code. Sometimes being pragmatic, backing off from the ideal leads to faster development and quicker deliver of value to the users. Rails is big on these pragmatic tradeoffs. The important thing is that we know when and why we're making the tradeoff.

Usually I go with Rails defaults and usually it's not a problem. Sometimes, when the code is especially complex and perhaps on the critical path, I turn up the purity dial and go down the road you describe exactly for the benefits you describe.

But when I decide that sticking to the defaults the is right tradeoff I want to get the most of it and use Fixtures (or Factories) in the optimal way.

Comment by axelthegerman 20 hours ago

Right let's add more layers that always solves everything.

No language or abstraction is perfect but if someone prefers pure functional coding, Rails and Django are just not it, don't try to make them. Others like em just as they are

Comment by antonymoose 20 hours ago

Three clean and simple layers that dovetail are better than one ball-of-yarn God class with too many dependencies.

Comment by jstanley 20 hours ago

You're more likely to get a ball-of-yarn if you try to separate things into unnatural layers than if you let it be a single layer.

Comment by antonymoose 19 hours ago

If you have to pass data up and down, back-and-forth the chain and rely on mutation, sure, yes. That’s ball-of-yarn in a nutshell. There is such a thing as reducto ad absurdum in software.

Nevertheless I’ve found far more God classes that could be refactored into clean layers than the other way around. Specifically in the context of Rails style web app as GP is specially discussing. Batteries included doesn’t necessarily require large tangled God classes. One can just as well compose a series of layers into a strong default implementation that wraps complex behavior while allowing one to bail-out and recompose with necessary overrides, for example reasonable mocks in a test context.

Of course this could then allow one to isolate and test individual units easily, and circle back with an integration test of the overall component.

Comment by onionisafruit 20 hours ago

Fixtures are great for integration tests. But I agree that unit tests needing fixtures indicates a design issue.

Still, most of us work on code bases with design issues either of our own making or somebody else’s.

Comment by bluGill 19 hours ago

I disagree. Prefer integration tests to unit tests where ever possible. If your tests run fast - which most integration tests should be able to do; and you are running your tests often - there is no downside. Your tests run fast and since you run them often you always know what broke: the last thing you changed.

Fixtures done right ensure that everyone starts with a good standard setup. The question is WHAT state the fixture setups. I have a fixture that setups a temporary data directory with nothing in it - you can setup your state, but everything will read from that temporary data directory.

Unit tests do have a place, but most of us are not writing code that has a strong well defined interface that we can't change. As such they don't add much value since changes to the code also imply changes to the code that uses them. When some algorithm is used in a lot of places unit tests it well - you wouldn't dare change it anyway, but when the algorithm is specific to the one place that calls it then there is no point in a separate test for it even though you could. (there is a lot of grey area in the middle where you may do a few unit tests but trust the comprehensive integration tests)

> Worst case verify that it called some mocks with x,y,z.

That is the worst case to avoid if at all possible (sometimes it isn't) that a function is called is an implementation details. Nobody cares. I've seen too many tests fail because I decided to change a function signature and now there is a new parameter A that every test needs to be updated to expect. Sometimes this is your only choice, but mock heavy tests are a smell in general and that is really what I'm against. Don't test implementation details, test what the customers care about is my point, and everything else follows from that (and where you have a different way that follows from that it may be a good think I want to know about!)

Comment by abhashanand1501 19 hours ago

You should look into factory boy (in django). Been using it for 10 years. It helps with this situation.

Foofactory() will automatically setup all the foreign key dependencies.

It can also generate fuzzy data, although having fuzzy data has its own issues in terms of brittle tests (if not done correctly).

Comment by orwin 19 hours ago

People really use fixture to simulate internal code? i thought it was overwhelmingly used to simulate external API response, or weird libraries that need some context switching (and in that case, an advice: the NIH syndrome is _very_ valid, and sometime the library you use isn't worth the time you put "fixing" it: just rewrite the damn thing)

[edit] though in my case we have one fixture that load a json representation of our dev dynamodb into moto, and thus we mock internal data, but this data is still read through our data models, it doesn't really replace internal code, only internal "mechanics"

Comment by bluGill 19 hours ago

Simulating an external API is the responsibility of a test double of some sort, not a fixture. Fixtures often setup the test doubles with test data, but they are not the test double. Fixtures can setup other things as well (the line between factories and fixtures is blurry)

Comment by swader999 20 hours ago

I find inheritance in tests leads quickly to hell. Striving for every last bit of reuse seems like the right thing to do but it hurts in subtle ways that compound over time. If you must, use composition and spend the time on a DSL that clearly documents the setup in each test.

Comment by ozim 20 hours ago

Inheritance everywhere leads to hell.

As much in applications code it is easy to curb, for test code it is just really hard to get people to understand all this duplication that should be there in tests is GOOD.

Comment by disgruntledphd2 19 hours ago

As always, there's a tradeoff. I used to go for doing all setup in each test for clarity, but one of my co-workers eventually convinced me that doing this in a fixture is better.

There'll always be some duplication, but too much makes it harder to see the important stuff in a test.

Comment by ozim 3 hours ago

I guess one level of inheritance is bearable, downside is once you start, there will be people coming in later adding more.

Comment by bluGill 19 hours ago

It depends on how much setup is done, and where it is. 10 tests that share a setup fixture are good. 100,000 starts to get unmaintainable.

I have lots of test fixtures each responsible for about 10 tests. It is very common to have 10-20 tests that share a startup configuration and then adjust it in various ways.

Comment by radanskoric 20 hours ago

When you say inheritance do you mean DRY as in "Don't repeat yourself"?

I'm not sure what you mean by inheritance in tests but DRY is criminally overused in tests. That could be a whole separate article but the tradeoffs are very different between test and app code and repetition in the test code is much less problematic and sometimes even desirable.

Comment by swader999 13 hours ago

Both actually. But having to open up three files to figure out how this thing is setup and then override setup to change it slightly in my one case. You get the idea. A really good DSL can help in the areas where creating the SUT is very complex.

Comment by dkarl 19 hours ago

I feel like the elephant in the room in this post is property-based testing. I dislike using fixtures for all the reasons stated in the post, and when it seems like I might need really them, I reach for property-based testing instead.

"Generators" for property-based testing might be similar to what the author is calling "factories." Generators create values of a given type, sometimes with particular properties, and can be combined to create generators of other types. (The terminology varies from one library to another. Different libraries use the terms "generators," "arbitraries," and "strategies" in slightly different and overlapping ways.)

For example, if you have a generator for strings and a generator for non-negative integers, it's trivial to create a generator for a type Person(name, age).

Generators can also be filtered. For example, if you have a generator for Account instances, and you need active Account instances in your test, you can apply a filter to the base generator to select only the instances where _.isActive is true.

Once you have a base generator for each type you need in your tests, the individual tests become clear and succinct. There is a learning curve for working with generators, but as a rule, the test code is very easy to read, even if it's tricky to write at first.

Comment by radanskoric 19 hours ago

Author here. Yes, what you describe sound where much like what I call Factories (and that's what they're usually called in Ruby land, and some other languages).

The problem arises when they're used to generate Database records, which is a common approach in Rails applications. Because you're generating a lot of them you end up putting a lot more load on the test database which slows down the whole test suite considerably.

If you use them to generate purely in memory objects, this problem goes away and then I also prefer to use factories (or generators, as you describe them).

Comment by tclancy 19 hours ago

No, property-based testing is something more like https://hypothesis.readthedocs.io/en/latest/ -- it's like fuzz testing with some smarts and it is lovely where it fits.

Comment by radanskoric 15 hours ago

Ah, ok, now I understand. Ok, I wasn't talking about that. From what I understand about property based testing it's sort of half way between regular example based testing and formal proofs: It tries to prove a statement but instead of a symbolic proof it does it stohastically via a set of examples?

Unfortunately, I'm not aware of a good property based testing library in Ruby, although it would be useful to have one.

Even so I'm guessing that property based testing in practice would be too resource intensive to test the entire application with it? You'd probably only test critical domain logic components and use regular example tests for the rest.

Comment by dkarl 19 hours ago

Oh, that's a very different set of requirements than I was thinking, and I missed that context even though you did mention database testing at one point. You're right, property-based testing is less helpful in that situation, because your database may contain legacy data that your current application code must be able to read but also shouldn't be able to write.

Comment by strehldev 19 hours ago

This is basically how I solved this in a past codebase. I called them "builders" and for complex scenarios requiring multiple different entities I called them "scenario builders" that created multiple entities.

My rule was to randomize every property by default. The test needs to specify which property needs to have a certain value. E.g. set the address if you're testing something about the address.

So it was immediately obvious which properties a test relied on.

Comment by dkarl 19 hours ago

You should see if your language has a property-based testing library; it'll have a ton of useful functionality to help with what you're already doing!

A clarification on terminology, the "property" in "property-based testing" refers to properties that code under test is supposed to obey. For example, in the author's Example 2, the collection being sorted is the property that the test is checking.

Comment by FuckButtons 19 hours ago

Forgive me if I’m just reading this incorrectly, but that doesn’t sound exactly like property testing as I’ve done it. the libraries implement an algorithm for narrowing down to the simplest reproducer for a given failure mode, so all of the inputs to a test that are randomized are provided by the library.

Comment by japhyr 19 hours ago

How did you deal with reproducibility when your tests use randomized data? Do you run with a random seed or something, so you can reproduce failures when they come up?

Comment by dkarl 19 hours ago

ScalaCheck includes the random seed in its failure messages, so it's easy to pull the seed out of CI/CD logs and reproduce the failure deterministically.

Comment by RHSeeger 21 hours ago

Example 1 bothers me. It says

> This test has just made it impossible to introduce another active project without breaking it, even if the scope was not actually broken. Add a new variant of an active project for an unrelated test and now you have to also update this test.

And then goes on to test that the known active projects are indeed included in what the call to Project.active returns.

However, that doesn't test that "active scope returns active projects". Rather, it tests that

- active scope returns _at least some of the_ active projects, and

And it does not test that

- active scope returns _all_ of the active projects

- active scope does not return non-active projects

Which, admittedly, is only different because the original statement is ambiguous. But the difference is that the test will pass if it returns non-active projects, too; which probably is not the expected behavior.

I prefer to set things up so that my test fixtures (test data) are created as close to the test as possible, and then test it in the way the article is saying is wrong (in some cases)... ie, test that the call to Project.active returns _only_ those projects that should be active.

Another option would be to have 3 different tests that test all those things, but the second one (_all_ of the active projects) is going to fail if the text fixture changes to include more active projects.

Comment by jon-wood 20 hours ago

Strongly agree with that. Its slower, but I will always prefer building actual database records as either part of the test or in the test context rather than relying on some predefined fixtures. That makes the test behaviour clearer, and it means you don't have a bunch of unrelated tests failing because someone changed a fixture to accommodate a new test.

Comment by radanskoric 20 hours ago

Author here. Thanks for writing up your thoughts on this!

The "doesn't include non-active projects objections is easy", please check the Example 1 test again, there's a line for that:

``` refute_includes active_projects, projects(:inactive) ```

Hm, if you missed it, perhaps I should have emphasised this part more, maybe add a blank line before it ...

Regarding the fact that the test does not check that the scope returns "all" active projects, that's a bit more complex to address but let me let tell you how I'm thinking about it:

The point of tests is to validate expected behaviours and prevent regressions (i.e. breaking old behaviour when introducing new features). It is impossible for tests to do this 100%. E.g. even if you test that the scope returns all active projects present in the fixtures that doesn't guarantee that the scope always returns all active projects for any possible list of active projects. If you want 100% validation your only choice is to turn to formal proof methods but that's whole different topic.

You could always add more active project examples. When you write a test that is checking that "Active projects A,B and C" are returned that is the same test as if your fixtures contained ONLY active projects A,B and C and then you tested that all of them are returned. In either case it is up to you to make sure that the projects are representative.

So by rewriting the test to check: 1. These example projects are included. 2. These other example projects are excluded.

You can write a test that is equally powerful as if you restricted your fixtures just to those example projects and then made an absolute comparison. You're not loosing any testing power. Expect you're making the test easier to maintain.

Does that make sense? Let me know which part is still confusing and I'll try to rephrase the explanation.

Comment by RHSeeger 20 hours ago

I want to start by saying that I agree with what you're trying to accomplish here. And I agree with some of the ways you go about it. I'm trying to find the right words to covey what I mean here, but... the best I can come with is... what I'm saying here isn't "you're wrong because", it's "what you're doing seems to miss some situations; here's what I do that helps for those".

> The "doesn't include non-active projects objections is easy", please check the Example 1 test again, there's a line for that:

You're correct; I totally missed that.

> In either case it is up to you to make sure that the projects are representative.

That's fair, but that's also the point you're trying to address / make more robust by how you're trying to write tests (what the article is about). Specifically

- The article is about: How to make sure you're tests are robust against test fixtures changing

- That comment says: It's up to you to make sure your test fixtures don't change in a way that breaks your tests

> You can write a test that is equally powerful as if you restricted your fixtures just to those example projects and then made an absolute comparison. You're not loosing any testing power. Expect you're making the test easier to maintain.

By restricting your fixtures to just the projects (that are relevant to the test), you're making _the tests_ easier to maintain; not just the one test but the test harness as a whole. What I mean is that you're reducing "action at a distance". When you modify the data for your test, you don't need to worry about what other tests, somewhere else, might also be impacted.

Plus you do gain testing power, because you can test more things. For example, you can confirm it returns _every_ active project.

All that being said, what I'm talking about relies on creating the test data local to the tests. And doing that has a cost (time, generally). So there's a tradeoff there.

Comment by radanskoric 19 hours ago

I think I'm getting what you mean and I almost completely agree with you, let me address one part, the only part where I don't agree:

> Plus you do gain testing power, because you can test more things. For example, you can confirm it returns _every_ active project.

Imagine this:

1. You start with some fixtures. You crafted the fixtures and you're happy that the fixtures are good for the test you're about to write.

2. You write a test where you assert the EXACT collection that is returned. This is, as you say, a test that "confirms the scope returns _every_ active project".

3. You now rewrite the test so that it checks that the collection includes ALL active projects and excludes all inactive projects.

Do you agree that nothing changed when you went from 2 to 3? As long as you don't change the fixtures, those 2 version of the test will behave exactly the same: if one passes so will the other and if one fails so will the other. As long as fixtures don't change they have exactly the same testing power.

If you agree on that, now imagine that you added another project to the fixtures. Has the testing power of the tests changed just because fixtures have been changed?

Comment by RHSeeger 19 hours ago

> If you agree on that, now imagine that you added another project to the fixtures. Has the testing power of the tests changed just because fixtures have been changed?

No, _but_ (and this is a big _but_) you're not testing the contract of the method, which (presumably) is to return all and only active projects.

Testing that it returns _some_ of the active methods is useful, but there are cases where it won't point out an issue. For example, image

- Over time, more tests are added "elsewhere" that use the same fixtures

- More active projects are added to the fixture to support those tests

- The implementation in the method is changed to be faster, and an off-by-one error is introduced; so the last project in the list isn't returned

In that ^ case, testing that _some_ of the active projects are returned will still return true; the bug won't be noticed.

Not directly related to the above, but I'll note that I would also split 2/3 into different tests.

- Make sure all projects returned are active

- Make sure projects returned includes all active projects

I think that's more of a style thing, but I _try_ to stick to each test testing one and only one thing. I don't always do that, but it's a rule of thumb for me.

Comment by radanskoric 15 hours ago

I'm with you on the one assertion per test. I bundled two assertions into the same test here because my whole point was to have them effectively together describe a single test, just in a more maintainable manner.

Regarding the fact that I'm not fully testing the contract of the method, you're absolutely correct. But also, no example based test suite is fully doing that. As long as the test suite is example based it is always possible to find a counter-case where the contract is violated but the test suite misses it.

These counter-cases will be more contrived and less likely the better the test suite. So all of us at some point decide that we've done enough and that more contrived cases are so unlikely and the cost of mistake is so small that it's not worth it to put in the extra testing effort. Some people don't explicitly think about it but that decision is still made one way or another.

This is a long way of saying that I both agree with you but that also, in most cases, I would still take the tradeoff and go for more maintainable tests.

Comment by sceptic123 18 hours ago

There's always a scenario where this can break though. What happens if someone introduces a test that confirms that marking `active1` as inactive works. Then it depends on the test order whether your initial test still passes.

Comment by radanskoric 15 hours ago

It's required for tests to clean up after themselves. With Rails and fixtures this is handled by default: each test runs inside a transaction which is rolled back at the end of the test. That way each test starts with the same initial state.

Comment by jillesvangurp 20 hours ago

I've been doing one thing for many years that forces me to be smart about test fixtures in my integration tests and test data: I test concurrently with many threads.

This means that my test can't depend on the database to be in some known state or assume to have exclusive access to that database. And for example modify anything that might be used by another test. They can only modify things that are specific to that test.

Most of my tests work around this limitation by either just creating their own teams, users, and other objects they need with randomized ids; or in some cases deferring their execution until some bit of logic with lock has created some shared data that then is never modified.

Instead of hard coded IDs, I tend to use randomized ids (UUIDs typically). I have a person data generator that gives me human readable names, email addresses, etc. Randomized data like this avoids test modifying each other's data.

As an example, we have a few tests for an analytics dashboard that locks on a bit of expensive code that creates a lot of content via our APIs to do analytics on. The scenario is quite elaborate and uses a few factories, known timestamps, etc. If I refactor my data model, my factories are also refactored. Using a lock ensures that data is initialized only once. Once that is done, there are a bunch of test that that test different queries against that.

You might think that all this is slow. It's not. I have about 380 integration tests like this that run in under 30 seconds on my laptop (which has a lot of CPU cores). Having this as a safety net is very empowering. I've been on teams that had less tests where running them took ten or more minutes. This I can do quickly before committing.

Testing like this has many advantages. But one includes easy to maintain tests. I put some effort into usable test data factories. The "when" part of a BDD style integration tests is usually most of the work. So, by making that as easy as I can, I lower the barrier for writing more tests. And using all my cpu cores, minimizes the impact new tests have on execution time to the point where I don't worry about that.

Another is that for big structural changes my tests continue to work if I just fix their shared factories to do the right thing usually.

Comment by tclancy 19 hours ago

I feel this is all contentious because, like so much in coding, we have people taking Thing That Works For Me on their current project or in their experience over time and declaring it to be The One True Way while other people are in completely different codebases with different priorities around shipping v quality v cost or what have you and are complaining "That doesn't work for me, Y has always been my go to".

The answer is most definitely, 100%, with no room for argument, to not speak so assuredly, acknowledge other people have the right to think differently and find synthesis and/ or a set of heuristics that apply for given cases.

But this is the Internet, and we need to be arguing PS2 vs X-Box for the rest of our lives, so have at it.

(Me? Factories are great until they aren't, which may not happen if a project or a team is small enough. Generators are great but do have some footguns and I would love to hand over everything to property-based testing, but I _feel_, without any experimenting or trying, they resist anything other than the purest of pure unit tests and can't help with integration tests that much.)

Comment by jrochkind1 21 hours ago

factories definitely are the cause of my test suite being slower than I'd like, I've been thinking of switching to fixtures. Good to see some context of what challenges I might be dealing with instead if I do.

Comment by radanskoric 20 hours ago

Author here. I'm a big fan of factories but the slowness is a real drag on large test suites. If you're considering switching, remember that you can do it gradually, there's no law against using both fixtures and factories in the same project, in some cases (mostly on very complex domain data models) even makes sense: fixtures for the base setup that all tests share, factories for additional test specific records.

Btw, I also have an article with some of my learnings using factories and I make a remark on how it helps with test speed: https://radanskoric.com/articles/test-factories-principal-of...

Comment by jrochkind1 20 hours ago

Thanks! While I have you, since you seem to know what's up with this stuff, I'm going to ask you a question I have been curious about, in Rails land too.

While I see the pro's (and con's) of fixtures, one thing I do _not_ like is Rails ordinary way of specifying fixtures, in yaml files. Especially gets terrible for associations.

It's occured to me there's no reason I can't use FactoryBot to create what are actually fixtures -- as they will be run once, at test boot, etc. It would not be that hard to set up a little harness code to use FactoryBot to create objects at test boot and store them (or logic for fetching them, rather) in specified I dunno $fixtures[:some_name] or what have you for referal. And seems much preferable to me, as I consider switching to/introducing fixtures.

But I haven't seen anyone do this or mention it or suggest it. Any thoughts?

Comment by onionisafruit 20 hours ago

I use the pattern you describe, but not in Ruby. I use code to build fixtures through sql inserts. The code creates a new db whose name includes a hash of the test data (actually a hash the source files that build the fixtures).

Read-only tests only need to run the bootstrap code if their particular fixture hasn’t been created on that machine before. Same with some tests that write data but can be encapsulated in a transaction that gets rolled back at the end.

Some more complex tests need an isolated db because their changes can’t be contained in a db transaction (usually because the code under test commits a db transaction). These need to run the fixture bootstrap every time. We don’t have many of these so it’s not a big deal that they take a second or two. If we had more we would probably use separate, smaller fixtures for these.

Comment by radanskoric 20 hours ago

Your thinking is sound. At the end of the day Rails default fixtures is nothing more than some code that reads yaml files and creates records once at the start of test suite run.

So you can definitely use FactoryBot to create them. However, the reason I think that's rarely done is that you're pretty likely to start recreating a lot of the features of Rails fixtures yourself. And perhaps all you need to do is to dynamically generate the yaml files. Rails yaml fixtures are actually ERB files and you can treat is an ERB template and generate its code dynamically: https://guides.rubyonrails.org/testing.html#embedding-code-i...

If that is flexible enough for you, it's a better path since you'll get all the usual fixture helpers and association resolving logic for free.

Comment by jrochkind1 19 hours ago

Cool, thanks!

I feel like i don't _want_ the association resolving logic really, that's what I don't like! And if it's live ruby instead of YAML, it's easy to refer to another fixture object by just looking it up as a fixture like normal? (I guess there's order of operation issues though,hm).

And the rest seems straightforward enough, and better to avoid that "compile to yaml" stage for debugging and such.

We'll see, maybe I'll get around to trying it at some point, and release a perversely named factory_bot_fixtures gem. :)

Comment by yxhuvud 19 hours ago

What I feel is really missing from factories is the ability to do bulk inserts of a whole chain of entries (including of different kinds). That is where 95% of the inefficiency comes from. As an additional bonus it would make it easy to just list everything single record that was created for a spec

Comment by mnutt 20 hours ago

I have a large rails app that was plagued with slow specs using factory_bot. Associations in factories are especially dangerous given how easy it is to build up big dependency chains. The single largest speedup was noting that nearly every test was in the context of a user and org, and creating a default_user and default_org fixture.

Comment by jrochkind1 19 hours ago

That's a great, example, thanks.

Then you just refer to the fixture in your factory definitions? Seems very reasonable.

Comment by mijoharas 20 hours ago

there's a profiler that can show you what to focus on, probably fprof here: https://test-prof.evilmartians.io/ (been a while and I don't remember exactly what I used)

(now maybe that's what you used to see what was causing the slowdown, but mentioning to for others to help them identify the bottlenecks.)

Comment by erdaniels 20 hours ago

I think fixtures generally work fine. If a change to one breaks many tests, introduce a new one and start using that. I also think it's okay to make some manual changes to them in the test and it's distinct from wanting factories; needing factories only in test code feels like a waste.

100% agree with "Test only what you want to test".

Comment by stephen 19 hours ago

These two suggestions are fine, but I don't think they make fixtures really that much better--they're still a morass of technical debt & should be avoided at all costs.

The article doesn't mention what I hate most about fixtures: the noise of all the other crap in the fixture that doesn't matter to the current test scenario.

I.e. I want to test "merge these two books" -- great -- but now when stepping through the code, I have 30, 40, 100 other books floating around the code/database b/c "they were added by the fixture" that I need to ignore / step through / etc. Gah.

Factories are the way: https://joist-orm.io/testing/test-factories/

Comment by radanskoric 15 hours ago

Author here. I didn't mention it because I wasn't writing an evaluation of fixtures. Just writing about how to make better use of fixtures. I actually use both fixtures and factories depending on the project specifics and also whether it is even my decision to make. :)

Personally, I even slightly prefer to use Factories and I also previously wrote about a better way to use them: https://radanskoric.com/articles/test-factories-principal-of...

Comment by perlgeek 19 hours ago

For a database-driven application with sqlalchemy, I've found mixer[0] to be pretty helpful. It gives you an easy way to generate an object, and it automatically creates dummy-objects that your object depends on.

You can also supply defaults and name schemes for individual columns.

For business logic, I prefer to have it structured in a way that it doesn't need the database for testing, but loading and searching stuff from the DB also needs to be tested, and for those, mixer strikes a really good balance. You only need to specify the attributes that are relevant for the test, and you don't need shared fixtures between many tests.

[0]: https://pypi.org/project/mixer/

Comment by onionisafruit 20 hours ago

I’ve found that golden master tests (aka snapshot testing) pair very well with fixtures. If I need to add to the fixtures for a new test, I regenerate the golden files for all the known good tests. I barely need to glance at these changes because, as I said, they are known good. Still I usually give them a brief once over to make sure I didn’t do something like add too many records to a response that’s supposed to be a partial page. Then I go about writing the new test and implementing the change I’m testing. After implementing the change, only the new test’s golden files should change.

They are also nice because I don’t have to think so much about assertions. They automatically assert the response is exactly the same as before.

Comment by radanskoric 20 hours ago

I'm familiar with snapshot testing for UI and I agree with you, they can work really well for this because they're usually quick to verify. And especially if you can build in some smart tolerance to the comparison logic, it can be really easy to maintain.

But how would you do snapshot testing for behaviour? I'm approaching the problem primarily from the backend side and there most tests are about behaviour.

Comment by onionisafruit 19 hours ago

I'm also primarily on the back end. Like most backenders, I spend my workdays on http endpoints that return json. When I test these the "snapshot" is a json file with a pretty-printed version of the endpoint's response body. Tests fail when the file generated isn't the same as the existing file.

Comment by radanskoric 19 hours ago

Ah, Ok, yes, for API endpoints it makes a lot of sense. Especially if it's a public API, you need to inspect the output anyway, to ensure that the public contract is not broken.

But, I spend very little or no time on API endpoints since I don't work on projects where the frontend is an SPA. :)

Comment by Fire-Dragon-DoL 18 hours ago

The title was a bit confusing, what's frozen is the fixture definition (and shouldn't be).

The data created by the fixtures shouldn't be touched, or factories are being used, like the author suggested

Comment by Tknl 21 hours ago

This makes me think of https://github.com/AutoFixture/AutoFixture

Comment by orwin 19 hours ago

I missed the "frozen". Ok, i understand the text better now. I think the issue is only with the frozen though, i understand why people would think it is necessary, but i think it should be avoided as much as possible, and fixtures, like data should be rewritten each time a data model change.

We have a solution. Not sure if it is elegant, but use it as an inspiration: it works.

When our project run its test, it will generate its database json representation itself (only using its models) with a file that contain fake/test data. That database representation will be loaded in the dev environment, and also in the database fixture that then run our tests. If our tests pass and we have an issue in dev, that mean our test missed something (that happen waaaaaay more often that i like to admit) and we have to add them.

Forcing every test to use this representation also force us to have a dev environment that contain enough items to run the test, and we can't forget to generate an item in the dev database, since that would mean our new feature isn't tested.

Comment by radanskoric 21 hours ago

Author here, thanks for posting. :)

Comment by immibis 20 hours ago

Your test fixtures are introducing tighter coupling between the tests, than the code they are testing! In this scenario (a mock DB with data that a test relies on, which is incompatible with a new test you want to add), duplicating the fixture is correct. Different tests with incompatible requirements on state should use different mock data. In this specific case, however, it's also true that one of them can be modified to make the tests compatible. If you do red-green-refactor then you can duplicate first, and then coalesce by changing the first test.

"assert_equal names, names.sort" is a wrong answer. It would accept an empty collection.

Comment by bluGill 19 hours ago

That depends on what state the fixture sets up and when you run them. That state becomes something you expect all tests to handle, if you change the fixture state and test breaks you should be fixing the production code - not the test code - to handle that change. Of course in reality I can well believe it is a tests data conflict in most cases (someone with the name "test user one" already is in the database at a different address..) and this is something you need to ensure doesn't happen.

I have a fixture that sets our database to the initial install state. This works for me because we are an embedded system where every month we ship a bunch more new systems and so code needs to see that initial install state, if we change the initial state (which we do all the time) and a test breaks we want to know and fix that since customers will see that situation.

However if you run on a server in a data center I could well believe you will never again see any specific state and so a fixture probably isn't right. Maybe ideally every test would take a snapshot of your current production database and test against that (with whatever additional data you add for the test) - if a customer enters data that breaks a test that is a "all hands on deck" to fix the code before customers hit that code path. Maybe - I don't work in this space and so I'm just speculating what you need.