When testing doesn't help

Written by Aaron Fowles 15th November 2022


I spend most of my professional life implementing more tests, getting better observability and monitoring in place, and automating as much of that as possible. So it might seem strange that I’m writing an article about testing being a bad thing. Testing is almost always a very worthwhile activity and you should do it as much as you can. Except when someone tells you to do it.

When testing adds value

Ask yourself - why are you testing? If your answer can’t easily be traced back to delivering accurate, complete and timely data to decision-makers then something is wrong. Also, make sure that you match the tests to the type of risk you are trying to mitigate.

  • a codebase for UDFs containing key business logic - get that unit-test coverage up! Integration tests are probably more effort than it’s worth here and will be cumbersome to generate an exhaustive set of inputs.

  • Tables for exploratory advanced analytics - computing summary statistics asynchronously and making them available on a dashboard is probably a good play here. Your Data Scientists will want to know the shape of the data and any issues shutting down the pipeline in the case of a few dupes is probably overkill.

  • Tables used for mission-critical decision-making. Bake quality and integrity tests into every step of your pipeline. Do not be afraid to fail fast (and loudly) when something goes wrong here.

Seek to retain the value to the decision-maker when elaborating any required development work. Empower your Engineers to have a stake in sharing that success. Integrate testing into Data Engineering directly using approaches like Great Expectations’ Airflow support.

When testing doesn’t add value

I’ve seen plenty of examples where programmes have tried to rush things through to get ahead of schedule only for decision-makers to find the data is messy, late and incorrect. Then comes the inevitable crashing down of quality-gates that forbid anything from going live without lots of reviews and manually produced test evidence that is out of date the moment someone ticks the box and waves it through. At that point - the horse has bolted.

Tests that allow a piece of development to move from in progress to done are generally useless post-deployment. Your organisation’s data is constantly changing and you need tests and quality checks that are running alongside the flow of data as it happens, not way back in the development process.

Should you test?

Always test if the value gained is provable. Never test because there is a column on your team’s board that a programme manager put there. If it’s not possible for you to take the leap to trust and empower your team to make the right decision when it comes to testing then at least make sure their goals are aligned to the value you are delivering. For example, if your team is being rewarded based on the burndown of lift-and-shift migration tickets being moved to the done column for some other team to support then they are not incentivised to make data quality monitoring their problem. And remember - engineering teams love solving problems! Don’t tell them to test - make testing a problem that they are invited to apply their creativity to solve.

So, I hope you’ll never test because someone tells you to do it.