This articulates the problem I’m having right now in an interesting way. I’m fine writing unit tests that validate business logic requirements or bug fixes, but writing tests that validate implementations to the point that they reimplement the same logic is a bit much.
I want to figure out how to count the number of times a test has had to change with updated requirements vs how many defects they’ve prevented (vs how much wall clock time / compute resources they’ve consumed in running them).
Brilliant distillation of this insight, I've never heard it put in those words before but it's perfect. It cuts both ways too, if you have lots of tests but most of them aren't really exercising the external API, then you're worse off.
> I want to figure out how to count the number of times a test has had to change with updated requirements vs how many defects they’ve prevented
I did the same some years back in a project that had both a unit test suite with pretty high code coverage, and a end to end suite as well.
The results for the unit test suite were abysmal. The number of times they caught an actual regression over a couple of months time were close to zero. However the number of times they failed simply because code was changed due to new business requirements was huge. With other words: they provided close to zero value while at the same time having high maintenance costs.
The end to end suite did catch a regression now and then, the drawback of it was the usual one, it was very slow to run and maintaining it could be quite painful.
The moral of the story could have been to drastically cut down on writing unit tests. Or maybe write them while implementing a new ticket or fixing a bug, but throwing it away after it went live. But of course this didn't happen. It sort of goes against human nature to throw away something that you just put a lot of effort in.
I want to figure out how to count the number of times a test has had to change with updated requirements vs how many defects they’ve prevented (vs how much wall clock time / compute resources they’ve consumed in running them).