Basically it's like this: the more constraints you have, the more freedom you ha...

delta_p_delta_x · 2025-08-04T06:38:52 1754289532

> if someone actually built AI for writing tests, catching bugs and iterating 24/7

This is called a nightly CI/CD pipeline.

Run a build and run all tests and run all coverage at midnight, failed/regressed tests and reduced coverage automatically are assigned to new tickets for managers to review and assign.

jelder · 2025-08-04T12:10:00 1754309400

"Nightly?"

Iteration speed can never be faster than the testing cycle.

Unless you're building something massive (like Windows or Oracle maybe) nobody is relying on "nightly" integration tests anymore.

progbits · 2025-08-04T08:25:06 1754295906

Post-merge tests, once a day?

Who does that, we are not in 90s anymore.

Run all the tests and coverage on every PR, block merge on it passing. If you think that's too slow then you need to fix your tests.

justacrow · 2025-08-04T11:28:18 1754306898

We go through maybe 10k CPU hours in our nightly pipeline. Doing that for every PR in a team of 70 people is unsustainable from a cost standpoint.

The existing tests aren't optimal, but it's not going to be possible to cut it by 1-2 orders of magnitude by "fixing the tests"

We obviously have smaller pre-merge tests as well.

klibertp · 2025-08-04T13:06:02 1754312762

> We obviously have smaller pre-merge tests as well.

This. I feel like trying to segregate tests into "unit" and "integration" tests (among other kinds) did a lot of damage in terms of prevalent testing setups.

Tests are either fast or slow. Fast ones should be run as often as possible, with really fast ones every few keystrokes (or on file save in the IDE/editor), normal fast ones on commit, and slow ones once a day (or however often you can afford, etc.). All these kinds of tests have value, so going without covering both fast and slow cases is risky. However, there's no need for the slow tests to interrupt day-to-day development.

I seem to remember seeing something like `<slowTest>` pragma in GToolkit test suites, so at least a few people seem to have had the same idea. The majority, however, remains fixated on unit/integration categorization and end up with (a select few) unit tests taking "1-2 orders of magnitude" too long, which actually diminishes the value of those tests since now they're run less often.

EGreg · 2025-08-04T14:40:19 1754318419

Pssht, so little? With AI you're supposed to have a huge data center and pay them thousands of dollars to process many, many tokens. That way you are doing it right, 24/7.

How else are we going to cover these costs? https://www.youtube.com/watch?v=cwGVa-6DxJM