But that requires code that is runnable and testable in isolation otherwise there are all sorts issues with that approach (aside from the obvious one of scalability)
It also assumes they "understand" enough to be able to extract the correct output to test against.
It also assumes they "understand" enough to be able to extract the correct output to test against.