That's how I do it, too. I give students a script which compiles their code with the test code (which is provided to them, if they want to look at it) and then runs it inside GDB (in case it segfaults or throws an exception). All of the tests are documented and, on failure, print an error message describing what was expected, what was actually returned, and include a URL linking to a fuller description of what might be going on.
In the future I'm hoping to run their tests inside Valgrind to check for memory leaks, enable the various sanitizers, etc., and I also need to figure out a way to check for conformance with my code formatting standard, but for right now it gives students perfect knowledge of what their grade will be when they submit.
I’m a student and in our CS department professors can test our code with valgrind and for formatting with Gradescope. I’m not sure how it works in practice but it’s an option if you are interested.
In the future I'm hoping to run their tests inside Valgrind to check for memory leaks, enable the various sanitizers, etc., and I also need to figure out a way to check for conformance with my code formatting standard, but for right now it gives students perfect knowledge of what their grade will be when they submit.