When fixing bug it makes sense. But when writing new code?

mpweiher · on Jan 21, 2023

> [R->G] But when writing new code?

Absolutely, and most prominently when writing new code. The red-green transition is absolutely essential for new code.

You should not write any production code except to make a red test green.

Think of the tests as the specification of your system.

If all tests are green, your system meets the specification. Thus there is no need to write production code.

So in order to make the system do something new, you need to first change the specification. So you add a test. When you add this test, it will almost certainly fail. After all, you haven't written the code to implement the feature.

Then you make the test green, and now the system once again matches the (now updated) specification. Commit/Refactor/Commit.

Having the test that is red also validates your tests. If your tests are always green, how do you know that you're actually testing something?

In fact, it sometimes happens that you write a test that you think should be red, because you haven't implemented the feature yet, but then it starts of as green. Meaning you inadvertently already built the feature. This can be very confusing... :-)

garethrowlands · on Jan 21, 2023

Yes, new code too. If the test doesn't fail at first, you have a different problem than you thought.

jfengel · on Jan 21, 2023

There is a saying that coding is debugging a blank piece of paper. ("Piece of paper" tells you how old this saying is.)

"New code" just means "I want a program to do a thing, and it doesn't do it yet. That's a bug." The difference between "bug" and "new feature" is more a matter of perspective than actual development effort.

smaudet · on Jan 22, 2023

Bugs are features which were not expected.

lowbloodsugar · on Jan 21, 2023

If you write the code first and then the test, how do you know your test works? You've only ever run it against working code.

Like if there was a virulent disease for which there was a 100% cure, but you can't get the cure unless you test positive for the disease. I give you a test and say "Yep, test says you are healthy". Ok. What if the test always says people are healthy? "Have you ever tested an unhealthy person and the test detected that they were unhealthy?" "Oh, no, we've done this test a thousand times and it always says people are healthy!"

You write the test first, because your code does not yet have the feature that you are testing. Your current code is a perfect test for the test. Anyone who has done TDD for even a short amount of time has written a test that should have failed but instead it passed. Sometimes the was just a simple error in the test. You fix the test so it can detect what you are looking for (i.e. the test now fails). Other times a fundamental misconception was discovered that blows everyone's mind.

spc476 · on Jan 22, 2023

I'm replying to this one, but really, it applies to each of the replies to tester756. I want to say you all are insane, but I'm wondering if it's just that I do a different type of programming, or that the language I'm using doesn't lend itself well to TDD. Example, I have a function, in C [1], that runs another program and returns true or false. The prototype is:

    extern bool run_hook(char const *tag,char const **argv); // [2]

tag is informational; argv is an array where argv[0] is the program name, the rest are arguments. Okay, how would you write a test first for this? You literally can't, because the function has to exist to even link it to a test program. Please tell me, because otherwise, this has to be the most insane way to write software I've come across.

[1] LEGACY CODE!

[2] I go more into testing this function here: <https://boston.conman.org/2022/12/21.1>. The comments I've received about that have been interesting, including "that's not a unit test." WTF?

Jtsummers · on Jan 22, 2023

What's the problem here? You would write the test (if you're doing TDD, which you wouldn't always do anyways because you recognize it's a tool and not all tools are appropriate for all situations). The test would fail because it doesn't compile, so you make it compile. Not compiling counts as a failed test, even fools get that.

Unless of course you're only working no trivial programs (based on your write up, not the case) or an absolute genius you must have at some point or another encountered a failed compilation and used that as feedback to change the code. This is no different.

spc476 · on Jan 22, 2023

Then I'm not even a fool, because I didn't get that "not compiling counts as a failed test." It feels like the goal posts get shifted all the time.

Yes, I've gotten failed compilations, and every time it's because of a typo (wrong or missing character) that is trivial to fix, no test needed (unless you count the compiler as a "testing facility"). That is different from compiles that had warnings, which are a different thing in my mind (I still fix the majority of them [1]).

But I'm still interested in how you would write test cases for that function.

[1] Except for warnings like "ISO C forbids assignment between function pointer and `void *'", which is true for C, but POSIX allows that, and I'm on a POSIX system.

MiyamotoAkira · on Jan 22, 2023

I just want to point about that sentence of "goal posts get shifted". In his book Test Driven Development by Example (2003), Kent Beck, on the preface, page X says: "Red-Write a little test that doesn't work, and perhaps doesn't even compile at first"

There is no goal post moving.

More likely, as we transmit information, we don't do it correctly, and knowledge/data gets lost. I found quite enlightening to always go back to the source.

Jtsummers · on Jan 22, 2023

I think you've mistaken me for a unit testing (your restricted definition) and TDD zealot. I wouldn't aim for 100% code coverage using the restricted definition of unit testing you've provided, or any other testing mechanism. The lines you're missing coverage for, in particular, are ones where syscalls fail. Those are hard to force in general, and I'm not going to be bothered to mock the whole OS and standard library. I do see a way to cause `open` to fail, though, you can change the permissions to /dev/null, but that doesn't get you your desired version of a unit test that doesn't touch the file system.

At some point, I, and probably most people, operate under the assumption that we don't need to test (ourselves) that syscalls will do what they say they will do. Until they actually fail to act correctly and then I'd investigate it, and write tests targeting it to try and reliably replicate the failure for others to address since I'm not a Linux kernel developer.

spc476 · on Jan 22, 2023

My post is a reaction to my previous job, where management cared more for tests than the actual product, and I was tasked with proving a negative. I was asking for a definition of "unit testing" and I always get different responses, so I've taken to asking using my own projects.

It may seem cynical, but I assumed that anyone into "testing" (TDD, unit testing, what have you) wouldn't bother with testing that function, or with limited testing of that function (as I wrote). You aren't the first to answer with "no way am I testing that function to that level," but no where have I gotten an answer to "well then, what level?"

This may seem like a tangent to TDD, but in every case, I try to see how I could apply, in this case, TDD, to code I write, and it never seems like it's a good match. What I'm doing isn't a unit test (so what's a unit? Isn't a single function in a single file enough of a unit?). I'm not doing TDD because I have to write code first (but then, the testing code fails to compile, so there's not artifact to test).

People are dogmatic about this stuff, but there's no discussion about the unspoken assumptions surrounding this stuff. Basically, the whole agile, extreme, test driving design seems to have fallen out of the enterprise area, where new development is rare and updating code bases that are 10, 20, 40 years old are the norm and management are treating engineers like assembly line workers, each one easily replaceable because of ISO 9000 documentation. And "agile consultants" are making bank telling management what they want to hear, engineers be damned because they don't pay the bills (they're a cost center anyway).

Jtsummers · on Jan 22, 2023

I had written more, but my cat decided pounding on the keyboard was a good idea and got lucky with F5.

Anyways, you never asked me "well then, what level"? and I thought I did answer it but here's an answer anyways (to your unasked question): I'd test it to the point that made sense. I wouldn't follow some poorly considered hard and fast rule (morons do that, we're not morons, we are humans with brains and a capacity to exercise judgement in complex situations). A hard and fast 70% code coverage rule is stupid, as is 100%, even a strict 1% rule is stupid (though for other reasons, like that it's trivially achieved with useless tests for almost every program). If I'm writing code and 90% of it is handling error codes from syscalls, then you'll likely end up getting 10% code coverage from tests (of various sorts, not just unit) out of me. I'm not going to mock all those syscalls to force my code to execute those paths, and I'm not going to work out some random incantation that somehow causes fork to fail for this one program or process but also doesn't hose my entire test rig. Especially not when the error handling is "exit with this error code or that error code". If it were more complex (cleanly disconnects from the database, closes out some files) then I'd find a way to exercise it, but not by mocking the whole OS. That's just a waste of time.

To reiterate my take: We have brains, we have the opportunity to use them. Use the appropriate techniques to the situation, and don't waste time doing things like mocking every Linux syscall just because your manager is a moron. Educate them, explain why it would be a waste of time and money and demonstrate other ways to get the desired results they want (in a situation like your example, inspecting the code since it's so short should be fine).

orbital223 · on Jan 22, 2023

> The test would fail because it doesn't compile, so you make it compile.

But in this situation the test will fail regardless of what you wrote in the test code. So the supposed usefulness of the test failure showing that you are actually testing what you mean to be testing is inexistent and the exercise of making it fail before making it pass is pointless.

Jtsummers · on Jan 22, 2023

Then don't do it that way? I don't get why people are hung up on this. As I've said in other comments, you have a brain. Use it. Exercise judgement. Think.

If this is actually the one thing that trips you up on TDD, then don't do this one thing and try the rest. This is the easiest part of TDD to skip past without losing anything.

MiyamotoAkira · on Jan 22, 2023

I like that write up on [2]. I have not really been exposed to C in a very long time, and that has been quite informative.

I also like that Set of Unit Testing Rules. That is basically correct, external systems are a no-no on unit testing.

Usually, you deal with mocks through indirections, default arguments, and other stuff like that so you can exclusively test the logic of the function, which is more difficult in C, from what I've seen on your write up, than in other languages. But if you care about not having that on your code for performance reasons, then more likely than not, you will not be able to unit test. And that is fine. You have an integration test (because you are using outside systems). You can still do integration test first, as long as they help you on capturing the logic and flow. The issue is that they tend to be far more involved, and far more brittle (as they depend on those outside systems).

lowbloodsugar · on Jan 22, 2023

I’d say be pragmatic rather than pedantic? And thirdly, “The pirates’ code is more what you’d call ‘guidelines’ than actual rules.”

Jtsummers · on Jan 21, 2023

It does make sense with new code, yes. Whether for all code or not, or whether unit tests or integration or end to end tests though is up to your judgement. No one but you, who knows your system and your capabilities and knowledge, can decide for you. Only fools expect a technique to replace thinking. I assume you're not a fool.

For new code, the reason it makes sense is that your system is bugged. It does not do what it's intended to do yet because you haven't written the code to do it (or altered the existing code to add the new capability). So the test detects the difference between the current state of the system and the desired state, and then you implement the code and now the test detects that you have achieved your desired state (at least as far as the test can detect, you could still have other issues).

lolinder · on Jan 21, 2023

The red step is how you test your test: if you don't get red at first, your test is wrong.

If I only see green for a given test, I have no way of knowing if it is asserting anything at all, much less if it's testing what I thought it was.