Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I worked with some Rails folks a while back who were utterly convinced that comments were to be avoided because it meant your code was not clear.

I agree that comments explaining 'what' is happening are mostly useless. Write clearer code. But 'why' comments are good. Most things can be coded up different ways. Tell me why you chose this way. Tell me about constraints that might not be immediately obvious. Let me know if indeed, it could be cleaner, but you were in a hurry (this is a thing that happens in the real world), or if it's that way for a specific reason.



Many people think their code is clear. The problem is it is clear often only for the author and sometimes for a limited time even for the author. If anyone else start to working with the code questions usually arise. And you can prepare comments for at least some of such questions.

In my practice I often see code where useful comments are missing and hours of research required to learn something author for sure knew. Code which has too much comments to me is an imaginary problem - I've seen it at most a couple times in my career and one of them was the code written by a junior developer in a style one can expect in a tutorial or a book. But usually people quickly learn to not add unnecessary comments (and eventually start adding too little of them).


>Many people think their code is clear.

This is true, but if you just get these same people to write more comments, they will also think their comments are clear when they're, in fact, not. I think anyone who is capable of writing clear comments is also equally capable of writing clear code.

If I was to suggest investing time in learning some skill, I would suggest learning how to make your code actually clear by getting the right people (with relevant domain expertise, but without experience with the code) to critique it for clarity rather than learning how to make your comments actually clear.

Why questions are best answered in architecture documentation, the code itself should be self explanatory (given the architecture documentation) for the most part with specific why comments carefully placed in places where it doesn't make sense to use the architecture documentation.

I've read code from various codebases from various companies, large and small. I very rarely see a comment which is genuinely both well placed and useful. From the bits of the linux kernel that I have worked on, I have generally seen pretty good why comments, although I think a lot of them could be shifted into architecture documentation.

All in all, when reading code, my default stance is to ignore comments unless I get lost reading the code, then I attempt to read the comments. It has been incredibly rare that I've found unclear code to have clear comments.


> I very rarely see a comment which is genuinely both well placed and useful

Your experience is different than mine: I write software which interacts with cloud (Amazon AWS, Google Compute Platform, VMware vSphere), and a well-placed comment describing the quirks of whatever platform I'm working with helps immeasurably, e.g.

> VMware NSX will throttle us with an HTTP Status 429 "Too Many Requests" if > 100 API calls/second. In that case, we back off and retry rather than fail.


Agreed. I’ve found over the years it is helpful to prefix comments like that with something like “HACK: “ as it makes it clear you are working around a limitation and not adding a feature - plus it stands out when browsing the code.


I’m a fan of comments, but in this case I would reach for an enum with descriptively-named members, e.g. HTTPStatus.TOO_MANY_REQUESTS


Making it strictly worse. TOO_MANY_REQUESTS? How many is too much? In what scenarios?

The code comment in the GP is pithy, explanatory, and reads in plain English. Yours is ripe for misunderstanding.


My bad for not being clear, I was really talking about the comment saying 429 means “too many requests”. By all means comment to say that service X has a rate limit of N req/min but a comment saying 429 means too many requests is exactly one of those things that’s better expressed in an enum.

301; 302; 401; 429; 504; I shouldn’t need comments to tell me what they are, nor should it be expected that anyone touching the code knows them off the top of their head.


Unless the API itself is poorly designed, 429 _means_ "you have reached a rate limit". That might not be so self explanatory for someone who isn't familiar with HTTP error codes, but that person maybe shouldn't be attempting to read a codebase which deals with making HTTP requests in the first place (e.g. my keyboard firmware doesn't have a comment explaining how a keyboard matrix works). It may be worth noting what the rate limit is somewhere, but likewise, I'm not sure how useful that is. If the rate limits change then your comment is now incorrect, but the code to handle backing off due to a rate limit should probably still stay, and should work independently of what the actual rate limit is. If someone questions why that code was included (first off, not sure who would question what seems like a perfectly sensible design decision) then that person can refer to the VCS blame (e.g. git blame) logs to find what _should_ be a detailed commit message explaining all of the 'in the moment' reasoning for why and how the code was written.

I don't mean to imply that your code is poorly written as I clearly can't see it and maybe it's a lot more complex than I am imagining it, but I don't really understand how your code was written such that it became unclear enough what code handling a 429 was doing such that you felt you needed a comment to explain it, but I can envision a codebase which handled rate limiting where comments weren't needed to explain the reasoning behind it.

I think the best way to think about it is this: What information is this comment conveying?

- "HTTP Status 429" means "Too Many Requests" - You can put this information in a comment every time you refer to 429, or you can simply use an enum.

- "429 Too Many Requests" means "a rate limit has been hit" - You can put this information in a comment every time you refer to the enum, or maybe it's best placed on a website like MDN[0] where it can be surrounded with far more detail and where the information has a much smaller chance of becoming outdated.

- The rate limit is 100 API calls per second - You can put this information in a comment every time you talk to this particular API, or it's best placed in a document which describes the API itself such that if you ever need to update it, you can update it once. Maybe you can demand that the vendor of the API provides and maintains this document as part of whatever contract you have with them. If necessary, the module of your code which deals with interacting with this API may benefit from a reference to this document.

- That the code should back off and re-try instead of failing - Okay, now we're getting to the core of something important. And I think the reality is more complex than the comment even lets on. Does the code re-try indefinitely? Does it have an option to disable retries in some contexts? I'd assume most likely a no for the first question and possibly a yes for the second. In that case, the names of variables within the code the name of the function, and the names of the parameters of the function should probably be enough to convey this meaning entirely.

That being said, I think that as far as comments go, there is still a benefit of having short descriptive comments for WHAT a function does for every function in a codebase. I've found that they're much more useful (as proper editor tooling can show you them whenever you make use of a function) and by being short and ONLY describing the _what_ and not the _why_ or _how_, the comments have a better survival rate and encourage people to avoid trying to make individual functions too complex.

In this case, your function may benefit from a comment such as:

// Request foo from bar, optionally backing off and re-trying <retries> times

The function signature itself might look something like:

foo(..., retries: Optional[int]) -> ...

Or alternatively you could do something similar with an explicit timeout instead.

The only missing piece I can think of is that if you have some form of back-off you probably need some initial delay before retrying, in this case this should probably be made a constant and this constant (without making any explicit references to its value or whatever value it is based on) may benefit from some reference to the method by which it was chosen. If you do intend to make it depend on the actual rate limit (e.g. 100 requests per second) then that should itself be a constant. But I think generally code like this should be avoided due to its fragility.

In any case, it is again difficult to give too specific of a recommendation given that I haven't seen the code but I don't think the comment you gave is a particularly good example of something I would personally ever recommend or want to see in the middle of a block of code.

[0]: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/429


> then that person can refer to the VCS blame (e.g. git blame) logs to find what _should_ be a detailed commit message explaining all of the 'in the moment' reasoning for why and how the code was written

Agreed, they "_should_" write the reason why, but many committers unfortunately write the "what" rather than the "why".

I like comments in code, and they're the first thing I read when, say, browsing the kernel or the Golang standard libraries/packages. And I appreciate that you don't find them as useful--more power to you!


People have different ideas of value. I see a method that sounds like that and a 429 and I have a pretty good idea of what is going on without someone taking up more lines to describe the method than execute it.


> I see a method that sounds like that and a 429 and I have a pretty good idea of what is going on

Yes, you're a seasoned HTTP developer. The comment isn't meant for someone like you. It's meant for the other developers on a shared codebase who may not know much beyond the standard HTTP status codes (200, 400, 403, 404, 503).

> taking up more lines to describe the method than execute it

It's a one-line comment. the Ruby code with the exponential backoff, eight lines. It does not take up more lines than the method.


You split developers in to two categories, but how do you know those are correct categories? What if there's another type of engineer that only knows about 200 and 400?

My approach is that conventional knowledge doesn't need to be documented in comments. Attempts at guessing how much conventional knowledge someone is lacking are futile. Those comments are similar to:

  int a = 4 // set a equal to four
If someone doesn't know what 429 is, they should get confused, do their own research, and learn. Comments should not be turned into educational material about open standards, they should be about your business. There are much better places to learn about standards.


>if you just get these same people to write more comments, they will also think their comments are clear when they're, in fact, not. I think anyone who is capable of writing clear comments is also equally capable of writing clear code.

Think of it more like two lines of bearing. You are far more likely to get an accurate position from two lines of bearing. If you need to understand what past self was thinking, a comment is written in a different mindset than the code. Two different lines of reasoning.

Also, I rarely think about writing code for someone else. I write for my future self. And I really want my future self to like my past self. So I strive for clean code and write detailed comments, complete with references.


> Think of it more like two lines of bearing. You are far more likely to get an accurate position from two lines of bearing.

I mean in theory yes, but in practice I've usually found comments in such situations to either be missing, outdated or misleading.

>If you need to understand what past self was thinking, a comment is written in a different mindset than the code.

I don't think I agree with this at all, why would the mindset be different when they were both written at the same time?

Like I said, writing good comments requires just as much if not maybe more self awareness of what you're likely to forget in the future as writing clear code. I think if you're at that level where you can successfully muster that, you should spend your time just making your code clearer.

In reality I think it's even difficult to muster that requisite introspection skill in the moment and it's usually best left until after you solve the problem (let's say, the next day) to go over the code and check how clear it really is.


I have no idea what clear code without comments mean. I think I’ve actually once or twice seen that. The first was an erlang codebase. The second was a golang codebase (the letsencrypt codebase!). I’ve read a shit ton of code in my life and these were the only time I found something uncommented that I could follow easily. And even these codebases would have been clearer with comments.


I don't think your experience is uncommon, I agree that the vast majority of code out there isn't very clear. That being said, you have found code which was clear without comments, so clearly you do have an idea of what clear code without comments means.

Regarding why I don't think that clear code should have comments, it's very simple, comments are not type checked or syntax checked or even tested (unless, I guess, you hire someone to read them). It's very hard to ensure that over time comments don't deteriorate and become unhelpful. This is a much larger maintenance burden than the maintenance of clear code itself.

I think a lot of the issue boils down to people assuming that because all the code they've ever read was unclear, that code itself is inherently unclear.


> I think anyone who is capable of writing clear comments is also equally capable of writing clear code.

Imo, first, they are two different skills. And second, sometimes it is easy to write down what I am trying to achieve, but hard to achieve it with clear code. And other times it is vice versa - easy to write the code and hard to put it into words.


It's never easy to achieve something with clear code (unless you are writing a hello world program) but it is a skill you can practice and which will pay off more than any comment. At the end of the day, a comment which clearly explains your reasoning and intentions won't help prevent bugs which have an increased likelihood of occurring in unclear code.


The problem with this argument is that it's implying that there is no such thing as clear code. Sure some people think their code is clear when it is not, but that shouldn't stop you from striving to write clearer code.

A long time ago, I had a tendency to write comments to explain code that could be simplified. Refactoring it usually made the comments redundant.

Comments are still very useful when the why is unclear.


There is a culture that the l33ter the code, the more senior I am. So, many people will never refactor code to make it simpler to read. It affects their ego. In fact, I've seen the opposite: simple code refactored into more complex (not by the original author). The mindset: if it's more complex, it must be better.

Java example: rewriting non-stream collections code to use lots of chained streams. Using lambda functions when not needed. Rewriting case labels to use case lambdas. etc.


and don’t forget the implicit corollary - if you don’t understand it that just means your IQ is too low, find a job somewhere else noob…


> Refactoring it usually made the comments redundant

That sounds like a nice story


The fact that you don't believe me blows my mind a little, not going to lie. Maybe this will convince you a little more? https://refactoring.guru/smells/comments


Don't be too hard on them, they're only a baby after all.

More seriously, I agree with you 100%. Many, many, times I've stopped halfway through a writing an explanatory comment and refactored the to make the comment unnecessary.

This habit was formed by realizing that most people (including me) don't read comments half as carefully as they read code. So if you want something to be noticed and understood, you better put it in the code.


Many people think their subjective comments are clear! Code doesn't lie, comments do.


So train people to write better comments, because although code doesn't lie (questionable as that is since the intent behind the code and what it does doesn't always align) it can be literally the worst implantation of an algorithm imaginable, and so we teach people how to code.

Teaching people how to comment is a skill that's just as important and nothing turns me off a project faster than the "code is its own documentation" mantra.


This is misunderstanding the point. Code cannot lie. Reading code (correctly) gives you a correct understanding of the state of the system, no matter how clear or unclear it is.

Reading comments only tells you what the person who wrote the comment believed. It does not tell you anything in particular about how the system behaved, you have to trust the other human beings (in general a long succession of them) to have understood it correctly. And any one of you can mess that up.

Saying good comments have value is fine. But their ability to lie is unique; code doesn't have that misfeature. You can't make comments lie-free by fiat, for the same reason that you can't train your developers not to write bugs.

Given that, IMHO comments have limited value. Don't be a zealot in either direction, but when debugging hard problems, train yourself to ignore the lying comments.


While code gives you a correct understanding of the state of the system, it doesn't give you any understanding of why the system is in that state. That's the point from the OP. Code tells you what a system is doing, not why it was designed to be that way. Comments that try and explain what a system is doing can absolutely be wrong and therefore problematic, but there's no way for code to convey the context in which it was implemented a particular way, which is what comments are for.

Code without comments only gives half the story. It gives you the what, not the why.


> Code cannot lie

Sure, but it can take you on a freaking trip. Comment your code bro.

This whole argument makes me think of C coders who say C is perfectly safe as long as you don’t write bugs.


The point is not to fetishize comments. Write a comment if you feel something needs explanation, but don't freak out about "uncommented code" as a smell or whatever, when reading the code should have been your first step anyway.

And to repeat: when debugging, train yourself to ignore the comments. They will absolutely lie to you.


Imo comments are like formatting. If you don’t enforce it in CI it’s going to be bad


Forcing developers to write comments[1] just to get code merged seems like the best possible way to ensure those comments are lies.

[1] Which, it's important to point out since you used the term "CI", are fundamentally untestable. There's absolutely no way to ensure a comment is correct. A CI smoke test at the very least verifies code builds and doesn't break pre-existing cases. No such validation is even theoretically possible for comments.


I’ve seen it done before and no, it was beautiful, everything was commented


This is a good point, and I'm not arguing with it, but I don't think the words "code cannot lie" are the right way to express it.

Code can deceive even if, definitionally, the code states what the computer will do. So code most certainly can lie.

[By analogy, you'd still feel lied to if someone told you something that is technically correct but very misleading: "(Me) It's going to rain tomorrow." → tomorrow comes → "(You) It isn't raining today!" → "(Me) It is raining, in Japan. I didn't say it would rain here."]

I suppose it depends on whether you're considering what the code communicates to the machine or what it communicates to a person.


How about "code does what it does but comments say what it might have once, and possibly still might do?"

You can have your pipeline regression test code, but not comments. Just recently my mentee found some of my code where I had changed the code but not the comment. I'm a horrible human and wasted his time. If that comment hadn't existed the code might have taken a moment to understand but as it was, he wasn't sure which was the intent.


+10

It sounds like we have had similar experiences and have come to similar conclusions.

Learning to code, learning to comment and learning to log are all ways to learn to communicate and represent a means to risk-reduction and importantly, to cost-reduction.


How much do you charge to run training sessions on how to write better comments?


Here's a free method that's twice as much work but produces great results:

immediately after composing, for each step and nested step, write a line or two of what its place in the code is for. Write it as though the code is broken and you're following the imaginary line threading through, explained as if to your rubber duck.

Then, having written out the business logic map, look at each written step and see if they're just a description of the logic "iterates through file, passes hits onto nextFunc" and you can safely delete those. They're just glue, really, holding processes together.

What you'll have left is skeletal comments that are restricted to "we did this because this stackoverflow post gave the solution" as well as those mental maps of the solution in your head, which is really what comments are for, future programmers to grok your state of mind and thus better implement their code changes.


Comments can lie, but so can code...

  function Add(int a, int b) { return a - b; }
Code can lie in so, so many ways that are much more obscure and confusing than this simple example.


And also, the code can be clear and tell the truth along with the comment, while still being unhelpful:

  // Add with adjustment factor
  function AddWithAdj(int a, int b) { return a + b + 12345 }
When the function was written, everyone probably knew what the adjustment factor was for and how it was determined, but years later, someone's going to look at that code and have no idea.


>When the function was written, everyone probably knew what the adjustment factor was for and how it was determined, but years later, >someone's going to look at that code and have no idea

Well, to be fair that adjustment factor could be refactored to use a usefully named constant, with a helpful comment.

static const int ZEN_ADJUSTMENT = 12345; //When I wrote this only myself and God knew what this value meant, now...

int AddWithAdj(int a, int b) { return a + b + ZEND_ADJUSTMENT; }


  static const int ZEN_ADJUSTMENT = 12345; //When I wrote this only myself and God knew what this value meant, now...

  int AddWithAdj(int a, int b) { return a + b + ZEND_ADJUSTMENT; }

And now the maintainer is going to wonder why the originally programmer defined ZEN_ADJUSTMENT just above this function, but actually used ZEND_ADJUSTMENT (which is apparently defined somewhere else in the code). By design or typo!? :-)


// Adds two numbers together function Add(int a, int b) { return a - b; }

is even less clear.


Sure, but I don't think anyone would call this a useful comment - it's exactly the kind of comment which should be avoided.

Comments don't need to describe what the code does, but if there's an unexplicable line which handles an obscure edge case you better add a comment or even you won't remember why that line exists 3 months later.


Because comment is out-of-date. That’s technical debt. The comment is not the problem: the developer not updating it is.

You wouldn’t leave out-of-date code in a system, it would cause bugs. Why would you leave out-of-date comments? Oh, because you don’t like to write. Your strength is math and code, not writing. Now we get to the heart of the matter and not some ruse like “code is self-documenting.”


Not really. You clearly see that the intent at some point was to add two numbers, and not some fiction about ADD the diagnosis or what ever.


So which one is right, the name of the function (the comment), or it's implementation?


Rethorical question? You can't know that without looking on how the function is used.

You at least know something is strange from the comment.


i mean, this is actually perfectly clear—in the sense it’s clearly and obviously a mistake either with the comment and function name or it’s implementation. the example isn’t great because it’s devoid of context that might indicate what specifically is wrong with it.

there are many standard means that should catch this: code reviews, unit tests, etc.

this stuff can obviously still sneak through, but i don’t think this is a good example of what folks are generally talking about here.


+10

Comment review is as important as code review.


Exactly.

Here is another example where a comment can help.

  // Do does x, y, and z, in that order.
  func Do(x, y, z func()) { 
      x()
      z()
      y() 
  }
Sometimes, as here, the comment correctly describes what the code should do. But the implementation does not agree with the comment. Relying on the comment, a developer can confidently fix the bug, or, at least, notice a discrepancy between the intentions, as described by the comments, and reality, as implemented by the lines of code. This is particularly important in complex code. Think: HTTP keep-alive handling, in which many factors, with varying priorities, have to be taken into account (e.g. client hint in the request header, current server load, overall server configuration, per handler configuration).


It’s also plausible that the function used to do x,y,z in that order, but then a new requirement came to do z before y. The code was changed, maybe a test was even added, but the comment was never fixed.

That’s the problem with comments (and all documentation really) - it gets stale and has no guarantee of being correct.

A better way is to write a test which would break if the order of x,y,z was wrongly changed. This way the order is guaranteed to stay correct forever, regardless of any comments.


I agree with your points regarding writing a test. It is one of the best ways, and the right way, to handle the issue.

That said, I think my toy example wasn't the best way to show some of the points I was getting at. Consider a real example, such as this function from file src/net/http/transport.go in the Go source:

        // rewindBody returns a new request with the body rewound.
        // It returns req unmodified if the body does not need rewinding.
        // rewindBody takes care of closing req.Body when appropriate
        // (in all cases except when rewindBody returns req unmodified).
        func rewindBody(req *Request) (rewound *Request, err error) {
                if req.Body == nil || req.Body == NoBody || (!req.Body.(*readTrackingBody).didRead && !req.Body.(*readTrackingBody).didClose) {
                        return req, nil // nothing to rewind
                }
                if !req.Body.(*readTrackingBody).didClose {
                        req.closeBody()
                }
                if req.GetBody == nil {
                        return nil, errCannotRewind
                }
                body, err := req.GetBody()
                if err != nil {
                        return nil, err
                }
                newReq := *req
                newReq.Body = &readTrackingBody{ReadCloser: body}
                return &newReq, nil
        }

When you work with code in or around this function, the existence of the comments tell you what the function guarantees, for example, regarding the closing of the request body. If at some point, there is a bug, it is more likely to be noticed because of the discrepancy between the intentions in the comment and the implementation in the code. Without the descriptive comment, an unfamiliar developer working on this code will likely not be able to tell if something is off in the function's implementation.

Also, as a user of code, it's nice to first know descriptively what guarantees a complex internal function promises from its comment. Then, if necessary, be able to verify those guarantees by reading the function body.

Of course, in an ideal scenario, a unit test is the best way to address things. But this this is a 22 line function unexported function in at 2900 line file, and sadly it appears this function isn't unit tested.

So, given how things are in practice, I argue that the presence of a comment, like here, can help with correctness of code.


Comments shouldn't be there if they are not maintained in same way code is, meaning there isn't shared understanding in the team about their importance. And important they will be if you actually maintain your code, or somebody else is. Adding few comments means you show respect for their time (and sanity) so they can focus on delivering changes or fixes instead of just grokking your crap.

Every time I see 0 comment in complex logic I attribute it to laziness of coder. And of course every time excuse is the same - its self-documenting, you see what its doing etc. But why, what are the effects elsewhere in the system and outside, what part of business needs this is covering, what SLA it tries to cover, what are overall goals etc. can be even impossible to grok from just code itself. World is bigger than just code.


I wouldn’t use the word ‘lie’ but plenty of Rails code I’ve seen has mystery effects and hard to follow coupling due to monkey patching and the like. Comments please.


> Code doesn't lie, comments do

    pi = 3


You can't attach a debugger to the comments and figure out why your code is calculating that the circumfrence of a circle with radius 1 is 6 rather than 6.28...

You're proving the point of the person you are responding to.

pi = 3 isn't a lie, it's the truth of the program regardless of what any comment says.


But that is a lie. It's deceptive and misleading. The code states what the computer will do; nobody is contesting that or claiming otherwise. They were pointing out that, although part of what `pi = 3` communicates to a person (that the machine will assign the value 3 to the variable named "pi") is correct, another part of what it communicates (that 3 is close enough to the ratio between a circle's circumference and diameter for the task at hand) may be untrue – and thus a lie.

The programs that people work with in their mind are not the programs (the code) that the machine runs. What the machine does is definitely more important in the moment, but the models that live in the heads of people are more important in the long term. The code that the machine runs is just one implementation, a shadow cast by the real thing from one of many angles.


> Code doesn't lie, comments do

Code doesn't tell the full story either. And code can contain bugs so one cannot fully trust the code too.

If code/comments are out of sync it is likely that someone changed the code and forgot to update comments (which is possible to verify using VCS logs) or it may be that the comment stated the intention right but the code has a bug. Code+comments like a parity check to me - if they agree it is good, if not - I have to investigate if the comment or the code is correct.


Yet, code might not tell the whole story. It's a pitty.


Code can lie too. Example: poor naming of classes, functions and variables. A name can tell you it's doing one thing but it can be doing something different.


It may not lie, but it can certainly misdirect.


I agree with you. It’s like having too much doc, or doc so outdated it’s harmful. These are fairy tails. I’d take too much doc or outdated doc any time of the day over no doc/comments.


Too much documentation, outdated documentation or just plain wrong documentation leads to people simply ignoring it altogether, even the good parts.


Who cares if they ignore it. The moment where they get stuck on something then they can check the part that refers to that


IMO, it is better to err on the side of too much comments, as comments are easier to delete than code


Having worked with Rails folk for many years, they do have some weird strong opinions, this being one. The argument that comments shouldn't be made because they'll "become stale" is pretty lame IMO. It's either not a big deal or the end of the fucking world for some reason lol. Maintaining code is just... work. You just do it. That includes comments.


> The argument that comments shouldn't be made because they'll "become stale" is pretty lame IMO

Better not write tests either, because those will become stale too.

And any sort of user documentation/support/etc. It'll all just become stale.

It's almost as if code is just part of a larger 'thing' that has to be maintained.


When tests go stale the build goes red, and you know to fix things.

Comments transition from truth to lie without any indication.


This was true before git blame. Now I can see when the comment was created and the original code it described if I have any doubts about the veracity of the comments.


If you have git blame, you can see a detailed log messages pertaining to the commits which touch some lines of code. You don't need it to be littered by comments.

You seem to think that git invented blame. This appeared in previous version control systems, e.g. "cvs annotate" whose synonym is "cvs blame".


Put the comments in git commit messages! Ha, that’s a good laugh. There are so many shortcomings to this, I don’t even know where to begin.


Nick checks out?


The obsession people seem to have with squashing commits in git (particularly on merge) means they're right, you can't really rely on commit messages still containing relevant information.


Squashing commits in this way would not have been allowed anywhere I've worked in at least 20 years.

This is a strawman argument.

Let's assume best CM practices (no vandalism of change sets when moving changes among branches), best practices for commit messages and best practices for commenting.

Commit messages win.

Code comments should be like road signs. E.g. how high can your truck be to pass under the upcoming bridge. Not blueprints for the bridge, or paragraphs about why it was built there, what communities it serves. (That information must exist and be available, just not on the bridge.)


I don't consider comments litter. Commit messages rarely have the amount of context needed for a specific block of code.


I write GNU-ChangeLog-style commit messages which mention every function and other named entity touched, added or deleted, with details about why. Between that and the diff, you can figure out exactly what was done and why, matching the comment with specific code blocks.

I've never been disappointed when digging back in history to figure out why the heck something was done.

It would be a strawman of my position to say that I'm favoring zero comments. For instance, a /* fallthrough */ comment in a C switch statement cannot be in a commit comment, not the least reason for which being that compilers look that that nowadays.

Comments should be like road signs. I need a warning about a dangerous curve ahead, not a discussion of why it was built that way.


yeah, if you can do commits per function that would be fine. Most projects don't allow that, everyone squashes commits these days.


If that's an argument so is this: most coders don't like commenting, nobody bothers, comments are haphazard and unmaintained. Just skip them.

You don't seem to have understood my remarks; It's not "commit per function", but cover each function that was touched (or other entity: class, global variable, type, object member) in your commit comment.

When someone does "git log" they should be able to search for the name of a function of interest and see all commits which touch it, explaining what was done to it and why.

Banish squash commits; it's a poor CM practice. If you're not able to get your developers to commit to a good practice, then that's your main problem. You have to fix that before engaging "comment versus commit".


Only if you don't maintain your code.

Guess what: Comments are part of the code. Like all other documentation!


Snark aside, I think we can agree that there is nothing automatic telling you when a comment is out of date.

If you have a strong discipline of manually checking every possibly relevant comment with each review, I can see how that can mostly work (manual human checks are never perfect).

I've never seen anything close to that process, so it's hard to know how it would work. It seems to require a lot of discipline and effort.


> Comments transition from truth to lie without any indication.

Only if you don't do code reviews. The reviewer should be checking for this.


Why stop there? The code will become stale, too. Best to not write any of that either.


I don't think your comment tries to paint the best case for the point of view you are opposed to or show an understanding of the philosophy you reject.

Tests represent truth in the same way code represents truth. In many ways the tests represent truth more than the code represents truth. Tests state "this is what I want" and then interrogate the code to verify that the code itself does what I want.

Tests by definition can not be stale (unless they are never called).

The ruby community is where I first learned about test driven development which is philosophically the idea that tests are the highest level of semantic importance.

There are DSL's in ruby land for writing tests like you might write comments: https://semaphoreci.com/community/tutorials/getting-started-...

If tests describe what you want, and code describes how it happens, then the major piece left is why it is the way it is, which almost everyone in this thread agrees is the most important thing to comment.


(well written) tests tend to fail (or fail to compile) if they're 'stale' though, so they force you to maintain them just like other code.

(I'm not arguing against writing comments though - the 'they'll get stale' thing is not enough to make them not valuable)


I don't see a difference.

If you write good tests and use a tool (CI) that detects them going stale, tests will never be stale.

If you write good comments and use a tool (code review) that detects them going stale, comments will never be stale.

In order for tests to not get stale, you need perfect (well written) tests.

In order for comments to not get stale, you need perfect code reviews.

Neither of those exist in real world: code reviews are done by humans, tests are written by humans.


> Better not write tests either, because those will become stale too

Unit tests are easier to maintain when incorporated to a CI/CD pipeline. Then every pushed commit (or Merge Request) will trigger the test suite to run and in case one of the tests fails the team will know.


> Better not write tests either, because those will become stale too.

They do go stale, but tests were invented exactly so that your documentation is able to alert you when it has gone stale. Their existence rests on being an attempt to solve to this problem.


The difference between tests and comments is that comments become stale silently.


Only under assumption that you are writing perfect tests.

But if you are capable of writing perfect tests, why haven't you written perfect software in the first place?


> Maintaining code is just... work. You just do it.

There is an argument about reducing the amount of work.

On the staleness of comments, it's a real thing. In particular, on comments explaining design decisions, they will often touch on aspects of the system that are outside of the specific method they are attached to, and nobody will go back to them to rewrite all that prose.

We had a project with a ton of documentation written in the first 2 years, and as engineer count grew, these comments just disappeared as again and again they were causing misunderstandings, and actual documentation was already written in the internal wiki as each refactorings and new features were discussed and designed.

It's not just a rails thing, and after a while people stop trusting comments altogether, which make updating them a chore more than anything.


> actual documentation was already written in the internal wiki

I think that's a reasonable approach if it's easy to find the relevant section of the wiki that describes the code you're looking at by searching. Or just have a comment in the code that points to the wiki. The important thing is to document the "why's".


Rails developers (myself included) are lazy by design, and thus eschew anything requiring redundant work. Most operate under the principle that "it was hard to write, it should be hard to read"


[flagged]


Personal attacks aren't kosher at Hacker News.


It’s not an attack. It’s an observation based on his lack of team awareness. One of the marks of a senior engineer is someone who is aware he’s part of a group and strives to contribute to a group, rather than a soloist mindset. Call is social maturity if you like.


If this study is to be believed. Citation-> https://cacm.acm.org/magazines/2017/10/221326-a-large-scale-...

The 5 programming languages with the least amount of bugs are:

    Haskell
    Ada
    Scala
    Rust
*Drum roll please*

    Ruby
Ruby is nothing like the other languages above in design, but that community has fantastic conventions around code clarity, code quality, testing, continuous integration, etc.

Comments are for things like, `Todos`, Noting a specific algorithm, maybe a link to a white paper, noting a violation of a convention for speed, security, or some other performance reason.

Comments are for things that are __impossible__ to tell from the code.


I wish I could get this point across to my team. They will happily write:

db.insert(record); // save the record to the database

But when they write some crazy off-the-wall code (that's usually because they didn't understand the proper way to do something) I have to check the logs to see who wrote it and ask them why.

Just the other day I was debugging an application that had a 6 second sleep at the end of the main() function. I just figured it was some dumb thing left in for debugging and deleted it, because there's no good reason to do that. The next day, the dev who put it in messaged me and said he put it there because the application was exiting before it had logged it's completion to our logging system. So I explained that the proper way to do this is to flush the log, not just hang the application for 6 seconds so that the last messages just happen to go through.

If there was a comment explaining why there was a 6 second sleep, I could have just fixed it and educated the developer without causing any grief.


> I just figured it was some dumb thing left in for debugging and deleted it, because there's no good reason to do that.

Look up Chesterton’s Fence. :)

(I made similar errors more than once.)


You should be very careful applying Chesterton’s Fence to software engineering. Oftentimes the knowledge why fence exists is more valuable than the fence itself. And the best way to (re-)obtain this knowledge is to remove the fence and see what breaks.

I've seen people get stuck for months being afraid to change a complex piece of code because nobody understands how it works anymore. The best course of action was to admit that knowledge is lost forever and the fastest way to gain it again is to repeat past mistakes.

Of course, don't do this if your software is responsible for landing airplanes. But most of us are not landing airplanes here.


When you look at the original quote [0], it doesn’t necessarily disagree. You’re free to test what happens when removing the code, and when it then breaks you’ll know why the code was there, and can then decide what to do. However, when everything still seems to work after removing it, that is dangerous, because you’re likely to overlook some edge case or some unconsidered use case that may come back to bite you later.

Of course, Chesterton’s Fence is just a guideline. You can weigh the risks against the benefits in each case. It’s just a reminder that there may well be very good reasons why some logic is in place. If it is at all possible to find out those reasons, then that would generally be preferable, in order to make an informed decision.

[0] https://en.wikipedia.org/wiki/G._K._Chesterton#Chesterton's_...


> that is dangerous, because you’re likely to overlook some edge case or some unconsidered use case that may come back to bite you later.

I know, and that's exactly why I don't like it. In my experience, lack of understanding bites you way harder than lack of a single fence.

I once worked at a company that had a very complicated patented (!) algorithm to calculate a certain value. Not a single person in the company knew why it was so complicated, and nobody ever questioned it. We would constantly struggle to introduce new rules to it, because they would conflict with the old rules, and it wasn't clear how to resolve those conflicts.

Since we didn't understand what value those magical rules provided, there was absolutely no way we could resolve conflicts in a way that retains the original value (if there was any).

Eventually I was able to convince everyone and just removed all the rules we didn't understand. Everyone sighed with a relief.

In hindsight, I believe the sole reason algorithm was so complicated was that having a patented algorithm would sound more sexy to investors, and you can't patent something stupidly simple.


+10

Throwaway commenting often makes people feel as if they’re adding value, whilst fulfilling their obligations.

Of course such comments have no value and worse, can distract from what ought to be commented.

Perhaps ChatGPT has found a use, if it is asked to add comments to code, although I somehow doubt it…


thats a legit case of, "its impossible to gleam that fact from the code and the code alone." (although I would have left that as a comment for the reviewer in the PR) so if theres a conversation around that decision, it would have been captured there.)


The discussion in the PR does not help the next guy…

All why answers belong into the code, as comments.


Interesting paper, thanks for posting. The paper goes on to say:

> One should take care not to overestimate the impact of language on defects. While the observed relationships are statistically significant, the effects are quite small. Analysis of deviance reveals that language accounts for less than 1% of the total explained deviance


Where are Ada and Rust mentioned in that link? The top listed down to Ruby in that link by general failure seem to be: Scala, Perl, Clojure/PHP (tied), Haskell/Go (tied), TypeScript, Ruby.


The Ruby community has a very good culture of testing application and framework code, but in my experience is quite poor at properly specifying behaviour or writing truly robust code.

Many functions in the standard library behave subtly differently from there documentation or have behaviour that is not documented but depended on by applications, and I’ve certainly seen lots of concurrency bugs in both code and tests because the GVL makes it hard to provoke the worst case behaviour.

The unit tests are really useful for Ruby language implementations because they help give us confidence we’re replicating all the required behaviour, but from that point of view I wish the underlying things were better specified so those tests weren’t quite so necessary.

Don’t get me wrong. I love Ruby as a language and loved working on TruffleRuby, but I don’t think the community should be too smug about code quality and testing.


Agree with most of what you said, except on the __impossible__ part.

You can tell everything with code, given enough time and resources. Nothing is "impossible". But there's a point where you hit diminishing returns. It's about efficiency.

Other devs, or your future self, will have to time trying to build the same "castle in your head" as you did. Can a comment shorten that time? If a one-line comment saves 15 minutes of investigation in aggregate, that's a no-brainer. You should add it. Conversely, a line that says "add 5 to x" is just wasting everyone's time.

As a general rule, once I have finally understood a particularly hairy piece of code, I don't want to do it again in 6 months. I have even done ASCII diagrams in the comments explaining how a particularly hairy piece of code worked, when I hated it enough.


ASCII diagram comments are surprisingly common in scientific computing, FWIW. It's not so much that it's always hairy code, more that it helps to add some visualizations of what's going on. Math vs intuitive understanding are often very different things.

It's something I wish was more common than it is. Sometimes you really want that ASCII pseudo graph explaining how minimizing this thing relates to that other thing that we actually care about.

Taken a step further, I really wish we had better systems for managing this type of information alongside a codebase. E.g. the whitepaper, or presentation, or figures, or whatever is often very necessary to understand the code, and ideally it would be under version control and live neatly alongside things. Nicely rendered docs definitely help with this (e.g. latex in comments), but I'm always surprised there aren't more people focusing on this aspect of information management.


In my experience, one man’s “waste of time” is another man’s castle-in-the-head-building-time-saver. At my current company, if I have reasonable, self explanatory code, such as a function with the signature “read_page(page_num: u32) -> PageData”, they’ll gather it reads a page with the given page number and extracts the data. At my last company, if I wrote the same code, I’d receive a comment on my PR complaining I did not document it will enough. I would then add a comment that somehow only I was able to detect was snarkily written, something like “// reads the given page number and returns its data” and they would be happy as clams.

I always assumed that behavior was dogmatic until, over many PRs, multiple members of the team commented that if I commented my code, it would be much easier to read. And comments of the above quality assuaged them.


What I would miss, in the above, is information about:

- where the page is being read from

- what page numbers the function accepts

- what happens if an invalid page number is specified (or if that’s deliberately undefined, stating that fact)

- other possible error conditions, if any

(I’ll assume that PageData already documents what “page” refers to in that context, because for me it’s not “self explanatory” just from the function signature you provided.)


> You can tell everything with code, given enough time and resources.

That's just not true.

There are infinite many ways to write the exact same thing.

Your code can never say why it's like it is, why this approach was chosen and not another.

> […] the comments explaining how a particularly hairy piece of code worked […]

This kind of comments are the bad ones actually.

Your code should be clear enough that it's easy to see how it works and what is happening.

The rule is actually very simple, and I don't get it why so many people have issues with following it.

"Code says what and how, comments say why."


> There are infinite many ways to write the exact same thing.

I agree with that. I don't think that makes things impossible. Only more difficult.

> Your code can never say why it's like it is, why this approach was chosen and not another.

If the why is important, given enough time and resources, someone will eventually complain that something is not working and someone will be tasked to "fix it". It may take them weeks, but they will eventually figure out the why.

> Code says what and how, comments say why.

If we are considering only quality vs quality, I agree. But there's also quantity.

When the how is complicated enough, it is indistinguishable from a why. You will hear the reviewers ask outloud: "But WHY is this done this way?". Even if the problem is merely algorithmic/technical and the code is doing exactly what it's supposed to do.

Then you have two options: spend a potentially significant time refactoring things (which is always the right choice, given enough time and resources) or spend 10 minutes writing a comment with what you learned, and moving on to the next problem.


> The rule is actually very simple

Why does there have to be rules about when to comment? I comment frequently about whatever the hell I want. Don’t like it? Then don’t read the comments. Very simple, just like your rule.


Useless comments are like useless code: They create an unnecessary (and therefore costly) maintenance burden.

If you think you're the smartest person on the planet so "rules" (or better said best practices) don't apply to anything you do I hope nobody ever needs to collaborate with you.

All the "rules" are there for a reason!

No, you're not free do to whatever you like in a professional setting. For example when you're operating machines you need to follow safety regulations. Otherwise you could even end up in jail quite quickly.


> All the "rules" are there for a reason!

Except in this case, you've made up the rule. It is not an axiomatic rule like gravity. I want different rules. So unless we work together, your rules don't apply to me.


Comments are critical at creating beginners to the code base, something rarely the expert class think about accelerating.

For my code? I agree

Code for others to participate in? The simplest possible way for the greatest number of developers to understand as quickly as possible. Not about my best practice or preference but for the greater good.

Coding standards for the elite devs are only so effective at growing that growing quick enough.


The expert will be helped as well when coming back to the code after a few months.


100%.

Comments are a critical way to more quickly re-grok the mental model of what's going on for your future self as well.

I think there was a post in the past few weeks about a single developer who maintained hundreds of his own tools, and for him it came down to process and documentation first so it was easy to come back to months or years later.


The best comments I've seen go something like this:

    # this is going to loop 4x and call foo_bar on the 4th time because when you calculate this number it needs the 4th time to calculate the difference from the sales tax. weird, I know, but the business has special rules about how taxes work in California


Shed a tear for how little our industry values conciseness, and then consider:

  # Call foo_bar only on the 4th loop iteration to handle the weirdness of calculating taxes for California.
(I've just joined a company where everyone's constantly writing giant docs, and no one seems to grok the idea of a "living doc" that gets edited and refined. I think it's slowly making me allergic to verbosity.)


These are both terrible comments (IMHO) because they tantalise the reader (the coder knew why they did this mysterious thing!) but don't actually really explain why this is necessary or how this mysterious thing actually addresses the messy reality involved. The problem might just be that it's an unrealistic example.


Could be elaborated but this is 1000x better than nothing. The reader can look up what's unusual with enough hints from this comment.


If you're going to call something terrible, it's a little more productive to offer a suggestion for something better.


Okay instead of

# Call foo_bar only on the 4th loop iteration to handle the weirdness of calculating taxes for California

# Call foo_bar only on the 4th iteration because in California widget tax # (1st iteration) doesn't apply at all, foo tax (2nd iter.) only applies # for B2B transactions (this is not one), bar tax (3rd iter.) only applies # for alcohol and this subsystem is not called for alcohol sales. But VAT # (4th iteration) does apply.


HN hasn't respected my line breaks.


I believe you need to have multiple line breaks between lines, I also appreciate the comment. I originally agreed with the other poster, but your comment is way better. Especially for having no context :D


I get your point (yours is written much better), but having comments that feel more like inner monologue is actually a good thing.


This is overly expository and could be much shorter. It also shows the programmer had no interested in actually understanding what their code was doing. The comment makes absolutely no sense whatsoever. "It needs the 4th time". What needs it? I absolutely guarantee you you do not need to loop 4 times to "calculate the difference from the sales tax". "Weird, I know" -- I guarantee you it's not weird if you actually understand anything about the calculation you're supposed to be implementing. This comment barely passes as an English sentence, much less someone who knows anything about taxes would ever say. This is such a great example of an absolutely terrible comment belying lazy, thoughtless programming.

I'm not sure I've seen all the "comment everything" advocates in this thread provide a single example of a good comment.


func handleFourTimesForCaliforniaTaxes() {

   for i in 0..<4 {  

      if i == 3 {  

        foo()  

      }  
      else {  

         foo_bar()  

      }  

   }  

}


I worked at a late-stage Rails-based startup that had "dont write comments" as an engineering principle!

The python folks before me used this to avoid writing docstrings for modules/classes/methods, so code could only be understood by reading the entirety of it. Throw in some deep class hierarchies and it was very hard to onboard there.


TBF, deep class hierarchies will hard to onboard whatever you do.

If you're relying on comments to help with that, you'll have to share the same mindset as the person writing the comments, except there's probably multiple years of understanding gap between you and them.


I went from a rails developer to iOS; Apple and iOS developers in general are pretty good at comments, and I now value them more. Comments at the API level mean I don’t have to go diving however deep to find out what a method does. If they’re really good, they can also be used to distinguish the specification from what’s actually happening when there are bugs.

In the simplest cases, sure you don’t need to write “getName returns a user’s first name and last name separated by a space character,” but I’m guessing the method requestPurchase will have a lot of nuance and I hope it’s documented (yes, even the “what happens”)


> “getName returns a user’s first name and last name separated by a space character,”

This would be actually a quite interesting comment as it indicates that the implementation of `getName` is buggy in regard to internalization.

Here's the classic post about this topic:

https://www.kalzumeus.com/2010/06/17/falsehoods-programmers-...


I'm at the point where I want to see docs on any non-private properties/fields, methods, parameters, classes, etc.

I think there's nearly always value to be added:

* Explaining required fields, if null/blank/0 is allowed, where an ID comes from (db vs app-generated), if string values are formatted or require formatting (credit card, phone numbers), max length or other restrictions

* If a class/method is thread-safe or not (when not obvious and misuse is dangerous), error conditions, timeouts.

* Any external or indirect dependencies for use (config, packages, etc)

* Links to other docs/wikis or tickets is hugely useful.

When you're writing/working on the code you already know all this stuff, and it takes only a couple minutes to document. When someone else (or you, 6 months later) comes along to use or modify it, figuring everything out from scratch can take hours -- or worse, bug reports from QA or customers. Docs shave this down to seconds and directly avoid bugs.

The other huge benefit I often experience is through trying to write docs for something I realize there's a better, more obvious name that makes it easier to use and requires less explanation (less docs). This happens on easily 5-10% of the things I write docs for.


What question will the next engineer have about your choice?

1. Rewrite your code to make the answer obvious.

2. If that's not possible use a comment.


3. Be aware that programmers tend to be bad at judging what is “obvious” about their code.


Document the why, not the what.


A lot of code is WTF enough that a comment explaining what it is attempting to do might be helpful. The real world is full of absolutely shit code, but it still earns the company money.


Hmm... I think you might be missing the concept a bit. What I mean is that there is a lot of documentation that is like this:

  /*
   The phone number
  */
  private static String phoneNumber;
That is considered 'the what', the documentation doesn't explain 'why' the phone number is a field on the class. This might be obvious because the class is called 'Address'... so people will argue that it is self documenting, and it is.

The WTF code is 'the why'...


In something like your example, the what is obvious. But I've seen a lot of code where the what is not at all obvious, and a comment succinctly summing or breaking it down up would help drive the understanding.

The way is useful in addition to that, and often IME far more important than the what. IMO you should document both, as necessary, and that's the hard part/skill. You have to treat it like you won't remember it in 6 months, because in 6 months, you won't. A lot of devs don't do that.

(A fair number of devs won't follow the style guide / idiomatic patterns of a language if there isn't a CI check forcing them to, and will argue to all ends with a human as to why it ought not apply to them, to the great detriment of their code's readability to other experience practitioners of the language. Oddly CI enforcement usually works better, why I don't know.)


> I've seen a lot of code where the what is not at all obvious

So then the question that needs answering is:

"why" is the what not obvious?

Document the "why."


The comment is missing everything useful. If it was public, I'd definitely ask for docs on it:

What is the phone number for? When I read it, is it ready for display as-is or should I be formatting it? Same question when I set it. Can it be null or empty? Does or can the system use this number for anything - such as sending SMS messages?

Being static is a huge red flag, though, and even though it's private I'd still ask. Why is it static? Is this thread-safe (and how)? What context does it get set in, and when can I use it?


Yea, that code was definitely written to throw people off. ;-)


That code isn't WTF. Its completely obvious what it is doing, and the comment is unnecessary.


Read gregmac's comment.


I always find the 'self documenting code' concept to be a bit misleading. Because intuitively the idea is that you'd be able to read the actual code and understand what it does, without having to rely on potentially dated or otherwise invalid comments. But in reality we're not really talking about code, but about things like function names - which suffer all of the potential woes as comments, even if to a lesser degree in practice.

For instance, this [1] is a clean implementation of quicksort, which I offer as an example of any non-trivial algorithm. You can't really write it in a way that would make the idea clear to anybody who wasn't already familiar with the algorithm, because the idea itself is non-trivial. So the way intent is made clear and documented is by making sure the function is called QuickSort. And that is indeed 'self documenting', but really in a way that has nothing to do with the code itself.

[1] - https://www.w3resource.com/csharp-exercises/searching-and-so...


    I worked with some Rails folks a while back who 
    were utterly convinced that comments were to be 
    avoided because it meant your code was not clear.
One of the few things that makes me want to reach a management position is my burning desire to alter this widespread, toxic, and absolutely bizarre belief.

As an IC, even a senior IC, it's difficult to effect this change.


There were similar discussions in the PHP community, with arguments that were sensible and rather stupid on both sides of the argument.

I'm honestly a bit torn by it. Yes, your code absolutely should be written in a way that keeps it tidy and easy to understand. But you've always got complex situations which arent always going to be easy or obvious for someone fresh to the codebase.

There should be a good middle ground. I dont need to see comments saying "this is a loop that gets all the users". If its got a variables called users, calling a methog called "getUsers()" then thats pretty damn obvious whats happening.

However if you then go on to do something weird like loop over each user and calculate a score based on the number of posts they've made then theres undoubtedly going to be some logic in there that even just a simple one sentence comment will help someone understand.

It's a fine line, and getting it right is a skill in itself.


Right, it's about the "why". For instance with your loop that calculates a score, I might want to know why it wasn't done in SQL. Or if it could be, but whoever was writing it just wanted to hurry along. Is there a TODO to get rid of an N+1 problem associated with it?


Arkency, a very well-known Ruby consulting company (at least in Europe), are the ones I've seen shout about this 'rule' the loudest: https://twitter.com/arkency/status/1254784379190038534


> assumptions - we’re a small, cohesive team of experienced & responsible engineers

I found that pretty much any methodology works fine in a team like that.

"Don’t comment the code" still sounds pretty crazy to me, but I can believe that it works for them.


My time at Pivotal Labs was similar. "The tests are the documentation!"


I have worked with a few developers who shared this motto. I didn't agree with them.

While it's anecdotal, my understanding is that we were simply working on very, very different codebases. I was refactoring multi-million lines C++ codebases, implementing concurrent algorithms in which dependencies cannot be expressed through language constructs and I had to make non-trivial choices to prioritize performance, or safety, or security at the expense of readability.

On the other hand, these (few) developers were working on writing fresh code, with less footgunny languages, with simpler data and control dependencies and didn't need to make hard choices due to perf/safety/security.


I’ve surprisingly heard it said on a c++ project.


That's... surprising indeed!

What kind of domain was it?


Simulation


Oh? Their tests have 100% coverage of every feature, and every meaningful, non-obvious interaction between features?


Pretty much yes, look them up. They are huge on TDD and pair programming.


I'll believe it when I see their codebase. In my experience (10 years in test infra for ~50 to 300-dev products)[1], for any non-trivial product, there's always a colossal testing gap between what I would consider to be thorough coverage and actually attained coverage. The combinatoric complexity I alluded to is one of the reasons for it.

And this is despite significant and on-going investments in test cases, test infra, integration testing infra, and good overall dev culture around quality.

Also, pair programming has nothing to do with any of this. Mentioning it is like saying that their tests are better because they use IntelliJ instead of Eclipse. It's a very interesting fact, but irrelevant to this.

[1] I will, however, point out, that there have been many times in my career that I have said 'This comment should be a test'. Tests are incredible. [2] But there is no circumstance where I would ever agree with a blanket policy of 'Every comment should be a test'.

[2] It's entirely feasible, and highly desirable to get a small project to the point where just running the tests makes you feel 100% confident in the correctness of your changes. It is highly desirable to drive large products towards such a point, but those efforts can, at best, approach it asymptotically, and cover some enumerated, but limited set of use-cases and data-flows. [3]

[3] And don't even get me started on verifying adherence to security best-practices through testing. It's an utter miracle that security gets two thoughts from a test-writer, and those thoughts are inevitably "These checks are annoying, how do I disable them to get my test to work?"


Wow, sounds like you’ve consistently worked only in some really shitty environments, sorry for you. The worst companies I've ever worked for, had no testing.

Seriously, PL is all about TDD, I'm not joking. PP actually does matter because two people sign off on commits that they worked on together and it is quite an achievement to convince your coworker that you're sitting next to, to not write a test. Consider it a sort of checks and balances. PL's version of pair programming is 100% of the time, it works.

I have been running an open source project for a few years now with 40k downloads a month on NPM. This project has a heavy dependency on two other projects. Testing isn’t 100%, but it is enough to rely on it for releases. I also started out with TDD as well. Nothing is perfect, in fact we just had a release yesterday with a contributed fix in it, that came with a test, and still ended up with a bug... but out of 118 tags, a few here and there isn't bad. I routinely upgrade all the dependencies and never hear a peep out of people who depend on it.

Most recently, I wrote some software that ran on 20k servers... it was extremely well tested because if it failed, it could have caused massive amounts of downtime.

I've worked for a number of other companies where we had fantastic testing infrastructure, CI/CD... mostly because we started from day one with that. Bolting it on after the fact, never happens to the degree it should.

I just saw Hashicorp release Hermes today... and was saddened to see it had zero testing in it. I just sighed and closed the browser window. It just isn't worth it.


> Consider it a sort of checks and balances.

So, just like a proper code review process.

> I have been running an open source project for a few years now with 40k downloads a month on NPM. This project has a heavy dependency on two other projects.

Are they dependencies on libraries, or are they dependencies on services?

Library dependencies are 'easy' to test. Service dependencies are very, very difficult. Tractable, but very difficult. That's just one of the difference between a small project, and a large product.

Just because we have vastly more than two dependencies doesn't mean that I'm working in a shitty environment. It just means I'm working on a large problem.

> I've worked for a number of other companies where we had fantastic testing infrastructure, CI/CD... mostly because we started from day one with that. Bolting it on after the fact, never happens to the degree it should.

I'm not sure why you believe that all of these things weren't with us at day 1, either.


> So, just like a proper code review process.

Except it happens in real time, with a real person, sitting next to you.

> Are they dependencies on libraries, or are they dependencies on services?

In this case, other libraries... but I honestly don't see the difference. Libraries and services have APIs. They either work or they don't. Services have the additional complexity of network failures and runtime errors, but these are things that can be dealt with and tested in the primary project.

> Just because we have vastly more than two dependencies doesn't mean that I'm working in a shitty environment.

You made the wrong equation there. The environment is shitty because you said that you're in an environment where people have created products so complex that it is very difficult to achieve adequate testing. You also shrugged off Eclipse vs. IDEA... where I'd actually say that matters. You also talk about disabling tests for security... that seems absurd.

> It's entirely feasible, and highly desirable to get a small project to the point where just running the tests makes you feel 100% confident in the correctness of your changes.

I've built multiple companies that have achieved significant revenue on that confidence. Not small projects. I know it is possible to do.

> I'm not sure why you believe that all of these things weren't with us at day 1, either.

So you started off good and ended poorly? Another reason to believe that the environment is shitty.


Testing library and service dependencies is similar in theory. The difference between theory and practice is that in theory, there isn't one.

I assure you, I am a real person, generally sitting adjacent to people whose code I'm reviewing, or who review my code.

The projects are complex because the business needs are complex. This criticism is the 'I could build Twitter in a weekend, why does it have five thousand people working on it, anyways?'

I, like most people, have plenty of opinionated preferences on the subject of Intellij and Eclipse, but I'm not going to drive any broad conclusions regarding their impact on a product's test coverage. And I will postulate that most attempts to do so reduce down to cargo-culting. It was an example of a largely irrelevant technical decision, and I think it's telling that you are latching onto it.

And it's not disabling tests for security, it's disabling security for tests, as a distressingly common example of a routinely undertested interaction between two complicated spaces.


> Testing library and service dependencies is similar in theory.

Not really. The theories around testing are all well established and trodden at this point. People who've been around long enough, have this experience.

> The projects are complex because the business needs are complex.

Complex business needs are separate from the functional testing of individual components. Testing business logic is well... a matter of writing a test, seeing it fail and then writing the business logic (however complex), to make that test pass. Books and PhD's have been written about TDD. The stuff isn't rocket science.

> generally sitting adjacent to people whose code I'm reviewing

Pivotal Pair programming is literally sitting next to the person, sharing a single computer, two mirrored monitors, two keyboards, two mice. Two people independently working together to control a single computer. 8 hours a day (with breaks, of course). Is that what you do?

> I think it's telling that you are latching onto it.

I think it's telling that you dismiss it. Tools are important. It is like telling a car mechanic that every wrench is the same and unimportant. There is no debate on this, IDEA 'eclipsed' Eclipse, especially for Java development (and honestly, everything else), over a decade ago. The final straw for me was when Eclipse crashed for the 1000th time and torched my workspace to the point that I had to delete Eclipse to get it running again.

> it's disabling security for tests

Which says to me that the 'frameworks' used for testing aren't set up well or that the developers you're working with don't have an understanding of or care for how to do that. No thanks. Get it right from the start and make it friendly to the people you're working with so that they aren't so annoyed they have to disable tests. It also says to me that the way things are coded are so intertwined with the security layer that they can't be easily tested independently... which is also a fundamental design issue. Obviously, there are exceptions, but the underlying issue still exists if developers are disabling tests because of this.


Chiming in on this point:

> In this case, other libraries... but I honestly don't see the difference. Libraries and services have APIs. They either work or they don't. Services have the additional complexity of network failures and runtime errors, but these are things that can be dealt with and tested in the primary project.

In my personal experience, there are considerable differences.

From the top of my head:

1. Typically, your libraries don't shift under you without any action from your part. External services do.

2. In CI, you can either use the external service (which may require piercing holes in your CI's security and may break due to no change of your part), replicate it (which is more stable but can be extremely painful and resource-consuming) or mock it (which is even more stable but limits the realism of your tests).

3. And of course, as you mention, services have all sorts of failure modes, including but not limited to network failures. Normally, your tests should cover all these failure modes. I don't think I've ever seen code that does that, in part because service errors are very often notoriously under-documented.

YMMV


> 1. Typically, your libraries don't shift under you without any action from your part. External services do.

External services should have versioning, just like libraries.

> 2. ...

They don't need to be 'or'.

> 3. ...

I write tests for this. If I'm using a networking library/function (let's use `fetch` as an example) to talk to a 3rd party external service, I write a wrapper around it to deal with error cases. I then write tests for that wrapper. Anything that uses that wrapper is then inherently tested at the networking level. I then write tests for my own code which uses the wrapper, to deal with the thrown exceptions and how that code should deal with them (retries, throw again, etc).

You don't do that? Do you just assume that all calls to `fetch` succeed?


You don't need 100% to be better than comments.


You don't need to be at 100% to be better than some comments.

Mind you, no serious project is anywhere close to 100% coverage. It may have 100% unit-test coverage, but that is a far cry from 100% coverage.

And the valuable comments are the ones that describe integration interactions.


No serious project is anywhere close to "100% comment coverage" either.


I'm not pitching comments as a replacement for all tests, like some are pitching tests as a replacement for all comments.

I'm pitching comments as a compromise when the test that encodes their meaning is impractical.


Combining tests and comments is still not going to get you to 100% coverage though. So for comments to be worthwhile the marginal gain from using them has to outweigh the cost. IME it's usually better to put that effort into better tests/tooling/etc..


I'm curious, who is pitching for that?


I tend to be the one who comments code the most on my team. I’ve learned how often and what to comment via my side projects. I tend to stop and go with them, sometimes going a year or two before picking them back up again. You figure out what you wish you’d left yourself as comments.


Rails guy turned rustacean here. If I have a piece of code that looks gnarly, I'll write a comment above it explaining what it does. Elegance != readability always. This can often happen with complicated map/filter/etc chains (both in ruby and in rust!) that are compact but are complicated enough that your eyes glaze over when you read them. Anything eye-glaze-over-ey is probably worth a comment in my opinion.

Other times to comment include when there is some sort of dirty hack, TODO annotations, etc.


The why should be clearly explained in the Jira ticket associated with the commit. Comments are useful only to explain non obvious corner cases and invariant. You can write whatever you want in your Jira instead of polluting the code for everyone.


> I worked with some Rails folks a while back who were utterly convinced that comments were to be avoided because it meant your code was not clear.

Obviously, unclear code without comments is better then unclear code with comments.


Small, well named functions have served me well in this regard, eg.

if (theWhy()){ theWhat() }


Sounds good in theory. In practice, you get stuff like https://github.com/git/git/blob/master/cache-tree.c#L246


I can't tell what's wrong with that — could you elaborate?


It's fine. but it's not exactly small, either. It does a lot of different things.


This is something many people don’t do. They just put some weird statements that will change with time and deciphering what they meant is pain


As a "rails folk", we aren't all like that :)


> I worked with some Rails folks a while back who were utterly convinced that comments were to be avoided because it meant your code was not clear.

What this means is that if your Ruby code is clear it should read like natural language. Injecting subtitles into the middle of the expression

# I decided to phrase it this way because I was in a rush and it was the first thing that came out. With more time I could no doubt articulate this in a more communicative way, but for the sake of a random post on the internet I'm not terribly worried.

interrupts the flow when one is reading the code, which makes for a much less pleasant experience. Ruby may not be the only language that is like this, but it is relatively unique in this regard. Comments in other languages don't seem to cause the same interruption.

# That's all I've got. Time to clean up.

I'm not sure this means don't provide additional information to future readers that may be beneficial, but be discriminating in where you put it. Or do whatever you want. Who cares what someone else thinks?


> should read like natural language

The history of programming is full of that notion and it never really works out. See: COBOL, SQL, and so on.

I think we can all agree that code that is easy to read is better, but sometimes the 'why' needs spelling out.


Totally agree. I even said so at the end. You would have noticed if the comments didn't get in the way. :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: