> I always thought of race conditions as corrupting the data or deadlocking. I n...

williamdclt · 2025-02-28T17:29:40 1740763780

It's a decent rule of thumb, but it definitely needs some pragmatism. Squashing any error, strangeness and warning can be very expensive in some projects, much more than paying the occasional seemingly-unrelated problem.

But of course it's quasi-impossible to know in advance the likelihood of a given error participating in a future problem, and whether it's cheaper to fix this error ahead or let the problem happen. So it becomes an art more than a science to decide what to focus on.

"fix nothing" is certainly a horrible approach, "fix everything" is often impractical. So you either need some sort of decision framework, or a combination of decent "instinct" (experience, really) and trust from your stakeholder (which comes from many places, including good communication and track record of being pragmatic over dogmatic)

alex_smart · 2025-02-28T20:21:25 1740774085

> Squashing any error, strangeness and warning can be very expensive in some projects

Strongly disagreed. Strange, unexpected behaviour of code is a warning sign that you have fallen short in defensive programming and you no longer have a mental model of your code that corresponds with reality. That is a very dangerous to be in. Very quickly possible to be stuck in quicksand not too far afterwards.

huijzer · 2025-02-28T20:49:08 1740775748

Depends a lot on the project, I think, as the parent comment suggests.

fc417fc802 · 2025-03-01T01:41:37 1740793297

I feel like these categories are different. Warnings should generally be treated as errors in my book, and all errors should be corrected. But "strangeness" is much more open ended. Sometimes large systems don't behave quite as expected and it might make sense to delay a "fix" until something is actually in need of it. If none of your tests fail then does it really matter?

alex_smart · 2025-03-01T05:25:40 1740806740

> If none of your tests fail then does it really matter?

Yes. Absolutely.

You don't believe your software is correct because your tests don't fail. You believe your software is correct because you have a mental model of your code. If your tests are not failing but your software is not behaving correctly, that your mental model of your code is broken.

fc417fc802 · 2025-03-01T10:24:55 1740824695

I agree for small systems. But as they get larger you often can't keep track of every last piece simultaneously. It can also become quite involved to figure out why a relatively obscure thing happened in a particular case.

Consider something like Unreal Engine for example. It's not realistic to expect to have a full mental image of the entire system in such a case.

At least in theory the tests are supposed to cover the observable behavior that matters. So I figure if the tests pass all is well. If I still find something broken then I need to add a test case for it.

alex_smart · 2025-03-01T11:59:37 1740830377

> But as they get larger you often can't keep track of every last piece simultaneously

Sure, but then you divide the larger system into smaller components where each team is responsible for one or few of these individual pieces and the chief architect is responsible for making sure of how the pieces are put together.

> At least in theory the tests are supposed to cover the observable behavior that matters. So I figure if the tests pass all is well. If I still find something broken then I need to add a test case for it.

But you sure as hell hope that the engineer working on implementing your database has a decent mental model for the thread safety of his code and not introduces subtle concurrency bugs because his tests are still green. You also hope that he understands that he needs to call fsync to actually flush to data to disk instead of going yolo (systems never crash and disks never fail ). How are you supposed to cover the user observable behavior in this case? You cut off the power supply to your system/plug off your disk while writing to the database and assert that all the statements that got committed actually persisted? And how many times you repeat that test to really convince you that you are not leaving behind a bug that will only happen in production systems say once every three years?

I am only giving database and multithreading as examples because they are the most obvious, but I think the principle applies more generally. Take the simplest piece of code everyone learns to write first thing in uni, quicksort. If you don't have a sufficient mental model for how that algorithm works, what amount of tests will you write to convince yourself that your implementation is correct?

fmbb · 2025-03-01T13:04:19 1740834259

> then you divide the larger system into smaller components where each team is responsible for one or few of these individual pieces and the chief architect is responsible for making sure of how the pieces are put together

And then you have ravioli code in the large. It is not going to make it easier to understand the bigger system, but it will make it harder to debug.

https://en.wikipedia.org/wiki/Spaghetti_code#Ravioli_code

If you can reproduce an error, you can fix it. Do that.

If you cannot reproduce it after a day of trying and it doesn’t happen often, don’t fix it.

fc417fc802 · 2025-03-01T13:53:02 1740837182

Off topic but when I first saw ravioli code it was in a positive light, as a contrast to lasagna code. But then somewhere along the line people started using it in a negative manner.

There is some optimal level of splitting things up so that it's understandable and editable without overdoing it on the abstraction front. We need a term for that.

quinnirill · 2025-03-02T09:36:20 1740908180

Pizza code? Central components are mostly clearly distinguishable and glued together in a fairly consistent manner.

fc417fc802 · 2025-03-01T13:46:22 1740836782

Probably not the best examples.

Sqlite famously has more test related code than database code.

Multithreading correctly is difficult enough that multiple specialized modeling languages exist and those are cumbersome enough to use that most people don't. In practice you avoid bugs there by strictly adhering to certain practices regarding how you structure your code.

You mention fsync but that famously does not always behave as you expect it to. Look up fsyncgate just for starters. It doesn't matter how well you think you understand your code if you have faulty assumptions about the underlying system.

Generally you come across to me as overconfident in your ability to get things right by carefully reasoning about them. Of course it's important to do that but I guarantee things are still going to break. You will never have a perfect understanding of any moderately large system. If you believe you do then you are simply naive. Plan accordingly (by which I mean, write tests).

Kiro · 2025-03-01T06:39:11 1740811151

I highly doubt anyone has a mental model of the all the code they're working with. You very often work with code that you kind of understand but not fully.

alex_smart · 2025-03-01T10:24:43 1740824683

I obviously meant the code that you own and are responsible for.

Kiro · 2025-03-02T11:08:09 1740913689

Same thing there.

taneq · 2025-03-01T05:45:42 1740807942

And/or so are your tests.

alex_smart · 2025-03-01T10:25:45 1740824745

No, they are not. They are a cheap way of verifying that something hasn't gone wrong, not a proof of correctness.

Tests failing implies the code is incorrect. Tests not failing does not imply that the code is correct.

fc417fc802 · 2025-03-01T10:45:48 1740825948

> Tests not failing does not imply that the code is correct.

I don't think that's what's being suggested. Tests not failing when your code does implies that you are missing test cases. In other words things are underspecified.

Haskell is the extreme example of this. If it successfully compiles then it most likely does exactly what you intended but it might be difficult to get it to compile in the first place.

alex_smart · 2025-03-01T11:35:32 1740828932

>Tests not failing when your code does implies that you are missing test cases. In other words things are underspecified.

I am really confused. Have you guys never written any multithreaded code? You can write the most disgusting thread-unsafe code without a single lock and be perfectly green on all your tests. And who in the world can write tests to simulate all possible timing scenarios to test for race conditions?

I give multithreading as just the most egregiously obvious example that this "tests can prove correctness" idea is fundamentally broken, but I think it applies more generally.

>Haskell is the extreme example of this. If it successfully compiles then it most likely does exactly what you intended but it might be difficult to get it to compile in the first place.

Absolutely 100% of the safety of haskell comes from the mental model (functional programming, immutable data structures etc) and none from the test cases (although their community appears to even do testing slightly better than others).

fc417fc802 · 2025-03-01T13:31:12 1740835872

My Haskell comment was regarding specification of the overall system, not tests specifically. It was a reference to the incredible type system.

> this "tests can prove correctness" idea

You are the only one putting forward such an idea. It's not that I think tests passing proves correctness. It's that I know from experience that I don't fully understand the system. The code that breaks is always a surprise to me because if it wasn't then I would have fixed it before it broke.

So if my code breaks and tests don't catch it then I figure that before I fix it I should add a test for it.

Of course there are some categories such as multithreading that you generally can't test. So you take extra care with those, and then inevitably end up sinking time into debugging them later regardless.

alex_smart · 2025-03-02T03:50:42 1740887442

>You are the only one putting forward such an idea

This was the very first comment you made that started this thread:

>> If none of your tests fail then does it really matter?

fc417fc802 · 2025-03-02T09:24:46 1740907486

"it doesn't matter" != "code is correct"

Sometimes it's just not cost effective to solve every last bit of jank.

If you write software that controls safety critical systems then obviously that statement does not apply. But if you write webapps or games or ...

alex_smart · 2025-03-02T15:42:52 1740930172

>"it doesn't matter" != "code is correct"

Fair. Despite the lengthy argument, I don't think our stances are all that drastically different. I am not saying that you shouldn't write tests. Just that the mental model comes first and that the tests are informed by it (in an ideal world the test are an automatically executable specification of your mental model).

Still I don't understand why you insist that "there exists an automated test for it" has to be the definition of whether something matters.

> But if you write webapps or games or ...

In which case it might just be fine to YOLO it without tests.

People have been delivering valuable software without automated tests for decades. Checkout the code of the linux kernel from circa 2005 and see how many tests there are.

taneq · 2025-03-01T14:05:26 1740837926

I'm not saying that passing tests proves the code is correct, I'm saying that if you find a problem with the code that your tests don't pick up, then you should add a test for it.

taneq · 2025-03-01T05:44:47 1740807887

100% this. If our product does something unexpected, finding out why is top priority. It might be that everything’s fine and this is just a rare edge case where this is the correct behaviour. It might be a silly display bug. Or it might be the first clue about a serious underlying issue.

saagarjha · 2025-02-28T23:05:31 1740783931

I mean, it is expensive. It’s just that the alternative might be more so.

fmbb · 2025-03-01T13:07:41 1740834461

> might be

Yes.

Or it might be cheaper.

Ntrails · 2025-02-28T18:12:24 1740766344

I remember a project to get us down from like 1500 build warnings to sub 100. It took a long time, generated plenty of bikeshedding, and was impossible to demonstrate value.

I, personally, was mostly just pissed we didn't get it to zero. Unsurprisingly the number has climbed back up since

TYMorningCoffee · 2025-02-28T18:22:12 1740766932

Could you propose to fail the build based on the number of warnings to ensure it doesn't go up?

I did something similar with spotbugs. There were existing warnings I couldn't get time to fix so I configured the maven to fail if it exceed the level at which I enabled it.

This has the unfortunate side effect that if it drops and no one adjusts the threshold then people can add more issues without failing the build.

bornfreddy · 2025-02-28T19:40:25 1740771625

> This has the unfortunate side effect that if it drops and no one adjusts the threshold then people can add more issues without failing the build.

Our tests are often written with a list of known exceptions. However, such tests also fail if an exception is no longer needed - with a congratulatory message and a notice that this exception should be removed from the list. This ensures that the list gets shorter and shorter.

allannienhuis · 2025-02-28T23:15:21 1740784521

I did this with a project that was worked on by a large team (50+ devs, many, many, many kloc) when we first added linting to the project (this was very early 2000s) - we automatically tracked the number of errors and warnings at each build, persisted them, and then failed the build if the numbers went up. So it automatically adjusted the threshold.

It worked really well to incrementally improve things without forcing people to deal with them all the time. People would from time to time make sure they cleaned up a number of issues in the files they happened to be working on, but they didn't have to do them all (which can be a problem with strategies that for example lint only the changed files, but require 0 errors). We threw a small line chart up one one of our dashboards to provide some sense of overall progress. I think we got it down to zero or thereabouts within a year or so.

thingification · 2025-02-28T18:55:32 1740768932

If you can instead construct a list of existing instances to grandfather in, that doesn't suffer from this problem. Of course many linting tools do this via "ignore" code comments.

That feels less arbitrary than a magic number (because it is!) and I've seen it work.

Etheryte · 2025-02-28T19:23:02 1740770582

We used this approach to great effect when we migrated a huge legacy project from Javascript to Typescript. It gives you enough flexibility in the in between stages so you're not forced to change weird code you don't know right away, while enforcing enough of a structure to eventually make it out alive in the end.

Ntrails · 2025-02-28T21:27:25 1740778045

Absolutely could!

However, management felt kinda burned because that was a bunch of time and unsurprisingly nobody was measurably more productive afterwards (it turns out those are just shitty code tidbits, but not highly correlated with areas which where it is miserable to make changes. Some of the over-refactorings probably made things harder.

It was a lovely measurable metric, making it an easy sell in advance. Which maybe was the problem idk.

kristianp · 2025-03-01T00:09:09 1740787749

It depends on the language. It's so easy to generate warnings in C/C++, other languages they are rare or it's easy to avoid them.

pimeys · 2025-03-01T09:55:11 1740822911

I guess I've been lucky to work on companies in the past ten or so years, where compiler warnings were treated as errors in the CI. Also been really lucky to use an editor which always highlights warnings and lists them, so I can fix those easily.

minutillo · 2025-03-01T11:01:29 1740826889

Step 1: get it to zero

Step 2: -Werror

Step 3: there is no step 3

Forricide · 2025-02-28T17:36:08 1740764168

Fixing everything is impractical, but I'd say a safer rule of thumb would be to at least understand small strangenesses/errors. In the case of things that are hard to fix - e.g. design/architectural decisions that lead to certain performance issues or what have you - it's still usually not too time consuming to get a basic understanding of why something is happening.

Still better to quash small bugs and errors where possible, but at least if you know why they happen, you can prevent unforeseen issues.

munk-a · 2025-02-28T17:59:16 1740765556

Sometimes it can take a serious effort to understand why a problem is happening and I'll accept an unknown blip that can be corrected by occasionally hitting a reset button occasionally when dealing with third party software. From my experience my opinion aligns with yours though - it's also worth understanding why an error happens in something you've written, the times we've delayed dealing with mysterious errors that nobody in the team can ascribe we've ended up with a much larger problem when we've finally found the resources to deal with it.

Nobody wants to triage an issue for eight weeks, but one thing to keep in mind is that the more difficult it is to triage an issue the more ignorance about the system that process is revealing - if your most knowledgeable team members are unable to even triage an issue in a modest amount of time it reveals that your most knowledgeable team members have large knowledge gaps when it comes to comprehending your system.

This, at least, goes for a vague comprehension of the cause - there are times you'll know approximately what's going wrong but may get a question from the executive suite about the problem (i.e. "Precisely how many users were affected by the outage that caused us to lose our access_log") that might take weeks or months or be genuinely nigh-on-impossible to answer - I don't count questions like that as part of issue diagnosis. And if it's a futile question you should be highly defensive about developer time.

Forricide · 2025-02-28T18:14:35 1740766475

That's very fair - at least with third party software, it can be nigh impossible to track down a problem.

With third party libraries, I've too-often found myself reading the code to figure something out, although that's a frustrating enough experience I generally wouldn't wish on other people.

relistan · 2025-03-01T09:31:19 1740821479

This. Understand it all least to a level where you can make an effort vs risk/impact trade off. Ideally eliminate all low effort issues and mitigate high risk or high impact issues. But eliminating them all is not usually practical. And besides, most of the high impact/high effort application risk resides in design and not in warnings that come from logs or the compiler.

gamedever · 2025-02-28T20:06:06 1740773166

if there are any warnings I'm supposed to ignore then there are effectively no warnings.

there's nothing pagmatic about it. once I get into the habit of ignoring a few warnings that effectively means all warnings will be ignored

connicpu · 2025-02-28T17:33:10 1740763990

At my job we treat all warnings as errors and you can't merge your pull requests unless all automatically triggered CI pipelines pass. It requires discipline, but once you get it into that state it's a lot easier to keep it there.

berkes · 2025-02-28T18:59:10 1740769150

The last point is the key.

It then creates immense value by avoiding a lot of risk and uncertainty for little effort.

Getting from "thousands of warnings" to zero isn't a good ROI in many cases, certainly not on a shortish term. But staying at zero is nearly free.

This is even more so with those "fifteen flickering tests" these 23 tests that have been failing and ignored or skipped for years.

It's also why I commonly set up a CI, testing systems, linters, continuous deployment before anything else. I'll most often have an entire CI and guidelines and build automation to deploy something that will only say "hello world". Because it's much easier to keep it green, clean and automated than to move there later on

Calamitous · 2025-02-28T19:09:57 1740769797

That's because it moves from being a project to being a process. I've tried to express this at my current job.

They want to take time out to write a lot of unit tests, but they're not willing to change the process to allow/expect devs to add unit tests along with each feature they write.

I'll be surprised if all the tests are still passing two months after this project, since nobody runs them.

temporallobe · 2025-02-28T19:33:35 1740771215

That’s why TDD (Test-Driven Development) has become a trend. I personally don’t like TDD’s philosophy of writing tests first, then the code (probably because I prefer to think of a solutions more linearly), but I do absolutely embrace the idea and practice of writing tests along side of the code, and having minimum coverage thresholds. If you build that into your pipeline from the very beginning, you can blame the “process” when there aren’t enough tests.

ptmcc · 2025-03-01T05:12:43 1740805963

The flip that switched for me to make me practice something TDD-adjacent is to replace most manual verification with writing a test. Once I got in the habit I find it so much faster, more consistent, and then I have lasting tests to check in!

I don't typically write tests first so it's not true TDD but it's been a big personal process improvement and quality boost.

berkes · 2025-03-01T16:27:22 1740846442

> to allow/expect devs to add unit tests

For me such gigs are a red flag and immediate turn down (I'm freelancer with enough opportunities, luxury position, I know).

I would consider it really weird if management dictates exactly what tools and steps a carpenter must take to repair a chair. Or when the owner of a hotel tells the chef what steps are allowed when preparing fish. We trust the carpenter or chef to know this best. To know best how to employ their skills given the context.

If management doesn't trust the experts they hire to make the right choice in how they work, what tools they use, what steps they take, etc. that's a red flag: either they are hiring the wrong people (and the micromanaging is an attempt to fix that) or they don't think the experts are expert enough to make decisions on their own.

For me, this goes for tools (e.g. management dictates I must work on their windows machine with their IDE and other software) for processes (management forbids tests, or requires certain rituals around merges etc) and for internals (management forbidding or requiring certain abstractions, design patterns etc)

To be clear: a team, through, or via a management, should have common values and structures and such. And it makes perfect sense for management to define the context (e.g. this is a proof of concept, no need for the rigid quality here. Or we must get these features out of the door before thursday, nothing else matters.) It's when the management dictates how teams or experts must achieve this that it becomes a red flag to me.

I haven't been wrong in this. These red-flags almost always turned out to hint at underlying, deeply rooted cultural problems that caused all the technical troubles.

noisy_boy · 2025-03-01T02:21:57 1740795717

> I'll be surprised if all the tests are still passing two months after this project, since nobody runs them.

Wouldn't they just run as part of the build? At least for Java, Junit tests run as part of the build by default.

reaperducer · 2025-02-28T17:40:16 1740764416

At my job we treat all warnings as errors and you can't merge your pull requests unless all automatically triggered CI pipelines pass. It requires discipline, but once you get it into that state it's a lot easier to keep it there.

Sounds like what we used to call "professionalism." That was before "move fast, break things and blame the user" became the norm.

LeifCarrotson · 2025-02-28T18:24:27 1740767067

It very much depends on the nature of your work.

If manual input can generate undefined behavior, you depend on a human making a correct decision, or you're dealing with real-world behavior using incomplete sensors to generate a model...sometimes, the only reasonable target is "fail gracefully". You cannot expect to generate right outputs with wrong inputs. It's not wrong to blame the user when economics, not just laziness, say that you need to trust the user to not do something unimagineable.

I think this is the kind of situation where a little professionalism would have prevented the issue: Handling uncaught exceptions in your threadpool/treemap combo would have prevented the problem from happening.

dheera · 2025-02-28T20:23:30 1740774210

> That was before "move fast, break things and blame the user" became the norm.

When VCs only give you effectively 9 months of runway (3 months of coffees, 9 months of actual getting work done, 3 months more coffees to get the next round, 3 more months because all the VCs backed out because your demo wasn't good enough), move fast and break things is how things are done.

If giving startups 5 years of runway was the norm, then yeah, we could all be professional.

mardifoufs · 2025-02-28T22:10:40 1740780640

There's basically no proof that software used to be more "professional". Sure the process was more formal, but I've not seen any proof (and I'm not talking about peer reviewed stuff here, but even just anecdotal examples) of the "end result" of those processes being better, more robust or even less buggy than what we get out of what some may call "move fast and break stuff" development.

robocat · 2025-02-28T21:21:20 1740777680

> professionalism." That was before "move fast, break things

I think you're professing a false dichotomy. Is it unprofessional to "move fast, break things"?

I'm a slow moving yak shaver partly due to concious intention. I admire some outcomes from engineers that break things like big rockets.

I definitely think we learn fast by breaking things: assuming we are scientific enough to design to learn without too much harm/cost.

tonyarkles · 2025-03-01T02:39:55 1740796795

> I admire some outcomes from engineers that break things like big rockets.

I work in unmanned aerospace. It started with 55lb quadcopters and got… significantly bigger from there.

I’ve thought a ton about what you’re saying over the last 5-6 years. I have broken things. My coworkers and underlings have broken things. We’ve also spent a bunch of time doing design review, code review, FEA review, big upfront design, and all those expensive slow things.

Here’s, for me, the dividing line: did we learn something new and novel? Or did we destroy kilobux worth of hardware to learn something we could have learned from a textbook, or doing a bit of math, or getting a coworker to spend a few hours reviewing our work before flying it?

And I 100% agree with your last statement: you can “move fast and break things for science” professionally. But… if something breaks when you weren’t expecting it to, the outcome of the post-mortem really should be surprising new knowledge and not “this made it to prod without review and was a stupid mistake”

int_19h · 2025-03-01T17:59:11 1740851951

> Is it unprofessional to "move fast, break things"?

Most of the time it is. The sorry state of software in general is a testament to that.

dataflow · 2025-02-28T23:33:41 1740785621

> At my job we treat all warnings as errors

Pretty sure what you actually mean is that you treat some warnings as errors, and disable the others. I would find it hard to believe you're building with literally all the compiler warnings enabled all the time.

connicpu · 2025-03-01T00:04:07 1740787447

We do have a couple disabled by default in the config, but it's still a lot of warnings:

    -Wall,-Wextra,-Werror,-Wno-type-limits,-Wno-attributes,-Wno-deprecated-declarations

And of course some are suppressed with a pragma for a short section of specific segments of the code where you can make the argument that it's appropriate during code review, but those pragmas stick out like a sore thumb.

dataflow · 2025-03-01T00:10:46 1740787846

> We do have a couple disabled by default in the config, but it's still a lot of warnings

A lot, yes, but definitely a lot more than 2 that you still don't have enabled. Might be worth looking into them if you haven't already -- you will definitely disable some of them in the process.

(And I assume you already have clang-tidy, but if not, look into that too.)

connicpu · 2025-03-01T00:15:48 1740788148

Yep, plus static analysis in CI. We also run ubasan, tsan, and valgrind on our unit tests.

kelnos · 2025-03-01T11:58:59 1740830339

-Wall + -Wextra doesn't actually enable all warning, though. There are quite a few others that you still have to enable manually.

stickfigure · 2025-02-28T17:46:32 1740764792

The other thing is don't catch and ignore exceptions. Even "catch and log" is a bad idea unless you specifically know that program execution is safe to continue. Just let the exception propagate up to where something useful can be done, like return 500 or displaying an error dialog.

wpollock · 2025-03-01T00:07:49 1740787669

I agree, and I think the OP does as well. Really, the Executor framework used here was to blame by not setting the uncaught exception handler to catch and propagate those exceptions by default.

But having shared, mutable state without locks or other protection should never have passed code review in the first place. I would look at that commit to try to determine how it happened, then try to change the team's processes to prevent that.

necovek · 2025-03-01T05:19:57 1740806397

Most commonly you would have a non-concurrent use of TreeMap, and then for performance reasons, someone else would come in and introduce threads in a couple of places, without ensuring that all data access (esp write) is properly guarded.

The way people structure code, it might even be non-obvious that there's a use of something like TreeMap, as it will be abstracted away into "addNode" method.

Still a red flag, since the process when introducing threads should be "ensure data structures allow for concurrent write and read, or guard them otherwise", but when the task says "performance improvement" and one's got 14x or 28x improvement already, one might skip that other "bit" due to sheer enthusiasm.

temporallobe · 2025-02-28T19:22:10 1740770530

Yes, but…I suppose you have to pick your battles. There was recently a problem vexing me about a Rails project I maintain where the logs were filled with complaints about “unsupported parameters”, even though we painstakingly went through all the controllers and allowed them. It’s /probably/ benign, but it adds a lot of noise to the logs. Several of us took a stab at resolving it, but in the end we always had much higher priorities to address. Also it’s hard to justify spending so many hours on something that has little “business value”, especially when there is currently no functional impact.

It’s a nuisance issue sorta like hemorrhoids. Do you get the surgery and suffer weeks of extreme pain, or do you just deal with it? Hemorrhoids are mostly benign, but certainly have the potential to become more severe and very problematic. Maybe this phenomenon should be called digital hemorroids?

brirec · 2025-02-28T19:27:34 1740770854

As someone with pretty bad hemorrhoids, I’m hesitant to ask my doctor about surgery because I’ve been told the hemorrhoids will come back, without question. So it’s even still just a temporary fix…

imtringued · 2025-03-01T10:34:20 1740825260

Surgery for hemorrhoids sounds like trying to cure the symptom.

There is something going wrong in your body that has hemorrhoids as a downstream effect. Surgery can't fix the root cause.

If you have constipation then consider the following: the large intestine has bacteria that process your undigested food and this can have many nasty consequences. What is going wrong in the small intestine that leads to this?

colechristensen · 2025-02-28T19:24:21 1740770661

Couldn’t you just run a debugger to find all of the incidents of that issue?

temporallobe · 2025-02-28T19:44:10 1740771850

We’ve been down many paths on this. In some cases we know exactly where it’s happening, but despite configuring everything correctly, it still complains. It might just be a bug in the Rails code or a fault in the way parameters are passed in (some of the endpoints take a lot of parameters, some of them optional). We could “fix” the issue by simply allowing all parameters, but of course this opens a security risk. This is a 10+ year old code base and I am told it has been a thorn in their side for a long time. It’s one of those battles thar I suppose we are not going to try fighting unless we get really bored and have nothing else to work on.

bornfreddy · 2025-02-28T19:53:10 1740772390

Also, stack trace should show you everything you need to know to fix this, or am I missing something? (no experience with Ruby)

Otherwise, I see the cleanups and refactoring as part of normal development process. There is no need to put such tasks in Jira - they must be done as preparation for the regular tasks. I can imagine that some companies take agile too seriously and want to micromanage every little task, but I guess lack of time for refactoring is not the biggest problem.

colechristensen · 2025-02-28T22:24:30 1740781470

"why don't you just" comments are easy :) (I made one)

debugging in codebases with a lot of magic (rails) is hard. it can be very difficult to follow calls around unless you're quite the expert. certain styles of programming really frustrate me, but then again I program like a scientist so the kinds of things I'm prone to do frustrate software engineers (for loops nested 8 deep, preference for single character variables, etc.)

temporallobe · 2025-03-03T16:45:38 1741020338

You’re spot-on with the “magic” of Rails. While it can be very powerful and feature-rich, it can feel like a black box at times and stack traces aren’t always accurate or helpful. To be fair, this happens with a lot of library-heavy frameworks though, Spring Boot being a prime example of this.

dspillett · 2025-02-28T22:07:36 1740780456

> > race conditions […] I never though it could cause performance issues. But it makes sense, you could corrupt the data in a way that creates an infinite loop.

Even without corruption a race condition can cause significant performance issues by causing the same work to be done many times with only one result being kept.

> Food for thought. I often think to myself that any error or strange behavior or even warnings in a project should be fixed as a matter of principle

For warnings: at least explained in a comment, where it has been decided irrelevant (preferably with a pragma to turn off the warning as locally as possible).

Strange behaviour I prefer to get rid of. I've found code marked (by me at least once!) "not sure why this works, but" which very much no longer works, and had to rewrite in a rush where if it had been addressed earlier there would have been more time to be careful.

szundi · 2025-02-28T22:15:33 1740780933

There are so many of these in some projects that starting to fix them kills it.

Only time to fix is just after it is discovered - or mostly never ever because it becomes expensive to build up the context in mind again.

swatcoder · 2025-02-28T17:26:29 1740763589

> Rarely is this accepted by whoever chooses what we should work on.

You need to find more disciplined teams.

There are still people out there who care about correctness and understand how to achieve it without it being an expensive distraction. It a team culture factor that mostly just involves staying on top of these concerns as soon as they're encountered so there's not some insurmountable and inscrutable backlog that makes it feel daunting and hopeless or that makes prioritization difficult.

saulpw · 2025-02-28T17:40:59 1740764459

Most teams are less disciplined than they should be. Also, job/team mobility is very low right now. So the question becomes, how do you increase discipline on the team you're on?

rapind · 2025-02-28T19:10:53 1740769853

For very small teams, exploring new platforms and / or languages that compliment correctness is an option. Using a statically typed language with explicit managed side effects has made a huge difference for me. Super disruptive the larger the team though of course.

necovek · 2025-03-01T05:23:58 1740806638

It's also a function of the project size.

Switching that 500k LOCs Ruby or JavaScript project over to Rust ain't happening quickly in neither small nor big team.

rapind · 2025-03-03T17:06:05 1741021565

I was thinking Haskell not Rust :) But yes, for sure, you'd have to tackle it bit by bit with some sort of bridging if you decided it was worth it.

TYMorningCoffee · 2025-02-28T17:53:11 1740765191

> Food for thought. I often think to myself that any error or strange behavior or even warnings in a project should be fixed as a matter of principle, as they could cause seemingly unrelated problems. Rarely is this accepted by whoever chooses what we should work on.

I agree. I hate lingering warnings. Unfortunately at the at time of this bug I did not have static analysis tools to detect these code smells.

berkes · 2025-02-28T19:01:32 1740769292

Another problem with lingering warnings is that it's easy to overlook that one new warning that's actually important amongst floods of older warnings.

corytheboyd · 2025-02-28T20:58:02 1740776282

> Rarely is this accepted by whoever chooses what we should work on.

I get that YMMV based on the org, but I find that more often than not, it’s expected that you are properly maintaining the software that you build, including the things like fixing warnings as you notice them. I can already feel the pushback coming, which is “no but really, we are NOT allowed to do ANYTHING but feature work!!!!” and… okay, I’m sorry, that really sucks… but maybe, MAYBE, some of that is you being defeatist, which is only adding to this culture problem. Oh well, now’s not the time to get fired for something silly like going against the status quo, so… I get it either way.

Retr0id · 2025-02-28T18:13:13 1740766393

And from a security perspective, the "might cause a problem 0.000001% of the time" flaws can often be manipulated into becoming a problem 100% of the time.

sitkack · 2025-02-28T18:28:07 1740767287

All security issues are subclass of bugs. Security is a niche version of QA.

chanux · 2025-03-01T10:49:50 1740826190

I've also felt lonely on this hill I'm dying on.

But for some solace https://en.wikipedia.org/wiki/The_Power_of_10:_Rules_for_Dev...

closeparen · 2025-03-02T05:55:53 1740894953

>Rarely is this accepted by whoever chooses what we should work on.

This is the kind of thing you should probably just do, not seek buy-in for from whoever chooses what you should work on. Unless it is going to take some extreme amount of time.

phamilton · 2025-03-01T15:01:46 1740841306

A coworker once told me: "Undefined behavior means the code could theoretically order a pizza"

Hyperbole sure, but made me chuckle and is a nice reminder to check all assumptions.

cryptonector · 2025-03-01T05:11:16 1740805876

Not everyone agrees. The SQLite team, for example, famously refuses to fix warnings reported by users.

I myself do try to fix most warnings -- some are just false positives.