I think the pushback is against certain recipes that seem too absolutist. "Clean Code" is the title of a famous book that defines clean (among other things) as:
"No Duplication
Duplication is the primary enemy of a well-designed system. It represents additional work, additional risk, and additional unnecessary complexity."
So according to this definition, removing duplication is synomymous with simplification, which is simply incorrect.
Removing duplication is the introduction of a dependency. If this dependency is a good model of the problem then this deduplication is a good abstraction and may also be a simplification. Otherwise it's just compression in the guise of abstraction.
[Edit] Actually, this quote is a reference to Kent Beck’s Simple Design that appears in Clean Code.
Actually, when writing highly optimized code, cut-and-paste duplication is not unusual. We may also eschew methods, for static FP-like functions (with big argument lists), in order to do things like keep an executable and its working space, inside a lower-level cache.
But also, we could optimize code that doesn't need to be optimized. For example, we may spend a bunch of time refactoring a Swift GUI controller struct, that saves 0.025 seconds of UI delay, avoiding doing the same for the C++ engine code, that might save 30 seconds.
I find "hard and fast rules" to be problematic, but they are kind of necessary, when most of the staff is fairly inexperienced. Being able to scale the solution is something that really needs experience. Not just "experience," but the right kind of experience. If we have ten years' experience with hammers, then all our problems are nails, and we will hit them, just right.
I tend to write code to be maintained by myself. It is frequently far-from-simple code, and I sometimes have to spend time, head-scratching, to figure out what I was thinking, when I wrote the code, but good formatting and documentation[0] help, there. Most folks would probably call my code "over-engineered," which I have come to realize is a euphemism for "code I don't understand." I find that I usually am grateful for the design decisions that I made, early on, as they often afford comprehensive fixes, pivots, and extension, down the road.
TBH, I don't know if I will ever reach the transcendence stage, but my two cents are (after 20 years of professional programming) that breaking the rules is sometimes desired but not when the person has not even reached the first stage. The example from OP, as explained in the post, seems a clear case of never reaching stage one, following the simple rule of no duplication.
The conclusion in the article is kind of good and bad: it seems the OP has reached stage two but for all the wrong reasons and there is no telling whether they now actually posses the knowledge or that they just discarded one rule to follow another, both of which can be wrong or right based on the situation.
That's not just a view or opinion, it's how learning works. Thanks for the link!
When starting to learn a domain, you lack deep understanding of it, so you cannot make sound decisions without rules.
But rules are always generalizations, they never fully encompass the complexity they hide.
Through experience, you should become able to see the underlying reason behind the rules you've been given. Once full understanding behind a rule is ingrained, the rule can be discarded and you can make your own decisions based on its underlying principles.
Then you become an expert through deep understanding of several aspects of a domain. You are able to craft solutions based off intricate relationships between said aspects.
Let's take `goto` as an example. The rule you will commonly see is "don't use it". But if you're experienced enough you know that's unnecessarily constraining, e.g. there's nothing wrong with a `goto` in C for the purpose of simplifying error handling within a function. It only becomes an issue when used liberally to jump across large swathes of code. But your advice to juniors should still just be "don't use it", otherwise your advice will have to be "use `goto` only when appropriate", which is meaningless.
Also, that "unnecessary optimization" thing, can be a form of "bikeshedding."
Maybe the app is too slow, and the junior engineer knows Swift, so they spend a bunch of time, doing fairly worthless UI optimization, when they should have just gone to their manager, and said "The profiler shore do spend a mite of time in the engine. Maybe you should ask C++ Bob, if he can figure out how to speed it up."
I love this part: "It represents additional work, additional risk, and additional unnecessary complexity", because it could be "refactored" into "additional work, risk, and complexity". I assume it hasn't been, because (in the author's opinion) it communicates the intended meaning better - which might be the case with code, too. "Well-designed" is subjective.
Good design has objective and subjective elements. Or... it might be more accurate to say that it is entirely objective, but some/many elements are context-sensitive.
For example, a style of writing that is difficult to follow but rewarding to parse for the dedicated and skilled reader may be considered good. It is good at being an enjoyable reading puzzle. But from an accessibility standpoint, it's not a clear presentation of information, so it's not good.
Mostly we call things that are increasingly accessible well designed. But we're using a specific criterion of accessibility. It's a great criterion and it's one we should generally prioritize. But it's not the only facet of design.
In code, we generally could categorize high quality design as accessibility. Most engineers probably think of themselves as not really needing accessibility features (although how many are undiagnosed neurodivergent?), but writing code that is easy to read and parse and follow is an accessibility feature and an aspect of good design.
I'm not really sure I know what you mean by "compression in the guise of abstraction". Re-usable code is a great way to isolate a discrete piece of logic / functionality to test and use in a repeatable manner. A sharable module is the simplest and often smallest form of abstraction.
Reusing a function C in functions A and B makes A and B dependent on C. If the definition of C changes, the definitions of A and B also change.
So this is more than reusing some lines of code. It's a statement that you want A and B to change automatically whenever C changes.
If this dependency is introduced purely out of a desire to reuse the lines of code that make up C, then I'm calling it compression. In my view, this is a bad and misleading form of abstraction if you can call it that at all.
> Reusing a function C in functions A and B makes A and B dependent on C. If the definition of C changes, the definitions of A and B also change.
To pile onto this example, in some cases the mastermind behind these blind deduplication changes doesn't notice that the somewhat similar code blocks reside in entirely different modules. Refactoring these code blocks into a shared function ends up introducing a build time dependency where previously there was none, and as a result at best your project takes longer to build because independent modules are now directly dependent or at worsr you just introduced cyclic dependencies.
Unrelated modules often superficially look like each other at a point in time, I think that’s what the parent is referring to. An inexperienced developer will see this and think “I need to remove this duplication”. But then as the modules diverge from each other over time you end up with a giant complex interface that would be better off as 2 separate modules.
So deduplicating unnecessarily compresses the code without adding anything to code quality.
> Removing duplication is the introduction of a dependency. If this dependency is a good model of the problem then this deduplication is a good abstraction and may also be a simplification. Otherwise it's just compression in the guise of abstraction.
I think you're referring to coupling. Deduplicating code ends up coupling together code paths that are entirely unrelated, which ends up increasing the complexity of an implementation and increase the cognitive load required to interpret it.
This problem is further compounded when duplicate code is extracted to abstract and concrete classes instantiated by some factory, because some mastermind had to add a conditional to deduplicate code and they read somewhere that conditionals are for chumps and strategy patterns are cleaner.
Everyone parrots the "Don't Repeat Yourself" (DRY) rule of thumb and mindlessly claim duplicate code is bad, but those who endure the problems introduced by the DRY principle ended up coining the Write Everything Twice (WET) rule of thumb to mitigate those problems for good reasons. I lost count of all the shit-tier technical debt I had to endure because some mastermind saw two code blocks resembling the same shape and decided to extract a factory with a state patter turning two code blocks into 5 classes. Brilliant work don't repeating yourself. It just required 3 times the code and 5 times the unit tests. Brilliant tradeoff.
Yeah, this is the crux of it. What exactly is duplicated code? Humans are pattern matching machines, we see rabbits in the clouds. Squint at any 4 lines of code and something might look duplicated.
On the other hand, code bases that do have true duplication (100s of lines that are duplicated, large blocks of code that are exactly duplicated 16 different times), multiple places & ways to interact with database at differing layers - that's all not fun either.
It is a balance & trade-off, it goes bad at either extreme. Further, there is a level of experience and knowledge that needs to be had to know what exactly is a "duplicate block of code" (say something that pulls the same data out of database and does the same transform on it, is 20 lines long and is in 2, 3 or more places) vs things that just look similar (rabbits in the clouds, they are all rabbits).
> Deduplicating code ends up coupling together code paths that are entirely unrelated, which ends up increasing the complexity
Code paths that may be unrelated. If they are related, then deduplicating is most definitely a good idea. If they're trying to do the same thing, it makes sense that they call the same function to do it. If they do completely different things that currently happen to involve some of the same lines of code, but they could become different in the future, then deduplication makes no sense.
"No Duplication
Duplication is the primary enemy of a well-designed system. It represents additional work, additional risk, and additional unnecessary complexity."
So according to this definition, removing duplication is synomymous with simplification, which is simply incorrect.
Removing duplication is the introduction of a dependency. If this dependency is a good model of the problem then this deduplication is a good abstraction and may also be a simplification. Otherwise it's just compression in the guise of abstraction.
[Edit] Actually, this quote is a reference to Kent Beck’s Simple Design that appears in Clean Code.