Some of those formatting conventions are written in blood. The clarity of a blueprint is a big deal when people are using it to convey safety critical information.
I don’t think code formatting rises anywhere close to that level, but it’s also trying to reduce cognitive load which is a big deal in software development. Nobody wants to look at multiple lines concatenated together, how far beyond that you take things runs into diminishing returns. However at a minimum formatting changes shouldn’t regularly complicate doing a diff.
I 100% agree. The problem is that after a half a century, software engineering discipline has been unable to agree on global conventions and standards. I recently had an experience where a repair crew was worried about the odd looking placement of a concrete beam in my house. I brought over the blueprints, and the technician found the schedule of beams and columns within seconds, pinpointed the beam and said, "Ah, that's why. We just need to <solution I won't go into>". Just then it struck me how we can't do this in software engineering, even when the project is basically a bog-standard business app: CRUD API backed by an RDBMS.
That’s because the few hard rules you have to comply with have workarounds and matters rarely. In house construction, you have to care about weight, material degradation, the code, etc… there’s no such limitation on software so you can get something to work even if it’s born out of a LSD trip.
But we do have some common concepts. But they’re theoretical, so only the people that read the books knows the jargon.
The rules are what make it flexible. The rules let me understand what the heck is going on in the code you wrote so I can change it. Code that is faster to rewrite from scratch isn’t flexible.
Construction and civil engineering have been unable to agree on global conventions and standards, and they have a multi-millenia head start over software engineering. The US may claim to follow the "International Building Code", but it's just called that because a couple of small countries on the Americas have adopted it. For all intents and purposes it's a national standard. Globally we can't even agree on a system of units and measurements, never mind anything more consequential than that.
I’d say that globally we have agreed on a system of units and measurements. It’s just the US and a handful of third world countries that don’t follow that system.
> I brought over the blueprints, and the technician found the schedule of beams and columns within seconds
Is that really an example of the standardization you want? It shows that the blueprint was done in a way that the technician expected it to be, but I am not sure that these blueprints are standardized in that way globally. Each country has its standards and language.
If an architect from a different country did that blueprint, I would bet that it would be significantly different from the blueprint you have.
Software Engineering doesn't have a problem with country borders, but different languages would require different standards and conventions. Unless you can convince everyone to use the same language (which would be a bad idea; CRUD apps and rocket systems have different trade-offs), I doubt there could be an industry-wide standard.
But I can't look at the design from my desk-mate and hope to understand it quickly. We wall love to invent as much as possible ourselves, and we lack a common design language for the spaces we are problem solving in. Personally I don't entirely think it's a problem of discipline of software engineering, but a reflection of the fact that the space of possible solutions is so high for software, and the [opportunity] cost of missing a great solution because it is too different from previous solutions is so high (difference between 120 seconds application start and 120 milliseconds application start, for instance).
> The problem is that after a half a century, software engineering discipline has been unable to agree on global conventions and standards.
It can't, and it won't, as long as we insist on always working directly on the "single source of truth", and representing it as plaintext code. It's just not sufficient to comprehensibly represent all concerns its consumers have at the same time. We're stuck in endless fights about what is cleaner or more readable or more maintainable way of abstracting / passing errors / sizing functions / naming variables... and so on, because the industry still misses the actual answer: it depends on what you're doing at the moment. There is no one best representation, one best function size. In fact, one form may be ideal for you now, and the opposite of it may be ideal for you five minutes later, as you switch from e.g. understanding the module to debugging a specific issue in it.
We've saturated the expressive capability of our programming paradigm. We're sliding back and forth along the Pareto frontier, like a drunkard leaning against a wall, trying to find their way back to a pub without falling over. No, inventing even more mathematically complex category of magic monads won't help, that's just repackaging complexity to reach a different point on the Pareto frontier.
Hint: in construction, there is never one blueprint everyone works with. Try to fit all information about geometry, structural properties, material properties, interior arrangement, plumbing, electricity, insulation, HVAC, geological conditions, hydrological conditions, and tax conditions onto a single flat image, and... you'll get a good approximation of what source code looks like in programming as it is today.
I think we have to think of software like books and writing. It is about conveying information, and while there are grammatical rules to language and conventions around good and bad writing, we're generally happy to leave it there because too many rules is so constructive as to remove the ability to express information in the way we feel we need to. We just have to accept that some are better writers than others, or we like the style of some authors better.
That would make sense if code was written solely for the enjoyment of its readers, but it isn't.
Code uses text, sometimes even natural-language prose, but isn't like "books and writing". It has to communicate knowledge, not feels. It also ultimately have to be precise enough to give unambiguous instructions to a machine.
In this sense, code is like mathematical proofs and like architectural blueprints, which is a different category to drawings, paintings and literature. One is about shared, objective, precise understanding of a problem. The other is about sharing subjective, emotional experiences; having the audience understand the work, much less everyone understand it the same way, is not required, usually not possible, and often undesirable.
I don't know that we'll ever be able to agree on "global standards" though. Software is too specialized for that.
The only software standard that I'm reasonably familiar with is https://en.wikipedia.org/wiki/IEC_62304 which is specific to Medical Devices. In a 62304-compliant project, we might be able to do something like your example, but it could take a while. OTOH, I'm told that my Aviation counterparts following DO-178 would almost certainly be able to do something comparable.
It is going to be very industry dependent, where "industry" means aviation, or maritime, or railroad, or Healthcare, etc... Just "software" is too general to be meaningful these days.
Yes, of course. My point is that those standards apply only to specific domains where software is applied, not to the software development industry as a whole.
It's telling I think that the discussion becomes about standards, which software people like, rather than say, clear communication to technical stakeholders beyond the original programmer. I have a list somewhere I made of everyone that might eventually need to read an electronics schematic. Particularly to emphasize clarity to younger engineers and (ex-)hobbyists that think schematics are a kind of awkward source code or data entry for layout. It's not short: test, legal, quality control, manufacturing, firmware, sustaining/maintenance, sometimes even purchasing, etc.
Whoever drew your blueprints knew they would be needed beyond getting the house up. What would the equivalent perspective and effort for software engineering be?
That is largely due to a difference in complexity.I would say that the level of complexity of a blueprint of a house is on par with a 20-30 line python solution of a leet code easy excercise to a programmer. If the one reading the blue print is an engineer and the one reading the python code is a programmer. A crud app is more like the blue prints for a vacuum cleaner or something like that.
This was such a relieve for us.
Looking back its unbelievable how much combined time we wasted complaining about and fixing formatting issues in code reviews and reformatting in general.
With clang-format & co. on Save plus possibly a git hook this all went away.
It might not always be perfect (which is subjective anyway) but its so worth it.
Maybe a new paradigm for code formatting could be local-only. Your editor automatically formats files the way you like to see them, and then de-formats back to match the codebase when pushing, making your changes match the codebase style.
It's a decent idea, but it's weird reviewing code you wrote in saying GitHub, it looks totally different. Imo not a show stopper but a side effect you have to get used to.
This is disastrously easy to implement with just a few filters on git (clean & smudge).
I highly recommend it though, especially if you worked for a long time at one company and are used to a specific way of writing code. Or if you like tabs instead of spaces...
This is pretty common now. At least my Vim/git combo does this, where I always open source code with my preferred formatting but by the time it's pushed to the server it's changed to match the repo preferences.
Software, being arbitrary logic, just doesn't have the same physical properties and constraints that have allowed civil engineering and construction code to converge on standards. It's often not clear what the load bearing properties of software systems are since the applications and consequences of errors are so varied. How often does a doghouse you built in your backyard turn into a skyscraper serving the needs of millions over a few short years?
> However at a minimum formatting changes shouldn’t regularly complicate doing a diff.
If the code needs to be reformatted, this should be done in a separate commit. Fortunately, there are now structural/semantic diff tools available for some languages that can help if someone hasn't properly split their formatting and logic changes.
Reformatting comitts still leaves the issue that you deprive yourself of any possibility to reliably diff across such commits (what changes from there to there?) Or attribute a line of code to a specific change (why did we introduce this code?).
What we should have instead is syntax-aware diffs that can ignore meaningless changes like curly braces moving into another line or lines getting wrapped for reasons.
> What we should have instead is syntax-aware diffs that can ignore meaningless changes like curly braces moving into another line or lines getting wrapped for reasons.
These diffs already exist (at least for some languages) but aren't yet integrated into the standard tools. For example, if you want a command line tool, you can use https://github.com/Wilfred/difftastic or if you are interested in a VS Code extension / GitHub App instead, you can give https://semanticdiff.com a try.
This is the best argument I've ever heard for editorconfig, commiting a standardised format to the repos, and viewing it in whatever way you want to view it on your own machine
Some of those formatting conventions are written in blood. The clarity of a blueprint is a big deal when people are using it to convey safety critical information.
I don’t think code formatting rises anywhere close to that level, but it’s also trying to reduce cognitive load which is a big deal in software development. Nobody wants to look at multiple lines concatenated together, how far beyond that you take things runs into diminishing returns. However at a minimum formatting changes shouldn’t regularly complicate doing a diff.