Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Let’s Stop Bashing C (h2co3.org)
140 points by ingve on Dec 2, 2016 | hide | past | favorite | 328 comments


I could care less about integer division and minor details of syntax, but that's not going to stop me from being grumpy about C's continuing popularity.

Every C program I have ever worked on has been riddled with integer overflows, undefined behavior, buffer overflows and bugs. This was perfectly acceptable for a Vax or a Sun workstation in the 80s, and at forgivable in the mid 90s when plenty of developers still only had 8 MB of RAM and wasted half of that on Netscape Navigator.

But we live in an age of Pwn2Own and ClusterFuzz, where even the tiniest of bugs can be used as step 4 in a 6-step exploit, and where C compilers reserve the right to completely and silently break your program at the faintest whiff of undefined behavior.

I know my limitations: I'm not smart enough to write C code that deals with hostile data off the network. Neither are the authors of 99% of the C code that I've seen, with the possible exceptions of djb and a few OpenBSD contributors. It's long past time to invent and use better systems languages. Choose what you want, but choose something safer.


Alan Kay in his famous OOPSLA talk mentions how the architecture itself leads to these language choices.

https://www.youtube.com/watch?v=oKg1hTOQXoY

He mentions how Xerox PARC machines with tagged architectures from the 80s were only 1/50 the speed of the x86 systems in '97. This, in contrast to the 50000x increase in computing power over the same time frame.


Can anyone recommend any resources on older architectures? Alan also talks about the Burroughs CPUs but idk how one would go about learning about those these days.


Im at work but these two should be a nice start:

https://www.smecc.org/The%20Architecture%20%20of%20the%20Bur...

http://homes.cs.washington.edu/~levy/capabook/

Bitsavers and Wikipedia are good resources. The summaries of most on Wikipedia are accurate enough with references to get more authorative information on. I found a lot of it that way.


> C compilers reserve the right to completely and silently break your program at the faintest whiff of undefined behavior

<rant> Please, stop pushing this wrong rethoric forward! As soon as there is undefined behavior, it's a bug in the code. There is a problem, and the problem is on the developer's side. I appreciate the effort of compilers to help me get the most performant machine code I can, and I've had enough of people asking them to stop being smart in order to accomodate crappy code. </rant>


A major problem with undefined behaviour and compilers exploiting it is how unforgiving they are: there's no hint to the programmer (unless they explicitly opt in to a sanitiser, which they should, but usually don't) that they have written a bug, the compiler just happily "assumes" that the programmer is flawless and hence the code is exactly what was intended. Blaming the developer for mistakes seems far less productive than understanding that they are inevitable and investing in defences, whether in the form of better tooling for C or languages that fundamentally avoid or restrict the problems C has.


> the compiler just happily "assumes" that the programmer is flawless

In my experience, this is the right thing to do for most applications. Every time a program (ex: compiler) tries to guess what its user (ex: developer) meant, instead of just sticking to the input, there will be problems. In particular, you may help some users in some cases, but you'll prevent other users to get what they have the right to have.

> ... seems far less productive ...

And maybe it is, but it's not a compiler's business. Use automatic tools, write your own tools, decide coding convention, impose code reviews, but please let the compiler do what it is supposed to do and how it has learned to do in decades of evolution.


> Every time a program (ex: compiler) tries to guess what its user (ex: developer) meant,

That's one approach to fixing errors, but there's another: tell the programmer that there's an error, and let them decide the best way to fix it. As you say, unsupervised "DWIM" is problematic, and so is a computer program that trusts its inputs absolutely.

> And maybe it is, but it's not a compiler's business.

What exactly isn't a compiler's business? Helping the programmer write a program that solves their problem? I would say that that is 100% of the compiler's job. It might be that part of the problem is "absolute performance", but that is almost always a secondary concern to correctness... computing the wrong answer (or pwning the user's data) is suboptimal, no matter how fast it happens. C defaults to exploiting everything for performance, rather than defaulting to correctness while also providing opt-in ("dangerous") tools for performance that one can use when necessary.


djb, who is famous for being an exceptionally talented and careful C programmer, summarized these issues better than I ever could: https://groups.google.com/forum/m/#!msg/boring-crypto/48qa1k...

Yes, every time a C program relies on undefined behavior, it's a bug. And as djb observed, virtually all non-trivial C programs contain these bugs. This is the core of my argument.

Yes, we could rely on C programmers never making mistakes. But it would be better to design languages with no undefined behavior.


His idea of flagging UB at compile-time is pretty obvious - I have to wonder what the economics are that it hasn't been an emphasis at all, all these years.

I do get a facial tick when people say "all significant C programs use (or contain) UB." It's never been necessary and it's always been shameful.


As someone that doesn't HAVE to use C, this weirdness is one of the things that makes it interesting to me. I'm sure I'm the exception here, and if I were writing lots of C code for big projects I'd feel otherwise.


> As soon as there is undefined behavior, it's a bug in the code.

I think the emphasis on the original comment should be on the silent part. Nobody is arguing that these aren't bugs in the code; the problem is that you get no help from the compiler or runtime to find them and the damage is more severe than it would be if it simply crashed.


> the damage is more severe than it would be if it simply crashed.

I think I see your point, but you are wrong when you write "simply". Crashing in case of UB is not simple! Actually, ensuring any type of behavior in case of UB is not simple! It requires huge lots of runtime checks, thus you obtain a ridden executable as output. If you want your program to be cuddled in a safe baby box (which is the right thing to want in several cases), then go for an interpreted language, but please let "gcc -O3 -march=..." be what it is: a masterwork coated with magic.


In the vast majority of cases, concept of Undefined Behavior shouldn't even exist. All behavior should be defined. We have massively powerful machines and the performance advantage mostly doesn't outweigh the potential damage. And other languages, like Rust, even eliminate lot of UB without a lot of overhead.

There are exceptions to every rule, of course, but we need to re-evaluate our priorities as times change.


> It's long past time to invent and use better systems languages.

The sad part is that they already existed since 1961 actually, long before C was born, but then some startups decided to base their workstation OSes on UNIX and got successful selling them.


Agreed. Syntax is not a problem, we bash C mostly for security reasons. And we can stop that as soon as popular C compilers start taking measures to make C code by default as secure as in many other programming languages.


Unfortunately those two are linked together. C compilers produce insecure C code because the semantics of the C language limit the options that the C compiler has for generating secure code. You could make a C compiler that generated more secure code, but doing so would make it non-compatible with the C standard and would most likely break a large swath of programs that compile and run on the other non-secure compilers. Worse still it would still generate insecure code even if less of it because things like address aliasing, and being able to cast ints to pointers are fundamentally insecure.

For an idea of what it would take to make C actually secure (and even then only in so much as you reasonably can) take a look at all the things Rust has had to do. Nearly everything that exists in Rust is there because they started with a C feature and asked themselves "what do we need to change to make this safe/secure?". Most other languages as a minimal first step towards improving security remove direct access to pointers, they're simply too powerful to do securely in the manner C does. Rust has (mostly) secure pointers, but only after putting an insane amount of work into bolting a massively complicated ownership and lifetime tracking system on top of it, and even then it's still not perfectly secure since they had to include unsafe blocks to provide a trap door for certain edge cases (also for C interop).

So yes, you could make a secure C compiler, it just wouldn't compile C code anymore, it would compile some "safe" subset of C which would probably end up looking like either Rust or C# (or maybe ObjectiveC).


> being able to cast ints to pointers are fundamentally insecure

Your compiler could track those ints used as pointers all the way throughout the whole program and prevent out of bound values either at compile time or at runtime for compatibility. There is nothing fundamental about C that makes it incompatible with memory safety.


Well, yeah, but then you'd basically - eventually - end up with Rust... not C!


Which was kind of my point. In order for the compiler to have the info it needs to properly track everything and prevent unsafe things from happening you need to extend the language to provide enough context for the compiler to work with. Once you do that you've basically arrived at Rust (or something very much like it).


The compiler has the info it needs already, you don't need to change C to make it memory safe. I even posted a link to a working memory safe C compiler, that preserves complete compatibility. The only downside, at least initially, is that you have to sacrifice some of the performance. And with enough resources pouring into the compiler you can get most of that performance back eventually.


Or Cyclone, Vault, Clay, SAFEcode, or Softbound+CETS that are either C variants or protect legacy C. Rust is a totally separate topic and style. I'd push Modula-2 with C-like syntax on C programmers before Rust.


I've dabbled in C, but I'm no expert. Can you explain how compilers can affect the security of C?



"... I'm not smart enough to write C code that deals with hostile data off the network."

I just don't believe you for one minute. It's quite easy to guard against unwanted memory access over serial ports and sockets. You have to know the size of the buffer, but that's just not hard.

If you go throwing pointers-as-if-they-were-tokens/handles around and blindly following them, you'll get in trouble.

So don't do that.

If you constrain operation only to valid data ( which isn't all that difficult ) then ( and I ask out of ignorance ), how can they hurt you?

What do I mean by "constrain operation..." yadda yadda? I mean if somebody "misspells" something, you dump the input. Your recognizer accepts only valid tokens. Usually, the tokens are in a table, with an FSM knitting them together.

So the table can be wrong, or the FSM can be wrong. That's it. Recently, all commands were in a simple subset of XML ( by the customer's choice ). That was particularly easy.

I grew up in environmentally hostile environments where you'd lose things like PHY chips and serial port drivers, so this was how it was done.

If you mean exploits in the O/S itself, then yeah - I agree. But then my choice of language is irrelevant.

I won't disagree about the C compiler thing, but that shouldn't be at issue here.


> I just don't believe you for one minute.

The author demonstrated this by pointing to a list of CVEs caused by their code, so.


Not technically my code, and that's the part I don't know how to fix. I used every security technique I knew of back then, including extensive unit tests, manual auditing, malloc debuggers (including Electric Fence), and pretty much every open source tool that was well-known in 2000. I'm sure a modern fuzzer could find some errors, particularly integer overflows, because this was 32-bit code and I didn't understand some modern exploitation tricks.

But as hard as I tried, I still failed, because I relied on 3rd-party code (by an extremely talented programmer), and he made mistakes. Perfection is not a scalable strategy.


I understand that completely. "Doing it wrong" is simply a choice that has been made.


To err is human. Beyond the scope of simple, small, toy programs - "doing it wrong" is not a choice, it is a statistic.

You do have choices to make in terms of techniques and tools you can apply (assuming you're aware of them) to modify the statistics, but you will not be perfect in your application of them, and the tools were not written by perfect people either.

There's another choice here - how high you set the security bar for "deals with hostile data off the network". Ekidd chose to set this bar high enough to rule out C. How high do you set it?


Problem is that we haven't been able to come up with a good alternative. Go is limited by its curse of minimalism [1] while Rust is verbose and rigid [2]. One of the reasons C became popular was that it was designed to solve real problems, not an exercise in minimalism or academic beauty. Its authors frequently sacrificed beauty and minimalism if it made sense and real world use demanded it. Unlike Rust or Go or NimRod, C is not just strongly opinionated statements on language design. One of the side effect is also that new languages often come with sub-par libraries and language designers expect others/community to figure everything else out.

[1] https://www.quora.com/What-reasons-are-there-to-not-use-Go-p...

[2] https://plus.google.com/+nialldouglas/posts/AXFJRSM8u2t


Free Pascal, Component Pascal, and esp Modula-3. Id get rid of capital identifiers. Make syntax more C-like. Modula-3 esp was nearly ideal set of tradeoffs for a safer alternative between C and C++ that also compiles fast like Go on weak hardware.

We've had plenty of replacements waiting. C developers are just hyper committed to that language. Even though its inventors moved onto a Wirth-like language.


Yet people continue to introduce security vulnerabilities in high level languages all the time. Maybe its not a problem of the language, but of the people?


Different kinds of security vulnerabilities have different levels of severity. The ones in HLL tend to be not nearly as bad as the ones that come out of C codebases.


Yes, to your first point. To the second, I'd argue the impact is the not that much different in most cases. Barring horribly mis-configured systems, networked C/C++ applications execute at the local user level so even if you get an exploit going you're SOL unless you have a local privilege escalation vulnerability in the underlying OS. For a web app, most of them are connected to databases and exploiting a vulnerability there usually allows you to either dump user data, or corrupt memory and execute shell code (leaving you with the same situation as the C code). Usual disclaimers apply - these are oversimplifications for casual conversation purposes yada yada.


That's an easily correctable problem: learn the living daylights out of programming in assembler, and then C will be like a walk in the park: no more overflows, no more pointer screwups, no more forgetting to free the memory, because you'll know exactly what will happen in the chip and the memory when that code compiles.

You'll even start writing code which relies on integer overflow (rollover from #$ff to #$00 again) because, depending on the situation, it will give you a performance boost, or make the algorithm implementation simpler, or both, and you'll know it.

It's long past time to invent and use better systems languages.

I can't comment on Go, but the textbook examples of Rust and even treatises on the subject from recognized experts in the computer science field (Adam Leventhal) show it to be an utterly overcomplicated language for which there should be a criminal trial and a prison sentence of several decades. Immutables, borrowing, no, I have to stop; my head hurts already, and I'm getting a migraine thinking about solving even the simplest of classic input-output problems with such an overly complicated language. (And my head wants to explode when I compare Rust with AWK for solving input/output problems, the biggest reason why computers were invented.)

So... C, C, and more C, because it's infinitely flexible and simple. I like simple, because simple is smart. Complicated and complex isn't.

Let's all please be humane to one another and stop inventing more languages. We need less languages, not a bazillion variations on the one central task: making the computer do something. And for that, we already have all the languages we could ever want or need, and for the theoretical case where we don't, there is always the almighty assembler, in which exist no limitations as to what one can program with it.


That's an easily correctable problem: learn the living daylights out of programming in assembler, and then C will be like a walk in the park: no more overflows, no more pointer screwups, no more forgetting to free the memory, because you'll know exactly what will happen in the chip and the memory when that code compiles.

Knowing how processors work doesn't prevent people from making mistakes.

I knew assembly long before I knew C, because I grew up on computers that were too small to run a C compiler. I've contributed to a half-dozen compilers. I've written disassemblers for weird embedded DSP chips. And by the standards of C programmers, I'm pretty paranoid.

None of this makes me superhuman. I make mistakes. I've written network-facing C code, and I've left people exposed to security holes. Want to see the carnage?

https://people.canonical.com/~ubuntu-security/cve/pkg/xmlrpc...

That's 6 CVEs against an obscure C networking library that I wrote. Technically, none of them were found in my code (not yet). All were in the third-party XML parser that I chose to use. But that doesn't absolve me of fault, and I bet there's a few more bugs lurking in my code that could be found with modern fuzzers.

I don't care whether people use Go or Swift or D or Rust or some brand new language. But wherever C goes, bugs and security holes almost inevitably follow. It's going to take decades to reduce the proportion of C in our systems, and we ought to start now.


That's 6 CVEs against an obscure C networking library that I wrote. Technically, none of them were found in my code (not yet). All were in the third-party XML parser that I chose to use. But that doesn't absolve me of fault, and I bet there's a few more bugs lurking in my code that could be found with modern fuzzers.

While it is commendable that you take responsibility, you mis-diagnosed your error: it isn't the C programming language, but what you chose to do with it: send/receive extensible markup language for remote procedure calls. Seems like a great idea, until you actually have to deal with it. I just happen to be working on an application where I'm forced to generate XML-SOAP and send that over the network to a server, and then process the XML-SOAP response I get back with XPATH, using xsltproc. And all that because the designers of the server application use Java and didn't even know they were using XML-SOAP and XPATH, since by the time they work with it, Java libraries treat them as objects. (And neither did I, until I reverse-engineered the responses I'm getting.) I have caused the server to crash innumerable amount of times.

You're arguing for better systems languages, and I'm telling you that is a terrible idea because we have nobody so smart as to design something so complex to be simple, and the closest we have ever come to it is C. We already have way too many languages, that it's become a nauseating torture to work with computers.


So you're saying C is a great language until you have to parse XML.

... exactly.


That is exactly what I am saying, and more:

- don't use XML for anything;

- use the correct tool for the job, but first define the problem better. What was XML-RPC attempting to solve?


What was XML-RPC attempting to solve?

That's an excellent question! To answer it, we need to rewind to the year 1998. Nobody had ever heard of JSON (2002) or REST (2000) or SOAP. If you wanted to do remote procedure calls, you needed to use Sun RPC or Corba or another binary RPC stack. Some of these required tens of thousands of lines of C code to implement, and I don't even want to think about the security issues.

Basically, XML-RPC was trying to be REST+JSON before that was a thing. It had already been implemented for most scripting languages, but one of my consulting clients needed a C implementation to interoperate, so I wrote xmlrpc-c in a couple of busy weeks around the end of 2000, I think.

Back in those days, XML had a very clear 30 page spec and XML-RPC had a 2 page spec. If you want to argue that I was wrong to use C to implement 32 pages of pretty good specs, or that I was wrong to use the best off-the-shelf XML parser of the day, well, I agree. But I think we need to demand more from our system languages.


> To answer it, we need to rewind to the year 1998. Nobody had ever heard of JSON (2002) or REST (2000) or SOAP.

But people had heard of S-expressions, which are far more appropriate for data transfer than XML.

And the canonical S-expression spec, from 1997, is rather less than 30 pages: http://people.csail.mit.edu/rivest/Sexp.txt


I think one could actually argue that all RPC protocols have ended in unmanageable disaster, and that the only things that endure are IETF-style protocol-first human-readable systems. However, that would take a lot of argument which I'm not committed to.


I looked at XML-RPC when it came out, and knew instantly that it would be a disaster. Even when you took out the XML-ness, the underlying concepts were complicated, badly explained and open to a ton of wiggle room.

Then you add the inefficiencies of XML encoding, and the utterly crappy nature of most XML translation layers . . . yeah, really easy to see the train wreck coming.


Sure, XML-RPC not the best protocol. Compared to others it's pretty simple, and the mess and number of edge cases is smaller than SSL or TCP/IP.

If C isn't up to implementing an only slightly complicated network protocol, it's not much of a systems programming language, is it?

It turns out, C is fine for writing such systems and getting them working fairly well, but very poor at making them really reliable and secure, and that's no longer a good tradeoff.


You're making a great point against C, yet somehow you are not realizing it.


You're right, but I think he's also making a wider argument that C isn't good at fixing bad application or data design choices choices for you.

He considers that a positive feature, not a bad thing. You're the inverse.

I can see both sides of that tbh. It's a justifiable - but myopically idealistic - point.

But man he's being a pretentious ass about it.


I can't see how this can be a good thing.

If C is that bad at managing complexity (and complexity doesn't only arise from bad design choices, some problems are inherently complex) then it should be used as rarely as possible, period.

It's not about idealism, it's about ill-placed, almost comical elitism ("C is a low-level-hacker language and only real programmers use it" while a 3rd grader could understand pointer arithmetic) to the point of rather spending 5 days writing bug-ridden code in C instead of 5 minutes in whatever else, just because it's the right way.

It's ridiculous.


> If C is that bad at managing complexity

This, in my opinion, is why even if C were somehow free of buffer/integer overflow, UAF, and other vulnerabilities, it would still be a poor choice for many real-world (read: complex) applications.


You can always glue stuff to the compiler and reinvent c++.


You're right, but I think he's also making a wider argument that C isn't good at fixing bad application or data design choices choices for you.

Bingo, give the man a prize! As we say on UNIX: C gives one enough rope to hang oneself. It is expected one knows what one is doing, but in return, one gets complete control and infinite flexibility. No programming language can compensate for bad architectural decisions, or an absence of such decisions, or absence of insight. Such a programming language does not yet exist, and it is questionable whether it ever will. So yeah, you got exactly what I was trying to say.

Now, back to C: nobody, including me, has all the solutions, but from experience, I have some tips when writing C code:

run the damn thing you're building through a debugger; feed it all kinds of garbage while doing so; try to break it in every way, shape, or form you can think of. As the author of the program, one should have unique and powerful insights into where the weak points are. And after all of that is done and fixed, have someone completely unrelated to it try to break it. This last piece of advice is good for any language.

I have a friend. I'd beat the daylights out of my programs, break them every which way, fix all the bugs I could find, then I'd let him have at it. He'd usually break my program in 15 - 25 seconds. Then the cycle would repeat, until he could no longer break anything. At that point, we had something remotely resembling stable software. And a lot of those programs I had him break weren't even written in C!

One of the best techniques I found while developing in both assembler and C: run the program you're working on through the debugger as you're writing it. As soon as I write a new function, I fire up a debugger (or shell tracing, or...) One wouldn't believe how many bugs I've found and fixed before the program saw the light of day. And it saves so much time, even though that's completely counter-intuitive.


The circle in his argument is something like "C is great as long as you don't use it for things it's not great for."


That is correct.


"don't use XML for anything"

This just made me laugh. I'm not making a value judgement about your statement, I neither agree nor disagree.


I'm gonna go out on a limb here and say you've shipped numerous RCEs and don't even know it.


What's the ideal number of languages, in your opinion, and in what year do you imaging we passed that number?

There were "too many languages" before C was even conceived of. If that were a reason to not use new languages, you'd never even have heard of C.


Five. We passed that way back in the '60's of the past century.

Any more than five becomes counterproductive and destructive, because it's physically impossible to keep up with all of the new languages coming out, and no language solves all the problems simply and efficiently. Or elegantly, for that matter.

In my experience, it takes about ten years of heavy use to master a programming language and use it not only to its fullest extent, but correctly and efficiently as well. I would be interested to meet a person who has that kind of time, biologically.

If that were a reason to not use new languages, you'd never even have heard of C.

Back in my day, the number of computer platforms was pragmatically limited: either you wrote software for the Commodore Amiga, or you wrote software for the Atari ST. If you were a software company, you wrote for both. If you were a programmer with lots of experience, you split the core from the hardware dependent subroutines, but I don't think anyone thought about this very UNIX-specific approach to programming back then (most of us had never heard of UNIX). We all wrote software in assembler, and were just lucky enough that both major platforms had the exact same processor family, the Motorola 68000. The C programming language was a solution to a problem nobody had or asked for. I remember making fun of my mathematics teacher and berating him for writing programs in C, because compared to assembler, it was lame: bloated and slow. It was impossible to code something of any significant speed and small enough size in C on either Atari ST or the Amiga. We viewed C as something lamers who couldn't figure out assembler dabbled with. One just couldn't compete in C with code written in assembler, which effectively meant that if you didn't code in assembler, and your code wasn't small or fast enough, or both, you weren't competitive, and the lamer label was soon to follow.

Times have changed, C is now the portable assembler, but they have changed only somewhat. When push comes to shove, understanding what happens at the hardware level is still crucial when performance problems strike... and they strike a lot. Most importantly, understanding how the hardware functions at the lowest level is most important when writing in really high level languages. The higher the language abstractions, more crucial understanding of how low level hardware functions becomes.

Nowadays, when I have multple incompatible processors to deal with, completely different hardware architectures, and plenty of raw processing power, C is a very viable, even desirable choice: if I write my code cleanly enough, it will just compile and run everywhere without modification, and I'm done, but back then...

...C was the answer to a question nobody asked or cared for in the personal / home computer space in those days. If one couldn't figure out assembler, one dabbled in GFA-BASIC or C, or Pascal. It was a very elitist, eat or be eaten environment. If one's code was too large or too slow, one became the public laughing stock of the scene. It was a very effective, and highly motivating selection measure. I miss those days.


> I can't comment on Go, but the textbook examples of Rust and even treatises on the subject from recognized experts in the computer science field (Adam Leventhal) show it to be an utterly overcomplicated language for which there should be a criminal trial and a prison sentence of several decades. Immutables, borrowing, no, I have to stop; my head hurts already, and I'm getting a migraine thinking about solving even the simplest of classic input-output problems with such an overly complicated language. (And my head wants to explode when I compare Rust with AWK for solving input/output problems, the biggest reason why computers were invented.)

The level of contempt for other people and utter lack of perspective shown by this comment makes my head want to explode.


At first I thought that was sarcasm... but it's unfortunately not.

> no more overflows, no more pointer screwups, no more forgetting to free the memory, because you'll know exactly what will happen in the chip and the memory when that code compiles

To whoever wrote this: are you ready to share the code of a sufficiently complex program in C (or assembler, why not) you wrote and pay $100 for every "pointer screwup", buffer overflow, memory leak and generally every bug people find, which could have been prevented by a strong type system or borrow checking and other such complicated features?


$1 would be enough, you could still get rich.


> I can't comment on Go, but the textbook examples of Rust and even treatises on the subject from recognized experts in the computer science field (Adam Leventhal) show it to be an utterly overcomplicated language for which there should be a criminal trial and a prison sentence of several decades.

You sure are angry with me for working on a programming language.


Yeah I don't get this sort of attitude at all. It seems like a very simple sequence of steps:

1. Creating something non-trivial in language X, 2. Identifying things that language X made difficult, 3. Brainstorming ways to implement solutions to those issues within a set of acceptable trade-offs, 4. Concluding that some of the brainstormed ideas might actually work, 5. Working really hard to implement and iterate on them.

I can't figure out which step the parent commenter scorns. I guess probably step 2, where they see no pain points to begin with. That's fine, but trying to convince people they haven't felt pain that they have felt is a silly battle. I guess they're actually implying that people who have felt that pain are just not smart enough.


No, I'm implying that people are either too egotistical or that they overestimate their insights, or both, when they create a new language. These days it seems like there is a new language every 15 minutes. It's getting to be ridiculous.

And the worst part of it is, no such language covers all problems effectively; they all have flaws in one way or another... so the next guy comes along thinking he can do it better, and just repeats the mistake of his or her predecessor(s).

Instead of realizing that this would just exacerbate the problem, they go ahead and make it worse.

I guess they're actually implying that people who have felt that pain are just not smart enough.

It is extremely difficult to solve a complex problem in a simple manner. One literally has to be a genius with a lifetime of experience and insights from that experience in order to be able to pull that off, and even then, it doesn't always work and requires multiple iterations, and lots and lots of experimentation. Case in point: Go, still a work in progress by exactly such people. That should be instructive, but it seems that it isn't.


> And the worst part of it is, no such language covers all problems effectively; they all have flaws in one way or another...

Which is exactly why you don't try to create a language that covers all problems effectively, you define the problems you'd like to speak to. If you have defined those problems in such a way that they are shared by many people, and if you solve those problems in a way that many people find effective, then you may have created a compelling solution. How does that exacerbate "the problem"?

Honestly, what even is "the problem"? Why does it bother you that there are new languages created? The "literal geniuses" from the Bell Labs era experimented a ton, and that was a great thing then, and remains a great thing now. It's always valuable to point people to prior art, but it makes no sense to bemoan research and experimentation.


Honestly, what even is "the problem"? Why does it bother you that there are new languages created?

The problem is that new languages have such fatal flaws, that they make the problems they are trying to solve even worse.

The second, and much more serious problem, is that it takes at least ten years to truly master a programming language. Humans only live about 70 years on the average, which means that it is impossible to keep up with every new language that comes out. I mean I live for this stuff, but I would literally have to be chained to the computer my entire life without time for anything else. That is nonsense.

Anybody who thinks they have mastered a computer language in less than ten years is either a genius, or unbelievably egotistical and arrogant; and knowing a whole bunch of computer languages in a perfunctory manner, enough to make one dangerous leads to a really shitty job of sorting out that mess afterwards... that's the problem. My problem, and likely other people's problem.


But you don't have to learn all of them... If you think a new language makes the problems its trying to solve even worse, you definitely shouldn't bother with it. But many others may disagree with you, and decide to invest that 10 years in really learning it. There is no problem with this.

What I don't understand about your position is its absolutism. You keep talking about people being egotistical and arrogant, but it seems like you're the only one on this thread being those things, by claiming your subjective opinion as objective fact, and then making ridiculous hyperbolic statements about throwing people in jail based on it. It's not a contradiction for you to personally dislike a language's philosophy and for that language to have been a worthwhile creation and valuable to many people.

Your feelings about different languages are just, like, your opinion man.


But many others may disagree with you, and decide to invest that 10 years in really learning it. There is no problem with this.

Just wait until you invest ten years of your life to master a language, only for everyone else to move on to the next new trend, with you stuck holding the bag.

AngularJS, Rust, and node.js, is it? I can't wait to ask you how you're feeling about all that time you invested into all of those, when all the kids move on to something else and they're no longer popular, for no reason other than the newcomers' ignorance, or because something shinier came along, but didn't actually add any value, just re-implemented the old thing in a worse way.

We can then have a discussion about how you're feeling, having wasted a decade or more of your life away on what will then be perceived irrelevant. That's of course under the (optimistic) assumption that HN is still going to be popular in ten years...


I sure am. Not for working on it, but for thinking you can do it better, and for exacerbating the problem even more.

The fathers of C created AWK as C on steriods, and even though they were a success in this endeavor, they were wise enough to create it as a domain specific, and not a general purpose language. If you study their works afterwards, up to Go, they always developed domain specific minilanguages, and always split the difficult problems into many small programs, hence Plan 9. This is very instructive in the context of their previous achievements. They even recommend to solve problems by creating a domain specific minilanguage, but only when this is appropriate to the problem at hand.

Now fast forward to Go. These guys really know what they're doing, and still Go is a research project, unlike Rust, they're not trying to actively sell it to the populace at large, because they know just how difficult it is to create a good systems programming language. So yeah, you bet your patootie I'm angry at anyone who thinks they can do it better. It's not even that they screw it up, that's part of learning and gaining insight, but that they actually actively attempt to sell it as a good solution. Meanwhile, it's even more complicated than the previous solution! And then let's suppose someone invests about a decade to master that new language, such as it is: by that time a pile of other people come along thinking that they know it better and invent their own languages; sometimes they invent a new language just because they think it should have a different syntax, as was the case with Ruby, in the author's own words! Unbelievably arrogant! And now that becomes the new hotness. Congratulations, you just wasted a decade of your life on something that's not in fashion any more. That would be no problem if we lived for 500 to a 1,000 years, or if we were immortal, but we only have this one life, and the length of our life is a blip in the universe. So it turned out to be a huge waste of time. That's the part that really angers me.


Respectfully, "those guys know how to do one thing well" is (IMHO) much more accurate phrasing than "those guys know what they're doing". Note that these are compatible statements.


> because you'll know exactly what will happen in the chip and the memory when that code compiles.

.. for the specific chip you've targeted. And possibly even the specific stepping.

And that's assuming that you're diligent enough to manually review all that assembler. Very, very few people are. This is why typesafe code was developed. This is why all those checks exist in Rust. Limitations are good. Limitations are your safety equipment. Assembler is the guy who scrambles up on the roof without tether or hardhat to "get the job done".

> language for which there should be a criminal trial and a prison sentence of several decades

Commentators who overuse hyperbole should be executed.


You say that knowing assembler makes it easy to understand how incorrect C code behaves. This was sort of true a few decades ago, when I learned C after learning assembler. It stopped being true around 15 years ago when the popular compilers started interpreting the C standard like your ex-wife's divorce lawyer (as Kragen put it).

I used to write a lot of C, and I no longer use it for anything serious either, because I'm too stupid and the stakes are higher. Maybe better tool support could change that, but I'd prefer a better language.


>because you'll know exactly what will happen in the chip and the memory when that code compiles.

No you won't. Assembly is an abstraction now, at least on x86.

The chip does all sorts of things in ucode behind the scenes


C doesn't offer anything better over Algol based systems programming languages, other that easier ways to corrupt data and ability to program write-only code.


Knowing assembly won't teach you the difference between legal overflows and ones triggering undefined behavior, or why you might need -fno-strict-aliasing to cast your pointers, and many more things. Does anybody know any correct nontrivial C programs beyond the ones by DJB? Or anything close? And it's not like, once bug-free, those are evolving much...


My complaint against C is that some of the smartest people on the planet use it to write infrastructure that regularly fails catastrophically. If the experts who love it aren't capable of writing correct code, I have no chance whatsoever. Humans just aren't good at the fiddly bits (remembering to free(); not to free() twice; not to use after free(); accidentally adding a second clause to a non-bracketed if expression; how many times to we see these in vulnerability reports?).

C is obviously very powerful. There's no denying that. But it does approximately zero of the things to help programmers that a modern language does. It's not even strongly typed in the usual sense (is there the equivalent of casting void pointers anywhere else?).

Finally, C is slow. Yeah, you read that right. Its main problem is that it has no non-wizardry high level semantics for parallelism. For example, modern languages have map() to say "do this operation on all of these items". In C, you have to write that as "for each item in this collection, do this thing, then the next, then the next, ...". If the compiler is exceptionally smart, it might infer what's going on and help out. But with map(), you're explicitly telling the language "I want all of these things to be done" and it doesn't have to infer anything.

I respect C. It's given us a lot of great stuff. But there's no way I'd willingly start a new project in C today given the alternatives. I am dumb, and given the choice between languages which help me and languages which are basically a small step up from assembler, I'm going with one that gives me a reasonable shot at writing safe, fast, and understandable code.


My concern is this seems like it's own fallacy now. You are assuming the same experts could have written these systems in another language.

Could they? Maybe. But there is no solid evidence that they could.

Is the argument appealing? Yes. But it is lacking the evidence of these systems that could be better in other languages.

My 2 cents for why, is that c is basically the only game in town that can easily use new and special assembly. Of which there is a lot. Consider, what is the fastest way to calculate a square root? And how old is that instruction? (I'm assuming it is actually used by higher level languages by now.)


As someone who is already fairly hostile to C and actually getting even more hostile over time (the recent shenanigans where the compilers are cracking down on "undefined" behavior with "optimizations", in a language where it's virtually impossible to avoid "undefined" behavior at scale, is insane), part of my hostility is precisely that there there have historically been no other engineering-valid alternatives. (There have been alternatives, just not ones you would have been wise to bet your career on.) It is an example of one of the worst local optimas we can find in computer programming, where something comes out that is pretty good for its time, and then entrenches itself so thoroughly that nothing can ever extract it ever again even though it isn't actually a great foundation to build on for decades at a time.

Other examples include SQL, Javascript (though WebAssembly may free us from that before we're stuck with it in 2040), and Unix.

When I say this sort of thing, people tend to flip out and start defending those techs against the claim that I don't think they're good at all. That's not the problem. They are good. The problem is that they're not good enough to build on for the future, but they are good enough to entrench themselves.

The consequence is that it takes a lot of riling up the community with knowledge of the shortcomings to fertilize the ground for alternatives to arise and be supported enough to make progress.

So, please stop defending C. It is what it is. It had a good run. Nothing will ever make it stop being the most important language of the last four decades. But if you don't want it to be the most important language of the next four decades... and you really shouldn't... please stop defending it. Give the alternatives space to breathe.

$OBLIGATORY_RUST_CALLOUT.


I'm not defending C. If anything, I'm attacking all of the so called "better" replacements. Specifically your call out that they should have existed by now. I treat that argument as a fallacy nowdays. I'll call it the "wishful thinking fallacy."

I do this because if there is anything that calls into question the "science" part of computer science, it is the dogma of folks that refuse the empirical side of the field. It isn't like folks haven't tried to make better languages and toolchains throughout the years. You even acknowledge this. They just haven't delivered on their promises.

And, some of that is because you are ignoring all of the advances that have happened in C. If you are not using modern tools like valgrind/coverity/whatever, you are not comparing C fairly to contemporary languages.


FWIW, I worked at Coverity. I didn't write the analyzers, but I sat next to the team that did. It is an amazing piece of software and that team would regularly present new features to us that I would have sworn were provably impossible, like "you've just solved the halting problem" kind of things. I'm sure any of those engineers could glance at an IOCCC entry and say "oh, there's a missing semicolon in that part that looks like a cat's head". Seriously, it's black magic written by wizards and is the most impressive codebase I've ever seen.

But.

Holy crap, the idea that you have to use something like Coverity just to make C code not suck is a bitter pill. I mean, I'm glad it exists but it's terrifying that it has to and that it's still being developed. After decades of development, the Coverity team is still finding enough examples of new ways to write bad code that they can still improve it.

I don't write C because I'm on a first-name basis with the only people on the planet I trust to get it right, and I am not among them.


"If you are not using modern tools like valgrind/coverity/whatever, you are not comparing C fairly to contemporary languages."

I personally consider C + strong static analysis tools to be an entirely different language. I fully acknowledge this is my personal opinion.

Consequently, I consider it equivocation in this sort of context to be talking about C when convenient, then talk about C + strong static analysis tools when convenient. You're only programming in one or the other. My utter contempt is for C the language by itself. This is relevant because as near as I can tell, C has at least a 10:1 advantage in the field over C + strong analysis tools, and I'm probably off by at least a factor of magnitude. It is not realistic to pretend that C programmers are using these tools at scale. (Now, you can point at a ton of high profile C programs that do use these tools, because high profile C programs are exactly where they get used. But as near as I can tell they are firmly the exceptions, not the rule. And you can point at a ton of high-profile projects that don't use them, too.)

If everybody started using all these tools, including the ones that you actually have to buy because there's no open source equivalent of, it is true that most of my issues with C would be resolved. It would also be true that programmers would be a lot less gung-ho about C, though, because as anyone who has deeply integrated these strong tools into their workflow can attest, C + strong analysis tools is a much more complicated language. Now, in my opinion, that complication is being revealed by the analysis tools, not created, and all C code is actually that complicated and the programmers are merely kept in the dark by C's extensive structural deficiencies that make it so it's only safe to write in with these rather sophisticated tools, but the end result is still you're working in a much more complicated, demanding language with a much more baroque compilation process. Claims about C's productivity, compiler speed, probably execution speed, ease of programming, probably quite a few other things, all go flying out the window if this is the solution, and C looks eminently more replaceable.


We have not escaped the "wishful thinking" yet. Do I want there to be a replacement? Yes. I do my coding in other languages.

Have I grown weary of the attacks on C? Also yes.

Are not enough people using advanced toolchains? Yeah. I agree with that. But, this is effectively the same complaint that not enough people are using higher level languages. With the exception that fitting some of the extra tools on there is easier than switching languages.


Lint was created for C in 1979.

In Dennis own words:

"Although the first edition of K&R described most of the rules that brought C's type structure to its present form, many programs written in the older, more relaxed style persisted, and so did compilers that tolerated it. To encourage people to pay more attention to the official language rules, to detect legal but suspicious constructions, and to help find interface mismatches undetectable with simple mechanisms for separate compilation, Steve Johnson adapted his pcc compiler to produce lint [Johnson 79b], which scanned a set of files and remarked on dubious constructions."

Taken from https://www.bell-labs.com/usr/dmr/www/chist.html

We are in 2016, and static analysis is still ignored by the majority of C programmers that "just get work done".


Lint has a history of being proprietary.

The Free Unixes of the true UNIX DNA lineage have a lint that is a rewrite by Jochen Pohl (done for NetBSD sometime in the early 90's?) Somehow BSD didn't inherit lint. If you use lint on BSD, you're not literally using that program that dates back to 1979.

Other Lints like FlexeLint are proprietary.

C spread like wildfire in the 80's, but an implementation of lint did not follow. Most non-Unix systems that could be programmed in C did not have a lint accompanying that C compiler.

Definitely, it seems there was a lack of promotion of lint from the birthplace of C.

If I were to give a definition of lint, it would be: "that mythical C checking program everyone knows but hasn't used".

It must be suffering from a curse which affects all technology named using a four-letter word starting with L. (Or at least LI.)


> C spread like wildfire in the 80's,

Maybe on US, I only got my first contact with C in 1992, via Turbo C 2.0.

Before that already had used multiple languages and was using Turbo Pascal 6.0 by the time I learned C.

Then again, I hardly knew anyone with access to UNIX, my first contact being Xenix in 1994.

This type of experience was quite common in Portugal, who had money to buy expensive UNIX workstations....

Also the fact of being proprietary, well I usually paid for my tools, before FOSS started to be a thing.

And on the same place where they had Turbo C 2.0, they also just got Turbo C++ 1.0.

So there was a tool that had the type system that allowed me to bend C to be more like Turbo Pascal, provided I cared to use the said type system improvements.

Which lead to me never using C++ as an improved C compiler, rather taking advantage of C++ type system to ensure my arrays and strings were properly bound checked, and IO was done safely.

Oh and taking part on C++'s side on the whole C vs C++ USENET debates, since 1994. Or type safety in systems programming vs C.

So these HN and Reddit discussions regarding the C's unsuitability to write safe software, are hardly anything new to me.


Not just C programmers. I write mostly VB.Net, C#, Java, and I can never compile any of the programs that I work with (created by teams of smart people) with all warnings treated as errors, never mind any more sophisticated analysis.

And the justification is always the same, just as you said: "just get work done".


Apologies if this is rude, but I'm not sure what this is getting at. I know that static analysis is old. I also know that I have been guilty of ignoring it sometimes. Especially as I was learning.

I think it is still in the wishful thinking world to think "better tools" should have won by now.


The thing is that the C working programming class doesn't value quality and secure code, performance FTW everything else gets done later, if there is time that is.

Even the language authors were honest and:

1 - created a tool in 1979 that most C developers ignore even in 2016, 37 years later;

2 - realized that C was due its date and took part in Limbo and Go design

Yet lots of C coders continue to improve this NIST list, every single day:

https://nvd.nist.gov/visualizations/cwe-over-time

For me wishful thinking, is the mentality that there are C developers out there that manage to write memory corruption free without using such tools.

Never met any since my first contact with C in 1992.


Okay, so I have written ... multiple projects that ran as long as they had power with no evidence of memory corruption over a span of 30 years.

I've used Lint. Early on. It was great training. After a while, you don't need it so much.

I think you are Vastly overestimating the skill level needed to write normatively-correct C code by quite a margin.


> with no evidence of memory corruption over a span of 30 years.

How many people did you had on those teams and how was the development process?

Me too I have written such software, when working in teams of no more than two people where we taken a Pascal like approach to programming in C.

Basically:

- Use translation units as modules

- structs are always exposed as Abstract Data Types, with accessors either via macros (when speed matters) or functions

- assert everywhere any assumption that can lead to memory corruption

- on debug code, make use of assert with pointer validation library from debugger, e.g. IsBadReadPtr()

- arrays are never accessed with pointer syntax

- use of compiler extensions for bound checking during debug builds

- all warnings enabled and considered as errors

And quite a few others.

I can count up to 5, the amount of friends and co-worker that managed to write good quality C code, since I learned C in 1992.

C++ unfortunately also enjoys the flaws of C due to its C copy-paste capability, but at least I can make use of the type system via structs, classes and operator overloading to enforce code that in C cannot be more than plain patterns.

> I think you are Vastly overestimating the skill level needed to write normatively-correct C code by quite a margin.

Anyway I don't need to proof anything when we have so nice CVE databases full of examples.

Or companies like Apple, which are supposed to have such ideal C programmers, do an OS release with 36 bugs, 31 of these (86%) related to C memory corruption bugs.

https://support.apple.com/en-us/HT206903


I'm not trying to say you're wrong. Especially because "...so (many) nice CVE databases full of examples".

As I recall, you had six(?) defects identified on a rather significant subsystem. While that's hardly ideal, it doesn't sound like those six defects should cost that much to fix.

And I'm not trying to say I can achieve some abstract perfection - I will have the odd memory overwrite early in unit test. But SFAIK, I've caught the overwhelming majority of them. But I will also be able to spend several hours building scaffolding to flush 'em out... this is especially true of networked code. I tend to use Tcl to torture them.

I will say - all the old guys I knew who were good C coders just got out of coding pretty much altogether. When I see discussions from younger coders, based on the problems they have, I feel that there's been some loss of information.

Also - for a major release of something as big as El Capitan, 36 seems a not-unreasonable number of severe defects. Obviously, we'd prefer it was a smaller number, and we don't know how many will be found eventually.


  >> 36 seems a not-unreasonable number of severe defects.
The point is that it should have been 5 defects (36 total - 31 memory defects).


Also: "Me too I have written such software, when working in teams of no more than two people where we taken a Pascal like approach to programming in C."

There you go. That's how I have done it as well. The degree of "Pascal-like"-ness is always .. something to be humble about. I really, really wish it had been Ada all along but that seems not to be likely.


Static analysis isn't the only game in town anyway. An approach that seems to be gaining popularity is fuzzing. I'm using the AFL program (American FuzzyLop) in an ongoing way to find problems in code. Stress tests under Valgrind, and fuzzing are quite powerful.


Assuming one is using a C toolchain that allows for it, which usually isn't the case for most embedded projects and their commercial tools.


Also the skill level needed to write code that will evade lint type tools, while being incorrect. :)


Yeah, there's that :)


It is not helpful to characterize a discussion of C's problems as C-bashing. H2C03 titled his article "Let's Stop Bashing C" , but the article he was responding to was not a rant against C. It was actually a suggestion that some of the more dubious features of C should not be carried forward into future languages. You may disagree with specific issues (I am with H2CO3 on braces), but I think you agree with the principle.


Indeed. When we look at Valgrind for instance, that only runs on Linux (and not all architectures). So you can only use it to debug C code that you can port to Linux (which requires it to be portable, which is different from "correct"), and you can only debug those issues in that code which you can reproduce in that Linux port.


> It isn't like folks haven't tried to make better languages and toolchains throughout the years. You even acknowledge this. They just haven't delivered on their promises.

The question also has to consider, why did they fail? And what is the criteria for failure? That may be entirely the point: the languages may be better by some metrics, worse in others, but they cannot easily beat the inertia factor, as a result of being so entrenched for 4+ decades. Everyone is very aware of the inertia factor; it seems to permeate all levels of technical decisions in all kinds of fields, for good and bad reasons. If the failure criterion is "not used as widely as C", then, like, everything has failed, so it's a pretty bad criterion, I think.

Ada has been around for a while now, and has seen a fair amount of high-assurance industrial use. It's really unclear if you can actually characterize Ada as a total failure for example, it lives on, although it's certainly less popular. But it was designed with the field in mind from day 1. I think a lot of this has to do with things like institutional knowledge and inertia, and vast amounts of time and money sunk into tooling for systems like C.

That's not bad, the money and stuff obviously helped; but it makes the narrative that all these other things are strictly worse/never attempted a little bit muddier. Does the "empirical" lack of better replacements imply that better replacements cannot exist, or never existed? I think Ada alone specifically negates the core of that argument, they've certainly been tried and seen some level of success. Or perhaps we didn't actually try enough due to resources, maybe? Or maybe it suggests that other contributing factors, externalities, gave rise to such a "monopoly" of computing intellect?

Also, it's easy to forget, but today, it seems general programming languages have enough demanding requirements in general (from things like expectations of package managers, to libraries, to editors), that actually supporting and promoting a general purpose language, to the point of wide usage, is ridiculously time and money consuming. Programming languages are a commodity, for the most part. Existing languages almost all either struggle to exist, have their primary developers financed by some corporations, or have existed for long enough to not simply drift into the void and become effectively immortal.

Yet, there's almost no money in it as a field. It's insanely hard to make a business out of things like compiler tech or programming languages these days, outside of specialized fields and clients. Unless you have deep pockets and a willingness to throw money into the void for a while, bootstrapping a "competetive" programming language is going to be insanely hard, on any kind of reasonable timeframe. Rust is a good example: Mozilla financed a vast amount of the development for a long time, and sustained it even in the beginning stages when it was much riskier as an investment. That's likely one of the major reasons it's even remotely as popular as it is today, because they ate the cost, to catch up with modern expectations.

All of these compounding factors are not going away. I agree there are few viable alternatives; but I don't agree it's so easy to explain it all away "empirically" with such a simple analysis as yours, that it was all either never tried or just "failed" (without defining what that even means).

> And, some of that is because you are ignoring all of the advances that have happened in C. If you are not using modern tools like valgrind/coverity/whatever, you are not comparing C fairly to contemporary languages.

The fact you suggest a 15+ year old, insanely complicated static analysis tool that's proprietary to ensure your code should basically be allowed to exist on the modern internet (because otherwise it will just be a nightmare hellhole security problem for someone down the road), yet continues to fail to find true high profile vulnerabilities that keep occurring, and still requires constant evolution -- all so that C can be "fairly" judged with modern languages... it's a bit alarming when you think about it.

Well, I'm being slightly tongue-in-cheek. I mean, every time I write C, I heavily indulge in these tools as my insurance to ensure I'm doing things correctly. I even tend to use CompCert to find undefined behavior! Coverity is really good, no doubt.

But when I do this, it definitely gives me pause at the investment I have to make. I don't sit around thinking "Yes, this is basically a great solution to the problem". It would almost be, like, I dunno... calling the story of the Titanic a success? Lifeboats hardly seem like the ultimate risk insurance when you're the one crashing into ice, especially when it seems we never have enough of them.


I'm not sure where this conversation is going. :) I think I resonate/agree with your entire upper part. At least, I find myself nodding my head a lot.

For the criticism of age of static analysis tool. This criticism doesn't make sense. I could make the same criticism for a 26ish+ year old and insanely complicated language toolchain (Haskel). And that is if I just pick the popular one. I could go with your example, Ada, for a 36+ year old option.

If the criticism is simply due to the proprietary nature. That cuts against much of your upper argument. The lack of money in the field is a large part of the problem.


> The problem is that they're not good enough to build on for the future, but they are good enough to entrench themselves.

Richard Gabriel made the same observation in his 1989 "The Rise of Worse is Better": https://www.dreamsongs.com/RiseOfWorseIsBetter.html

The worse-is-better philosophy means that implementation simplicity has highest priority, which means Unix and C are easy to port on such machines. Therefore, one expects that if the 50% functionality Unix and C support is satisfactory, they will start to appear everywhere. And they have, haven’t they?

Unix and C are the ultimate computer viruses.

[...]

It is important to remember that the initial virus has to be basically good. If so, the viral spread is assured as long as it is portable. Once the virus has spread, there will be pressure to improve it, possibly by increasing its functionality closer to 90%, but users have already been conditioned to accept worse than the right thing. [...]

The good news is that in 1995 we will have a good operating system and programming language; the bad news is that they will be Unix and C++.


I came here to share the same link. This is the reason why C is still relevant today; it continues to be good enough! I think the same goes for Unix.


I mostly agree with you the exception of SQL.

I actually think SQL is a fairly good language for what it does and the alternatives have been fairly crappy (at the minimum it should not be lumped in with Javascript). If there is a Rust of SQL let me know.


SQL's problems can be seen by comparing the true relational database theory with what SQL can do.

In particular, relational database theory does not require tables to have homogeneous table rows that consist entirely of scalar values. That was an efficiency optimization introduced back in the 70s, for perfectly sensible reasons at the time. But writing that into the language at such a fundamental level bends the whole language around it in a really unpleasant way compared to what it could be. For a similar example, consider the difference between a language that has no first-class function references, vs. one that has them added in. The first language may do everything you need, but you'll be bending a lot of code around that problem.

Of course, a variety of extensions have been bodged on to the sides of SQL over the years to deal with the resulting problems, and generally in a good engine like Postgres or something you can do almost anything you want. I'd also say that if you can understand what I mean when I say "Postgres is actually a much better database than understanding SQL would lead you to believe", you're on the track of what I mean.

Again, I'm not saying SQL is bad, I'm saying its success is holding back the things that could be better.


> In particular, relational database theory does not require tables to have homogeneous table rows that consist entirely of scalar values.

Are you sure about that? https://en.wikipedia.org/wiki/First_normal_form

Homogenous values are not required, but having atomic/scalars as a fundamental unit of a relation is a pretty core concept in database normalization theory.

Or are you saying database normalization is a performance optimization and we should judge RDBMS' by their support for relational algebra/calculus and not normalization theory?


I agree SQL is not perfect but I don't know of many examples of alternative languages (that are not completely proprietary).

It honestly would be helpful to see some alternatives that you think are actually good (as the ones I have seen in the past are fairly awful). The homogeneous rows point is valid but I'm not entirely sure even SQL requires that (as you can have non scalar data types as columns in many databases... although I suppose that is to your point about postgres).

And yeah I mainly only use Postgres so I'm biased :)


"I agree SQL is not perfect but I don't know of many examples of alternative languages (that are not completely proprietary)."

That's the point.


As another example could you write a device driver in a language like Python?

I was frankly a little disappointed in both the referenced article and the one it references. C basically is assembler. If you have a problem that needs hardware-level access you'll need things like pointers. The really hard parts of C (like handling null pointers or lack of concurrency primitives) seem to be the flip side of this access. Whether you use things like braces just seems to be a matter of taste.


> As another example could you write a device driver in a language like Python?

Sure, with Cython or a Python C extension. :)


My thought exactly =)


Of course Python isn't up to the task of device drivers, but C isn't the only language that can do that. You could write a driver in Rust, which is a significantly safer language than C.


Why are C & Rust available to write drivers but Python cannot? Is this because Python is too abstracted, where C & Rust have access to the physical hardware without the need for an API?


Python has a much more complicated runtime system which assumes a lot of things that wouldn't be true for a driver. It could probably be adapted to work, but it's a square-peg/round-hole situation.

By comparison, C's runtime is very simple and assumes very little, so it's good for writing code in a kernel or integrated system where many OS services won't be available to you. It's also much faster, of course, and it gives more direct access to memory.


Thank you! So essentially because C gives you direct access to memory -- I remember you have to allocate array sizes which are literally contiguous blocks on your hardware -- you can control things directly. THanks


C and Rust compile to native code. Python doesn't. It's either interpreted, or run in an abstract machine (VM). Python was never really intended to interact with the actual hardware directly.

I mean, you could, there's no real technical blocker, but it'd be horribly slow. Plus many failure modes of touching hardware are very likely to break the VM process and make debugging very challenging


I think that the real blocker is raw pointers. You need to be able to read+write data to specific memory addresses for most hardware, and I don't think Python's capable of that.


Python is capable, via the ctypes module.

The problems include:

- performance: implementations being interpreted, and relying on dynamic typing mean that Python is almost always slower than compiled languages.

- the run-time: Python assumes that things like allocations are always OK, and generally requires more operating system support than is available in low level environments.


Thanks for the reminder. I had it at the back of my mind that CPython is just about built as an abstraction on top of C, and so should be capable of anything C is, depending on where you wanted to draw the line between the languages.


> As another example could you write a device driver in a language like Python?

You could write one in Lisp, which is in many ways a language like Python, but which does support bit twiddling & pointers if desired.


> Consider, what is the fastest way to calculate a square root? And how old is that instruction? (I'm assuming it is actually used by higher level languages by now.)

Isn't the right way to handle this for your code generator, e.g., LLVM, to know about this instruction, and your language implementation, e.g., clang or rustc, to just say "hey, give me a square root"?

I do buy your argument for very processor-specific things like writing a bootloader or a hypervisor.


That is certainly the high level language way. Because the magic involved in the code generator is quite high.

C, however, gives you hold onto the generator to say "use this assembly". So, for new or one off instructions, it is a quick way to get an edge.


My question is, does it actually get you an edge?

When you use inline assembly, the compiler (more or less) has to execute that inline assembly, right where you say it is. It can't speculatively execute it, it can't refactor the call into a different function if you're calling it twice, it can't optimize it out in certain circumstances, etc. It can't move the instruction around or choose a different form depending on register pressure. If you specify the right flags about what registers it clobbers, what memory barriers it requires, etc., the compiler can do some of it, but it's not going to be nearly as good as the compiler actually understanding what's happening, I think.

Telling the code generator (and C is a high-level language in this respect, if you want any performance out of C) what you're doing is likely to get you more of an edge, no?

I can certainly imagine cases where letting the compiler reorder a slow square root around a memory access will be faster in practice than a fast square root that can't be reordered.


Support for inline assembly language in C is a compiler extension.

It is not exclusive to C compilers.


Indeed it isn't. It does seem to be most common there, though. (See sibling post for comments of early lisps supporting this. Well aware of it, but many higher level languages do not commonly support it.)


Burroughs ALGOL, Modula-2, and industrial Pascals could all do that. Amomg others. They were safer by default and fast. C looks weaker at this point.


> My 2 cents for why, is that c is basically the only game in town that can easily use new and special assembly.

In Steel Bank Common Lisp, one can just write a new VOP to emit a new instruction.


Yeah, I was aware of that. Naughty Dog did a similar thing with their toolchain.

Lisp seems to have been damaged by the functional programming dogma more than anything else, the more I learn about it. Which is a shame, because the functional parts are good. But so were all of the toolchains around interop with other systems.


I don't think that's true. Let's pick the Node.js ecosystem for an example (chosen because I use it very little, so I'm not hyping up my own pet language). Lots of non-expert programmers are writing Node packages, both for their own in-house use and to share with others. I'm going to assert that the average Node user has fewer years programming than the average C user. And yet, they seem to have fewer critical vulnerabilities per amount of code written. They still make mistakes all the time, sure! But they're a higher class of mistakes, more "this sets the wrong HTTP status code" and less "this exposes kernel memory to remote users". JavaScript handles the low-level abstractions so that its users can concentrate on breaking the higher-level stuff, hopefully in ways that are easier to test and notice.

Also, almost every language has a way to use C code directly (For example, see Go: https://golang.org/cmd/cgo/). C may be the best way to write low-level driver code. If it is, use it for that! I'm not convinced that there's a single best language for Ethernet card drivers and HTTP stacks, though.


> fewer critical vulnerabilities per amount of code written

Because nobody uses node.js in serious infrastructure. Show me a serious ISP whose DHCP server is written in node. Know any whose DNS server is written in node?

Your statement is equivalent to: we should build bridges out of legos and not out of concrete because fewer people have been killed by collapsing lego bridges


What systems are we talking about, then? Few people are writing stuff in C/C++ nowdays. And the fields that do this on a regular basis are pretty much unchallenged by any of the higher level languages.

And, agreed, on truly expert systems, things do go polyglot. This has been true since pretty much the beginning of computers. I'm not sure what sort of evidence this provides, though.

So, to bring it directly to point. Where is the evidence that the systems people typically write in C could be written well in another language? About the best I have seen is some system utilities written in rust. Which is awesome, no doubt. And the beginnings of the evidence I'm asking for.

(To be fair, we often speak of mysterious wizards that wrote the symbolic lisp systems as though they succeeded at this. I want that to count as evidence, but it feels more like hearsay.)


(Now arguing the other side...) There is a case for looking at safety issues in C. Reducing the possibilities for mayhem from null pointers would be helpful.

What I don't fully understand is critiques of C that lump safety issues in with use of braces or semi-colons.


A lot of the safety issues in C have been decently solved, though. You just have to use a larger tool chain than just the compiler. (Valgrind, coverity, etc.)


Fair point, I think what's missing from some of the discussion here is that very few languages are productive without an effective toolchain.


Yeah. I have tried times in the past to make sure folks don't just think in terms of language, but full toolchains. I confess I am not good at that thinking, either. :(


IMHO, people constantly introduce vulnerabilities when using high level code. Most websites are defaced this way. They're not exploiting flaws in Apache or IIS anymore. Its the Python or PHP or Node.js application or wordpress blog or similar that doesn't sanitize user input or has a buggy web api.


A C program talking to the web will have memory vulnerabilities, encoding vulnerabilities, and business logic vulnerabilities.

A modern program talking to the web will have encoding vulnerabilities and business logic vulnerabilities.

It's still progress.

There are cutting edge stacks which will effectively eliminate the encoding vulnerabilities, too, leaving more-or-less just the bugs you put in.

(Obviously I'm simplifying the types of vulnerabilities that exist to make a point.)


Well, you could make another argument too. Given X amount of time in a day, a C programmer can't really pump out _that much_ business logic code because they're also dealing with the other lower level stuff.

A high level language OTOH allows you to be more productive in terms of the business logic stuff, and therefore, I'm using a rough metric here, more lines of code, which also means more bugs in the business logic.

Like you said, these are all simplifications, but I think the primary problem is that people undertake projects that are way above their competency level.

Edit: If we could get some multiplayer-game like matching system where you are matched with the project depending on your skill level that would be awesome.


If you're working on a project that is not above your competency level, then you're wasting your time.


Yes, because the ultimate purpose of delivering software is to challenge the company's developers.

/s


What stacks eliminate encoding vulnerabilities ? And are there any stacks that even protect you from writing bad business logic , if that's even possible ?


Using something like Shakespeare for Haskell makes it very difficult to write HTML that is incorrectly encoded without doing it on purpose: https://hackage.haskell.org/package/shakespeare

More conventionally, even something like Go's html/template [1], though I believe it uses a technique that can ultimately be fooled, is much harder to write accidental encoding vulnerabilities into. If you do the natural thing in the templates, the templates do the right thing.

Business logic errors I don't expect to be prevented by, well, anything. Oh, you can do some work to prevent certain sorts of errors with type systems, but a lot of business logic errors often start in the requirements, or the translation of the requirements into the code. Nothing will ever stop us from taking fuzzy requirements and translating them into wrong code, even if we can at least ensure the wrong code doesn't crash, corrupt memory, expose encoding errors, or violate simple invariants.


You can use secure parser generators like HAMMER or Nail for encodings. If it's about integration, you can use something like ZeroMQ.

Far as eliminaging logic errors, you can make a lot of progress on that using formal specifications combined with modular code that matches them well. That's how med-high assurance was done. Some companies also specify domain rules in formal logic then implement them in Prolog or Mercury. You will have a hard time straying from intended logic when executing the logic itself. ;) Toolkits like Allegro CL and sklogic's DSL scheme let you mix and match things like LISP for power, ML for safety, and Prolog for logic execution.

Really niche stuff in industry, though.


That's all true, but consider how many vulnerabilities we'd be seen if the programmers who wrote those applications also had to deal with pointers and memory management. You'd get the current problems plus all of the stuff that regularly snags systems programmers.


I agree and we can take it further. If we had to hand encode the instructions it would cause even more bugs. The more decisions you have to make as a programmer, the more chances there are of making errors. High level languages make some decisions for you, but people still end up making mistakes. Maybe the goal should be to increase the bar at which a programmer is considered competent.


I think you look at it a bit incorrectly. Those high level languages were designed not to avoid mistakes, but to push software complexity to a whole new level. You simply won't be able to get to that complexity with a lower level language, because people have limited ability to think about things and tend to fallback to solutions they can fit in their mind. Which is why you shouldn't expect high level languages to eliminate vulnerabilities, people still use their minds to the limits and make mistakes. Other approaches must be taken.


I think we agree. Its hard to encapsulate everything, so I just chose a smaller context for my argument. I was primarily thinking about resource allocation and such.

A HLL does indeed allow you to compose solutions to problems in terms of higher level abstractions, which I gather is your point.


But it seems the OP's implicit point was that C is not the right language for higher-level operations. For low-level access it's not very obvious what else you would use.

Personally I rarely use C because I currently write network services with REST APIs. C is close to worthless for that purpose for many reasons, not just errors. It's analogous to writing an accounting app in assembler. Most people stopped doing that in the 1960s.


> For low-level access it's not very obvious what else you would use.

Extended Algol, created in 1961. Almost 10 years before C.

There are tons of other examples.

C got widespread thanks to UNIX, just like a few decades later, JavaScript got widespread thanks to Web 2.0 and the browser.

> It's analogous to writing an accounting app in assembler. Most people stopped doing that in the 1960s.

Lotus 1-2-3 was initially written in Assembler, Ι will let you track down when it was created.

My last 100% Assembly program was for MS-DOS in 1996.


> Lotus 1-2-3 was initially written in Assembler, Ι will let you track down when it was created.

Interesting! I did not know that. I would still stand by the point that application programming by the end of the 60s was leaving the Assembler era and moving on to languages like COBOL and FORTRAN. (And weird cul-de-sacs like RPG which was my second programming language after BASIC.)


This was true on mainframes and workstations.

At 8 bit and 16 bit computers target for consumers it was a different story.

Forth, Basic, Pascal, C, Amos, Modula-2, Clipper were the "managed languages".

Anything that required performance was written in Assembly, or high level in such "managed language" (Turbo Pascal on my case) with lots of inline Assembly.


  Finally, C is slow. Yeah, you read that right. Its main problem is that it has no non-wizardry high level semantics for parallelism. 
Do you know about openMP? Which works in most mainstream c compilers (clang, gcc, icc, msvs, etc). Ok, its not part of the standard, but IMHO thats probably a good thing. C doesn't dictate a paradigm which allows it to support a wide range of competing standards. Pick the one best for your project, or embed something like openCL or lua right into your C project. Of course the opposite can be said as well, write your code in python/whatever and then write a C plugin for the critical portions. Because, outside of fortran or C++ pretty much nothing comes close to the capability of C to solve a problem quickly.

Which if no one noticed is even more important these days than it was two decades ago, when companies used crappy languages and shipped products that wasted 99% of the cycles of the users computer. Today doubling the performance of your product can literally 1/2 the computing costs at your friendly cloud provider/datacenter.


Exactly. I studied HPC over a decade ago. All C and Fortran dialects with amazing performance. Today, they advertise updated versions of same languages.


>>remembering to free(); not to free() twice;

You can set the pointer to NULL, then freeing again won't hurt

>>not to use after free()

Set the pointer to NULL so it will crash.

>> Its main problem is that it has no non-wizardry high level semantics for parallelism.

Fair, the language doesn't have it. For everyday programming though you can use something like OpenMP where making things execute in parallel often requires adding one line above your for loop.

>>But there's no way I'd willingly start a new project in C today given the alternatives

There is no alternative for performance critical code (well, at least among popular languages of today). For other things - sure.

>>I'm going with one that gives me a reasonable shot at writing safe, fast, and understandable code.

If safety is your priority then likely C is not for you. As to performance it depends what you consider fast. If making stuff 50% faster no matter how fast it already is saves a lot of money you have no choice but C/C++/Fortran or just hand code in assembly.


  >>> remembering to free(); not to free() twice;
  > You can set the pointer to NULL, then freeing again won't hurt
  >>> not to use after free()
  > Set the pointer to NULL so it will crash.
I like C and think it is bashed too much, but double frees and invalid pointers are most often the result of multiple copies of the pointer... some other structure hanging onto the original reference.


> some other structure hanging onto the original reference.

Let's unleash the superpower of C. You should've been using pointers to a pointer so that it is always a single reference to a memory block.

You can't really be programming in C if you're not thinking in pointers.

One more thing; The article should have mentioned "pointer increment" which has nothing to do with "+= 1".


> One more thing; The article should have mentioned "pointer increment" which has nothing to do with "+= 1".

What do you mean? ++ptr is perfectly equivalent to ptr += 1, for any pointer type.


I read it as integer arithmetic. It is semantically different from pointer arithmetic.


So it's turt^H^H^H^Hpointers all the way down?


Will the following help with the depth, or any depth?

  int *a, **b, **c, **d;

  d = c = b = &a;


> Set the pointer to NULL so it will crash.

Dereferencing NULL is not guaranteed to crash. It's undefined behavior, generally. SO that can lead to tons of other problems....

> There is no alternative for performance critical code

On many platforms, Rust exists today. C is still the king of that, though, especially for obscure embedded stuff.


Right. Still, in practice, if you are setting pointers to NULL after freeing them (to say "there is nothing there" if some other part of the program wants to check) you will get a crash (as the compiler won't be able to prove the pointer is NULL if setting it to NULL depends on what happens at runtime).

I agree it's not perfect solution, just something to help you catching errors in your logic (not checking for NULL when something could be NULL is a bug in program logic).


If the zero address is valid, then one aspect of "undefined behavior" is that the result is "this will actually continue to 'work' until something random goes wrong". Someone just the other day on Reddit was talking about how this is a common issue in their workplace, as they work on a system that maps valid memory to 0.


Just to add, that if you commonly make those mistakes in C/C++, it's time to learn to use lint. While humans commonly make these mistakes and fail to notice them, lint will scream quite loudly at you until they are fixed.

If you are writing code that is too complex to be linted, then you are also are at a level of expertise where those mistakes are absent or very uncommon.


> Set the pointer to NULL so it will crash.

This doesn't work on all architectures which is why its undefined behavior.


Yet it will on the vast majority.


I really have to disagree - I've seen and written a lot of perfectly safe C code. There are a lot of good reasons not to use C but that one's a rapdily expanding fallacy.

There is something missing from these discussions.


"There is something missing from these discussions."

The reasons for the gap between the condition of most C code and what you describe. Plus how to close that gap with lower effort than just getting them to use Ada, Component Pascal, Go, Rust, etc.

I am keeping some notes on these discussions. Remembering you and TickleSteve in mind for the eventual roundtable discussion on Doing C Right With Ease (TM). pjmpl's list of safety tricks in the reply to you looks interesting, too. Once a basic strategy forms, then tooling can be designed to analyze conformance to it with helpful error messages and justifications. Built into IDE running in background as developer programs.


Well, it's not that I haven't thought about how to write a paper on the subject.

But frankly, the timbre of these discussions leads me to believe I'd simply be defending that for hours on end and I just don't wanna fool with it. The sheer rush to judgement is enough to make me think I'm sort of out of my depth in the ability to actually debate this properly. Part of my desire to say anything is trying to get the color and shape of that.

Plus, I'd rather not break anonymity, and I (amusingly) don't own any code of consequence that could demonstrate the principles - it's all owned by my employers.

Something like that - if I can't visualize the end-game, I probably have no business starting in it. Plus, I feel somewhat strongly that the languages designed to replace C should probably be given a chance. Throw in that the compiler people, in chasing absolutely performant executables, have left old code to rot... I'm also a bit feeling out of my depth.

I can more do it than talk about it.


Are there any real world examples (with actual measurement) of languages where map uses hardware parallelisation to be faster than a c loop? Serious question! My impression is, no, outside of something like hadoop.


I don't have any off the top of my head, but I used a forking map in Python where the inner call was an image processing operation, and it was approximately #cores faster. I think it'd be enormously dependent on workload: if the inner operation is adding two ints, the gains might be lost in the parallelization overheard.

Map is just one example of a high-level construct that permits faster code, though. A lot of us here have done async IO in C because that was the option we had available to us for doing it, but I'd vastly rather do such stuff in Python (3), Go, or Node where the language itself gives you constructs for describing what you're trying to accomplish.

My impression is, no, outside of something like hadoop.

That's a pretty big exclusion, though! We use Hadoop-like stuff on, ahem, sizable quantities of data. That's not exactly an edge case.


https://donsbot.wordpress.com/2007/11/29/use-those-extra-cor...

This was a simple fib(), not a map(), but I think the point still stands pretty well.


FPGA implementations usually are made to do this, although there is usually a human writing dedicates VHDL. Also google course-grained reconfigurable architectures (which can sometimes be mapped automattically from code).


Check the now dead Connection Machine and *Lisp.


Here's an example (something of a proof-of-concept/toy example though)

Auto-vectorisation and parallelisation in Winter: http://www.forwardscattering.org/post/22


To be fair, the value of the features you cite are very dependent on the size of the team and the type of work being done. If you're throwing tons of engineers, tests, and money at something--think aircraft or whatever--you're not relying on the language to be smarter than the programmer.

And as for operating systems being written in C and failing, writing a general purpose, hardware agnostic operating system isn't exactly easy. This is not to say that C should be the systems language of choice, or even that C is easier or more stable to write operating systems; just that stuff will be buggy regardless.

And I think your main complaint is stuff like OpenSSL and its vulnerabilities... I don't think I would blame the language there. I would blame age / inherited bugs / lack of testing (amazingly that's the case, but we don't live in a world that's too concerned with the common good).

I agree with you for many, many use cases for a programmer these days... but only if you just contextualize all these assertions.


"And as for operating systems being written in C and failing, writing a general purpose, hardware agnostic operating system isn't exactly easy."

Wirth and Jurg did it in a year or two on custom hardware using Modula-2. They and thdir students repeatedly did it with Oberon variants. They were all safer and more reliable than earliest UNIX distro. Partly due go strong typing, interface checks, and GC's in later versions.

Similarly OS's written with Modula-3 (Spin), Haskell (House), Java (JX), and Rust (Redox) moved fast since the languages reduced number of problems they had. Spin could even dynamically link M3 code into running kernel with type system preventing many crashes or vulnerabilities. Made for great acceleration of networking apps, etc.


I know that the response I'm about to give makes a statement that is not falsifiable and very much has the form of "I can cart this out and think I'm right no matter WHAT you respond," but I still feel compelled to say...

...the point I'm making is absolutely not that building an OS is equally hard in all languages, and equally likely to be buggy. My point is that creating a general purpose OS that only gets its rigorous testing from getting put into production before the majority of its target hardware has even yet been developed... that's going to be buggy in all languages.

Designing and running an OS on a known system with a few applications in mind can't really refute what I'm saying. I mean, I could of course be wrong. But I can't imagine a convincing counter-example that doesn't have for its evidence "largescale adoption on disparate systems."

(there could be an in-principle reason that I'm wrong that I just don't understand, but this is the risk we take in having opinions)


"...some of the smartest people on the planet use it to write infrastructure that regularly fails catastrophically."

Not one thing said in that sentence fragment can be shown categorically to be true. Yo, software has defects. Some get past everybody. To wit:

1) It's 2016. There's a lot of old code out there. Old code tends to be C.

2) UB and implementation-defined behavior are just a fact in the language. This being said, they're not THAT hard to avoid.

I think the thing is whether or not people are habituated to thinking in constraints. People who aren't will be more dangerous with a C compiler than those who are.

FWIW, I've heard the phrase "knuckledragger C programmer" more than once. As a C programmer, it might even be apt. :)As in "monads? We don't need no steenking monads!" :) ( this being said, I use something akin to a monad pattern quite frequently and in C )


If you tilt your head and look at it funny,

  p == NULL
  p
  *p
is a monad. :)


not to use after free(); accidentally adding a second clause to a non-bracketed if expression;

Two methods I religiously use in my code: Set to NULL after freeing. free(NULL) is a no op, so freeing twice is never an issue anymore. And since you probably won't use the variable after that anyway, setting it to NULL eventually gets optimized out anyway, probably. And no if without brackets in my codebase.


> Spaces are for the human eye, while braces are for the compiler

and for the editor auto indenter too.

I don't like doing the job of the compiler but this is an exception.

I already wasted more time in a few months of Python by hand aligning and debugging code after I moved it around than it saved time for me by not having to type } or end at the end of the block. Note that Python has a { in most cases: it's a mandatory :

And how about the time lost because of "IndentationError: unexpected indent" when copy-pasting code from the editor to the command line interpreter?

To the point of bashing C, the post is an answer to a very few points of https://eev.ee/blog/2016/12/01/lets-stop-copying-c/ It's mostly about syntax and it's an interesting read. I particularly like "No hyphens in identifiers"


Perhaps C's biggest mistake (at least in terms of syntax) is allowing fewer braces than it could have. Specifically the fact that statements like if and for which allow blocks, they aren't required, so you can end up with code like this:

  if(foo)
     bar; baz;
Which, when you throw in a preprocessor has probably caused more bugs than anything other than out-of-bounds pointer errors.

Of course the fact that everything, including a block, is a single statement is C is one of the things that gave me my first big "whoa" moment when learning to program, and makes me glad that C was the first "real" language I learned (after Basic). K&R C is far more elegant than people give it credit for, probably because modern, real-world C is full of all sorts of weird stuff (and often egregious preprocessor abuse - yes, I'm looking at you PHP), or people learn C++ first and assume it's all the same.


I totally agree, the "bare" if/for/while block is a real curse and should never have been allowed. I decided a while ago to always put in braces in C-style languages even for single statements. If nothing else, it just looks better. The code looks more "balanced", i think, an "if" without curly braces looks like it's toppling over or something.


I suspect this is the reason the Golang designers made braces mandatory even for one-statement if clauses.


Is the above code example, though legal, considered bad practice?


  if (foo)
     bar; baz;
Is the above code example, though legal, considered bad practice?

Yes, misleading indentation like this is bad practice for anyone who believes that there is such a thing as "bad practice" exists. For clarity to those who do not know C, the problem is that "baz" is executed unconditionally, even though the formatting misleadingly implies that it depends on "foo".

Unfortunately (in my opinion) there is not consensus on whether the following is bad practice:

  if (foo)
     bar;
My belief is that this formatting should be avoided, to avoid the case where someone not familiar with the rules of C (or not thinking about them) edits it to add a manually indented "baz" on the next line:

  if (foo)
     bar;
     baz;
Personally, I'm fine with a single line without braces:

  if (foo) bar;
But I believe that as soon as the body is moved to another line, braces should be required:

  if (foo) {
     bar;
  }
This belief is common, but not universally shared. Some go farther and say that braces should always be required (there is a good argument for this). Others say that a two line if statement without braces is just fine (I think they are wrong).


Thanks for your informative answer.


Even if it is done correctly with a comma after the 'bar' instead of a semicolon, I think it is a bad practice to separate statements with commas. But if you think in semicolon all of the time and skip some brackets then you will mostly get similar buggy code as the above.


Does anyone do that in practice? In 10+ years I've never seen a single bug from braceless ifs.


Oh god, this is one of my least favourite parts of python. It is unnecessarily hard to restructure if-then code. It is risky to cut-and-paste from webpages. And as you say, you can't always paste from the editor to the interpreter.

Python has an "invisible close brace" character, the newline, which closes a varying number of braces. When you think of it like that, it makes much less sense.


"Python has an "invisible close brace" character, the newline, which closes a varying number of braces"

The idea that the end of a block is invisible isn't really right, though. The newline doesn't end the block. The block ends at the next line containing non-whitespace text at a lower indentation level. In other words, it only ends when it becomes visible that it has ended.


I'm seeing several people complaining about this, but I've always found changing indentation very easy. Just select then [Shift+]Tab to [un]indent. I do this regardless of the presence of braces.


It's not that it's hard to change, it's just that you have to keep track of it, and if it's damaged you have to work out what it's supposed to be.

In an Algol brace-blocked language (or Pascal, Verilog etc which use begin/end), you can just put stuff in the middle of a block, check that what you've just inserted is balanced, let the editor reformat it, and you're fine. Whereas if you've lost the indentation from

    if a:
        print b
        a = a + 1
then you have to work out all over again whether the second line is part of the block or not.


Ah true. Thanks for the explanation.

So maybe what C's missing is a standardized, popular, universal, automatic fmt-on-save. This eliminates its' issue problem with improperly indented code.


Yeah, most projects have a coding standard and you can run indent(1) to enforce it. I don't think it's high on the list of issues with C though.


Interesting.

I don't think I ever had a bug due to wrong indentation. I admit that I seldom copy-and-paste code from web pages, but even then I don't remember ever having troubles with it. By no means do I intend to invalidate your experience, just saying that I/some/a lot of people have no problems with this. When discussing Python there are some issues that regularly come up (package boilerplate, complex data model, asyncio), but the indentation isn't mentioned in my experience. I'm doing Python for 4+ years and mostly used (and still do use) C and C++ before that, and some Basics and php before that.

What I feel is very important when editing source (not just Python, any source) is to have an editor that isn't entirely bad about handling indentation. Very simple editors just delete/replace a selection when hitting tab. Slightly smarter editors prepend a tab to each line, or indent one level. Emacs knows enough about indenting things that it cycles between possible indentation levels. An editor supporting shift-tab (reverse indenting) is rather helpful.

Personally I find it easier to restructure Python code than eg. C code, because I don't have to adjust braces manually. In Python I can just move a block around and, if necessary, move it horizontally to the correct level, whereas in bracelang I often have to adjust braces as well, which are outside the moved section and need extra cursor navigation.

Pycharm, by way of it's IntelliJ heritage is pretty good here; when moving code it automatically adjusts the indentation if it is distinct (eg. move a few statements from an `if` to a upper level and it adjusts the indentation automatically to match the surrounding block, since the statements can't possibly be indented due to the lack of a if/for/while/def). CLion can do similar things. The IntelliJ IDEs do these things for any supported language, sometimes across languages (paste some Java code into a Kotlin file - IntelliJ will automatically translate it into Kotlin, adjust the code style and sometimes even rename variables by type to match).


For what is worth, we had two unrelated production bugs because of wrongly indented return statements in python code. Luckily we only use python for non-critical glue code so it was more of an annoyance than anything, but still...

For what is worth, for C-like languages, I do not have to indent my code, my editor of choice does it for me according to brace placement.


> In Python I can just move a block around and, if necessary, move it horizontally to the correct level, whereas in bracelang I often have to adjust braces as well, which are outside the moved section and need extra cursor navigation.

Could you clarify; I think adding one/two more line to the block being moved makes this a similar operation. For that matter, the editor can unambiguously figure out the block and do auto-indentation when braces are present.


I never really had issues with copy/cut/pasting code and indentation until I started using pycharm, which _insists_ on randomly re-indenting code on paste, even if you disable it. In Pycharm, 'paste simple' is your friend.


In the command line interpreter you should use %paste, which will paste with the corrected indentations.

I'm seriously baffled by people complaining about indentation errors. The indentation is so tightly tied to the meaning of the code, isn't it almost always obvious?

Do you use an editor where you have to indent each line individually rather than moving all selected lines with tab/shift-tab?


I'm using emacs. When I paste I lose the selection. I'll check if there is some special command in the Python mode for that.

I'll also google that %paste. Thanks.

Edit: I googled that and installed IPython 5.1.0 (sudo pip install ipython) I can paste lines at any indentation level now and they work, or type spaces at the beginning of the line. %paste is not required.

I would upvote you again if I could. Thanks again!


> and for the editor auto indenter too.

The fact that you need an editor that parses C and adds in characters for you should send up a red flag.

DRY. I shouldn be working with human-generated code, not redundant machine-generated code. The only program I want writing code for me is my compiler.


I like this short article for a few reasons:

1. It makes the case that just because you don't like something, it doesn't mean that there's not a good reason behind it.

2. Just because you like something, it doesn't mean that it is good (or bad).

3. The things that the naysayers constantly bring up just aren't really that bad to begin with.

C is amazingly powerful, and I love it. I love JavaScript, too (especially since ES6!), but I don't confuse where I should use one rather than the other. I tolerate Python. ;)

Languages are just tools, and you should use the one most appropriate for the job at hand.


Tools should be ergonomic and should have guards and other safety mechanisms to make it easy to do what you intend to do, and difficult to do things that you do not intend to do. C tends to do the opposite of that. It makes certain things that you want to do more difficult than necessary, while making it very easy to do things that no reasonable person would intend to do. Evee's original post highlights that.

In fact, if you look at this post, some of the so-called defenses seem more like indictments to me. For example, in the increment/decrement section, there is this:

But look, there’s an even more direct argument: the ++ and -- operators are not even “one” operator. There is a prefix and a postfix variation of them that do slightly different things. The usual semantics is that the prefix ones evaulate to the already-modified expression, while postfix ones yield the original value. This can be very convenient, as in different contexts, code might require the value of one state or another, so having both versions can lead to more concise code and less off-by-one errors. Therefore, one can’t simply say that “++ is equivalent to += 1“, as it is simply not true: it’s not equivalent to at least one of them. In particular, in C, it’s not equivalent with the postfix increment operator.

This illustrates exactly the point that Evee was trying to make. It's hard enough to justify having increment and decrement as their own operators (as opposed to a performance optimization implemented by the compiler). Having two variants, which work exactly the same in the vast majority of instances, but do subtly different things in e.g. if-statements and for-loops is complete and utter madness.


I really disagree that this is madness. It may be madness to someone not used to coding in C. But if you code a lot of C you _will_ know when to use which one of these, and you will know that they're immensely useful.

For example, iterating and adding stuff to arrays:

array[index++] = object;

instead of

array[index] = object; index += 1;

and the other way around:

array[++index] = object;

instead of:

index += 1; array[index] = object;

There is no way you will miss incrementing your index here, or do it in the wrong place. This is so common, that as a C-programmer you WILL recognize this pattern.

It's even better when iterating pointers instead of integers. May I ask what the following means to you?

float *float_ptr; float_ptr += 1;

I think writing ++float_ptr here is just much more clear. Incrementing pointers by integers just feels wrong when what you really are doing is incrementing it with sizeof(float_ptr).


Until the logic is complex enough and you screw it up. Then you write over '\0' in some string and have a root exploit. Congratulations!

float *float_ptr; float_ptr += 1;

is no more confusing than

float float_ptr[...]; float_ptr[1];


There is nothing subtle about the differences of prefix and postfix expressions. That is speaking generally, not only in C. Complaining about that might as well be complaining about parsing techniques, and that's where they would quickly be proven uninformed.


Even more important, let's stop seeing bash.

If somebody had set out to create a language prone to error and exploitation, bash would result. (It's fine for interactive use BTW, just not as a language in which to write anything complex or long-lived.) While it might not have C's pointer/memory problems, its lack of real types or scope rules and its other idiosyncrasies (e.g. making it nearly impossible to distinguish "" from an undefined variable or the whole $*/$@ mess) make it even more of a nightmare. I've actually gotten pretty handy with it because I had to, but it's a far worse crime against developer sanity than C ever was.


Totally agree! By the way, the Haskell community is working to make scripting easier in Haskell, so if you want to go from one of the most buggy languages straight to one of the least, now you can: https://docs.haskellstack.org/en/stable/GUIDE/#script-interp...

(Downsides: Haskell is weird (but good), you have to have Stack installed, and the first time you run the script it will take a while to download dependencies)

More resources: https://github.com/Gabriel439/post-rfc/blob/master/sotu.md#s...


It was actually only recently that I realized how ridiculous bash was. I was looking for a formal semantics of it in reply to a comment here. I found nothing except a partial attempt by some academics plus a list of everything a real one would need. I thought "Holy crap, that's overcomplicated!"

Interesting enough, a Scheme or even BASIC-like language with macros (eg 4GL) couldve handled about all that just as readably while building on a simpler, consistent core. Bash wasn't designed like that, though. So, I ditched it altogether except as compilation target.


Hating C because it has integer division is like hating haskell because it is functional. It is inherently low level. C's real problem is that is has undefined behavior in a lot of cases. C with some kind of compile-time memory protection and no undefined behavior (so basically Rust) would be a pretty sweet systems programming language.


I do wonder whether this might have missed the point.

"Let's Stop Copying C" isn't a criticism of C, so much as a criticism of other languages that have applied a kind of conceptual copy-paste of C-like characteristics.

Demonstrating that there are good reasons for those characteristics to exist in C talks entirely past the question of whether the sharp edges of some of those characteristics are still warranted in other languages with other design goals.


The author completely missed the point. It is pretty clear in the Eevee's post (hell, it is even in title) that she isn't saying that C is bad per see, however she thinks that modern higher level languages copying C design parts is silly because it really is. Heck, Java has a god dammit reserved goto keyword for the sake of it.

For sure, the author takes the most controversy points of Eevee's points, that AFAIK are mostly (her) personal preferences. However it does not enter in the interesting parts of her argument, like weak typing, C-style loops* or textual inclusion.

This post was much less interesting read than Eevee's post, that makes me wonder if the author did a click-baiting title just to try to get some clicks on his blog following Eevee's post...

*: really, this is the most retarded thing of the whole C-family languages; it kinda makes sense in C, however if you're writing a modern language that only supports this kinda of loop (pre-ES6 Javascript comes in mind), you're retarded.


Interesting list of things to bash C and Algol-derived languages on. I really like Eevee's original approach of doing a comparative survey rather than talking about things in isolation. ( https://eev.ee/blog/2016/12/01/lets-stop-copying-c/ )

Point by point:

- textual inclusion and macros: yeah, this technology was obviously chosen for ease of implementation on small systems and makes little sense. Now obsolete and a handicap. Especially in C++.

- optional block delimiters: responsible for gotofail and similar errors.

- modulo: symptom of C not having standardised maths semantics. Standardise your semantics, language designers! Don't just say "whatever the CPU gives us, I'm sure it'll be fine".

- octal: Nobody uses goddam octal and it's a trap for people used to leading zeroes in decimal.

If we're changing this then I'd like to appeal for some features from Verilog e.g. 8'b1101_1101 : leading width specifier, and the internal underscore which is ignored but provides visual clarity.

- power operator: meh.

- switch: actual operation is kind of bananas, see Duff's device. Should be block-based.

- integer division: Careful. Huge argument here about maths.

(tbc)


> - octal: Nobody uses goddam octal and it's a trap for people used to leading zeroes in decimal.

0b..., 0x..., 0o... the latter is still a bit subtle (0Oo), but far more distinct and far less likely to happen by accident.

> - switch: actual operation is kind of bananas, see Duff's device. Should be block-based.

the goto-table semantics of switch are kinda useful in many instances, OTOH a compiler should be smart enough today to achieve the same degree of rather simple optimization.


> e.g. 8'b1101_1101 : leading width specifier, and the internal underscore which is ignored but provides visual clarity

... some of which were extensions provided by Metaware's High C/C++ compiler back in the 1990s: 0x2x1101_1011 0x10x_bad_feed_face 0_777


Writing performant interpreters is really hard. People wrote those performant interpreters in C because it was necessary. Rather than blog about what's wrong with C or C++, why not write re-write your Python or Ruby interpreter in Rust/Swift/Whatever - that should really show us dinosaur C/C++ programmers that we were wrong all along.

PS: it's not as easy as you might imagine.


The original blog didn't blog about what was wrong with C or C++. It blogged about other languages blindly picking up design decisions from C even though they don't really have a reason to.

And Rust has been consistently delivering software with performance on par with or faster than C counter parts. http://blog.burntsushi.net/ripgrep/ is a recent example. Servo is a more long term one.


Weak arguments.

Integer division is misleading. It uses the mathematical division operator for something that doesn't behave in that way. Having a separate 'integer division' operator (e.g. //) is much clearer.

Nobody could say that the pre/post increment aren't confusing. They feature heavily in 'trick question' C++ tests (along with undefined behaviour, type promotion and so on).


> It uses the mathematical division operator for something that doesn't behave in that way.

Floating point does not behave in a mathematical way, either.[1][2] Unless your language uses fractional representation using big integers for real numbers.

> that the pre/post increment aren't confusing

The operator and retrieval of the variable are two separate operations that occur in the order that you read them. None less trivial than "I" before "E" except after "C".

[1]: https://en.wikipedia.org/wiki/Single-precision_floating-poin... [2]: https://en.wikipedia.org/wiki/Arithmetic_underflow


> Floating point does not behave in a mathematical way

It approximately does. The only common way that newbies would get tripped up is something like (3 * 1/3) != 3. Maybe that is a case for not allowing == operator to operate on floats (use a function/keyword instead), but I think that may be a step too far.


I don't think that is a step too far. When I was writing embedded code, == and != were prohibited for floating-point values. I support such a rule in general.


Years ago, I worked on software that dealt with computational geometry a lot, and I quickly found out the hard way that the equality operator is pretty much useless for floating point values.

On the other hand, that is a property of floating point numbers, regardless of what language one uses.


> On the other hand, that is a property of floating point numbers, regardless of what language one uses.

That's true, but, OTOH, having floating point rather than an exact representation be the default representation for values represented as decimal literals, and the only non-integer type supported by convenient operators is a language-specific "feature" of C that many other languages don't share. (Though, to be fair, there are also many newer and popular languages that do share both, and more just the first, of those features.)


Yeah, and integer division behaves approximately to real division, too.


Personally, I like my addition associative, but you do you.

EDIT: Also I like my numbers reflexive.


> Nobody could say that the pre/post increment aren't confusing.

I will - its one of the simplest concepts and far from any of C's real problems.


Gotta love that the answer to the comment "pre/post increment might be confusing" is "Git gud".


Integer division is misleading.

For dynamic languages I kind of agree, but with static languages I think integer division is the right thing. int+int, int*int, int%int and int-int all return an int, so having int/int returning a not_int (what type should it return? should 4/2 return a different type than 2/4?) would be equally confusing.


Sounds like a lack of imaginationg!

I would suggest that it does not compile, and a different operator is required that makes it clear that it isn't 'normal' division. `//` is a good choice, except that is a comment in many languages. Maybe `_/` to indicate that it is a sort of 'floor division'.


Making int/int an illegal operation would solve some problems, but I doubt it would make things less confusing.


I think it would simplify things. All you need to remember is:

    / = normal division

    _/ = division then round down
Then the compiler enforces all the hard-to-remember rules, i.e. that you can't to normal division on integers. Floats would support both.


I like the Pascal way of using named operators like `div` (and `mod`, `xor`, `shr`, `shl`, `sar` etc.). Much easier to remember, read and type. Why does everything have to be cryptic symbols?

This also frees up those symbols for use in more frequently used constructs. How often do you actually use those bit fiddling operators nowadays?


I want / for division for the same reason that I want + for addition. (Pascal doesn't make you say 'plus'...)


Pascal has / for real division, but div for (truncated) integer division.


So it had + for "integer addition, but can overflow", and what for "real" addition?

My point is that separating "real" division from integer division is... a bit artificial? Not totally mistaken - there are differences - and yet calling one "real" is going a bit too far. All of these models of arithmetic suffer from the limitations of their representation. In all cases, you'd better be aware of what those limitations are...


C really is still a beautiful language.


Until you need to track down a pointer misuse on an UNIX variant that doesn't have something like AddressSanitizer available, and the customer is already pissed off it has taken more than one week to track it down and they keep loosing sales every time the system crashes.

A personal anecdote that took place sometime between 1999 and 2003.


In the same way that a chainsaw really is still a beautiful tool.


What an inspired sentiment. I love C (well I used to last time I used it), and I love chainsaws too.


No, C is more like a chef's knife. It will take your finger off if used wrong, it never should be used as a screwdriver, but in trained hands it's a precise tool for the right job.

It's C++ that is more like a chainsaw: a noisy, smelly, clattering but effective tool, dangerous without modern protective gear, and an equally poor choice for tightening screws.


Here's a fun blog post about this particular analogy: https://www.schneems.com/2016/08/16/sharp-tools.html


Wonderful article, thanks! Looks like it hasn't been discussed yet, so I submitted it here: https://news.ycombinator.com/item?id=13092740.

One of the things that I like about the chef's knife is that if you hold it correctly (http://www.seriouseats.com/2010/05/knife-skills-how-to-hold-...) the deep blade gives you both control and increased safety. People often think the bigger the knife the more dangerous, but often the opposite is true.

Chainsaws on the other hand scare me despite their utility, possibly because my grandfather (a very experienced chainsaw user) had a horrendous scar across his neck and face from being stitched back together after an unexpected kickback.


Yeah, but to really appreciate it I think one needs to have programmed in assembly. Unlike the author of the original article, of course.

C is going to look horrible if you've only ever used python. That's because you don't understand at all about real computers and how to program them.


Assembly was my second language and C my 3rd or 4th, but I use Python because I got tired of proving every single day that I Understand About Real Computers. At some point I decided to stop feeding my ego and start getting stuff done. I think of myself as dumb when it comes to C, but objectively I'm OK with writing multithreaded memory management stuff so I'm probably not exactly at the drawing with crayons stage. I can get those things right given a big enough time and money budget. It's just that I'd rather spend my days doing just about anything else.


I can understand both perspectives. I'm not really a professional programmer. I started as a kid programming basic on my Apple II, but only dabbled until a few years ago, when I really got into Python/ Javascript/ Ruby. Learning C after the fact has opened up a lot of understanding for me, it allows me to be more confident when I'm using easier languages like Python.


I've known a number of EEs who have spent much of their professional time writing assembly and C to run on embedded hardware. Perhaps oddly, despite being ideally placed for it, none of them has ever displayed the sort of "Real Programmers Don't Use Pascal" elitism you do here. I wonder why that is.


I did not say anything about "real programmers". This came from your own mind.


Not using the exact phrase isn't the same as not invoking the concept. Disingenuousness benefits no one, especially when it's obvious.


I can't help it if it invokes the concept inside you. I've programmed extensively in assembler and C, and I've programmed extensively in Lisp. I see the benefits of programming a real computer and the benefits of programming an abstract one. My point was merely that if you have not been exposed to the real computer you will not see why C is the way it is.


To be fair, one probably wouldn't go much beyond malloc lest they at least roughly knew how memory works, heap/stack, etc.


The word is "unless". Be careful with words like "lest" if you don't fully understand the meaning (it is not synonymous with "unless").

But no, trust me there are C programmers who use malloc and friends without knowing about the heap and stack. For those programmers it's just how you "get memory" in C. It's just part of the language for them (as opposed to part of the operating system).


I like the language itself, I just dread the tooling around it. If there would be a better build system, easier integration of libraries, better strings and an overhauled standard library I would start to use it again.


It struck me during the "Modern C" discussion a day or two ago that C would be very well-served by a good package and build system with dependency management (akin to npm, cargo, cpanm, gem, etc.). While there are some pathological cases (npm, with it's insanely long dependency graphs...I once installed a small package that pulled in 53,000 files in dependencies), the value of a massive standard library that everybody is using, contributing to, and testing, is just incredibly valuable. There's a gazillion lines of C code out there, but finding/using/distributing it can be tricky. If it's not part of the standard OS distribution, in particular.

Make, while I quite liked it in its day, has some real limitations. cmake, while better on some fronts, doesn't actually solve the right set of problems (or, it solves the set of problems as they were perceived ~15 years ago).

I wish there were a strict subset of C (compliant with standards, but that prohibits the trickiest bits during build), with a good package/build system, a huge/modern library of high level functionality. I always enjoyed poking at C, but I just can't spend enough time coding in C to ever be really good at it. I can go years between working on C code, so I forget all the gotchas by the time I look at it again. It's not forgiving of casual users the way Ruby, Perl, Python, Go, and even (some) Java can be.

Then again, I guess Go is kinda that language for some classes of problem (not the very lowest level systems stuff, but I rarely do any of that kind of coding, anyway; I haven't touched a kernel module in a decade).


> a good package and build system with dependency management (akin to npm, cargo, cpanm, gem, etc.)

In most cases I don't even need a full build system. I do most of my hacking in Python and I often create small "throwaway libraries" to create a nicer structure of my project. In Python you can do this by just creating a sub-directory and a __init__.py file. This takes 45 seconds and then you can easily "import lib_xyz" in your project. And if you need this library in another project you can just copy/paste the. (I know, this is not an ideal solution but when I am hacking on some stuff it is a good way to do things because it takes no effort and brainpower to do it)


You can whip up a makefile in seconds to do simple stuff, too. That's not the problem I'm talking about; it's not that make is hard, it's that it is incomplete (and kinda hard, for advanced stuff).


Yes! A simple language that is easy to reason about with a tooling that doesn't resemble duct tape and black magic, better stdlib and actual strings would be nice to have.

Some people say Golang is that language, but the tooling is actually not that great and then theres the garbage collector...

Edit: forgot to add "sound type system", to me C feels like a hybrid between a properly strongly typed language and something scripty.


It's interesting that you say that, because my experience is just the opposite. I dread C. I dread all the simple ways that you can screw things up even without going into undefined behavior (which is its own can of worms). The only thing that makes C tolerable for me is that it's had literally a generation of people building tools and guardrails around all the stupid ways C can trip you up. Without modern debuggers, linters and valgrind, C would be nigh-useless as a programming language today. It'd simply be impossible to write a program of any reasonable complexity.


Sorry, I should have been clearerwhich toolings I was talking about.

It's not so much tools like debuggers or linters, but more about the build system and the compilation process. There is just do much crust and dangling bits and pieces of the last few decades that the whole process is the complete opposite of streamlined and user friendly.

I do most of my programming in Python and for most projects it is enough to create a program.py file, import some stuff and then do 'python program.py'. If your Python program is not overly complex and only uses one of the common libraries then there is almost no visible build system which gets in your way.

If i haven't used Python in half a year it takes about 30 seconds to start programming again, but god forbid I haven't looked at make or cmake for half a year...


You dread the tooling _now_, an age when we're lucky enough to have a checker and sanitizer for every class of error we can make with C?


C is beautiful portable assembly (nothing more).


Not even that, at least with Assembly you know what you can count on, with C and UB not really.


Bashing C is like bashing English. Well, maybe Tolkien Elvish might have a better grammar system, but really come on now.


Natural languages and programming languages aren't really comparable. The former evolve and are beholden to their history, because you can't tell people how to speak. The latter are user interfaces, and to suggest that the only meaningful criterion in user interface design is that it should match what previous designers have done is just self-evidently silly.


> The former evolve and are beholden to their history, because you can't tell people how to speak

Think about that for a moment in the light of Python 2 - Python 3 or K&R C - C99 or original C++ - C++14


"The former" refers to natural languages, not programming languages. What you cite is one of the primary reasons I called the two classes incomparable. If programming languages worked like natural languages do, Python 3 wouldn't have been controversial, or exist.


> you can't tell people how to speak

you can't tell machines how many bits to use.


Actually if you're the engineer designing the hardware, you most certainly can tell the machine how many bits it's going to use.


You seem to be making the assumption that English is as popular and widely used as C. I would like to remind you that it is only the case in western parts of the world, which is by far, not the majority.


I think it's a reasonable assumption. The people who can speak English, whether as a first, second or foreign language, is about 1.5 billion.[1] Comparable to Mandarin speakers.[2] So, about 20% of all humans.

What proportion of developers can write C at an analogous level to first, second or foreign language? 20% seems like quite a good ball park estimate, from my (anecdotal) experience.

[1] https://en.wikipedia.org/wiki/English_language

[2] https://en.wikipedia.org/wiki/Mandarin_Chinese


Note that the numbers for English are from 2006, the numbers for Mandarin (960 Million) are from 2010 and they lump together all the different (not mutually intelligible) dialects of Mandarin.


Noted, but it doesn't detract from my point about C. The number of Mandarin speakers was only to refute the OP's assertion that an anglocentric view of the world biases the opinion of how popular English is as a world language; this is not the case, it is genuinely popular.


I think it's actually more popular (relatively speaking) than your numbers suggest because of the points I mentioned.


Indeed :)


Everyone in China learns English now, and a lot of the Far East has English (or Panglish, Singlish etc) as a major language. And it's pretty popular in India.

Nor sayng it's perfect and universal, but it is pretty widespread even outside "The West"


Not quite. A lot/most of Chinese have only learned Chenglish. That's not a criticism; it's an observation.

IME, Indians and Germans who use English are generally extremely proficient in it.


From the original article: " Which is why we still, today, have extremely popular languages maintaining compatibility with a language from 1969 — so old that it probably couldn’t get a programming job."

Yikes! How about "old enough to have a crapton of experience I'd really like to learn from/integrate with my company"? I know there is ageism in this industry (towards both people and tools), but sometimes the cavalierness with which people throw out the idea that "old == bad" and "good == new" comes as a shock.

And besides that, shouldn't we be proud of half a century of stability and compatability? Isn't "well-designed systems that stand the test of time" a holy grail to reach for, rather than poke fun at?


Are people actually boggled by the "++" operator? If they can barely wrap their minds around that, why are they even in programming? Seriously!


There are two typical use cases for `++` and they should be considered separatly.

Firstly there is simply the `i++`, which is redundant with `i+=1`, but it's also very easy to understand syntactic sugar. Keep it or leave it, no big deal.

Then there is the more advanced and very subtle use of the return value:

  int i = 3;
  while (i--) {
      printf("%d, ", i);    // What does this print?
  }
This one is very controvertial. On one side, it simplifies and shortens code by a lot, but it also makes the code say too much at the same time. Can you tell by glance what this code would do? Personally, I prefer longer code and more expressive code, but I can also cut corners with the short version from time to time, when quickly testing something out.

EDIT: Fixed code formatting


I figured it would print 2, 1, 0, and was right, yay for me I guess. However my confidence was low.

I agree that you can write confusing code with ++, but that doesn't mean it should be banned! (Like they did with Swift). The ++ operators still are very useful. Also there are arguments that they're more correct than +=1 operations. I see ++ as meaning, "go to the next thing."

For example, the following operation sort of doesn't make sense:

  char c = f();
   c += 1;
Why are we adding an integer to a char? If you really wanted to be type safe, it should be

  c += '\001';
So you're adding a char to a char.

Same goes with pointers. They're memory addresses... but what happens of you add 1 to a memory address? Should you get the address plus one, or should you step over to the next object?


The issue I have with it is not evident in your code but add another printf after the while and it becomes slightly more problematic:

  int i = 3;
  while (i--) {
      printf("%d, ", i);    // What does this print?
  }
  printf("%d, ", i);


Nitpick: i += 1 is equivalent to ++i, not i++. i += 1 returns a value, too.


    2, 1, 0
Your code is just syntax sugar for assembler code. If you think in assembler instructions, you will have no problem with it.


Nope!

    2, 1, 0, %
where % is... I'm not sure what. What is that exactly?


It's probably your shell.

If you're using zsh:

> When a partial line is preserved, by default you will see an inverse+bold character at the end of the partial line: a ‘%’ for a normal user or a ‘#’ for root. If set, the shell parameter PROMPT_EOL_MARK can be used to customize how the end of partial lines are shown.

See http://zsh.sourceforge.net/Doc/Release/Options.html#Promptin...


I got `2, 1, 0,`

I read the instruction as: "Set i to 3. Subtract 1 then print until false."

    i = 3
    3 - 1 = 2, print
    2 - 1 = 1, print
    1 - 1 = 0, print
    0 = false, stop
I see a lot of people complain about using `--` like that but I fail to see what is unintuitive about it if you actually read the code and see what i is initialized as. =\


Ugh, never mind. I was misinterpreting that the shell was saying. See? Told you I'm dumb.


I'm glad it's the first time I see that code, instead of a usual for loop.

It doesn't compile in C++. Conditions have to be booleans. i-- is an integer :D

In C, 'true' is defined as anything else than '0', there is no proper boolean types, i-- is an integer which is perfectly okay for conditions, the condition is equivalent "while (i != 0)".

There might be some variations, errors and warnings depending on the compilers, the strictness level and the revision of the language chosen.


This is completely false, it compiles in C++ fine. Are you thinking of Java?


Well, it shouldn't. It's possible that you have to give a flag to the compiler to error on that kind of implicit casting. It should at least give a warning for sure.

I am definitely talking about C++


Compiles fine with -Wall -Werror. No warnings.


    int x = 1;
    bool y = x;

    # cl.exe /W3 main.c
This code gives a warning on VS2012. But it doesn't give one when the cast is in a loop condition. That is weird.

http://stackoverflow.com/a/31552168/5994461

This stackoverflow message talks about the specs for C11, and the first comment adds information on the C++03 spec. It seems that implicit cast from integer to boolean is allowed... under all circumstances... depending on what specification the compiler is following :D

For future references, I'll just summarize this as "C and C++ are minefields". We'll just add that to the list of WTF behaviors.

By the way, if you think that "C has had bool for 19 years" [the C99 spec specifically]. You clearly didn't work in C for long enough with a large variety of tools. The world is bigger than just GCC.


> But it doesn't give one when the cast is in a loop condition. That is weird.

I believe that the justification for that is that you'll often want to do e.g.

    while (node) {
      node.val += 3;
      node = node->next;
    }
Implicit conversion of a type into a bool is pretty useful here, or for e.g.

    while (std::cin >> x >> y) { ... }


I imagine the warning is following the rules for explicit constructors/operators in C++: an if condition is considered an "explicit" call to an `operator bool`. http://en.cppreference.com/w/cpp/language/explicit


The problem with ++ is not understanding what it does. It's remembering exactly how to use it appropriately in the given context, and never making a mistake. And, the affect is multiplied by C's semantics, specifically, given that off-by-one pointer shenanigans can lead to Very Bad Things. So, tricky to use thing that blows up big when it does blow up.

To make an analogy, I can perfectly understand null and what it does. My objection to null as a language feature is not based on not understanding it and what it does.


Have you heard of students learning to program in school? Seriously!


That's what Scratch is for!


Scratch++


Some of the issues defended here can be resolved if operator overloading weren't so prevalent.

"++ is also used for iterators" Python has "next". I understand there's "*p++", but the semantics of ++'ing an integer or ++'ing an iterator are very different! Why should they share the same mechanism? (There's an argument that ++ on integers is iterating over N. Pedantically true)

"Integer math is useful!" Floating point semantics are different, so why not use different operators for such? Caml has + for its, +. for floats. Impossible to mix different semantics together implicitly.

--

I find saying "whitespace sensitivity makes auto-indentation impossible" to be a bit silly. The equivalent of auto-indentation in Python to C is auto-brace insertion! Just as impossible.

There are definitely parsing difficulties with whitespace sensitive language (technocally, lexing difficulties). If you're willing to go for parenthetic function calls like in Python, it's not too hard. But go for Haskell-like function application and you're in for a treat! Would not want to have to parse Scala.

I can see the argument to braces being better, though. It's such a subject of taste that if you don't want to be opinionated, going for explicit blocks is really your only choice.


> What’s Wrong with Increment/Decrement?

This sort of misses the point there. What eevee was saying was that "Usually you use ++ to mean +=1". It is equivalent to +=1 in those cases.

Eevee then goes on to say "The only difference is that people can do stupid unreadable tricks with ++.". She is talking about postfix/prefix there. She never said that ++ and += are exactly the same, she's saying that they're mostly the same, except for some cases which often read to more unreadable code. Using i++ or ++i as an expression within a larger expression often leads to unreadable code.

The author here then goes on to talk of off by one errors, but ++-as-expression is what causes a lot of off by one errors.

https://news.ycombinator.com/item?id=13089663 explains this a bit too.

---

The other two points also miss the point a bit. The post is not "why C is bad". The post is "Let's stop copying C". There are good reasons behind C having many of these features. These reasons do not necessarily port to other languages; yet they copy the feature.


In college (over 10 years ago) I used to think C was very viable. Sure it took longer to develop but the reward was fast execution and generally KISS like code (just an anecdotal observation I have of C is the tedium to write complex things forces simpler solutions... I mean who wants to malloc over and over).

But near the end of college I had to write code using pthreads. It was awful. It was exceedingly painful. Banging on the keyboard cussing continuously painful.

Maybe it was just pthreads (I'm sure there are nicer libraries) or my stupidity but that exercise killed my mild liking of C.

Languages I like are heavily expression based but don't require braces (algol like languages) nor parenthesis (lisp like). As much I dislike braces and parenthesis (for blocks) I despise statement heavy languages more (sadly python). I wish the article of the "Stop Copying C" mentioned that (or maybe I only share that opinion).


> But near the end of college I had to write code using pthreads. It was awful.

For what it's worth, I don't think it's C's pthread library that is painful, but POSIX threading in general. Languages and libraries try to "simplify" threading by re-inventing pthreads. Eg: python's 'threading' module, erlang in general, java 'runnable' interface, golang's 'coroutine' thread manager...etc.

The difference between all those languages specific threading implementation and POSIX threads, are that POSIX threads work across every language, and all those implementations are only relevant to the language itself. Working with "one size fits all"(pthread) tools is inherently more complex than a simplified tool specific to one language.

Summary: once you learn POSIX threading [well] in C, using other pthread abstractions in other languages becomes much easier.


If I recall it wasn't that I didn't understand the concepts but rather difficulty in debugging. At the time I was ignorant and ill experienced of proper tooling. Today when dealing with C I would probably not even bother with using threads and might just use multiple processes.

I agree on the reimplementing POSIX threading and despite my rough experience pthreads was worthwhile learning about.


Don't blame C for the pain of using Pthreads. Pthreads is a low-level, fiddly library for shared-memory concurrency. Sometimes it's necessary, but often there's an easier model for using multiple cores (data parallelism via OpenMP) or dealing with asynchronous events (some sort of message-passing).


I 100% agree with this article. I'm a student that has learned Python, Java and C - using each depending on what I need to do.

Having taught as well, Python is not a language for beginners. They are punished for getting their spacing wrong, often confuse types, oo seems to have been an after thought and versions 2 & 3 are completely different languages.

C can also be confusing for beginners, but not for the reasons mentioned. Teaching people about pointers spins heads for the first time.

Java for me seems to be the middle ground. Good understanding of oo, portable code, solid types, well formed errors, beautiful garbage collection and well thought out (libraries, types, access, concurrency, etc).

C is obviously an advanced language, but like PHP there is a good reason why it's not ready to be buried yet.


One of the things (among many) that does make python a good language for beginners is its short developer feedback loop. The pleasure of programming is important, so even if they're writing terrible code at first, there's a strong motivation to keep going. Along the way their skills develop.


I agree, but this is achievable in other languages in the same way it is for Python - simply output something interesting via a package or library. First practical we have students drawing shapes in Java and crudely animating them.


Regarding integer division, the Wikipedia article about modulo operation has nice graphics.

https://en.wikipedia.org/wiki/Modulo_operation


Is there any room for a form of C with immutable semantics which forced you to jump through special hoops to get mutable data access (solely for performance reasons)... Or does Rust basically have this design goal already?

The limitation of programming (from a bug-minimization perspective) is the programmer's ability to understand all the possible states his code and data can get into which means that the key to bug and security hole elimination is any pattern which reduces cognitive load


I'm not sure I'd characterize it as "jumping through hoops", but Rust does lean strongly towards preferring immutabillity. Variables are immutable unless you give them the "mut" keyword when you declare them, and in cases where you need multiple pointers to the same data, it's impossible (outside of `unsafe` blocks, of course) to make one of them mutable unless you use a safe wrapper type like a`Cell` or `RefCell` (for single-threaded cases) or a `Mutex` or `RwLock` (for when you need to share access across multiple threads). But yeah, there's a strong bias towards immutability in Rust, although from what I understand this is more for safety than performance reasons.


I'm struggling to understand how requiring explicit mutability has anything to do with performance or "hoops."

I mean, getting mutable access is trivial:

  let mut x = ...
  fn frobnicate(x: &mut...)
The performance "problems" in Rust requiring "hoops" are in very narrow applications (e.g. certain applications of high performance numerical computing). I'm not a fan of the "unsafe-offset" idiom required so that every time access into a vector occurs doesn't waste cycles on a pointless boundary check (for example).

But I must say for a given chunk of Rust code the equivalent C++ code is generally more bloated, because unless you're writing code begging for abuse you're going to have "const" and move operations everywhere, which is sort of inverted from Rust's model.


Bounds checks are often elided, so many access of vectors won't waste cycles on pointless checks. Or at least, if they are truly pointless, and they're not elided, that's an LLVM bug.


Well, you can sort of achieve that by writing C for a C++ compiler and sprinkling it with 'const'. Use of the standard library can then be inconvenient in places.


I don't write huge systems with dozens of other people. So I can get away with basically "almost everything is a state machine" in C. The parts that are not a state machine are encoded patterns that are matched against in tables, for simple command parsing and the like ( for when an fsm is overkill ).

The only cognitive load is generating test vectors, and I use scripting languages for that.


> The only thing I can suggest you to do is to actually go program in C for some years, write some good software, and you will see what I mean.

(from the OP's comments to another reader)

Allow me to translate: Ok, let's stop bashing C. Let's start bashing people who have legit complaints about C.

I don't think the above comment was warranted. Someone took the time to read your arbitrary article and attempt to offer a meaningful point and your response was to demean the fellow.


And of course engineers and architects are blameless on this - we would not allow for gradual replacement, no, Sir , we need a revolution, tear down this Wall and build a brand new Castle in the sky. Nobody wants to weld new things to a rusty Compiler, until the Compiler is all rusty, but still fully able to bind all the legac inertia factor to his will.


I've always been a huge fan of the increment and decrement operators. I don't use them all the time, but they're very nice to have on occasion, and with operator overloading they can be made incredibly useful. I thought it was a very silly decision for Swift to remove them.


Here are some reasons that Python removed them: http://stackoverflow.com/questions/1485841/behaviour-of-incr...

Edit: a nice write up by Chris Lattner: https://github.com/apple/swift-evolution/blob/master/proposa...


It's very unfortunate that attempting to use preincrement and postincrement operators in Python results not in a syntax error, but in code that silently does something very unexpected.


I've seen these pixels before. C is a permanent fixture. It is not going anywhere in your lifetime. You can use it, or not.

http://www.joshianlindsay.com/index.php?id=170


Given that the original article that this replies to starts with "It [C] works well for what it is, and what it is is a relatively simple layer of indirection atop assembly." I'm not sure "bashing" is an exactly accurate.

Given that this reply also starts with "To begin with, I agree with most of the things Eevee wrote", I'm not sure "Let's stop X" is the right title to use, even if it matches the theme of the original article.

The reply itself is... a decent alternative viewpoint, to a few of the points made, but I don't think it quite matches the title. And now some counter-counter points:

> What’s Wrong with Integer Division?

This is the wrong question. The right question is: What's wrong with integer division if and only if both arguments are integers, and ambiguously conflated with floating point division in syntax? I'm sufficiently used to it to not find it a big deal, but that's a poor argument for it's merits in new languages.

Nobody thinks integer division shouldn't be an option.

> However, the behavior of integer division is very useful. I would argue that in most cases, one expects it to behave just as it does.

I've seen even professionals trip up over the semantics of this just regularly enough to be annoying - you read float a = x/y; and assume terrible things about x and y, things that were perhaps even once true. Even if you really did intend to round, you probably assumed unsigned 'round down towards zero' semantics and will have a bug to contend with when x<0 and end up rounding up towards zero.

Personally, I haven't found it that difficult to adjust to languages that have a separate integer division operator, or languages requiring an explicit rounding operation, despite a long history of only using languages with C's semantics before them.

> Unfortunately, Eevee seems to fall for the extremely common “++ is equivalent with += 1” fallacy. You can see it’s a fallacious statement when, in the end, even the author herself admits that there are things that can’t be implemented in terms of “+= 1“; for instance, incrementing non-random-access iterators.

That Eevee herself points out that "++" is not always "+= 1" makes it quite clear that she hasn't "fallen" for any such fallacy. The point is that even including other unmentioned (ab)uses of overloading, there is always a way to rewrite things in an equivalent manner in terms of "+=", "-=", or custom functions - combined with either the abuse of the comma operator, or multiple statements. For the mentioned case of iteration - plenty of other languages have a separate, named function for iteration, even if they allow "++" and "--" to be overloaded.

The only real point of debate here is how often the "++" and "--" operators produce concise code, versus how often they produce write-only code.


It seems to me that a lot of issues could be better handled if we weren't limited to programming in a subset of the ASCII char set. Unfortunately, the alternative is to completely retool the modern keyboard.



Only if C stops bashing us first- it started it.


I will stop bashing C when C code looks as beautiful as Python code.


Of all the problems C has, looking ugly isn't one of them, and most of the ugliness would be done away with if the language adopted a proper module system and C++ "constexpr". The Linux kernel in particular is quite nice to look at: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux....


Let's stop bashing bashing.


Is prefer to keep bashing bash, if possible.


Hey, dont mock the wailing parade around the C-monument. This must be its 300 cycle and its still growing strong.

Helping the inertia by self-sabotaging with over idealistic , non backwards compatible languages, the software architects and computer scientists lead the procession.

Followed they are by self-flagellating programmers, trying to figure out who beat them to it again and again.

Followed they are by the silent businessman parade, who search for the perfect compromise between throw-away code and reuse ability on hardware where the solder-lead is not stable yet.

All hail the holy C, all hail the mighty procession of pain.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: