Wait; what? If C is not a Low-Level Language, then what is a Low-Level Language?...

dragonwriter · on Dec 27, 2019

> Wait; what? If C is not a Low-Level Language, then what is a Low-Level Language?

Assembly, actual machine code. (Contrary to the article, C was never a low-level language, when it was younger it was literally a textbook high-level language because it allows abstracting from the specific machine, and while it's less likely to be what a textbook points to as an example today, that hasn't changed.)

nwallin · on Dec 27, 2019

Following the metrics if the article, assembly language isn't low level either. Assembly language only gives you access to 16 integer registers and the 16 (?) sse/avx SIMD registers on x86_64. It doesn't give you access to the 64 or so integer registers or the who knows how many SIMD registers there are. Assembly instructions do not map to uops, no matter how much we pretend they do. We couldn't even program uops of we wanted to. These instructions are not executed in the order we specify them, and some of them are not executed at all: modern CPUs have their own dead code detectors and will drop instructions if it feels like it.

Assembly language programmers have less control of the microcode than raw JVM bytecode programmers have over the x86_64 instructions that eventually get executed have.

millstone · on Dec 27, 2019

Right, but that's the hardware interface. The CPU consumes a compressed instruction stream. Compression is achieved by the compiler via a lossy mapping of infinite registers onto a finite register set. This stream is then re-inflated by the CPU through discovering false dependencies in the interference graph via register renaming, and then cleaning up spilling via caching.

If this seems absurdly complex, it might be because of the absurd complexity. But the alternative has been tried, and tried and tried (RISC, VLIW), and always a failure. Well fuck.

nwallin · on Dec 27, 2019

I mean... sure?

But what's the point, what low level languages are there? The linked article is arguing that C isn't low level because modern CPUs behave so differently than what their hardware interface suggests they do. If we accept this argument, then assembly language isn't low level either, because it suffers from all these same limitations. If assembly language isn't low level, then why is "low level" even a phrase?

My point is that if you're going to argue that C isn't low level, then it's hard to argue that assembly language is. Conversely, if you're going to argue that assembly language is low level, it's hard to argue that C isn't. So it's flippant to argue (in this thread) that assembly language is low level without also rebuking the article or coming up with a persuasive argument as to why C shouldn't be lumped in with assembly language.

Personally, I think the article is wrong. C is low level. It is useful to distinguish between C and Python in terms of C is low level and Python is high level. It is a useful mental model, therefore I'm keeping it. But if people in this thread are going to make both arguments that the article is correct and assembly language is low level, you'll need to justify that fairly strongly.

I'm also not arguing that RISC or (ew) VLIW are the answer.

dragonwriter · on Dec 27, 2019

> The linked article is arguing that C isn't low level because modern CPUs behave so differently than what their hardware interface suggests they do.

Which is right in conclusion, but wrong on reasoning.

C isn't low level because it allows, by design, allows writing code that works on very different hardware interfaces by abstracting away from what the particular machine is does independently of whether or not the CPU behaves the way it's interface suggests. This is why decades ago C was a textbook example of an HLL and nothing relevant to that description has changed in the intervening period.

> It is useful to distinguish between C and Python in terms of C is low level and Python is high level.

Python is in the general class of languages for which the term very high level programming language was created, and, yes, it's useful to distinguish between Python and C (hence the term coined for that purpose), but it's also useful to distinguish between Assembly and C (hence the terms coined for that purpose.)

a1369209993 · on Dec 27, 2019

> allows writing code that works on very different hardware interfaces by abstracting away from what the particular machine is does

Looks at a 8/16bit in-order processor with synchronous, byte-at-a-time memory access and perhaps 1Kbit of on-chip registers total.

Looks at a 64bit, out-of-order, speculative, multicore behemoth with 64(or 72)bit data bus accessed by a embarassingly complicated asynchronous protocol, and cached in multiple MB of on-die RAM, as well as dozens of general purpose registers and hundreds if not thousands of special-purpose or model-specific registers.

Looks at QEMU and other x86 interpreters.

So what you're saying is that x86 assembly is a very bad high-level language?

snaky · on Dec 27, 2019

"Itanium was in-order VLIW, hope people will build compiler to get perf. We came from opposite direction - we use dynamic scheduling. We are not VLIW, every node defines sub-graphs and dependent instructions. We designed the compiler first. We build hardware around the compiler, Intel approach the opposite." https://www.anandtech.com/show/13255/hot-chips-2018-tachyum-...

abhishekjha · on Dec 27, 2019

>Assembly language programmers have less control of the microcode than raw JVM bytecode programmers have over the x86_64 instructions that eventually get executed have.

Can you expand on this a little? There is no compilation that happens for the assembly code as far as I am aware. Wouldn't that execute all of the code serially? I am not an expert in this domain, just curious.

DSMan195276 · on Dec 27, 2019

> There is no compilation that happens for the assembly code as far as I am aware. ... Wouldn't that execute all of the code serially?

Nope, that's largely what the article is getting at, more or less. Modern x86 processors optimized the x86 machine code so much that they quite literally 'compile' it down to what are called micro operations, and those micro operations are what the CPU actually executes. And then it goes beyond that, because the x86 machine code doesn't really map to the processor's actual implementation, so the CPU does extra things like register renaming, where it dynamically maps the 16 or so registers exposed in the machine code to say 64 or 128 internal registers (So an instruction like `inc %eax` may actually just write the incremented '%eax' to a completely new internal register rather than modifying the existing value, with that new internal register being the new `%eax`).

And it uses all of this to then aggressively execute the machine code completely out of order, by seeing which instructions have dependencies on other instructions and determining which can be executed out-of-order without affecting the end result. The point of doing this is that there's lots of actions that can stall the processor, with the big two being branching and fetching memory (Either from cache, or main memory). The CPU is much faster than memory and even cache, so any time you have to go to either of those causes a big performance hit, but if you can continue executing instructions during that time (Because they don't depend on that memory) then you can get a lot more performance.

For branching, it effectively prevents the out-of-order execution at that point because the CPU doesn't know what instruction will be executed after the branch. The CPU can do 'branch prediction' however, where it guesses the result of the branch and then keeps executing from that point while waiting for the branch to be resolved. If the guess was right, then there is no delay. If the guess was wrong, the work it did was thrown out and it starts executing from the right location.

Note that, generally speaking, none of these are bad things by themselves, I would even argue they're great things and adding such features to a processor is somewhat inevitable if you want to retain decades of compatibility like we have. But it has arguably resulted in hardware bugs like Spectre and Meltdown, though I would argue it's a lot more nuanced then that and then the article implies. And none of this really has to do with C, we're only talking about x86 assembly (which exists in the way it does almost purely for backwards compatibility).

Intel and AMD do not expose the micro operations in any form, preventing a lot of what the article is talking about. But at the same time, you can easily argue that's a good thing because if they did they would either need to support whatever form they expose for the next decade (And eventually result in a different set of weird optimizations to boost performance while maintaining compatibility), or you'd have to compile different versions of your code for every new CPU (Which would be a disaster).

Edit: I left out one more relevant detail (Which I'm only including because the article talks about it a fair amount) - the CPU requests memory in chunks called 'cache lines', usually 32, 64, or 128 bytes in size. This means that whether or not the CPU will have a particular piece of memory when you code is execution is a more complicated question, because if multiple parts of your code reference memory within the same cache line, it will be a lot faster since it will only require one memory fetch. And code that has no branches will all be in the same cache line (Or consecutive cache lines), which makes the out-of-order execution simpler since all of the code is already fetched. And more still, there's a complex process for ensuring consistency of cached memory across multiple cores/CPUs. Older CPUs didn't bother doing any of this because memory was fast enough to simply be read/written on demand without slowing the CPU down, so the x86 instruction set (generally) acts as though you're reading/writing directly to main memory, without any cache, and it's up to the CPU to maintain that illusion.

gpderetta · on Dec 27, 2019

Mapping to uops is a trivial translation that hardly counts as compilation. Everything else is dynamic scheduling and speculation which is also not complilation as it is (mostly) data dependent.

DSMan195276 · on Dec 27, 2019

> Mapping to uops is a trivial translation that hardly counts as compilation.

That's fair, but now we're just arguing the semantics of what is and isn't compilation :) I understand your criticism though, it's just a 'translator'.

dragonwriter · on Dec 27, 2019

> Following the metrics if the article

The metrics of the article may describe a useful distinction, but it's not really the one the language levels terminology was designed to capture, though it is not too distantly related.

DSMan195276 · on Dec 27, 2019

> Assembly, actual machine code

Actually, the author is effectively arguing that x86 assembly is not low-level. Which is somewhat true, but none of the levels under x86 assembly are exposed to the programmer, for the most part.

blihp · on Dec 27, 2019

Microcode. Back when C became a thing, microcode only existed on 'big iron'. That changed 20+ years ago.

The microarchitecture is the 'real' architecture you're running on, the ISA that assembly language and C code is written against is a facade. It has value in that we don't need to rewrite everything every few years when the microarchitecture changes, but the downside is what we consider low level programming languages talking 'directly' to the hardware are now going through another layer of abstraction.

inopinatus · on Dec 27, 2019

A matter of perspective.

”In a low-level language like C...” - applications programmer

”In a high-level language like C...” - chip designer

Tomte · on Dec 27, 2019

Only marginally lower, but a common example is Ada.

You can actually describe hardware registers sanely and portably in Ada. You cannot do that in C.

(It obviously still works, because C is ubiquitous, and so processor and compiler vendors do their hardest to "make it work", but that's no accomplishment of C)