Having more options available in the Linux kernel is always beneficial. However,...

tialaramex · on July 15, 2024

> But you can't do everything that C does without using unsafe blocks

For this particular work the huge benefit of Rust is its enthusiasm for encapsulating such safety problems in types. Which is indeed what this article is about.

C and particularly the way C is used in the kernel makes it everybody's responsibility to have total knowledge of the tacit rules. That cannot scale. A room full of kernel developers didn't entirely agree on the rules for a data structure they all use!

Rust is very good at making you aware of rules you need to know, and making it not your problem when it can be somebody else's problem to ensure rules are followed. Sometimes the result will be less optimal, but even in the Linux kernel sub-optimal is often the right default and we can provide an (unsafe) escape hatch for people who can afford to learn six more weird rules to maybe get better performance.

mjburgess · on July 15, 2024

> That cannot scale.

lol... you're talking about the linux kernel, written in C.

The vast majority of software over many decades "bottoms out" in C whether in VMs, operating systems, device drivers, etc.

The scale of the success of C is unparalleled.

pjc50 · on July 15, 2024

The scale of C adoption is certainly unparalleled over the past 40 or so years, but so are the safety issues in the cyberwarfare era.

https://www.whitehouse.gov/oncd/briefing-room/2024/02/26/pre...

If, somehow, we'd got to an era where (a) operating systems were widely deployed in a different language, and (b) the Morris Worm of 1988 had happened due to buffer overflow issues, then C in its current form would never have been adopted.

mjburgess · on July 15, 2024

C is just convenient assembly. In an era where performance mattered, and much software was written for hardware, and controlling hardware, it's hard to see an alternative.

C's choices were for performance on hardware-limited systems. I don't really see what other ones made sense historically.

pjc50 · on July 15, 2024

C is, in some important cases, less convenient than assembly in ways which have to be worked round either fooling the compiler or adding intrinsics. A recent example: https://justine.lol/endian.html

Is the huge macro more convenient than the "bswap" instruction? No, but it's portable.

> I don't really see what other ones made sense historically.

Pascal chose differently in a couple of places. In particular, carrying the length with strings.

C refused to define semantics for arithmetic. This gave you programs which were "portable" so long as you didn't mind different behavior on different platforms. Good for adoption, bad for sanity. It was only relatively recently they defined subtraction to be twos-complement.

16-bit Windows even used C with the Pascal calling convention. http://www.c-jump.com/CIS77/ASM/Procedures/P77_0070_pascal_s...

another2another · on July 15, 2024

>In an era where performance mattered, and much software was written for hardware, and controlling hardware, it's hard to see an alternative

Actually, what made sense _was_ assembly when performance mattered above all. C was actually seen as a higher level language.

However C's advantage was the fact that it was cross platform, so you could compile or quite easily port the same code to many different platforms with a C compiler (Solaris,Windows,BSD,Linux and latterly Mac OSX). That was its strength (pascal shared this too, but it didn't survive).

You can see this in the legacy of software that's still in use today - lots of gnu utilities, shells, X windows, the zlib library, the gcc, openssl and discussed fairly recently POV Ray which has been going since the 80's.

kelnos · on July 15, 2024

> C is just convenient assembly.

I'm not sure if you're being facetious here, but that's absurd. It is certainly one of our lowest-level options before reaching for assembly, but it's still a high-level language that abstracts machine details from the programmer.

> In an era where performance mattered, and much software was written for hardware, and controlling hardware, it's hard to see an alternative.

During that era, people who really needed to care about performance used assembly. The optimizations done by C compilers at that time were not nothing, but they were fairly primitive to what they do now.

freeone3000 · on July 15, 2024

But it doesn’t have to. We can choose any other language that compiles to native, including memory-safe ones.

dxroshan · on July 15, 2024

I agree with you.

drdo · on July 15, 2024

But unsafe blocks are available! And you should use them when you have to, but only when you have to.

Using an unsafe block with a very limited blast radius doesn't negate all the guarantees you get in all the rest of your code.

sanxiyn · on July 15, 2024

Note that unsafe blocks don't have limited blast radius. Blast that can be caused by a single incorrect unsafe block is unlimited, at least in theory. (In practice there could be correlation of amount of incorrectness to effect, but same also could be said about C undefined behavior.)

Unsafe blocks limit amount you need to get correct, but you need to get all of them correct. It is not a blast limiter.

neysofu · on July 15, 2024

I believe this is technically true, but somewhat myopic when it comes to how maintainers approach unsafe blocks in Rust.

UBs have unlimited blast radius by definition, and you'll need to write correct code in all your unsafe blocks to ensure your application is 100% memory-safe. There's no debate around that. From this perspective, there's no difference between a C application and a Rust one which contains a single, incorrect unsafe block.

The appreciable difference between the two, however, is how much more debuggable and auditable an unsafe block is. There's usually not that many of them, and they're easily greppable. Those (hopefully) very few lines of code in your entire application benefit from a level of attention and scrutiny that teams can hardly afford for entire C codebases.

EDIT: hardy -> hardly (typo)

weinzierl · on July 15, 2024

Yes, they don't contain the blast, but they limit the places where a bomb can be, and that is their worth.

foldr · on July 15, 2024

Generally speaking yes, but there could be a logic error somewhere in safe code that causes an unsafe block to do something it shouldn’t. For example, a safe function that is expected to return an integer less than n is called within an unsafe block to obtain an index, but the return value isn’t actually less than n. In that case the ‘bomb’ may be in the unsafe block, but the bug is in the safe code.

nicce · on July 15, 2024

> yes, but there could be a logic error somewhere in safe code that causes an unsafe block to do something it shouldn’t.

Sounds like bad design. You can typically limit the use for unsafe for so small area than you can verify the ranges of parameters which will cause memory problems. Check for invalid values and raise panic. Still ”memorysafe”, even if it panics.

foldr · on July 15, 2024

Sure, it may be bad design. The point is that nothing in the Rust language itself guarantees that memory safety bugs will be localized to unsafe blocks. If your code has that property it’s because you wrote it in a disciplined way, not because Rust forced you to write it that way (though it may have given some moral support).

Let me emphasize that I am not criticizing Rust here. I am just pointing out an incontrovertible fact about how unsafe blocks in Rust work: memory safety bugs are not guaranteed to be localized to unsafe blocks.

Klonoar · on July 15, 2024

I cannot imagine writing a method to return a value less than n, and not verifying that constraint somewhere in the safe method.

foldr · on July 15, 2024

It’s just a simple example to illustrate the point. Realistic bugs would probably involve more complex logic.

The prevalence of buffer overrun bugs in C code shows that it very definitely is possible for programmers to screw up when calculating indices. Rust removes a lot of the footguns that make that both easy to do and dangerous in C. But in unsafe Rust code, you’re still fundamentally vulnerable to any arithmetic bug in any function that you call as part of the computation of an index.

drdo · on July 15, 2024

That is of course correct.

The main value is that you only have to make sure that a small amount of code surrounding the unsafe block is safe, and hopefully you provide a safe API for the rest of the code to use.

CraigJPerry · on July 15, 2024

I’d word that different- it reduces the search space for a bug when something goes wrong but it doesn’t limit the blast radius - you can still spectacularly blow up safe rust code with an unsafe block (that no aliases rule is seriously tough to adhere to!)

This is definitely a strong benefit though.

bigstrat2003 · on July 15, 2024

> But you can't do everything that C does without using unsafe blocks. Rust can offer a fresh perspective to these problems, but it's not a complete solution.

It's true that you need to have unsafe code to do low level things. But it's a misconception that if you have to use unsafe then Rust isn't a good fit. The point of the safe/unsafe dichotomy in Rust is to clearly mark which bits of the code are unsafe, so that you can focus all your attention on auditing those small pieces and have confidence that everything else will work if you get those bits right.

pjc50 · on July 15, 2024

> But you can't do everything that C does without using unsafe blocks

How much of this is actually 100% unambiguously necessary? Is there a good reason why anything in the filesystem code at all needs to be unsafe?

I suspect it's a very small subset needed in a few places.

nicce · on July 15, 2024

Usually avoidance of copying or moving data is the primary reason. In filesystems, this is quite highlighted.

bilekas · on July 15, 2024

> Concurrency problems?

I have to admit, while I do enjoy rust in the sense that it makes sense and can really "click" sometimes. For anything asynchronous I find it really rough around the edges. It's not intuitive what's happening under the hood.

the_duke · on July 15, 2024

Async != concurrency.

One of the major wins of Rust is encoding thread safety in the type system with the `Send` and `Sync` traits.

bilekas · on July 15, 2024

> Async != concurrency.

Right, but tasks are sharing the same thread which is fine, but when we need to expand on that with them actually working async, i.e non blocking, fire and quasi-forget, its tricky. That's all I'm saying.

the_duke · on July 15, 2024

The Rust async experience indeed has lots of pitfalls, very much agree there.

dboreham · on July 15, 2024

s/The Rust/All/

duped · on July 15, 2024

async == concurrency, concurrency != parallelism.

sophacles · on July 15, 2024

async == concurrency in the same way square == rectangle - that is it's not an associative '==' since there are plenty of rectangles that are not squares.

wongarsu · on July 15, 2024

Rust async isn't all that pleasant to use. On the other hand for normal threaded concurrency Rust is one of the best languages around. The type system prevents a lot of concurrency bugs. "Effortless concurrency" is a tagline the language really has earned.

asyx · on July 15, 2024

I really hate async rust. It's really great that rust forces you on a compiler level to use mutexes but async is a disease that is spreading through your whole project and introduces a lot of complexity that I don't feel in C#, Python or JS/TS.

John23832 · on July 15, 2024

Eh, syntactically async rust is the exact same as C#. It's all task based concurrency.

Now, lifetimes attached to function signatures is definitely a problem.

colejohnson66 · on July 15, 2024

Not really. C#'s Task/Task<T> are based on background execution. Once something is awaited, control is returned to the caller. OTOH, Rust's Future<T> is, by default, based on polling/stepping, a bit like IEnumerable<T> in C#; If you never poll/await the Future<T>, it never executes. Executor libraries like Tokio allow running futures in the background, but that's not built-in.

brigadier132 · on July 15, 2024

How do you imagine async works otherwise? Also, in case you misunderstand how polling works in practice in rust, it's not polling in the traditional web development sense where it polls every 5 ms to check if a future is completed (although you can do this if you want to for some reason). There are typically "wakers" that are "awoken" by the os when data is ready and when they are "awoken" then they poll. And since they are only awoken by the OS when the information is ready it really never has to poll more than once unless there are multiple bundled futures.

John23832 · on July 15, 2024

I don't want to "well actually" the "well actually", but I think you missed the word syntactically.

> C#'s Task/Task<T> are based on background execution. Once something is awaited, control is returned to the caller.

Async/await in any language happens in the background.

What happens during a Task.Yield() (C#)? The task is yielded to the another awaiting task in the work queue. Same as Rust.

> OTOH, Rust's Future<T> is, by default, based on polling/stepping,

The await syntax abstracts over Future/Stream polling. The real difference is that Rust introduced the Future type/concept of polling at all (which is a result of not having a standard async runtime). There is a concept of "is this task available to proceed on" in C# too, it's just not exposed to the user and handled by the CLR.

merb · on July 15, 2024

> Task.Yield()

In c# you probably never call yield.

neonsunset · on July 15, 2024

Yield in C# is frequently used for the same reasons as in Rust, although implementation details between fine-grained C# Tasks and even finer grained Rust Futures aggregated into large Tasks differ quite a bit.

Synchronous part of an async method in C# will run "inline". This means that should there be a computationally expensive or blocking code, a caller will not be able to proceed even if it doesn't await it immediately. For example:

    var ptask = Primes.Calculate(n); // returns Task<ulong[]>
    // Do other things...right?
    // Why are we stuck calculating the primes then?
    Console.WriteLine("Started.");

In order for the .Calculate to be able to continue execution "elsewhere" in a free worker thread, it would have to yield.

If a caller does not control .Calculate, the most common (and, sadly, frequently abused) solution is to simply do

    var task = Task.Run(Primes.Calculate);
    // Do something else
    var text = string.Join(',', await task);

If a return signature of a delegate is also Task, the return type will be flattened - just a Task<T>, but nonetheless the returned task will be a proxy that will complete once the original task completes. This successfully deals with badly behaved code.

However, a better solution is to instead insert `Task.Yield()` to allow the caller to proceed and not be blocked, before continuing a long-running operation:

    var ptask = Primes.Calculate(n); // returns Task<ulong[]>
    // Successfully prints the message
    Console.WriteLine("Started.");


    static async Task<int[]> CalculatePrimes(int n)
    {
        await Task.Yield();
        // Continue execution in a free worker thread
        // If the caller immediately awaits us, most likely
        // the caller's thread will end up doing so, as the
        // continuation will be scheduled in the local queue,
        // so it is unlikely for the work item to be stolen this
        // quickly by another worker thread.
    }

John23832 · on July 15, 2024

It was just an example. In practice, you're right.