Rust Atomics and Locks (2023)

tialaramex · 2024-08-14T00:04:52 1723593892

Because this book was written more than a year ago, it spends some time on Windows Slim Reader Writer Locks, SRWLocks which at that time of its writing were how Rust's Mutex and RwLock were implemented on Windows (by the author in fact IIRC).

Since then, two important things happened.

1. On Windows 8 and beyond Rust moved to WaitOnAddress with an API similar to the futex on several other systems.

2. We found out SRWLocks have a significant (arguably fatal, but depending on your use case it may seem irrelevant) difference between how they actually work and what Microsoft's API said about them. This bug is fixed... in Microsoft's own version control, not in released Windows versions.

Specifically SRWLocks may silently give you a Write Lock, even if you asked only for a Read Lock, in the case where the lock was just released at the moment you asked. If you were expecting other threads to also get a read lock, which would ordinarily be possible - too bad, you've secretly been given the exclusive write lock so read locks are unavailable until you release it.

The actual reason seems to be this: SRWLocks are small (a single pointer, with some low bits stolen to hide metadata) and the authors forgot that they actually know (because it's a different function call) whether you asked for a read or a write lock. Since they didn't have anywhere to store this single bit (read or write) they just assumed they don't know in this edge case where the lock happens to be available immediately, and since they "don't know" they always give you a write lock anyway. Oops.

[Edited to make minor clarifications]

forrestthewoods · 2024-08-14T00:28:10 1723595290

> Specifically SRWLocks may silently give you a Write Lock, even if you asked only for a Read Lock, in the case where the lock was just released at the moment you asked.

Oh hey, I reported that bug and behavior! One of my proudest reports. Not too often you find a legit bug, or at least documentation oversight, in such a core API.

saghm · 2024-08-14T04:11:08 1723608668

Definitely sounds like a legit bug to me! I'd only call this a "documentation oversight" in the sense that they could have the docs-equivalent of the fast-talking disclaimers at the end of commercials saying something like "notactuallyareadwritelockmayormaynotallowconcurrentreadsdonotusewhenpregnantorbreastfeeedinguseatyourownrisk"

tialaramex · 2024-08-14T01:30:05 1723599005

Yeah, I've also only very rarely seen real bugs in core system features (deadlock inside glibc's allocator a decade or more ago in my case IIRC) and I must say you got a much healthier reaction.

It's definitely an actual bug not a doc bug. I've explained this a few times to people and it does seem like the natural inclination is to assume SRWLock must be supposed to do that, but I was glad to see internally Microsoft did fix this, because it's clearly the Wrong Thing™.

I see two common defences for this bug. One is "Actually, it's supposed to be unfair, you don't understand why that's a good idea". Which assumes I'm expecting a fair lock and I'm unhappy not to get it. Giving a reader the reader lock when there's a writer waiting would be unfair - and probably a bad idea but I'm open to it if somebody presents benchmarks - but giving them the writer lock is just a bug.

Another is "Actually, there is a writer waiting, and this way that writer gets the lock faster". As you wrote the example code you know that's false, there is no writer waiting, there are only readers and (in the buggy scenario) they're blocked forever for no reason.

forrestthewoods · 2024-08-14T02:03:12 1723600992

> I must say you got a much healthier reaction.

I definitely spent a lot of time carefully crafting the message and iterating on a minimal repro.

Writing on the internet requires a LOT of defensive effort. It’s very annoying but is what it is. I write blog posts and my secret goal is “high views, low comments”. Because most comments are “well ackchyually“.

For the Reddit thread it helped that I tagged u/STL and he responded quickly in agreement. That was very intentional on my part! I’m still a little sad I didn’t get a Raymond Chen comment though!

tialaramex · 2024-08-14T07:41:25 1723621285

Does Raymond comment on Reddit? That would feel like a personal achievement indeed.

One of the smallest pieces of work I'm proud of is a tool that automates the labor of Raymond's "The poor man's way of identifying memory leaks". The part where you need to be familiar with how your data types look in memory isn't automated, that's on you, but my tool (leakdice because it replaces the hexadecimal dice I previously used for this in real life) picks a random page of heap in a chosen (Linux) process and shows you what's in it, the rest is up to you as Raymond explains.

forrestthewoods · 2024-08-14T18:02:09 1723658529

> Does Raymond comment on Reddit?

Not to my knowledge. But I kinda called him out for having a potentially false blog post so I was hoping he'd do a follow-up!

OptionOfT · 2024-08-14T00:47:08 1723596428

I've looked in the Rust issue tracker for SRWLocks but could only find older posts. Can you share some links?

tialaramex · 2024-08-14T01:16:12 1723598172

SRWLocks are a Windows feature, not a Rust feature, so you're looking in the wrong place.

Here's STL's (nominative determinism at work) GitHub issue for Microsoft's C++ stdlib implementation about this https://github.com/microsoft/STL/issues/4448

Here's the C++ Reddit thread where the bug was shown: https://www.reddit.com/r/cpp/comments/1b55686/maybe_possible...

Here's the Rust change which was merged for 1.78: https://github.com/rust-lang/rust/pull/121956/

hobofan · 2024-08-13T22:09:55 1723586995

Even if you are not into Rust, I'd recommend this book if you want to get into low-level and/or embedded programming. It's an exceptionally well written introduction into the most important topics there and ~80% of the book are not specific to Rust (or can be transferred just as well to other languages).

throwaway17_17 · 2024-08-14T06:01:42 1723615302

Thanks for this comment. I’ve had this book sitting on my To Be Read stack for about 10 months but kept bumping it down because I don’t use Rust, but the title and ToC was compelling enough for me to buy it and hope for non-Rust specific content. Even if your 80% estimate is 20% to large a guess, I’m definitely going to pull it up to the top of the stack and read it instead of passing it over.

self_awareness · 2024-08-14T08:36:40 1723624600

They can be transferred, but other languages have also other concepts and other mechanisms of synchronization. This Rust book seems solid, but I would not assume other languages to have the same mechanisms as in Rust the same way I wouldn't recomment using programming patterns from other languages in Rust.

nutate · 2024-08-13T23:41:51 1723592511

I wrote this in a review I believe, but this is one of the most comprehensive introductions to a good 80% of what could be considered a high performance computing education. It's extremely well written, in the weeds, but not lost in them. If you've done heavy atomics and or locks in C or C++ or with Fortran libraries, this will help show you how rust prevents so many footguns at compile time.

ridiculous_fish · 2024-08-13T23:45:56 1723592756

Any thoughts on the best way to express global locks in Rust?

A classic example is a set of bank accounts, atomically transacting with each other. Fine-grained per-account locking is possible, but risks deadlock due to lock ordering inversion. A simple solution is to replace per-account locks with a singleton global lock, covering all accounts. Any transaction must first acquire this lock, and now deadlock is impossible.

But this is an awkward fit for Rust, whose locks want to own the data they protect. What's the best way to express a global lock, enforcing that certain data may only be accessed while the global lock is held?

tialaramex · 2024-08-14T00:18:25 1723594705

I don't think this is an awkward fit. There's no reason it should be Mutex<CustomerBankAccount> it can be Mutex<MyTransferToken> or indeed Mutex<()> if you're really sure you don't actually want to lock any data.

One piece of advice I'd suggest is, write APIs which take that MyTransferToken to signify that you must take the lock before calling them, it can be a unit type (a Zero Size type, a struct with no members) at the start if you like, but I suspect you'll find that across several functions which take that MyTransferToken you realise actually the data I was going to put in a separate parameter is really always accompanying that token, and so it might as well go inside the the MyTransferToken and before you know it your unit type that was just to ensure correct locking is in fact an object with important data protected by the lock.

duped · 2024-08-14T01:59:33 1723600773

You would write

    Mutex<HashMap<CustomerID, Account>>

Instead of

    HashMap<CustomerID, Mutex<Account>>

Using a singleton is kind of overkill.

But in the real world you'd probably use a transactional database for this (there are a few in memory options) with retries.

returningfory2 · 2024-08-13T23:52:20 1723593140

Maybe define a new type in a such way that only 1 value of that type can be instantiated, and then put that unique value in a mutex. Any function that needs to hold the global lock accepts a value of the new type. Calling the function involves providing the value, which is proof that you are holding the mutex.

chc4 · 2024-08-14T00:01:54 1723593714

Use ghost_cell/qcell, and put the QCellOwner in a mutex. Acquiring the cell owner grants the ability to access the data, but doesn't tie ownership.

saghm · 2024-08-14T04:31:26 1723609886

My first instinct would be to use a static instance of `LazyLock`[1] wrapping the data (or `OnceLock`[2] if needed). `LazyLock` only just got stabilized, and `OnceLock` only a bit less recently, but both have equivalents available via the `once_cell`[3] package for a while.

[1]: https://doc.rust-lang.org/std/sync/struct.LazyLock.html [2]: https://doc.rust-lang.org/std/sync/struct.OnceLock.html [3]: https://crates.io/crates/once_cell

lucasyvas · 2024-08-13T23:48:34 1723592914

If I’m understanding correctly can’t you lock on some other account criteria, like IDs? A Mutex does have to own the data but you can still decide what it owns.

When you lock, the returned MutexGuard is owned, so you can pass it around or return it and it’s only dropped when it ultimately goes out of scope.

ra0x3 · 2024-08-13T22:04:10 1723586650

If you're into Rust and need a solid, no-fluff intro to atomics and locks, Mara Bos has you covered. It’s straight to the point, helping you nail down concurrency without the usual headache. Worth checking out if you're serious about leveling up your Rust game.

andrew_eu · 2024-08-14T09:33:33 1723628013

It's an excellent book and I agree with many of the comments -- while it's written for Rust, the vast majority of it is applicable to many languages. It's amazing to see the book published on their website for free (though I am still happy in having bought the book).

One thing I found lacking in the book were the examples. It has tons, but all of them are extremely focused on the topic they are illustrating and most feel very contrived. Would anyone here have a suggestion for a small/medium-sized project (weekend sized) which would actually use the patterns discussed in the book?

raphar · 2024-08-13T22:50:39 1723589439

Can you recommend any other book about the same subject but different programming language? Thanks!!!

chc4 · 2024-08-14T00:04:52 1723593892

"Is Parallel Programming Hard, And, If So, What Can You Do About It?" is one of the best books about atomics and concurrency, and my #2 recommendation. https://mirrors.edge.kernel.org/pub/linux/kernel/people/paul...

My #1 actually isn't a book at all, but the two "atomic Weapons" talks by Herb Sutter, which are extremely good. https://youtu.be/A8eCGOqgvH4 https://youtu.be/KeLBd2EJLOU

bonzini · 2024-08-14T09:33:16 1723627996

In the not-a-book section I'm going to plug my series of six articles on LWN.net: https://lwn.net/Articles/844224/

Most of the content is not too specific to Linux, much like this book is not too specific to Rust.

fanf2 · 2024-08-13T23:31:57 1723591917

I have used Rust a little, but this book was most useful to me when I was working on a concurrent data structure for an old C program. It’s a very good book for anyone writing low-level multi-threaded code in C or C++ as well as Rust, because they have basically the same primitives.

The only places I know where it isn’t applicable are the Linux kernel and Java, because their memory models and concurrency primitives predate and significantly differ from the Rust/C++/C models.

For the Linux memory model, there is Paul McKenney’s free book, “is parallel programming hard, and if so, what can you do about it?” https://cdn.kernel.org/pub/linux/kernel/people/paulmck/perfb...

npalli · 2024-08-13T23:14:45 1723590885

"C++ Concurrency in Action" by Anthony Williams is solid.

Make sure you get the second edition, it is updated for C++17.

https://www.amazon.com/C-Concurrency-Action-Anthony-Williams...

tialaramex · 2024-08-13T23:32:09 1723591929

I guess there must be at least one book about the Java Memory Model, which is very different but fascinating? I don't know of any specific books to recommend.

For many languages there is nothing resembling this, they tend to not get into the details Mara covers, if you get a mutex and maybe atomic arithmetic then they're done.

If you wondered about C or C++, this book is the same content as for those languages but with Rust's syntax. The discrepancy between Rust's memory model and the memory model adopted in C++ 11 and subsequently C is mostly about a feature that's not available in your C or C++ compiler and (which is why Rust doesn't have it) probably won't ever be.

C++ x.store(r1, std::memory_order_relaxed);

is literally the same thing as

Rust x.store(r1, std::sync::atomic::Ordering::Relaxed);

The biggest syntax difference is that C++ x.store(r1) compiles, and in Rust it doesn't. But, chances are after reading Mara's book you will think it's weird not to specify the Ordering needed and never use this uh, convenience.

freddierest · 2024-08-13T23:36:46 1723592206

The classic for Java is "Java Concurrency in Practice", a great book for more than just Java.

Java's happens-before memory model is similar to C++'s.

I'll prob get this book, if only for the memory model chapter.

ibraheemdev · 2024-08-13T23:40:21 1723592421

Java atomics are actually sequentially consistent. C# relaxes this to acquire/release. Though the general concept of happens-before is still immensely useful for learning atomics as sequential consistency is a superset of acquire/release.

freddierest · 2024-08-14T01:40:43 1723599643

Thanks for correction/clarification. Much as C# has a weaker memory model than Java, my mental model for memory models is weaker than I thought.

Where do Rust and C++ lie wrt C# and Java?

jcranmer · 2024-08-14T03:56:10 1723607770

All of the memory models in question are based on data-race-free, which says (in essence) that as long as all cross-thread interactions follow happens-before, then you can act as if everybody is sequentially-consistent.

The original Java 5 memory model only offered sequentially-consistent atomics to establish cross-thread happens-before in a primitive way. The C++11 memory model added three more kinds of atomics: acquire/release, consume/release (which was essentially a mistake [1]), and relaxed atomics (which, to oversimplify, establish atomicity without happens-before). Pretty much every memory model since C++11--which includes the Rust memory model--has based its definition on that memory model, with most systems defaulting an otherwise unadorned atomic operation to sequentially-consistent. Even Java has retrofitted ways to get weaker atomic semantics [2].

As a practical matter, most atomics could probably safely default to acquire/release over fully sequentially-consistent. The main difference between the two is that sequentially-consistent is safer if you've got multiple atomic variables in play (e.g., you're going with some fancy lockless algorithm), whereas acquire/release tends to largely be safe if there's only one atomic variable of concern (e.g., you're implementing locks of some kind).

[1] A consume operation is an acquire, but only for loads data-dependent on the load operation. This is supposed to represent a situation that requires no fences on any system not named Alpha, but it turns out for reasons™ that compilers cannot reliably preserve source-level data dependencies, so no compiler really implemented consume/release.

[2] Even Java 5 may have had it in sun.misc.Unsafe; I was never familiar with that API, so I don't know for certain.

gpderetta · 2024-08-14T16:00:52 1723651252

> as long as all cross-thread interactions follow happens-before, then you can act as if everybody is sequentially-consistent.

I don't think that's the actual guarantee. You can enforce happens-before with just acquire/release, but AFIK that's not enough to recover SC in the general case[1].

As far as I understand, The Data Race Free - Sequentially Consistent memory model (DRF-SC) used by C++11 (and I think Java), says that as long as all operation on atomics are SC and the program is data-race-free, then the whole program can be proven to be sequentially consistent.

[1] but it might in some special cases, for example when all operations are mutex lock and unlock.

62951413 · 2024-08-14T15:39:12 1723649952

Anyone with a Java background will mention JCiP. But there's another book going deeper - "Art of Multiprocessor Programming" (https://www.amazon.com/Art-Multiprocessor-Programming-Mauric...).

bonzini · 2024-08-14T15:44:29 1723650269

The book is good but it has a couple important drawbacks:

* while it tells you how to do lock-free programming but doesn't teach you why, nor whether you should it.

* it has a relatively narrow focus on linearizability, but the truth is memory is neither linearizable nor sequentially consistent. These days it is agreed that Lamport's "happens before" relationship and acquire-release are a better way to reason on multithreaded code.

imron · 2024-08-13T23:09:34 1723590574

This book! There’s Rust specific parts but the knowledge you gain will transfer across languages.

raphar · 2024-08-13T23:18:26 1723591106

I'm reading this at the moment! but I also want to compare how well each language helps you develop concurrency solutions.

iso8859-1 · 2024-08-14T02:00:16 1723600816

Parallel and Concurrent Programming in Haskell

https://simonmar.github.io/pages/pcph.html

self_awareness · 2024-08-14T08:39:08 1723624748

The book was already mentioned elswhere in this thread, but also a good, skimmable source of information is simply in javadoc:

https://docs.oracle.com/javase%2F8%2Fdocs%2Fapi%2F%2F/java/u...

miljanm · 2024-08-14T10:59:56 1723633196

Shared-Memory Synchronization by Michael L. Scott

https://link.springer.com/book/10.1007/978-3-031-38684-8

volonoctu · 2024-08-14T15:27:50 1723649270

Really good book and fairly comprehensive like a course. Probably has the best explanation of memory ordering. The 'build your own' teaching method is useful for understanding how the different data structures work.

mbsai29 · 2024-08-13T23:22:43 1723591363

If you are working with async rust, this book is a must-read. Clearly explains most of the primitives that are used in rust like Arc, Mutex, etc. The examples in the github repo are quite helpful and fairly intuitive if you follow along.

freddierest · 2024-08-13T23:32:52 1723591972

There's quite an escalation of topics for Rust.

If you want to write a HTTP server, people are guided towards Axum/Tokio, and thus async rust.

If you want to use async Rust, read this book?

This books covers assembly level atomics, and creating your own channels, in beginner chapters.

Is that necessary for writing a HTTP server in Rust?

From the topics in the TOC, this book is useful of you want you write concurrency primitives. I wouldn't recommend it if you just want to _use_ Arc/Mutex/crossbeam-channel.

skoocda · 2024-08-14T00:20:55 1723594855

I've done a precursory skim of this and plan to start reading it in earnest next week. Looks comprehensive and accessible. Very excited.

bk496 · 2024-08-14T09:00:58 1723626058

Why can't they distribute the edition as a PDF?

pythops · 2024-08-14T07:25:01 1723620301

Amazing book !

quohort · 2024-08-14T02:17:01 1723601821

Why do programming books always have some random unrelated illustration on the front?

Usually when you have a textbook, they will have some nice illustration that is tangentially related to the content of the book (like fibonacci spiral for a math book or some chemical reaction for a chemistry book for example). But I suppose that there isn't really such an equivalent unless it's a computer graphics book.

I guess it's also like how every project has to have its own "cutesey" mascot.

self_awareness · 2024-08-14T08:42:49 1723624969

This is mostly an O'Reilly thing.

https://www.oreilly.com/content/a-short-history-of-the-oreil...

TL;DR:

> Some of the people at O’Reilly were taken aback: they thought the animals were weird, ugly, and a bit scary. But Tim [O'Reilly] got it immediately—he liked the quirkiness of the animals, thought it would help to make the books stand out from other publishers’ offerings—and it just felt right.

They even have a browser which helps you identify the animal:

https://www.oreilly.com/animals.csp

otteromkram · 2024-08-14T02:21:05 1723602065

Yeah, especially that persnickety, "The Programming Language" book. Talk about obscure covers!

/s

I think it's just an O'Reilly thing. It keeps the people guessing.

Actually, the covers also make the books easily recognizable. Animals, statues, etc., are all good memory association drivers.