More

sherincall · 2025-09-10T07:14:24 1757488464

> completely impossible with printf

Not printf exactly, but I've found bugs with a combination of mprotect, userfaultfd and backtrace_symbols when I couldn't use HW breakpoints.

Basically, mark a set of pages as non-writable so that any writes trigger a pagefault, then register yourself as a pagefault handler for those and see who is doing the write, apply the write and move on. You can do this with LD_PRELOAD without even recompiling the debugee.

antonchekhov · 2025-09-10T20:03:27 1757534607

Back in the early '90s, I was a big fan of the C/C++ debugging library "electric fence" (written by Bruce Perens) - it was a malloc() implementation that used mmap() to set the pages of the returned buffer such that any writes in that region (and even read accesses!) caused a segfault to happen, and the program halts, so you can examine the stack. It was a godsend.

jesse__ · 2025-09-10T15:17:08 1757517428

Oh yeah, I've done that trick before, it's quite handy. Sometimes catches bugs that would be tricky to find with watchpoints. I wrote an allocator that allocates an extra page either before or after the users allocation, and aligns the user allocation such that it is as close as possible (while respecting user-requested alignment) to butting up against that extra page (marked PROT_NONE). Can catch underflows or overflows with this method, but not both simultaneously. Very handy from time to time.

sherincall · 2025-07-22T11:38:14 1753184294

wcc can do that for you: https://github.com/endrazine/wcc

nly · 2025-07-23T22:06:09 1753308369

There's no also lief: https://lief.re/doc/latest/tutorials/08_elf_bin2lib.html

EE84M3i · 2025-07-22T11:44:01 1753184641

Woah this is really awesome! Thanks for sharing, this made my day.

sherincall · on Feb 22, 2024

For whatever it's worth it to you, I've been with NVIDIA for over 10 years and I wouldn't switch to F,A,A,N nor G for even 2x my total comp. Feel free to shoot any questions to myusername@gmail.com if it'll help you decide.

Or, just talk to your hiring manager(s) about the dilemma. For both companies, it'd be better to talk through any concerns before joining. If you accept one offer and then regret it a few months later, everyone loses: you, NVIDIA and [FAANG].

sherincall · on Oct 24, 2023

Because of stock appreciation, they'd need to pay up to 10x what the other company is already paying.

Consider a senior SWE at e.g. NVIDIA making $300k, let's say halfsies split between base and RSU. The $150k RSU grant from a few years ago that still hasn't fully vested is worth $1M now. So even for a 50% increase in nominal total comp, it doesn't make financial sense to switch.

I don't know that anyone can afford to pay their engineers 10x more than the competition, and then also offer a cheaper product with thinner margins.

HWR_14 · on Oct 24, 2023

What percentage of the cost is the engineering?

sherincall · on Oct 6, 2022

Some more discussion and clarifications about this at https://github.com/NVIDIA/open-gpu-kernel-modules/discussion...

sherincall · on May 2, 2021

The memcpy API says that is Undefined Behavior, that program was never valid. Not much different from bitbanging specific virtual addresses and expecting they never change.

For overlapping memory, use memmove()

aidenn0 · on May 2, 2021

Yes, the program was invalid, but it was also accidentally bug free. The two are not mutually exclusive.

dooglius · on May 2, 2021

It was never valid taking in a generic shared interface to libc, but a statically linked version would have been valid.

sherincall · on May 2, 2021

C language makes a difference between Undefined Behavior and Implementation Defined Behavior. In this case it's the former (n1570 section 7.24.2.1).

Any code that invokes UB is not a valid C program, regardless of implemention.

More practically, ignoring the bug that did happen, libc also has multiple implementations of these functions and picks one based on HW it is running on. So even a statically linked glibc could behave differently on different HW. Always read the docs, this is well defined in the standard.

dooglius · on May 2, 2021

The source code may have had UB, but the compiled program could nevertheless have been bug-free.

> Any code that invokes UB is not a valid C program, regardless of implemention.

I disagree. UB is defined (3.4.3) merely as behavior the standard imposes no requirements upon. The definition does not preclude an implementation from having reasonable behavior for situations the standard considers undefined.

This nuance is very important for the topic at had because many programs are written for specific implementations, not specs, and doing this is completely reasonable.

sherincall · on Oct 26, 2020

There is now a big fat disclaimer at the top of the file saying basically the same thing:

https://github.com/nothings/stb/blob/master/stb_truetype.h#L...

ATsch · on Oct 26, 2020

Ah, that's good, despite being six years late. However, I do not think there is any good reason for code written like this to exist in the first place. Bounds checking is easy and fast (even if less so in C) and things you consider "trusted" can become untrusted easily and quickly turn otherwise minor bugs like file type confusion into critical vulnerabilities.

ludocode · on Oct 26, 2020

I mostly agree with you. An important qualifier is that these libraries are meant for games, especially games that package their own font files, and especially games that are on locked-down platforms like consoles and mobile phones. On console games it makes sense to exclude bounds-checking your own data because you want to minimize load times. But yes, bounds checking should have been included and have been on by default with an option to disable it.

ATsch · on Oct 26, 2020

I think rust has indirectly shown that the performance impact of bounds checking mostly neglegible in practice. Even in very high performance code, I've never seen anyone turn them off for performance. This makes a lot of sense considering that on modern CPUs everything except cache misses is basically free, so a single unlikely branch compare usually just does not matter.

The only exception I can think of is accessing lots of indexes in a loop, where getting good performance requires rust programmers to insert awkward asserts or casts to fixed length arrays to get the checks to optimize out[1]. But afaik that's mostly because the bound checks impede other loop optimisations like vectorization, not because the checks are slow themselves.

[1] (e.g. when you access indexes 1 to n randomly, assert n < length before)

steveklabnik · on Oct 26, 2020

We have seen some of this; Dropbox has a macro that lets them turn off checks with a feature flag, for example, because it had a noticeable impact.

ATsch · on Oct 26, 2020

Ah, that's interesting, I assume this is the link:

https://dropbox.tech/infrastructure/lossless-compression-wit...

Shows a significant improvement in compression code and some examples of checks that can't be elided. I guess this makes sense because it's pretty much the worst case scenario for bounds check impact.

I can still personally say that every time I've blamed rust's bounds checks, I was wrong.

EDIT: neither the unsafe flag nor the macro appear to be present in https://github.com/dropbox/rust-brotli/ anymore today, removed some time in 2017. So it seems they found some other way to deal with it?

steveklabnik · on Oct 26, 2020

Ah interesting! Glad they removed it. I can't quite find out when, and I don't have the time to really dig in right now. Very cool, thanks.

sherincall · on Oct 19, 2020

Be that as it may, existing software that used to work will no longer work, and the developers might not be around/able to fix it.

There are also cases of custom suballocators or arrays of objects - Looking at an address makes it possibly to figure out which array it belongs to. This code would break.

Granted, it would still be possible to do all this if you just mask off the tag bits, but it requires a software change.

pm215 · on Oct 19, 2020

The "suballocate out of some arrays" code should not break, because the whole array would be allocated at once and so would have the same tag for the whole range. Code that does a simple "is this pointer value inside the "array_base + size" range" continues to work, because array_base has whatever tag malloc() handed out for that array, and so do the pointer values that the suballocator handed out. I think for MTE to break your code you would have to be doing some pretty weird stuff with pointer arithmetic (beyond just the usual "technically maybe this is undefined behaviour but it works" level stuff).

It's always the case that some software that does things that are not valid-by-the-language-standard might break if run on a newer version of the OS or a newer system library version (remember the big flap about glibc memcpy() changing its behaviour when called for overlapping regions?). You don't want to break lots of software gratuitously, but sometimes the tradeoff is worth making.

sherincall · on Oct 19, 2020

The estimated number of atoms on earth is on the order of 1e50, which is 60 bits. If you take the 4 bits out of the 64bit address for tagging, you can still address individual atoms.

EDIT: My math is bad, see comments below.

I can unsarcastically say that no one will ever need full 64bit addresses on this planet.

five-robert · on Oct 19, 2020

2^66 bits - estimated storage space at Google data warehouse as of 2013 [0]

2^71 bits - total hard drive capacity shipped in 2016 [0]

I doubt your calculations.

[0] https://en.wikipedia.org/wiki/Orders_of_magnitude_(data)

sherincall · on Oct 19, 2020

Yes, you're totally right, my math is way off.

jmgao · on Oct 19, 2020

10^50 is definitely not 60 bits: log_2(10^50) = 50 * log_2(10), and log_2(10) > 3

sherincall · on Oct 5, 2020

I don't think it is fair to pin this on steam. Plenty of games on steam run without steam, and require no installation - just copy them over to a different PC. They often come with binaries for all supported OSes at once, so your windows install will work on linux too.

It is the game publishers that choose to use steam drm. Often the same publishers publish on GOG without the drm (since gog, admirably, does not offer a drm option to them), but it is not steam that is forcing the drm on you. Your issue is with the game publisher.

Moru · on Oct 5, 2020

No, it's in the agreement when you sign up for Steam. You don't own, you license.

ClumsyPilot · on Oct 5, 2020

Well the only owner is the copyright holder