Yes, you can surely improve things from C. C is not a benchmark for anything oth...

pjmlp · on May 23, 2024

As proven multiple times throughout the computing history, individual tools are optional, and as such used less often than they actually should be.

Language specification is unavoidable when using said language.

fooker · on May 23, 2024

Have you wondered why Rust or Python do not have a specification?

For a bunch of languages outside the C-centric world, specifications don't exist.

pjmlp · on May 24, 2024

The certainly have, even if it isn't a ISO one.

https://docs.python.org/3/reference/index.html

https://docs.python.org/3/library/index.html

https://doc.rust-lang.org/reference/index.html

https://doc.rust-lang.org/std/index.html

https://ferrous-systems.com/blog/ferrocene-language-specific...

fooker · on May 25, 2024

Documentation and specification are not the same things.

The intuitive distinction is that the second one is for compiler/library developers, and the former is for users.

A specification can not leave any room for ambiguity or anything up to interpretation. If it does (and this happens), it is treated as a bug to be fixed.

lstodd · on May 23, 2024

mwahahaha. as if there is some divine "language specification" which all compilers adhere to on pain of eternal damnation.

no such thing ever existed.

pjmlp · on May 23, 2024

Given that one can write Fortran in any language, maybe you're right.

rcxdude · on May 23, 2024

it's not just in debug modes. It should be the standard in release mode as well (IMO the distinction shouldn't exist for most projects anyway). ASan and UBSan are explicitly not designed for that.

samatman · on May 23, 2024

Worth noting that Zig has ReleaseSafe, which safety-checks undefined behavior while applying any optimizations it can given that restriction.

The more interesting part is that the mode can be individually modified on a per-block basis with the @setRuntimeSafety builtin, so it's practical to identify the performance-critical parts of the program and turn off safety checks only for them. Or the opposite: identify tricky code which is doing something complex, and turn on runtime safety there, regardless of the build status.

That's why this sort of thing should be part of the specification. @setRuntimeSafety would be meaningless without the concept of safety-checked undefined behavior.

I would say that making optionals and fat pointers (slices) a part of the type system is possibly more important, but it all combines to give a fighting chance of getting user-controlled resource management correct.

Given the topic of the Fine Article, it's worth briefly noting that `defer` and `errdefer` are keywords in Zig. Both the test allocator, and the GeneralPurposeAllocator in safe mode, will panic if you leak memory by forgetting to use these, or rather, forget to free allocations generally. My impression is that the only major category of memory bugs these tools won't catch in development is double-free, and that's being worked on.

fooker · on May 23, 2024

Well, give it a try.

If you can make it work in a way that has acceptable performance characteristics, every systems language will adopt your technique overnight.

rcxdude · on May 23, 2024

I use rust, which already does this.

fooker · on May 23, 2024

Signed overflow is officially a 'bug' in rust, it traps in debug mode but silently follows LLVM/platform behavior in release mode.

Huh, doesn't that sound familiar?

steveklabnik · on May 23, 2024

> silently follows LLVM/platform behavior

This is not the case. It's two's compliment overflow.

Also, since we're being pedantic here: it's not actually about "debug mode" or "release mode", it is tied to a flag, and compilers must have that flag on in debug mode. This gives the ability to move release mode to also produce the flag in the future, if it's decided that the overhead is worth it. We'll see if it ever is.

> Huh, doesn't that sound familiar?

Nope, it is completely different from undefined behavior, which gives the compiler license to do anything it wants. These are well defined semantics, the polar opposite of UB.

fooker · on May 25, 2024

>This is not the case. It's two's compliment overflow.

Okay, here is an example showing that rust follows LLVM behavior when the optimizer is turned on. LLVM addition produces poison when signed wrap happens. I'm a little bit puzzled about the vehement responses in the comments wow. I have worked on several compilers (including a few patches to Rust), and this is all common knowledge.

https://godbolt.org/z/r6WTxGjrb

steveklabnik · on May 26, 2024

The Rust output:

  define noundef i32 @square(i32 noundef %x, i32 noundef %y) unnamed_addr #0 !dbg !7 {
    %_0 = add i32 %y, %x, !dbg !12
    ret i32 %_0, !dbg !13
  }

Let's compare like to like, here's one with equivalent C++ code: https://godbolt.org/z/Y4MnGeof4

The C++ output:

  define dso_local noundef i32 @square(int, int)(i32 noundef %0, i32 noundef %1) local_unnamed_addr #0 !dbg !99 {
    tail call void @llvm.dbg.value(metadata i32 %0, metadata !104, metadata !DIExpression()), !dbg !106
    tail call void @llvm.dbg.value(metadata i32 %1, metadata !105, metadata !DIExpression()), !dbg !106
    %3 = add nsw i32 %1, %0, !dbg !107
    ret i32 %3, !dbg !108
  }

> LLVM addition produces poison when signed wrap happens.

https://llvm.org/docs/LangRef.html#add-instruction

> nuw and nsw stand for “No Unsigned Wrap” and “No Signed Wrap”, respectively. If the nuw and/or nsw keywords are present, the result value of the add is a poison value if unsigned and/or signed overflow, respectively, occurs.

Note that Rust produces `add`. The C++ produces `add nsw`. No poison in Rust, poison in C++.

Here is an example of these differences producing different results, due to the differences in behavior: https://godbolt.org/z/Gaonnc985

Rust:

  define noundef zeroext i1 @test() unnamed_addr #0 !dbg !14 {
    ret i1 true, !dbg !15
  }

C++:

  define dso_local noundef zeroext i1 @test()() local_unnamed_addr #0 !dbg !123 {
    tail call void @llvm.dbg.value(metadata i32 undef, metadata !128, metadata !DIExpression()), !dbg !129
    ret i1 false, !dbg !130
  }

This is because in Rust, the wrapping behavior means that this will always be true, but in C++, because it is UB, the compiler assumes it will always be false.

> I'm a little bit puzzled about the vehement responses in the comments wow.

You are claiming that Rust has semantics that it was very, very deliberately designed to not have.

samatman · on May 24, 2024

Rust includes a great deal of undefined behavior, unlocked with the trustme keyword. Ahem, sorry, unsafe. If only...

So if we're going to be pedantic, it's safe Rust which has defined semantics for basically everything. A considerable accomplishment, to be sure.

steveklabnik · on May 24, 2024

While this is true, we’re talking about integer overflow. That’s part of safe Rust. So it’s not really germane to this conversation.