more pornel's comments | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit | more pornel's comments

login

pornel 7 months ago | parent | context | [–] | on: Unsafe Rust is harder than C

Unchecked shared mutability that causes data races and Undefined Behaviour is the pervasive default behavior in C, with no option to turn it off.

Safe Rust doesn't have this "feature".

This makes multi-threaded code in C very difficult to write correctly beyond simplest cases. It's even harder to ensure it's reliable when 3rd party code is involved. The C compiler has no idea what thread safety even is, so it can't help you (but it can backstab you with unexpected optimizations when you use regular types where atomics are required). It's up to you to understand thread-safety documentation of all code involved, if such documentation exists at all. It's even more of a pain to debug data races, because they can be impossible to reproduce when a debugger or printf slows down or accidentally synchronises the code as a side effect.

OTOH thread-safety is part of Rust's type system. Use of non-thread-safe data in a multi-threaded context is reliably and precisely caught at compile time, even across boundaries of 3rd party libraries and callbacks.

pornel 7 months ago | parent | context | | [–] | on: Why those particular integer multiplies?

BTW, this asymmetry makes unary negation in C an unexpected source of Undefined Behavior.

pornel 7 months ago | parent | context | | [–] | on: What Color is Your Function? (2015)

This article gets referenced a lot, but it does a poor job of defining what color actually is.

The biggest issue the article describes — inability for sync code to wait for an async result — is limited mostly to JavaScript, and doesn't exist in most other languages that have async and a blocking wait (they can call red functions from blue functions).

If this architectural hurdle is meant to be the color, then lots of languages with async functions don't have it. This reduces the coloring problem down to a minor issue of whether async needs to be used with a special syntax or not, and that's not such a big deal, and some may prefer async to be explicit anyway.

enragedcacti 7 months ago | | [–]

Python also has it. You can call async code with `asyncio.run` but it is extremely limited by the fact that nested event loops are prohibited. Any function that uses it becomes extremely brittle because it can't be called from async functions unlike all other synchronous code.

This is for the most part an intentional design decision to "make sure that when the user is already in async code they don't call the sync form out of laziness." [1]

[1] https://github.com/python/cpython/issues/66435#issuecomment-...

pornel 7 months ago | parent | context | | [–] | on: 3 Years After Allocating $5B for EV Chargers, 17 S...

> likeliest group of people to own an EV […] These people do not live in “disadvantaged communities.”

Isn't that the whole point? Early EV adopters are already creating a demand in richer areas, and that can be fulfilled by commercial suppliers. It's the poor areas that have a chicken-egg problem. EVs have to get cheaper, but that requires a mass-market demand for non-luxury vehicles. But people won't consider buying an EV if the infrastructure is not there, and no business will invest in chargers where there are no EV owners.

vannevar 7 months ago | | [–]

Yes. While it's true that most EV owners now are relatively affluent, there are a lot of sub-$10K used EVs on the market and if chargers are available in disadvantaged communities, those used EVs become a much more viable option.

AtlasBarfed 7 months ago | | | [–]

What we need to concentrate on is electrifying the parking lots of apartment buildings and parking garages. Doesn't need to be super-level-10 charging, it really can just be L2 or even L1 chargers.

That will bring a range of EVs to "disadvantaged": scooters, ebikes, blade scooters, skateboards, etc. It will also work for day-to-day driving in PHEVs and city car EVs, which we also desperately need to nudge car companies to start making.

The sodium ion battery EV drivetrain should enable a very very cheap city car. Sodium ion should be 40% the cost of NMC or less, and keep in mind that Tesla likely is at price parity with NMC-EV and ICE drivetrains right now.

kwere 7 months ago | | | [–]

people that live in the sticks have reasonably the space at home to charge EV's and most likely wouldnt use a charging station unless their workplace parking got some or they are on the road for a long trip.

The demand for charging station will be higher in expensive high density areas where people cant charge at home

pornel 7 months ago | parent | context | | [–] | on: Meta Bans Accounts Tracking Private Jets for Zucke...

They're not fighting anything. Your straw from a bar in a first-world country reliably ends up in a landfill, among heaps of other plastics. Far away from any turtles.

The Pacific garbage patch is mostly from fishing nets, and the other sources of plastic pollution are mainly from riverside cities too poor to have good waste management.

The whole plastic straw scare is so obviously attacking a minor non-issue at significant personal inconvenience, that I wouldn't be surprised if it was planted by climate deniers to make people angry at or cynical about actual environmentalism.

pornel 7 months ago | parent | context | | [–] | on: Rustls Outperforms OpenSSL and BoringSSL

This has been a deliberate design choice, because these primitives typically have to be constant-time, and are full of tricks to avoid CPUs' side channels. It's a very delicate code that is dangerous to rewrite.

However, TLS still involves a lot of code code that isn't pure low-level cryptography, like numerous protocol and certificate parsers, CA store interface and chain validation, networking, protocol state handling, etc.

nickpsecurity 7 months ago | | [–]

“ Rustls is a memory safe TLS implementation with a focus on performance.”

If the other commenter was right, then what they’re saying is that people seeing a Rust TLS stack outperform non-Rust stacks might assume critical operations were written in memory-safe Rust. Then, that the post was implying memory-safe Rust is fast even with low-level operations. That maybe they could use Rust to replace C/C++ in other low-level, performance-critical routines. Then, they find out the heavy-lifting was memory-unsafe code called by Rust.

It does feel misleading if a reader thought Rust was replacing ASM/C/C++ in the low-level parts. I mean, even the AI people are getting high performance wrapping unsafe accelerator code in Python. So, what’s that prove?

In these situations, I might advertise that the protocol engine is in memory-safe code while the primitives are still unsafe. Something like that.

pornel 7 months ago | | | [–]

The lowest-level routines need to be written mostly in assembly to have constant-time execution (which is a difficult task even in assembly due to the complexity of modern CPUs). None of Rust, C, nor C++ can guarantee constant-time execution, and all three have aggressive optimisers that can remove "useless" code that is meant to defend against side-channel leaks.

However, there's more to TLS than just the lowest-level primitives. There's parsing, packet construction, certificate handling, protocol logic, buffer management, connection management, etc. These things can be written in safe Rust.

nickpsecurity 7 months ago | | | [–]

That’s memory-safe Rust mixed with unsafe assembly. The Rust should block many errors that would exist in an unsafe stack. There’s definitely benefits even if the whole program is no longer memory safe.

It’s also the same strategy I would have used except maybe attempting extra verification of the assembly. It’s one of the best choices with today’s tools. There’s work on constant-time compilation and certification but I don’t know its maturity.

tptacek 7 months ago | | | | [–]

It's a post about a memory safe TLS stack outperforming the dominant memory unsafe TLS stack, with metrics. The only observation that detractors are making is that, as with virtually every TLS stack, the lowest-level cryptography primitives are done in assembly. Ok, and?

nickpsecurity 7 months ago | | | [–]

Because, by definition, it’s not a memory-safe, TLS stack at that point. Security is only as strong as its weakest link. If critical components aren’t memory safe, we don’t usually call it memory safe overall or claim it’s in a memory safe language without clear qualifiers.

The detractors are talking about how they’re marketing or describing it. They want the memory safe and Rust labels to only be used for memory safe and purely-Rust programs. That’s fair.

Outside the marketing, the stack is good work. I’m grateful for all the TLS teams do to keep us safer.

ozgrakkurt 7 months ago | | | [–]

I am switching to zig after writing rust professionally for 5+ years, but this take doesn’t make any sense having small amount of unsafe primitives is not the same as having all of your code unsafe. Especially higher level logic code can have a lot of mistakes, and the low level primitives very likely will be written by more experienced and careful people. This is the whole point of rust, even if it is questionable if it reaches it. Title only says rustls beats the other libraries which is objectively true so don’t see what is misleading here.

carlmr 7 months ago | | | [–]

>this take doesn’t make any sense having small amount of unsafe primitives is not the same as having all of your code unsafe

I've been arguing this for years. It makes the area you need to review more tightly much smaller. Making it way easier to find bugs in the first place. I'm sometimes wondering if unsafe was the right choice of keyword. Because to people that don't understand the language, it conveys the sense that Rust doesn't help with memory safety at all.

I've written a bunch of Rust, and rarely needed to use unsafe. I'd say less than 0.1% of the lines written.

Aside from that unsafe Rust still has a lot more safety precautions than standard C++. It doesn't even deactivate the borrow checker. [1]

[1] https://doc.rust-lang.org/book/ch19-01-unsafe-rust.html

nickpsecurity 7 months ago | | | [–]

In the past, safe vs unsafe meant whether it preserved invariants in all executions of your code. Was your code type- and memory-safe by default in all situations? Was there a guarantee? If so, it was safe. If breaking that guarantee or outside the type system, it was “unsafe.”

Note: I don’t know enough about Rust to tell you how they should label it.

Another thing you might find interesting is external verification of unsafe modules. What you do is build static analyzers, verifiers, etc that can prove the absence of entire categories of bugs, esp memory safety. It’s usually for small code. You run that on the code that doesn’t use memory safety.

Another technique is making a verified, reference implementation that’s used to confirm the high-performance implementation. Their interfaces and structure are designed to match. Then, automated methods for equivalence checking verify the unsafe code matches the safe code in all observed cases. The equivalence might be formal and/or test generators.

You can also wrap the unsafe code in safe interfaces that force it to be used correctly. I imagine the Rust TLS does this to some degree. Projects like miTLS go further to enforce specific, security properties during interactions between verified and unsafe code.

The last thing to consider are abstraction gap attacks. If mixing languages or models, then the behavior of one can make the other unsafe just because they work differently. Especially in how the compiler structures or links them. This led to real vulnerabilities in Ada code that used C just due to interactions, not the C code. Although previously checked by eye, there’s a new field called secure compilation or abstract compilation trying to eliminate the integration vulnerabilities.

Lastly, if not too bad for performance, some sandboxed the unsafe code with the interfaces checking communication both ways. Techniques used include processes (seL4), segments (GEMSOS), capabilities (CHERI), and physical (FPGA coprocessors). It’s usually performance-prohibitive to separate crypto primitives like this. Whereas, coprocessors can have verified crypto and be faster, though. (See Cryptol-generated VHDL.)

nickpsecurity 7 months ago | | | | [–]

There’s no disagreement between us on the value of using mostly memory safe code. I’ve advocated it here for years.

I also promoted techniques to verify the “unsafe” portions by using different, verification methods with some kind of secure linking to avoid abstraction gap attacks.

The detractors were complaining about changing the definition of memory-safe code. It was code in a language that was immune to classes of memory safety errors. If the code compiles, the errors probably can’t occur. A guarantee.

The new definition they’re using for this project includes core blocks written in a memory unsafe language that might also nullify the safety guarantees in the other code. When compiled, you don’t know if it will have memory errors or not. That contradicts what’s expected out of memory-safe code.

So, people were objecting to it being described as memory safe Rust if it included code blocks of memory-unsafe, not-Rust code. There’s projects that write the core, performance-critical blocks in safe languages. There’s also those doing making crypto safer, like Galois’ Cryptol or SPARK Skein. So, using the right terminology helps users know what they’re getting and reviewers do apples to apples comparisons.

For this one, they might say it’s “mostly safe Rust with performance blocks written in unsafe assembler for speed.” (Or whatever else is in there.) The high-security community has often talked like that. Instead of hurting perception, it makes suppliers more trustworthy with our users more educated on well-balanced security.

stoperaticless 7 months ago | | | | [–]

> Title only says rustls beats the other libraries which is objectively true so don’t see what is misleading here.

You are correct.

Although, communication has two parts: sending and receiving.

Application named “rustFoo”, automatically is an advertising for rust, and title “RustFoo is faster than Foo” for many implies “rust is faster than <probablyC>”.

rerdavies 7 months ago | | | | [–]

And.... It raises the very pointed question as to WHY they are getting better performance when all the performance-critical code is written in C/assembler in the Intel library. It seems inconceivable that 75% of the CPU profile isn't being spent in the Intel crypto library. In which case, big fat so what?

The question is: are they cheating?

Could it possibly be that that they have (somewhat suicidally) chosen to force the AVX512 execution path, when more reasonable implementations have decided that it's not really worth risking halving the performance of EVERY OTHER TASK ON THE ENTIRE COMPUTER in order to use AVX512 for a performance gain that isn't going to matter except in the very tiniest slice of use cases -- big iron running on the edge with dozens (hundreds?) of gazillo-bit/s network adapters, doing nothing but streaming TLS connections. Plus the fact that you'd have to lock your TLS encryption code to a particular CPU core on previous-generation CPUS, which is also a Really Bad Thing To Do for a TLS transfer.

I rather suspect it's entirely that.

Even on latest generation intel CPUs it's not clear whether using AVX512 for TLS is a sensible choice. AVX52 still drops the processor frequency by 10% on latest-gen CPUs. So every core on the entire CPU would have to be spending 80% (60%?) of their time running TLS crypto code in order to realize actual benefit from using AVX-512 crypto code.

That's what and.

pornel 7 months ago | parent | context | | [–] | on: The empire of C++ strikes back with Safe C++ bluep...

It's possible, but that's an exaggeration of a degenerate case. RC doesn't mean all data must be tangled into an unknowably large randomly connected web of objects.

The behavior is deterministic enough to be profiled and identified if it actually becomes an issue.

Identifying causes of pressure on a mark-and-sweep style GC is much more difficult, and depends on specialized GC instrumentation, not just a regular profiler.

In practice, you have predictable deallocation patterns for the vast majority of objects. The things that are "young generation" in a GC are the things that get deallocated right away in RC.

Time required to deallocate is straightforwardly proportional to the dataset being freed. This can be a predictable bounded size if you're able to control what is referencing what. If you can't do that, you can't use more constant-time alternatives like memory pools either, because those surprise references would be UAFs.

I've been there when Apple moved Cocoa from GC to ARC, and the UI stutters have disappeared. It's much more palatable to users to have RC cost happening in line with the work application is doing, than have it deferred to cause jank at unexpected times seemingly for no reason.

neonsunset 7 months ago | | [–]

Generational GC designs do not "deallocate" objects. In fact, most GCs don't. It's an understandable but an unfortunate misconception that sometimes causes developers to write more GC-unfriendly code than necessary.

When a collection in a generational GC occurs, the corresponding generation's heap is scanned for live objects, which are then usually relocated to an older generation. In a most common scenario under generational theory - only few objects survive, and most die in a young/nursery/ephemeral generation (.NET, JVM and other ecosystems have different names for the same thing).

This means that upon finishing the object relocation, the memory region/segment/heap that was previously used can be immediately made available to subsequent allocations. Sometimes it is also zeroed as part of the process, but the cost of such on modern hardware is miniscule.

As a result, the more accurate intuition that applies to most generational GC implementations is that pause time / CPU cost scales with live object count and inter-generational traffic. There is no individual cost for "object deallocation". This process is vastly more efficient than reference counting. The concern for overly high allocation traffic remains, which is why allocation and collection throughput are defining characteristics of GC implementations, alongside average pause duration, frequency, costs imposed by specific design elsewhere, etc.

Allocating and deallocating a complex graph of reference-counted objects costs >10x times more than doing so with a modern GC implementation. I don't know which implementation was used back in the day in Cocoa, but I bet it was nowhere as advanced as what you'd see today in JVMs or .NET.

pornel 7 months ago | | | [–]

You've misread my comment. All my uses of "deallocate" refer to RC.

pjmlp 7 months ago | | | | [–]

The issue is that Objective-C GC was a failure, but not for the reasons Apple marketed its pivot into RC.

Thankfully the documentation is still available, so those that were there can dig it out from the archives,

"Inapplicable Patterns"

https://developer.apple.com/library/archive/documentation/Co...

"Managing Opaque Pointers"

https://developer.apple.com/library/archive/documentation/Co...

"Compile GC-Only"

https://developer.apple.com/library/archive/documentation/Co...

"Memory Management Semantics"

https://developer.apple.com/library/archive/documentation/Co...

A few entry points into the documentation, where several ifs and buts are described on how to make use of Objective-C GC, without possibility having things go wrong.

All of this contributed for Objective-C not being a sound implementation with lots of radar issues and forum discussions, as you might expect trying to make existing projects, or any random Objective-C or C library now work under Objective-C GC semantics and required changes, wasn't that easy.

Naturally having the compiler automate retain/release calls, similar to how VC++ does with _com_ptr_t (which ended up being superseded by other ways) for COM, was a much better solution, without requiring a "rewrite the world" approach.

Automate a pattern developers were already expected to do manually, and leave everything else as it is, without ifs and buts regarding code best practices, programming patterns, RC / GC interoperability issues with C semantics and so on.

The existing retain/release calls wouldn't be manually written any longer, everything else stays the same.

Naturally Apple being Apple, they had to sell this at WWDC as some kind of great achievement of how RC is much better than GC, which in a sense is correct but only from point of view of the underlying C semantics and the mess Objective-C GC turned out to be, not tracing GC algorithms in general.

pjmlp 7 months ago | | | [–]

Were you also there when GC failure was actually underlying C semantics from Objective-C, producing random crashes, especially with mixed code bases, instead of the marketing material "why RC?"?

There is a reason why RC is considered the baby algorithm from automatic memory management algorithms.

Anything that people point out as optimizations, and profiling tools, also exist for the better algorithms.

Most languages with automatic memory management support, also offer primitives to deterministic call sites, if one so desires.

Finally it isn't as if Apple is a genius that managed to revolutionized memory management algorithms, doing in Cocoa what Microsoft was already doing with COM, was the natural way out given Objective-C' GC unsound implementation.

Swift's requirement to stay compatible with Objective-C memory management, naturally required the same approach, the alternative being something like CCW/RCW COM interop from .NET, which understandably they didn't want to go down to, given previous history.

pornel 7 months ago | | | [–]

My point was that RC in practice is pretty deterministic in the vast majority of cases, and I think equating its unpredictable rare cases with natural unpredictability of a GC is scaremongering. The differences between their non-deterministic behaviors are significant enough that they can't be simply equated with a sweeping generalization.

This is true regardless of which approach has higher-throughput implementations or lower overhead overall, and completely unrelated to Apple's marketing, or Microsoft being first at something.

pornel 7 months ago | parent | context | | [–] | on: My free book "Rust Projects – Write a Redis Clone"...

No, it's not a problem at all. There are already several high-performance databases written in Rust.

Lifetimes are mainly a learning barrier for new users, and affect which internal API designs are more convenient, but they're not a constraint on the types of applications you can write.

Rust strongly guides users towards using immutability and safeguards uncontrolled shared mutability, but you can use shared mutable memory if you want. In single-threaded programs that's trivial. In multi-threaded programs shared mutability is inherently difficult for reasons beyond Rust, and Rust's safeguards actually make the problem much more tractable.

pornel 7 months ago | parent | context | | [–] | on: The empire of C++ strikes back with Safe C++ bluep...

Fil-C sounds very similar to Google's MiraclePtr.

However, Safe C++ (Circle) and Rust do much more than that. They are not limited to pointers on the heap, and the borrowing rules work for all references including the stack. They also work for references that need to be logically short even when the data is not freed, e.g. internal references to data protected by a mutex don't outlive unlocking of the mutex. And all of that is at zero runtime cost, and by guaranteeing the code correctly doesn't create dangling references in the first place, not merely by softening the blow of run-time failures of buggy code.

pizlonator 7 months ago | | [–]

Fil-C is nothing like MiraclePtr. Fil-C gives you comprehensive memory safety.

Yes, it handles references to the stack. Misuse traps or leads to other safe outcomes.

Fil-C makes it so races have memory safe outcomes (like Java).

Circle and Rust are strictly less safe than Fil-C, since both have unsafe escape hatches. Fil-C doesn't even have an unsafe escape hatch.

pornel 7 months ago | | | [–]

Oh, I remember this project now. I see it still advertises 3x-10x overhead. To me this takes it out of being a contender in the systems programming space.

This can't be dismissed as a mere quality-of-implementation detail. C and C++ are used because they don't have such overheads, so it takes away the primary reason to use these languages. When non-negligible overhead is not a problem, there are plenty of nicer languages to choose from for writing new code.

This leaves Fil-C in a role of a sandbox for legacy code, when there's some unsafe C code that won't be rewritten, but still needs to be contained at any cost. But here you need to compete with WASM and RLBox which have lower-overhead implementations.

pizlonator 7 months ago | | | [–]

Fil-C was 200x slower when I started and the latest overheads are lower than 2x in a lot of cases. It’s getting faster every month, though I don’t always update the docs to say the latest numbers (because I’m too busy actually making it faster).

I think the reason why folks end up using C is often because they have a gigantic amount of C code and for those folks, Fil-C could easily be good enough as is.

But dismissing it as a contender because the perf isn’t there today even as it’s getting steadily faster (on a super aggressive trajectory) is a bit unfair, I think.

pornel 7 months ago | | | [–]

The success of this project is going to be very non-linear with speed, so it really hangs on where your speed improvements will plateau.

If you get below 2x, you can compete with WASM and ASAN. If you get it down to 1.1x-1.2x, you can compete with RLBox. If you get down to 1.05 you can call it software-emulated CHERI and kill the whole hardware line before it even comes out.

If you get it down to 1.01x, Rust will copy you, and then beat you by skipping checks on borrowed references ;)

neonsunset 7 months ago | | | [–]

1.05-1.7x is where C# places vs C. Except you also have an actual type system, rich set of tools to diagnose performance and memory issues and ability to mix and match memory management styles. It is rudimentary when compared to borrow checking and deterministic drop in Rust, but in the year of 2024 almost every language with low-level capabilities is an upgrade over C if it can be used in a particular environment.

pizlonator 7 months ago | | | | [–]

> If you get below 2x, you can compete with WASM and ASAN.

I'm at 1.5x for a lot of workloads already. I will be solidly below 2x for sure, once I implement all of the optimizations that are in my near-term plan.

Wasm and asan don't really give you memory safety. Wasm is a sandbox, but the code running within the sandbox can have memory safety bugs and those bugs can be exploited. Hackers are good at data-only attacks. Say you run a database in wasm - then a data-only attack will give the bad guy access to parts of the database they shouldn't have access to. Fil-C stops all that because Fil-C makes C memory safe rather than just sandboxing it.

Asan also isn't really memory safe; it just catches enough memory safety bugs to be a useful tool for finding them. But it can be sidestepped. Fil-C can't be sidestepped.

So, even if Fil-C was slower than wasm or asan, it would still be useful.

> If you get it down to 1.1x-1.2x, you can compete with RLBox.

RLBox is like wasm. It's a sandbox, not true memory safety. So, it's not a direct competitor.

That said, I think I'll probably land at about 1.2x overhead eventually.

> If you get down to 1.05 you can call it software-emulated CHERI and kill the whole hardware line before it even comes out.

I can already kill CHERI because Fil-C running on my X86-64 box is faster than anything running on any CHERI HW.

No, seriously.

The biggest problem with CHERI is that you need high volume production to achieve good perf in silicon, so a SW solution like Fil-C that is theoretically slower than a HW solution is going to be faster than that HW solution in practice, provided it runs on high volume silicon (Fil-C does).

I think I'm already there, honestly. If you wanted to run CHERI today, you'd be doing it in QEMU or some dorky and slow IoT thing. Either way, slower than what Fil-C gives you right now.

> If you get it down to 1.01x, Rust will copy you, and then beat you by skipping checks on borrowed references ;)

Rust is a different animal. Rust is highly restrictive due to its ownership model, in a way that Fil-C isn't. Plus, there's billions of lines of C code that ain't going to be rewritten in Rust, possibly ever. So, I don't have to be as fast as Rust.

I don't think there is going to be much copying going on between Rust and Fil-C, because the way that the two languages achieve memory safety is so different. Rust is using static techniques to prevent bad programs from compiling, while Fil-C allows pretty much any C code to compile and uses static techniques to emit only the minimal set of checks.

pornel 7 months ago | parent | context | | [–] | on: Smart pointers for the kernel

That's cool, but it'd be nice to also have a distinction between a get and a copy (like ObjC) or borrow/move (like Rust) to avoid redundant increments and decrements.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact