more pornel's comments | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit | more pornel's comments

login

pornel 54 days ago | parent | context | [–] | on: C++ creator calls for help to defend programming l...

Stroustrup is right that C++ is under pressure, but I'm baffled that his response to this is so ineffective.

The focus on incremental backwards-compatible changes is only looking inwards, and isn't facing the external pressure. Limiting solutions to ones that don't require rewriting C++ code is not going to stop those who have already started rewriting their C++ code. Banning proposals that create safe subsets or lifetime annotations (d3466 R1 4.4/4.5) isn't satisfying to anyone who is already jumping to new languages with these things.

The direction of C++ WG only cements the perception that C++ is for unfixable legacy codebases. They have decided that having the cake is absolutely critical, and promise to find a way to eat it.

Stroustrup seems to see the safety issue only as a problem with codebases that aren't using Modern C++ (which is a good thing to fix from C++ perspective), but doesn't seem to realize that the bar for safety has been set way above Modern C++. Those moving away from C++ have heard of smart pointers and std::vector. These aren't the solution they're looking for, these are the problems they want to get rid of.

pornel 54 days ago | parent | context | | [–] | on: C++ creator calls for help to defend programming l...

There's address sanitizer, and languages with garbage collectors and runtime bounds checks. There are WASM VMs, and even RLBox that translates WASM back to C that checks its own pointers at run time.

The difficulty is shifting most of these checks to compile time. Proving things at compile time is the holy grail, because instead of paying run-time cost only to make the program crash sooner, you can catch the violations before they even make it into the program, not pay any run-time cost, and provably not have such crashes either.

But that needs reliable static analysis, and C++ doesn't have enough guarantees and information in its type system to make that possible with a high degree of accuracy in non-trivial cases. This is not a matter of writing a smarter tool.

Statically tracking how pointers are used quickly ends up infeasible: every if/else doubles the state space, loops can mix the state in ways that makes symbolic reasoning provably impossible (undecidability), pointer aliasing creates lots of nearly-useless edge cases, and everything done through "escaping" pointers adds the state of the whole program to every individual state analysed, quickly reaching limits of what can be proven. For example, if use of a pointer depends on obj->isEnabled, now you have to trace back all paths that lead to getting this obj instance, and all the code paths that could modify the flag, and cross-reference them to know if this particular obj could have this flag set at this point in time... which can be infeasible. Everything ends up depending on everything, and if you give up and mark it as "unknown", it spreads like NaNs making the rest of the analysis also unknown, and you can't prove safety of anything that is complex enough to need such proof.

Rust and Circle/Safe C++ solve this problem by banning all cases that are hard for static analysis (no temporary pointers in globals, no mutable aliasing, no pointer arithmetic without checkable length, strict immutability, and single ownership and lifetime of memory is baked into the static type system, rather than a dynamic property that needs to be inferred through analysis of the program's behavior). This isn't some magic that can be sprinkled onto a codebase. The limitations are significant, and require particular architectures and coding patterns that are compatible with them. Nobody wants to rewrite all the existing C++ code, and that applies to not wanting to rewrite for Profiles too. I don't see how C++ can have that cake and eat it too.

pornel 55 days ago | parent | context | | [–] | on: Chrome Returns 206 when the Server Returns 403

TIL, the HTTP RFC explicitly allows range end to exceed the length of the content:

https://www.rfc-editor.org/rfc/rfc9110#name-byte-ranges

pornel 60 days ago | parent | context | | [–] | on: TypeScript types can run DOOM [video]

Is there a mainstream language where this still holds true?

From what I've seen most languages don't want to have a Turing complete type system, but end up with one anyway. It doesn't take much, so it's easy to end up with it accidentally and/or by adding conveniences that don't seem programmable, e.g. associated types and type equality.

anon25783 60 days ago | | [–]

pretty sure the C type system is not turing complete, but that doesn't necessarily make it superior

pornel 61 days ago | parent | context | | [–] | on: Hyperspace

> exceedingly rare

To have a mere one in a billion chance of getting a SHA-256 collision, you'd need to spend 160 million times more energy than the total annual energy production on our planet (and that's assuming our best bitcoin mining efficiency, actual file hashing needs way more energy).

The probability of a collision is so astronomically small, that if your computer ever observed a SHA-256 collision, it would certainly be due to a CPU or RAM failure (bit flips are within range of probabilities that actually happen).

unclebucknasty 60 days ago | | [–]

You know, I've been hearing people warn of handling potential collisions for years and knew the odds were negligible, but never really delved into it in any practical sense.

Context is everything.

pornel 61 days ago | parent | context | | [–] | on: Hyperspace

Reading just the first byte is probably wasting a read of the whole block.

Hashing the whole file after that is wasteful. You need to read (and hash) only as much as needed to demonstrate uniqueness of the file in the set.

The tree concept can be extended to every byte in the file:

https://github.com/kornelski/dupe-krill?tab=readme-ov-file#n...

jonhohle 61 days ago | | [–]

Yeah, there is definitely some merit to more efficient hashing. Trees with a lot of duplicates require a lot of hashing, but hashing the entire file would be required regardless of whether partial hashes or done or not.

I have one data set where `dedup` was 40% faster than `dupe-krill` and another where `dupe-drill` was 45% faster than `dedup`.

`dupe-krill` uses blake3, which last I checked, was not hardware accelerated on M series processors. What's interesting is that because of hardware acceleration, `dedup` is mostly CPU-idle, waiting on the hash calculation, while `dupe-krill` is maxing out 3 cores.

Thanks for the link!

pornel 63 days ago | parent | context | | [–] | on: Torvalds: You can avoid Rust as a C maintainer, bu...

Giving good feedback about Rust<>C bindings requires knowing Rust well. It needs deep technical understanding of Rust's safety requirements, as well as a sense of Rust's idioms and design patterns.

C maintainers who don't care about Rust may have opinions about the Rust API, but that's not the same thing :)

There are definitely things that can be done in C to make Rust's side easier, and it'd be much easier to communicate if the C API maintainer knew Rust, but it's not necessary. Rust exists in a world of C APIs, none of which were designed for Rust.

The Rust folks can translate their requirements to C terms. The C API needs to have documented memory management and thread safety requirements, but that can be in any language.

pornel 66 days ago | parent | context | | [–] | on: Cot: The Rust web framework for lazy developers

Rust already has several server frameworks that are relatively low-level network plumbing, and leave figuring out everything else to the user. If that's what you like, you can pick and choose from all the existing tools.

The Rust's ecosystem is now missing its Rails or Django.

This is an attempt to make something for those "lazy" devs who don't want to write their own cookie parsing middleware, and figure out how to get a database connection pool working with a request router.

echelon 66 days ago | | [–]

> Rust already has several server frameworks

The incredible proliferation of Rust web frameworks should be an almost blinding beacon advertising how well-suited Rust is for web backend development.

The biggest takeaway that anyone new to Rust or new to Rust-on-backend should have: Rust absolutely rocks for backend development. It's getting a tremendous amount of attention, people are trying a lot of things, and it's crystalizing as a major backend powerhouse.

You can be just as performant in Rust as you can in Go, or frankly, Python, and the result is super typesafe, super ergonomic, and blindingly fast. Google recently published a paper that said as much.

Rust already has several Python Flask equivalents (Actix/Axum), and it's waiting on its Rails/Django framework.

For anyone scared of Rust or the borrow checker: due to the nature of HTTP services and request flow logic, you almost never bump into it when writing backend Rust. But if you ever need to write anything with multiple hand-rolled threads or worker pools, you can. Rust opens up a lot of interesting possibilities, such as rich in-memory databases. But you certainly don't have to use these powers either if you don't need them.

dlisboa 66 days ago | | | [–]

> For anyone scared of Rust or the borrow checker: due to the nature of HTTP services and request flow logic, you almost never bump into it when writing backend Rust.

I’d say for anyone worrying about it just use ‘clone()’ everywhere you can. If you’re coming from any interpreted language the performance and efficiency will just be so much better that it doesn’t matter.

echelon 66 days ago | | | [–]

That's an excellent way to get your footing. And you can come back in a month and fix it all easily.

johnisgood 66 days ago | | | | [–]

clone(); )?; all that stuff is just meh.

johnisgood 65 days ago | | | [–]

I mean, who thinks using .clone() everywhere is such a good idea?

echelon 65 days ago | | | [–]

It's a suggestion for beginners writing their first Rust program. You wouldn't do this once you feel comfortable with the language.

johnisgood 65 days ago | | | [–]

They might end up with a bad habit, however.

dlisboa 66 days ago | | | [–]

There’s https://loco.rs/ if you like that sort of Rails experience. Personally I’ve grown more fond of having little cruft in my apps, being “lazy” about what goes into the code isn’t right to me and many of these frameworks don’t really care about that. To me most of the value in these opinionated frameworks is in the scaffolding anyway, not the opinions.

pornel 66 days ago | parent | context | | [–] | on: Greg K-H: "Writing new code in Rust is a win for a...

It's not hard to just call C. Rust supports C ABI and there's tooling for converting between C headers and Rust interfaces.

The challenging part is making a higher-level "safe" Rust API around the C API. Safe in the sense that it fully uses Rust's type system, lifetimes, destructors, etc. to uphold the safety guarantees that Rust gives and make it hard to misuse the API.

But the objections about Rust in the kernel weren't really about the difficulty of writing the Rust code, but more broadly about having Rust there at all.

pornel 68 days ago | parent | context | | [–] | on: Build your own SQLite in Rust, Part 5: Evaluating ...

At this point it's too early to even worry about correctness, it doesn't work yet.

But the years of work put into the existing project to make it robust don't mean the exact same years have to be spent on the reimplementation:

- there's been work spent on discovering the right architecture and evolving the db format. A new impl can copy the end result.

- hard lessons have been learned about dealing with bad disks, filesystems, fsync, flaky locks, etc. A new impl can learn from the solutions without having to rediscover them the hard way.

- C projects spend some time on compatibility with C compilers, OSes, and tweaking build scripts, which are relatively easy in Rust.

Testing will need a clever solution. Maybe they'll buy access to the official test suite? Maybe they'll use the original SQLite to fuzz and compare results?

crabmusket 68 days ago | | [–]

The Limbo team seems to be leaning heavily into deterministic simulation testing (DST) and one of the cofounders on a recent podcast was very enthusiastic about the benefits of the approach.

https://github.com/tursodatabase/limbo/tree/main/simulator

https://changelog.com/podcast/626

Join us for AI Startup School this June 16-17 in San Francisco!
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact