It depends. There are two options of generational references we've been experimenting with:
* Table-based generational references where the generations are stored in a separate table, and we retire overflowed slots. This is guaranteed memory-safe.
* Random generational references which are stochastic and enable storing objects on the stack, and inline in arrays and objects.
The latter's memory safety depends on definitions and baseline:
* It's not as safe as completely memory-safe languages like Typescript/Javascript.
* It's safer than C or C++.
* It could be more safe or less safe than Rust, where most programs use unsafe directly or in (non-stdlib) dependencies [0]. For mutable aliasing, instead of trying to properly use unsafe in Rust, a Vale program would use generational references which are checked (leaving aside the RC capabilities of both languages).
It's a very strong stochastic measure. Whereas simple memory tagging uses 4 bits and has a false positive rate of 6%, these are using 64 bits for a rate close to zero. Combined with how they fail very loudly, invalid-dereference bugs don't lie in stealth for years as they can in unsafe and C programs.
Additionally, one can code in a way that doesn't use any generational references at all. Ada has SPARK, Rust has Ferrocene, and Vale has linear style + regions [1].
Still, if one doesn't want to think about any of this, then safer languages like TS/JS are great options.
You're correct that if a generation leaks then it could be a problem for this C-like's approach. Vale largely solves this by making it so the user code can't read a generation (though like in any language, unsandboxed untrusted code can get around that of course).
> where most programs use unsafe directly or in (non-stdlib) dependencies [0].
Interesting study I hadn't seen before. I think stdblib-vs-external crate is a bit arbitrary of a definition. Rust deliberately chooses to have a relatively small standard library, and there is a core set of special crates written by the similar people and with similar quality to the standard library. Hashbrown is a particularly obvious example: the standard library HashMap and HashSet are a thin wrapper around the external crate, so using the external crate is just as safe.
I'd be interested in a more recent paper that includes miri test coverage.
The authors also have a good caveat that it's 60% of popular libraries, not 60% of codebases. If you browse around crates.io comparing download stats to dependency stats you'll see download stats are far far larger, but that the difference varies wildly depending on the type of crate. Crates that tend to be used by applications but not libraries tend not to be depended on by crates published to crates.io. I'd expect these crates to be less likely to use unsafe. But it's still clearly a lot. (And I think my point above that there isn't a clear stdblib vs external crates distinction cuts the other way as well: there have been soundness bugs in the standard library)
as someone with an interest in the subject but not a domain expert, I have an imprecise understanding of what you both are referring to when you say "a stochastic measure."
do you have any resources that you could point me to for reading up on stochastic memory management?
* Table-based generational references where the generations are stored in a separate table, and we retire overflowed slots. This is guaranteed memory-safe.
* Random generational references which are stochastic and enable storing objects on the stack, and inline in arrays and objects.
The latter's memory safety depends on definitions and baseline:
* It's not as safe as completely memory-safe languages like Typescript/Javascript.
* It's safer than C or C++.
* It could be more safe or less safe than Rust, where most programs use unsafe directly or in (non-stdlib) dependencies [0]. For mutable aliasing, instead of trying to properly use unsafe in Rust, a Vale program would use generational references which are checked (leaving aside the RC capabilities of both languages).
It's a very strong stochastic measure. Whereas simple memory tagging uses 4 bits and has a false positive rate of 6%, these are using 64 bits for a rate close to zero. Combined with how they fail very loudly, invalid-dereference bugs don't lie in stealth for years as they can in unsafe and C programs.
Additionally, one can code in a way that doesn't use any generational references at all. Ada has SPARK, Rust has Ferrocene, and Vale has linear style + regions [1].
Still, if one doesn't want to think about any of this, then safer languages like TS/JS are great options.
You're correct that if a generation leaks then it could be a problem for this C-like's approach. Vale largely solves this by making it so the user code can't read a generation (though like in any language, unsandboxed untrusted code can get around that of course).
[0] https://2020.icse-conferences.org/details/icse-2020-papers/6...
[1] https://verdagon.dev/blog/first-regions-prototype