brohee's favorites | Hacker News

You can do linear lists and trees in Rust. The problem is backlinks. If you refcount everything, you can have backlinks, but otherwise there's a safety problem. Backlinks require an invariant which covers two variables, and you can't express that in Rust.

Back pointers are a special sort of pointer from an ownership perspective. They don't carry ownership, but are locked in an invariant relationship with the pointer that does. If you could express that in Rust, it would be compile-time checkable. Rust needs a way to say "field Q of struct T2 is a backpointer to field P of struct T1".

During pointer manipulation, that rule will momentarily be broken. But it's still checkable. It just needs some analysis. First, the checker would have to identify a block of code in which a backpointer was being manipulated, and locate the corresponding manipulation of the forward pointer. This is the code block of interest. It's an atomic transaction, in the database sense. Maybe the user would have to identify the code block with something like "atomic", with syntax like "unsafe".

Then, the checker would have to establish that if the invariant held at entry to the code block, it will hold at exit from the code block. This is a simple application of program verification technology. If the atomic section is small, this is not difficult.

Typical uses would be updating doubly-linked lists, rebalancing trees, and pulling part of a DOM-like tree out and moving it somewhere else.

This is one class of unsafe code - transient. It's the easiest to check, because it's a local problem. At the end of the code block, a safe state has been reestablished.

The second class of unsafe code involves partially valid data structures. This problem lies underneath "Vec", where the underlying block of storage has preallocated, uninitialized space for later growth. This is harder, because unsafe state persists outside unsafe code sections. To deal with this, it's necessary to somehow attach invariants to the data structure. If you have an array A which is valid from elements 0 to n, you need something like "valid_array(A,0,n)". Then, when you initialize an element n+1, you change the validity limits. The checker needs a few simple theorems, such as "valid_array(A,0,n) and valid_element(A[n+1)) implies valid_array(A,0,n+1)" to check this. This is old and completely automatable technology; we did it in the Pascal-F verifier 35 years ago, and Dafny does it today.

Handling these two classes of unsafe code takes care of a sizable fraction of the unsafe code really needed in Rust. With the support described above, most of the unsafe code in pure Rust could be eliminated. Outside of those two classes, most unsafe code in Rust involves external interfaces to other languages.

Unsafe code which doesn't fit into any of those classes needs to be looked at very hard.