Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Interesting! But there’s a gap between aspirations and what was accomplished here.

Early on in the blog post, the author mentions that "c2rust can produce a mechanical translation of C code to Rust, though the result is intentionally 'C in Rust syntax'". The flow of the post seems to suggest that LLMs can do better. But later on, they say that their final LLM approach produces Rust code which “is very 'C-like'" because "we use the same unsafe C interface for each symbol we port”. Which sounds like they achieved roughly the same result as c2rust, but with a slower and less reliable process.

It’s true that, as the author says, “because our end result has end-to-end fuzz tests and tests for every symbol, its now much easier to 'rustify' the code with confidence". But it would have been possible to use c2rust for the actual port, and separately use an LLM to write fuzz tests.

I'm not criticizing the approach. There's clearly a lot of promise in LLM-based code porting. I took a look at the earlier, non-fuzz-based Claude port mentioned in the post, and it reads like idiomatic Rust code. It would be a perfect proof of concept, if only it weren't (according to the author) subtly buggy. Perhaps there's a way to use fuzzing to remove the bugs while keeping the benefits compared to mechanical translation. Unfortunately, the author's specific approach to fuzzing seems to have removed both the bugs and the benefits. Still, it's a good base for future work to build on.



It's in between. It's more C like than the Claude port, but it's more Rust-y than c2rust. How much depends on how fine-grained you want to make your port and how you want to prompt your LLM. For inside of functions and internal symbols, the LLM is free to use more idiomatic construction and structures. But since the goal was to test the effectiveness of the fuzz testing, using the LLM to do the symbol translation is more of an implementation detail.

You could certainly try using c2rust to do the initial translation, and it's a reasonable idea, but I didn't find the LLMs really struggled with this part of the task, and there's certainly more flexibility this way. c2rust seemed to choke on some simple functions as well, so I didn't pursue it further.

And of course for external symbols, you're constrained by the C API, so how much leeway you have depends on the project.

You can also imagine having the LLM produce more idiomatic code from the beginning, but that can be hard to square with the incremental symbol-by-symbol translation.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: