Hacker Newsnew | past | comments | ask | show | jobs | submit | ludocode's commentslogin

I just use ConnectBot to ssh to my house. It runs tmux and vim well, especially with a little pocket-size folding bluetooth keyboard to go with it.


These filesystems are not really alternatives because mdraid supports features those filesystems do not. For example, parity raid is still broken in btrfs (so it effectively does not support it), and last I checked zfs can't grow a parity raid array while mdraid can.

I run btrfs on top of mdraid in RAID6 so I can incrementally grow it while still having copy-on-write, checksums, snapshots, etc.

I hope that one day btrfs fixes its parity raid or bcachefs will become stable enough to fully replace mdraid. In the meantime I'll continue using mdraid with a copy-on-write filesystem on top.


> zfs can't grow a parity raid array while mdraid can.

indeed out of date - that was merged a long time ago and shipped in a stable version earlier this year.


Like everything else in engineering it is a matter of trade offs. The setup you chose to run really hampers the usefulness of having a checksuming file system, since it cannot simply get the correct data from another drive. As a peer pointed out: ZFS does support adding additional drives to expand a RaidZ (with some trade offs). What you cannot do is change the raid topology at the fly.


soon :)


Here's a link to the case on the Framework marketplace:

https://frame.work/ca/en/products/cooler-master-mainboard-ca...

I put my original mainboard in one of these when I upgraded. It's fantastic. I had it VESA-mounted to the back of a monitor for a while which made a great desktop PC. Now I use it as an HTPC.


Indeed, a decent closed hash table is maybe 30 lines. An open hash table with linear probing is even less, especially if you don't need to remove entries. It's almost identical to a linear search through an array; you just change where you start iterating.

In my first stage Onramp linker [1], converting linear search to an open hash table adds a grand total of 24 bytecode instructions, including the FNV-1a hash function. There's no reason to ever linear search a symbol table.

[1]: https://github.com/ludocode/onramp/blob/develop/core/ld/0-gl...


a linear search may be faster because it is cache and branch prediction frienly. Benchmarks on real world data is needed to make a final call.


Indeed! An eventual goal of Onramp is to bootstrap in freestanding so we can boot directly into the VM without an OS. This eliminates all binaries except for the firmware of the machine. The stage0/live-bootstrap team has already accomplished this so we know it's possible. Eliminating firmware is platform-dependent and mostly outside the scope of Onramp but it's certainly something I'd like to do as a related bootstrap project.

A modern UEFI is probably a million lines of code so there's a huge firmware trust surface there. One way to eliminate this would be to bootstrap on much simpler hardware. A rosco_m68k [1] is an example, one that has requires no third party firmware at all aside from the non-programmable microcode of the processor. (A Motorola 68010 is thousands of times slower than a modern processor so the bootstrap would take days, but that's fine, I can wait!)

Of course there's still the issue of trusting that the data isn't modified getting into the machine. For example you have to trust the tools you're using to flash EEPROM chips, or if you're using an SD card reader you have to trust its firmware. You also have to trust that your chips are legit, that the Motorola 68010 isn't a modern fake that emulates it while compromising it somehow. If you had the resources you'd probably want to x-ray the whole board at a minimum to make sure the chips are real. As for trusting ROM, I have some crazy ideas on how to get data into the machine in a trustable way, but I'm not quite ready to embarrass myself by saying them out loud yet :)

[1]: https://rosco-m68k.com/


Author here. I think my opinion would be about the same as the authors of the stage0 project [1]. They invested quite a bit of time trying to get Forth to work but ultimately abandoned it. Forth has been suggested often for bootstrapping a C compiler, and I hope someone does it someday, but so far no one has succeeded.

Programming for a stack machine is really hard, whereas programming for a register machine is comparatively easy. I designed the Onramp VM specifically to be easy to program in bytecode, while also being easy to implement in machine code. Onramp bootstraps through the same linker and assembly languages that are used in a traditional C compilation process so there are no detours into any other languages like Forth (or Scheme, which live-bootstrap does with mescc.)

tl;dr I'm not really convinced that Forth would simplify things, but I'd love to be proven wrong!

[1]: https://github.com/oriansj/stage0?tab=readme-ov-file#forth


You might get a kick out of DuskOS(baremetal forth system)'s C compiler.

https://git.sr.ht/~vdupras/duskos/tree/master/item/fs/doc/co...


To add a bit to this, although Dusk OS doesn't have the same goals as stage0, that is to mitigate the "trusting trust" attack, I think it effectively does it. Dusk OS kernels are less than 3000 bytes. The rest boots from source. One can easily audit those 3000 bytes manually to ensure that there's nothing inserted.

That being said, the goal of stage0 is to ultimately compile gcc and there's no way to do that with Dusk OS.

That being said (again), this README in stage0 could be updated because I indeed think that Dusk is a good counterpoint to this critique of Forth.


Oh, amazing! I've heard of DuskOS before but I didn't realize its C compiler was written in Forth.

Looks like it makes quite a few changes to C so it can't really run unmodified C code. I wonder how much work it would take to convert a full C compiler into something DuskCC can compile.

One of my goals with Onramp is to compile as much unmodified POSIX-style C code as possible without having to implement a full POSIX system. For example Onramp will never support a real fork() because the VM doesn't have virtual memory, but I do want to implement vfork() and exec().


It can't compile unmodified C code targeting POSIX. That's by design. Allowing this would import way too much complexity in the project.

But it does implement a fair chunk of C itself. The idea is to minimize the magnitude of the porting effort and make it mechanical.

For example, the driver the the DWC USB controller (the controller on the raspberry pi) comes from plan 9. There was a fair amount of porting to do, but it was mostly to remove the unnecessary hooks. The code itself, where the real logic happens, stays pretty much the same and can be compiled just fine by Dusk's C compiler.


That might be rather more difficult than you might expect. Advent of Code uses quite a lot of 64-bit numbers. A bit of googling tells me C64 BASIC only supports 16-bit integers and 32-bit floats. I imagine the other BASICs have similar limitations.

I did 2023 Advent of Code with my own compiler and this was the biggest challenge I ran into. I only had 32-bit integers at the time so I had to manually implement 64-bit math and number formatting within the language to be able to do the puzzles. You would probably have to do the same in BASIC.


Commodore BASIC, derived from Microsoft's 6502 BASIC, actually has 40-bit floats, with a 32-bit mantissa and 8-bit exponent, not that it would help much, if any, with 64-bit maths.

There are Microsoft BASICs that have 64-bit floats, such as built into ROM on the TRS-80 Model I, III and 4 w/Level 2 BASIC, TRS-80 Model 100/102, TI-99/4(a), Apple III, and MSX systems, or on cartridge such as Microsoft BASIC for the Atari 8-bit computers.


The generated C code could contain a backdoor. Generated C is not really auditable so there would be no way to tell that the code is compromised.


It can be difficult to explain why bootstrapping is important. I put a "Why?" section in the README of my own bootstrapping compiler [0] for this reason.

Security is a big reason and it's one the bootstrappable team tend to focus on. In order to avoid the trusting trust problem and other attacks (like the recent xz backdoor), we need to be able to bootstrap everything from pure source code. They go as far as deleting all pre-generated files to ensure that they only rely on things that are hand-written and auditable. So bootstrapping Python for example is pretty complicated because the source contains code generated by Python scripts.

I'm much more interested in the cultural preservation aspect of it. We want to preserve contemporary media for future archaeologists, for example in the Arctic World Archive [1]. Unfortunately it's pointless if they have no way to decode it. So what do we do? We can preserve the specs, but we can't really expect them to implement x265 and everything else they would need from scratch. We can preserve binaries, but then they'd need to either get thousand-year-old hardware running or virtualize a thousand-year-old CPU. We can give them, say, a definition of a simple Lisp, and then give them code that runs on that, but then who's going to implement x265 in a basic Lisp? None of this is really practical.

That's why in my project I made a simple virtual machine, then bootstrapped C on top of it. It's trivially portable, not just to present-day architectures but to future and alien architectures as well. Any future archaeologist or alien civilization could implement the VM in a day, then run the C bootstrap on it, then compile ffmpeg or whatever and decode our media. There are no black boxes here: it's all debuggable, auditable, open, handwritten source code.

[0]: https://github.com/ludocode/onramp?tab=readme-ov-file#why-bo...

[1]: https://en.wikipedia.org/wiki/Arctic_World_Archive


Yep, I think this would have been good context in the OP


Say you start with nothing but "pure source code".

With what tool do you process that source code?


The minimum tool that bootstrapping projects tend to start with is a hex monitor. That is, a simple-as-possible tool that converts hexadecimal bytes of input into raw bytes in memory, and then jumps to it.

You need some way of getting this hex tool in memory of course. On traditional computers this could be done on front panel switches, but of course modern computers don't have those anymore. You could also imagine it hand-woven into core rope memory for example, which could then be connected directly to the CPU at its boot address. There are many options here; getting the hex tool running is very platform-specific.

Once you have a hex tool, you can then use that to input the next stage, which is written in commented hexadecimal source code. The next tool then adds a few features, and so does the tool after that, and so on, eventually working your way up to assembly and C.


From the point of view of trust and security, bootstrapping has to be something that's easily repeatable by everyone, in a reasonable amount of time and steps, with the same results.

Not to mention using only the current versions of all the deliverables or at most one version back.


While it is technically possible to bootstrap Rust from Guile and the 0.7 Rust compiler, you would need to recompile the Rust compiler about a hundred times. Each step takes hours, and you can't skip any steps because, like he said, 1.80 requires 1.79, 1.79 requires 1.78, and so on all the way back to 0.7. Even if fully automated, this bootstrap would take months.

Moreover, I believe the earlier versions of rustc only output LLVM, so you need to bootstrap a C++ compiler to compile LLVM anyway. If you have a C++ compiler, you might as well compile mrustc. Currently, mrustc only supports rustc 1.54, so you'd still have to compile through some 35 versions of it.

None of this is practical. The goal of Dozer (this project) is to be able to bootstrap a small C compiler, compile Dozer, and use it to directly compile the latest rustc. This gives you Rust right away without having to bootstrap C++ or anything else in between.


This is accurate. I'm an OS/kernel developer and a colleague was given the task of porting rust to our OS. If I remember correctly, it did indeed take months. I don't think mrustc was an option at the time for reasons I don't recall, so he did indeed have to go all the way back to the very early versions and work his way through nearly all the intermediate versions. I had to do a similar thing porting java, although that wasn't quite as annoying as porting rust. I really do wish more language developers would provide a more practical way of bootstrapping their compilers like the article is describing/attempting. I've seen some that do a really good job. Others seem to assume only *nix and Windows exist, which has been pretty frustrating.


I'm curious as to why you need to bootstrap at all? Why not start with adding the OS/kernel as a target for cross-compilation and then cross-compile the compiler?


Nim uses a smaller bootstrap compiler that uses pre-generated C code to then build the compiler proper. It's pretty nifty for porting.


The article mentions that the Bootstrappable Builds folks don't allow pre-generated code in their processes, they always have to build or bootstrap it from the real source.


that's interesting! what kind of os did you write? it sounds like you didn't think supporting the linux system call interface was a good idea, or perhaps even feasible?


It's got a fairly linux like ABI, though we don't aim or care to be 1-1 compatible, and it has/requires our own custom interfaces. Porting most software that was written for linux is usually pretty easy. But we can't just run binaries compiled for linux on our stuff. So for languages that require a compiler written in its own language where they don't supply cross compilers or boot strapping compilers built with the lowest common denominator (usually c or c++), things can get a little trickier.


interesting! what applications are you writing it for?


Building rustc doesn't take hours on a modern machine. Building it 100 times would take on the order of a day, not months.

> Moreover, I believe the earlier versions of rustc only output LLVM, so you need to bootstrap a C++ compiler to compile LLVM anyway.

This is a more legit point.


The current version of rustc may compile itself quickly, but remember, this is after nearly ten years of compiler optimizations. Older versions were much slower.

I seem to recall complaints that old rustc would take many hours to compile itself. Even if it takes on average, say, two hours to compile itself, that's well over a week to bootstrap all the way from 0.7 to present. You're right that months is probably an exaggeration, but I suspect it might take a fair bit longer than a week. The truth is probably somewhere in the middle, though I suppose there's no way to know without trying it.


> Moreover, I believe the earlier versions of rustc only output LLVM, so you need to bootstrap a C++ compiler to compile LLVM anyway. If you have a C++ compiler, you might as well compile mrustc. Currently, mrustc only supports rustc 1.54, so you'd still have to compile through some 35 versions of it.

Not sure I follow - isn't rustc still only a compiler frontend to LLVM, like clang is for C/C++? So if you have any version of rustc, haven't you at that point kind of "arrived" and started bootstrapping it on itself, meaning mission complete?

Ultimately from what I glean the answer really is just that this would be made nicer with Dozer, but I still wish this was explicitly stated by the author in the post. It's not like the drudgery of the ocaml route escapes me.


> Not sure I follow - isn't rustc still only a compiler frontend to LLVM, like clang is for C/C++?

The rustc source tree currently includes LLVM, GCC, and Cranelift backends: https://github.com/rust-lang/rust/blob/c6db1ca3c93ad69692a4c...

(Cranelift itself is written in Rust.)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: