Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's been tried several times, but it's hard because it's such a massive quantity of data. The IPFS backup never really got off the ground.

They have their own backups which I think is good enough for now unless someone plans on donating a few hundred million.



Oh no! I didn't know their IPFS initiative didn't pan out. What happened to it? I am surprised how hard it is to google. I remember interviewing for a role on that team at the archive to help move it to filecoin. Was so happy to hear that the effort was underway to decentralize their datastore. We need this more than ever.


There are people still working on trying to make it happen but it's just a collosal amount of data and filesystems are notoriously hard, so it's very slow going.

From my own personal experience doing distributed archiving with no relation to Archive.org, Filecoin/IPFS's UX isn't quite there yet. They still don't let you serve data to the network from a normal filesystem, you have to let their system ingest all of your stuff so you end up double-storing data or you have to give into everything being stored as inscrutable binary blobs.

That's why I still haven't integrated ArchiveBox with IPFS/Filecoin/Storj, let my data live in a normal filesystem dammit!


> They still don't let you serve data to the network from a normal filesystem, you have to let their system ingest all of your stuff so you end up double-storing data or you have to give into everything being stored as inscrutable binary blobs.

I don't understand this part. What data would you have to give them? Why can't it just live next to your stuff on your OS' filesystem?


For IPFS, I'm fairly sure you can now serve from your normal filesystem, rather than load it into their blockstorage -- or at least the blockstorage has pointers to real data blocks that are part of your existing files (it's the nocopy option[1]; it's marked as experimental, so there may be some sharp edges.)

For Filecoin, if you want fast access, you do need to keep a second hot plaintext copy, as well as the sealed Filecoin copy. But that works for the backup case for IA, because the hot copy would be served from the archive's existing infrastructure (and/or a distributed IPFS hot cache) -- you'd just use Filecoin for the proven safe backup.

The project to back up IA to Filecoin is still ongoing. The IA dashboard that shows the current state is (perhaps predictably) down at the moment, but it crossed the 1PiB line last year[2], and they've been optimising the onboarding flow recently.

[1] https://docs.ipfs.tech/reference/kubo/cli/#ipfs-add

[2] https://blog.archive.org/2023/10/20/celebrating-1-petabyte-o...

(Disclosure: I work at the Filecoin Foundation/Filecoin Foundation for the Decentralized Web, which partners with the Archive on this project, as well as supporting other Internet Archive backup projects.)


Needing to keep a separate hot copy at 220PiB is already ~$7M/yr, and multiples much more than that if you factor in labor and redundancy. The --nocopy option looks great though, I didn't see it last time I was looking around for an MFS/FUSE solution, I'll try it.

I appreciate your effort and I hope the project continues.


They're saying that the client software (the servers that speak the IPFS protocols) has to load the files to be served into their own local storage database, it can't just keep a "metadata file" and read the existing files off disk. Presumably somebody could write a client that spoke the IPFS protocol and did this, or fork the main Go or JS one, but until someone does that they're stuck with the software that's already been written


IPFS is all content-hash-addressed, so my guess is the IPFS service spirits the files away to a (hopefully) immutable store for the sake of sanity.


Perhaps you can persuade Elon that it owns the libs?


I don't want Elon anywhere near Archive.org, please don't give him any ideas. There are plenty of other people in the world with money.


Yes please, we need this lunatic out of our life, not the other way around


"Based on historical records from the first half of the last century, Mr Musk (inventor of the car and the rocket) and President Xi were the most respected and popular individuals on earth."


History is written by the winners...


Maybe in the immediate aftermath, but not long after. King Leopold "won" but we now all think he was terrible.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: