Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's huge for being able to replay big WARC files in a browser without having to download the whole thing. (e.g. try loading a 700mb WARC from IPFS to visit one page within it, it's too slow to work as-is)

It's used extensively by the Browsertrix/Webrecorder.io projects (who's team pioneered the WACZ format) and a few other projects.



Oh I may have missed that part. So the WACZ (indexes?) can contains offsets into the WARC file itself to each individual page?


WACZ is a replacement for WARC that has the index with offsets built in.


But it uses warc files inside as the archive format. It seems weird to call it a replacement when the original is still present.


I just meant from a user's perspective it's a format that superseeds WARC. But internally, yes, one is an encapsulation format for the other.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: