Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There are filesystems that support inline or post-process deduplication. btrfs[1] and zfs[2] come to mind as free ones, but there are also commercial ones like WAFL etc.

It's always a tradeoff. Deduplication is a CPU-heavy process, and if it's done inline, it is also memory-heavy, so you're basically trading CPU and memory for storage space. It heavily depends on the use-case (and the particular FS / deduplication implementation) whether it's worth it or not

[1]: https://btrfs.wiki.kernel.org/index.php/Deduplication

[2]: https://docs.oracle.com/cd/E36784_01/html/E39134/fsdedup-1.h...



One problem is if you need to support Windows clients. Microsoft charges $1600 for deduplication support or something like that: https://learn.microsoft.com/en-us/windows-server/storage/dat...


Deduplication is included with every version and edition of windows server since 2012. You need to license windows server properly of course, but there is no add-on cost for deduplication.


there exists an open-source btrfs filesystem driver for Windows...


Yeah, which is great for storage but doesn't help over the wire.


ZFS at least supports sending a deduplicated stream.


Right, and btrfs can send a compressed stream as well, but we aren't sending raw filesystem data via VCS.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: