Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm a ZFS user but recently I took the plunge and tried btrfs out. I'm mostly impressed. This was after an encouter with ZFS's horrendous performance with dedup enabled. btrfs at least lets you manually dedup (which works fine for my purposes--I rely heavily on hashdeep in my workflows already and it's trival to locate duplicates in hashdeep inventories).

My sense is that it's the more exotic parts of btrfs (i.e. the native RAID but not-RAID-0/1 parts, esp the stuff intended to be in the RAID-z alternates) that have the most issues.

But ext4 doesn't natively support any of that either, so maybe any place you would consider using ext4, btrfs is potentially ok as an alternative. I mean if you approach it as ext4 with a few very useful missing features I think you hit the most useful and widely used corners. You can still use md etc. I'm not talking about datacenter scale things, I'm talking workstations and laptops.

I currently run a zfs-on-linux RAID-1 and have been testing btrfs for one of my backup copies. There are parts of btrfs things seem to work fine and that I really like a lot compared to the ZFS versions. I'm a bit paranoid about data loss and corruption (I do research with large sets of medical images, it's sort of like video except the data is typically very compressible, but few tools work with compressed images--filesystem-level compression is great for this stuff) so I keep hashdeep inventories of everything to document provenance. One thing about medical image processing workflows is you easily end up with multiple copies of images in different directories. I hoped to use ZFS deduplication for this but it is a nightmare. With btrfs you at least have the option of safely deduping files manually using the (cp --reflink) so you can dedup periodically (or even based on knowledge of how the data is layed out in the filesystem) or add it into workflow scripts which works well.

Unfortunately, the one thing that I haven't figured out is how to make a bit-by-bit clone of an existing btrfs filesystem. You can dd, but that leads to issues because device UUIDs are imbedded into the metadata. Working on btrfs, I've managed to consolidate 5.1TB of data that includes very compressible source images, duplacates and a lot of text files into 1.2TB, but I can't figure out how to correctly duplicate the 1.2TB version of the data without it transiently exploding back to 5.1TB in the process.

I do like the way ZFS approches the concept and organization of "datasets" better than btrfs's approach though. btrfs's approach seems more adhoc and less opinionated. I think if you lack discipline and experience with large datasets, btrfs can enable you to do unwise things that will seriously bite you in the butt down the road, whereas ZFS enforces some discipline. I think it's probably because ZFS was designed by people that have seen hell.



> Unfortunately, the one thing that I haven't figured out is how to make a bit-by-bit clone of an existing btrfs filesystem.

Have you tried btrfs send/receive?


No, I have not actually tested it yet, but according to my research the current state of btrfs send/receive is this:

1. btrfs send streams the files (i.e. it decompresseses files that are compressed on disk)

2. The send stream (optionally) is compressed

3. btrfs receive (optionally) recompresses data into the destination

If you have a large dataset and take the time to use one of the slower compression algorithms/settings this means that you have to wait for that compression to happen all over again at the destination. (you could have different compression settings on the two btrfs filesystems, btrfs send/receive is one of the ways for migrating data between these settings)




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: