What I found most interesting about the review is that Apple chose not to implem...

StillBored · on June 27, 2016

Which is silly, and fails to isolate problems when/where they happen. Pretty much every significant level should have its own checksum, in many cases hopefully an ECC of some form. Hardware has bugs/failures as does software. What is particularly evil is when they both collide in a manner which causes silent/undetected corruption of important data for long periods of time.

ebbv · on June 26, 2016

That's not the only reason, though. There's other factors going into that decision that make it totally rational:

APFS isn't designed as a server file system. It's meant for laptops, desktop and most importantly (to Apple) mobile devices. Note that most of the devices are battery powered. That means "redundant" error checking by the FS is a meaningful waste.

That's not to say they might not add error checking capability in the future, but it makes total sense to prioritize other things when this file system is mostly going to be used on battery powered clients basically never on servers.

agumonkey · on June 26, 2016

So the idea is that the cloud handle the reliable, APFS is average end-user experience optimized ?

AceJohnny2 · on June 26, 2016

Similar design decision as IPv6, which doesn't have CRC at the IP level, unlike IPv4, for the same reason.

legulere · on June 26, 2016

If I remember correctly it's actually the opposite and it was reasoned that checksumming should be done higher up.

takeda · on June 26, 2016

Actually the reason for it is that lower layers already do checksuming and generally at that layer you don't get scrambled packets. You only lose packets which happens when there's congestion.

wumpus · on June 26, 2016

But anyone who's ever looked knows that you DO get scrambled packets, thanks to buggy hardware. And disks screw up, too.

ghshephard · on June 26, 2016

Buggy Hardware you would catch with your MAC checksum, such as ethernet's CRC.

wumpus · on June 26, 2016

Anyone with enough hardware can observe this sort of problem:

http://www.evanjones.ca/tcp-and-ethernet-checksums-fail.html

When The CRC and TCP Checksum Disagree: http://conferences.sigcomm.org/sigcomm/2000/conf/paper/sigco...

Alternately, just look at "netstat -s" for any machine on the Internet talking to a bunch of others. Here's the score for the main web host of the Internet Archive Wayback Machine:

    3088864840 segments received
    2401058 bad segments received.

_qjt0 · on June 27, 2016

I'd prefer them in the software so that they work with external storage like USB hard discs.

mahyarm · on June 26, 2016

Another way to look at it is checksumming is something that should be done at a generic lower level block layer, not the filesystem layer.

Do it in CoreStorage, not filesystem X.

wmf · on June 26, 2016

One of the key innovations in ZFS is storing checksums in block pointers which is something that cannot be done efficiently outside the file system. Storing checksums elsewhere is far more complex and expensive.

brigade · on June 26, 2016

Serious question: what value does filesystem checksumming offer for the average user who has no redundancy?

I mean, it'll tell you that your only copy of a file got corrupted, but it'll still be corrupted...

liw · on June 26, 2016

It tells you that your file is corrupted. You can then restore from backups, re-download, or take some other corrective action, such as delete the file, reboot the machine, re-install the operating system, or play Quake 2 to test your RAM and graphics.

Never underestimate the value of a reason to play Quake 2.

Zardoz84 · on June 26, 2016

I remember when I played DooM to test and benchmark a computer...

radiowave · on June 26, 2016

The average user might have no redundancy, but they still ought to have a backup. Checksum failure tells them they need to restore.

At the very least, a checksum failure might tell them (or the tech they're consulting) that they have a data problem, rather than, say, an application compatibility problem.

XorNot · on June 26, 2016

"Why is my machine crashing?" "Well, somelib.so is reporting checksum failures" is a much better experience then "weird, this machine used to be great but now it crashes all the time"

astrange · on June 27, 2016

Most all executable files are already codesigned, which is a better version of checksums, so it'd only help user data files.

_qjt0 · on June 27, 2016

somelib.so what? And what's a "checksum"? Error messages need to be comprehensible to the average user.

brokenmachine · on June 29, 2016

"Error: Buy a new Mac."

_qjt0 · on July 3, 2016

Assuming your intent is not to troll: "The file xyz.txt is corrupt. Click here to restore from a Time Machine backup."

mindajar · on June 26, 2016

Today you can verify backups on OS X with "tmutil verifychecksums", at least on 10.11. The UI to this could be improved, but user data checksums don't necessarily need to be a filesystem feature. On a single-disk device, the FS doesn't have enough information to do anything useful about corrupt files anyway.

vardump · on June 26, 2016

> On a single-disk device, the FS doesn't have enough information to do anything useful about corrupt files anyway.

Some filesystems can be configured to keep two or more copies of certain filesystem/directory/etc. contents. Two copies is enough information to do something useful.

comex · on June 26, 2016

Well, Apple is moving in the direction of syncing everything with iCloud - iCloud Drive has been around for a while, and Sierra adds the ability to sync the desktop and Documents folder; of course on top of long-existing things like photo sync. If the file was previously uploaded to iCloud, there is redundancy, and you definitely don't want to overwrite it with the corrupted version.

How big an issue this is in practice I don't know.