The link to garbage collection is a bit odd. The garbage collector does not, of course, exist on iOS, and this bug has nothing to do with garbage collection. It's just a memory leak due to bad manual memory management. It just so happens that this leak is in code that's written to also work when garbage collection is on, but the leak isn't due to, or even related to, garbage collection.
This may be nitpicking, but it made it hard for me to pay attention to the real meat of the thing.
I would disagree pretty strongly with that. The function CFMakeCollectable is explicitly a function for dealing with the garbage collector, and its unpredictability in non-garbage-collected code is the direct cause of the bug.
Doesn't it make sense to say that without the existence of Apple GC, the bug never would have existed? Doesn't that at least somewhat justify the title?
edit:
Furthermore the original intent of including garbage collection in the title was as an ironic twist based on the fact that ios has never had garbage collection. Maybe that didn't convey as well as I would have liked.
There's no unpredictability. CFMakeCollectable does nothing when not using garbage collection. That's explicitly documented.
Without GC, this code would have been written as simply:
CFRetain(data);
And the same bug would manifest. The bug is just a missing release call.
Edit: I'd assume they were going for the standard pattern for code that needs to be both GC and non-GC for bridging CF objects out into the Cocoa world:
[NSMakeCollectable(cfobj) autorelease];
In this case, "obj" was retained inline. They just forgot the autorelease. They could have forgotten it just as easily without garbage collection, and the NSMakeCollectable (which inlines to CFMakeCollectable) call is unrelated, aside from possibly occupying the wrong spot of the original programmer's brain at the wrong moment.
I won't argue too strongly on this, as you're far more knowledgable about it than I am.
However, imo the intention and semantics behind a call like CFMakeCollectable implies a transfer of ownership to an external system. A newbie Apple coder could be forgiven for thinking it would still transfer ownership in RC environments, just to the autorelease pool instead of a collector. In all likelihood this is what happened. An intern got at the code and didn't know the details about GC.
Obviously the point stands that this interpretation is well-documented to be false, but its naming is definitely misleading.
Double edit:
I see from your edit that some of my basic assumptions about CFMakeCollectable were wrong, having never actually worked with it. My bad.
I agree with you that the person who wrote this code likely wrote it assuming a GC environment. It was written originally for OS X perhaps, and not updated (with an autorelease, per mikeash's comment) when ported to iOS.
Thanks for the article btw - I'm new to OS X/iOS dev and had no idea that ObjC ever had GC support!
As framework code, it was certainly written to be dual-mode. It's possible to write code that behaves correctly with both GC and reference counting. It's a minor pain in the ass because you have to satisfy both sides at once, but it's not all that hard. Because the frameworks can be used in both environments, all (Mac) framework code has to be written this way. There would have been no changes needed to the code when moving it to iOS, at least not related to garbage collection.
They just forgot an autorelease. This would cause the same problems on the Mac as on iOS for non-GC apps (which is, to a decent approximation, all of them). Given that the API in question pre-dates Objective-C garbage collection by about half a decade, and the code in question probably does as well, I really don't think garbage collection can be related to this in any way beyond one GC-related call being near the bug.
In case you're curious about the history (since you said you're new to the platform), garbage collection showed up as an optional feature in Objective-C in OS X 10.5, was effectively discouraged in favor of ARC in 10.7, and officially deprecated in 10.8. Not a terribly long run, sadly. It was a nice idea that didn't work out as well in practice.
A single piece of code can potentially work in both environments, but it needs to be written with that in mind, so binaries are annotated with their GC support. Any given library or plugin can be non-GC, GC-only, or GC-optional. Non-GC libraries can only load into non-GC apps, and GC-only libraries can only load into GC apps. GC-optional libraries can load into either, but are annoying to write as I mentioned. Since the system has to support both, all system libraries had to be made GC-optional.
In addition to the usual teething bugs with the collector itself, all the libraries sprouted bugs in GC mode due to the conversion, which made garbage collection in ObjC a bit too interesting to really be nice to use.
That ends today's long, pointless, rambling comment.
I should note that, while we saw something like 0.4% increased crash rate, we actually don't have a number to compare it against for memory crashes.
This is because if memory usage gets too high, the OS will send a kill signal to the process, which can be neither detected nor caught.
This means that in our original decision to use this fix, all we had was anecdotal evidence of untraceable crashes. Luckily we had dedicated QA that was keeping pretty solid track of them all, and they piled up.
We touched a file on disk and deleted it on entering background. If it exists at startup, last run was a crash. Not a great solution, but gave us some idea of # of crashes not caught by our crash reporting.
We do this for various other reasons (clearing possibly corrupted state, etc), but we've never uploaded the statistics to the servers. We may begin doing this in the future.
We've been using Crittercism in the past and recently switched to Crashlytics, unfortunately memory crashes are one of the few exceptions (hah) to the rule, as they are SIGKILLs from the OS.
Another commenter recommended touching a file at launch and at sleep to track untraceable crashes, which we do for various other reasons, but don't upload the stats. We may begin doing this.
Our app uses this method to cache images and we were seeing huge memory leaks due to this bug. Like the example showed, the bug manifested itself in a retain count being incremented on property access. Our solution was something like this:
Written from memory, so excuse any mistakes. First (and only) time I've ever seen a legitimate use for `retainCount`. @bbum would be proud (or perhaps horrified).
So in the first line, it has 1 and then checking to see if it's equal bumps it to 2 allocs, and finally the third call, to begin releasing, brings it to 3 (then removal of all 3)?
That's super weird - I have yet to encounter something along these lines in my Cocoa work so far. Solid idea to beat the system there. I just hope it's got a nice comment above it haha.
Yeah this is the general idea we went with. We did our best to synchronize around it, but unfortunately we can never have total control, which necessarily introduces race conditions.
I would imagine this is what leads to the 0.4% crash rate I mentioned at the top of the article :-(
Aren't you living dangerously by still using garbage collection on iOS at this point? Apple keep making dire statements about how everyone needs to stop using it, so I'd be worried that iOS8 will just remove it entirely, meaning apps that use GC just wouldn't work at all on that OS...
ObjC garbage collection never existed on iOS. The issue was related to framework code that was written by Apple for an ObjC-GC environment (probably Mac OS X) and ported to a non-GC environment (iOS).
I should clarify. iOS does not have garbage collection and has never had garbage collection. The root of this bug is in old legacy cruft from the garbage collected days of OSX. The code is dormant in a lot of old CoreFoundation classes, and still causes unexpected behavior from time to time.
What I love about these stories from the trenches isn't just seeing how the mystery gets solved (although that part is great), but seeing the evolution of the investigation -- which might be the more valuable of the two things when it comes to improving your own bug tracking skills.
Actually not a redefinition, as that function has had the same behavior since it was introduced. It's just code that's incorrect when run with reference counting.
This may be nitpicking, but it made it hard for me to pay attention to the real meat of the thing.