It strains credulity to assume no malicious (i.e. data harvesting) intent. They are requiring millions of devices to phone home every time they open any application, rather than just pushing an update to a locally stored certificate blacklist when necessary.
By that argument, if the intent was data harvesting, they wouldn't need to do this in real-time, they could also just periodically push a list of apps and launch dates/times without using the internet connection when an app is launched. That would certainly be a lot easier and a less expensive way to harvest data -- and it would also be more reliable since it could capture things that happen when offline. But that's not what they are doing.
If the intent was data harvesting, why the hell did they not actually send data that was useful to harvest? If they wanted to harvest data, they would have sent your Apple ID or device identifier along with the app hash.
They didn't.
Either they are just completely, blitheringly incompetent, or: They actually, truly, didn't intend to harvest any data.
Who's to say they won't add that in future as a new "security feature" - especially now that they are "fixing" this issue by encrypting the calls home. Apple is a master of shifting consumer norms, behaviours and expectations over the long term using this boil the frog slowly approach.
Then you can complain when they do that. In theory, they can add any tracking anywhere in the OS they want. But until they actually do so, complaining about it is somewhat inane.
Same thing applies to the current Apple implementation - Malware could disable the application check. If malware has that kind of access (disable blacklist/application-check), the owner has already lost.
Both approaches are meant to stop malware from launching so I don't see much of a difference. Blacklist/whitelist? Whitelist could be implemented locally as well.
My gut tells me it’s easier to sabotage something passively polling for updates vs something built into the OS that checks at each application launch, but if not, the only difference is the vulnerable window between polling intervals.
So, you may be right, may not make much difference. I don’t accept malicious data-mining intent, however.
Realtime information about spread of malware is really valuable from a prevention/attack perspective. Virus scanners also employ this method. That's why they still give away free versions of their software, just to have as much coverage as possible to get the latest zero-day. So you could even discuss if the data harvesting itself is with malicious intent or not.