> The algorithm could generally tell what a person’s political orientation was with a high degree of accuracy, even when that person’s identity was “decorrelated with age, gender, and ethnicity,” researchers write.
But the whole thing is looking at photos - how do you decorrelate a photo from age, gender, ethnicity, or weight?
> Please don't comment on whether someone read an article. "Did you even read the article? It mentions that" can be shortened to "The article mentions that".
They expressly look at the predictive power of BMI in Study 4
> How does the predictive power of the lower face size and BMI compare with the predictive power of the facial recognition algorithm estimated in Study 1? Would the VGGFace2-based model trained in Study 1 perform better if it was supplemented with explicit measures of lower face size and BMI? To answer these questions, we trained a series of regression models predicting political orientation (while controlling for age and gender) and used leave-one-out cross-validation to estimate prediction performance.
> The predictive power of the lower face size equaled r(434) = .11; p = .02; 95% CI [.01, .20]. BMI’s predictive power was insignificant r(272) = .06; p = .36; 95% CI [−.06, .18]. Combining the VGGFace2-based predictions (estimated in Study 1) with BMI, lower face size, and with both these variables did not improve prediction performance. The highest performance was afforded by combining VGGFace2 predictions with lower face size. Yet, this model’s performance, r(434) = .21; p < .001; 95% CI [.12, .30], was no higher than the performance of the VGGFace2 predictions alone, r(434) = .22; see Study 1.
Oh I see, that's in the actual paper, not the linked article. Having now re-read that section a few times, I think the important quote is
> The predictive power of the lower face size equaled r(434) = .11; p = .02; 95% CI [.01, .20]. BMI’s predictive power was insignificant r(272) = .06; p = .36; 95% CI [−.06, .18]. Combining the VGGFace2-based predictions (estimated in Study 1) with BMI, lower face size, and with both these variables did not improve prediction performance. The highest performance was afforded by combining VGGFace2 predictions with lower face size. Yet, this model’s performance, r(434) = .21; p < .001; 95% CI [.12, .30], was no higher than the performance of the VGGFace2 predictions alone, r(434) = .22; see Study 1.
"Additionally, body mass index (BMI) was computed for 274 participants who self-reported their weight and height."
So self-reported data. And only provided by 274 participants out of 591, so less than half. It seems very probable the more overweight you are, the less likely you are to self-report your weight. I still think the most likely explanation is that they are picking up on obesity, which would surely be strongly correlated with "lower face size."
However, simply asserting that something explains the results while not addressing the ways that the study actually look at that factor is not a valuable contribution to the conversation.
You are correct that self-reporting adds error to the BMI and the optional self reporting may have narrowed the range of BMI reports than can be analyzed. Both of these would reduce the size of the predictive effectivenes. It does seem like an area where a study better designed to control for BMI would increase our understanding. However the relative effect size vs lower face shape indicates to me that it is pretty unlikely that BMI fully explains model predictive ability.