I used to work as an insurance analyst and regularly dealt with customer-reported mileage distributions. It boggles the mind that this data was used for an academic paper when any junior analyst in an insurance context could tell you it's nonsensical with a glance. To me it just goes to show how far away these researchers are from the domain of the data that they're using in these studies. Kind of lowers the credibility of social sciences as a whole, unfortunately.
There are more interesting studies to be done with this data IMO, which the researchers could have done if they had cared enough to talk to someone in the field.
As an example, we rarely saw completely normal "bell curves" with reported mileage. We often saw a roughly gaussian shape between 10k-30k, with a "J Curve" under 10k, where some % of dishonest people would report their mileage as absurdly low.
Where permitted by regulators, we would actually rate on this. If you had a single car, reported yourself as fully employed outside of the home, and also reported 5k mileage per year, you would receive a SURCHARGE compared to someone who reported 15k, because there was signal about your likelihood to make a claim in the fact that you were lying. The signal disappeared if you looked at people with more plausible arrangements, like having 2 cars, one of which had low mileage, had a single low-mileage vehicle but were self-employed (possibly WFH), etc...
I have to believe a clever researcher could find some interesting results with such data.