They provided the probability distribution. The fact that you can’t handle math and need some sort of absolute certainty for a future event is not 538’s problem.
That's a bit strong for what Gelman said. I'm a big fan of Gelman (and learned from his books!), but he specifically mentioned that both Gelman et al's Model and 538's Model did indeed capture the outcomes in their probability distributions, but that to improve performance going forward it was much better to predict closer to the median than closer to the tails. (And funny enough, Gelman gave 538 some grief earlier on making a model with very wide tails.) This is a nuanced but very fair criticism, and taking a Twitter-style summary of it I think is overly reductionist.
Ah yes. Mr 'let me tell you why Nate is wrong' Gelman, who is now Mr 'let me tell you why the fact that I missed bigger than Nate is not my fault and in fact is entirely the fault of these other people' Gelman. Forgive me if I find his excuses laughable, but I guess if it makes him feel better about himself we can humour him. He even manages to choke his first rant by missing once again on EV and vote percentages.
it's not one single election -- it's consistent failure over multiple state elections, by large margins, all in the same direction -- which falls beyond any reasonable probability
I don't think much of evgen's unreasonable personal attack. But 538 isn't necessarily claiming that the per-state error will be normally distributed around their predictions.
I don't know the specifics of their model, but probably they are claiming "with these polls, the probability of this outcome is...". The polls being consistently biased doesn't tell us much about 538s model. They said Biden would almost surely win and despite a massive surprise in favour of Trump, Biden won.
And even if Biden lost, 10% upsets in presidential are expected to happen once every 10 elections like this one.
> If per-state error isn't normally distributed, that's evidence of bias, or bad polling.
No!
Assuming the per-state error would be normally distributed in some neutral world is making huge assumptions about the nature of the electorate, polling, and the correlations of errors between states, you can't do that! You would specifically /not/ expect per-state error to be evenly distributed because the nature of the error would have similar impacts on similar populations and there are similar populations of people that live in different states in differing numbers.
You should review the literature about the nature of the (fairly small) polling misses that impacted the swing states and thus disproportionately the outcome in the 2016 election. You will probably find it interesting.
There are unavoidable, expected, sampling errors which are, by definition, random. That's why valid, trusted polls calculate a confidence interval instead of a single discrete result.
Other types of "errors" -- election results that repeatedly fall outside the confidence interval, or are consistently on only one side of the mean -- only arise when the poll is flawed for some reason. Maybe you relied on landlines only, maybe you spoke with too many men, or too many young people, asked bad questions, miscalculated "likely voter," whatever. Accurate, valid, trusted polls don't have these flaws, the ONLY errors are small, random, expected sampling errors.
> Accurate, valid, trusted polls don't have these flaw
Yes, they do. Because (among many other reasons) humans have a choice whether or not to respond, you can't do an ideal random sample subject to only sampling error for a poll. All polls have non-sampling error on top of sampling error, it is impossible not to.
when polls don't match up with reality, as they didn't in 2016, the pollsters have a responsibility to re-calibrate the way they conduct the poll. Ask different questions, find new ways of obtaining respondents from all demographics, adjust raw data, etc. A professional pollster doesn't just get to say, hey, some people didn't want to talk to me ¯\_(ツ)_/¯
> when polls don't match up with reality, as they didn't in 2016, the pollsters have a responsibility to re-calibrate the way they conduct the poll.
Pollsters do that for continuously, and there were definite recalibrations in the wake of 2016.
OTOH, the conditions which produce non-sampling errors aren't static, and it's impossible to reliably even measure the aggregate of non-sampling error in any particular event (because sampling error exists, and while it's statistical distribution can be computed the actual error attributable to it in a by particular event can't be, so you never no how much actual error is due to non-sampling error much less any particular source of non-sampling error.)
> That's why valid, trusted polls calculate a confidence interval instead of a single discrete result.
That is what each of these statistical models did, yes. And the actual outcomes fell into these confidence intervals.
> Other types of "errors" -- election results that repeatedly fall outside the confidence interval, or are consistently on only one side of the mean -- only arise when the poll is flawed for some reason.
Or the model was inaccurate. Perhaps the priors were too specific. Perhaps the data was missing, misrecorded, not tabulated properly, who knows. Again, the results fell within the CI of most models, the problem was simply that the result fell too close to the mean for most statisticians' comfort.
>That is what each of these statistical models did, yes. And the actual outcomes fell into these confidence intervals.
The CI is due to sampling error, not model error. If the error of the estimate is due to sampling error, the estimate should be randomly distributed about true value. When the estimate is consistently biased in one direction, that's modelling error, which the CI does not capture.
> If the error of the estimate is due to sampling error
What does "estimate" mean here? Gelman's model is a Bayesian one, and 538 uses a Markov Chain model. In these instances, what would the "estimate" be? In a frequentist model, yes, you come up with an ML (or MAP or such) estimate, and if the ML estimate is incorrect, then there probably is an issue with the model, but neither of these models use a single estimate. Bayesian methods are all about modelling a posterior, and so the CI is "just" finding which parts of the posterior centered around the median contain the area of your CI.
I'm not saying that there isn't model error or sampling error or both. I'm just saying we don't know what caused it yet.
> Landed within the confidence interval? Are you kidding? CI is generally 2-4 points in these election polls.
The models and their data are public. The 538 model predicted an 80%CI of electoral votes for Biden as: 267-419, with the CI centered around 348.49 EVs. That means that Biden had an 80% chance of landing in the above confidence interval. Things seem to be shaking out to Biden winning with 297 EVs. Notice that this falls squarely within the CI of the model, but much further from the median of the CI than expected.
So yes, the results fell within the CI.
Drilling into Florida specifically (simply because I've been playing around with Florida's data), the 538 model predicts an 80%CI of Biden winning 47.55%-54.19% of the vote. Biden lost Florida, and received 47.8% of the vote. Again, note that this is on the left side of this CI but still within it. The 538 model was correct, the actual results just resided in its left tail.
Dude, you're gaslighting by using the national results as evidence instead of the individual states, which is what this has always been about since my original comment. Nearly every consequential state fell at, or beyond, the tail end of 538's confidence interval (BTW, who uses 80%? and not 90-95%?), on the same side. A bit closer to the mean in AZ and GA but same side, over-estimating Biden's margin of victory. Deny it all you want, gaslight, cover your eyes, whatever -- but clear, convincing, overwhelming evidence of a systematic flaw or bias in the underlying polls is right there in front of you.
Many political handicappers had predicted that the Democrats would pick up three to 15 seats, growing their 232-to-197 majority
Most nonpartisan handicappers had long since predicted that Democrats were very likely to win the majority on November 3. "Democrats remain the clear favorites to take back the Senate with just days to go until Election Day," wrote the Cook Political Report's Senate editor Jessica Taylor on October 29.
> Nearly every consequential state fell at, or beyond, the tail end of 538's confidence interval
While I haven't checked each and every individual state, I'm pretty sure they all fell within the CI. Tail end yes, but within the CI.
> (BTW, who uses 80%? and not 90-95%?)
... The left edge of the 80% CI shows a Biden loss. The point was 538's model was not any more confident than that about a Biden win. So yeah, not the highest confidence.
> Deny it all you want, gaslight, cover your eyes, whatever -- but clear, convincing, overwhelming evidence of a systematic flaw or bias in the underlying polls is right there in front of you.
Posting a bunch of media articles doesn't prove anything. I'm not saying there isn't systemic bias here, but your argument is simply that you wanted the polls to be more accurate and you wanted the media to write better articles about uncertainty. There's no rigorous definition of "systemic bias" here that I can even try to prove through data, all you've done is post links. You seem to be more angry at the media coverage than the actual model, but that's not the same as the model being incorrect.
Anyway I think there's no more for us to gain here by talking. Personally, I never trust the media on anything even somewhat mathematical. They can't even get pop science right, how can they get something as important as an election statistical model correct.
Not necessarily. Errors, like outcomes, are not independently distributed in US elections. Politics are intertwined and expecting errors and votes to be independent on a state (or even county) basis is overly simplistic. This is also what makes modelling US elections so difficult.
Sampling errors are random, and expected. Other types of misses are not simple "errors" but polling flaws, like sampling a non-representative group, ignoring non-responders or assuming they break the same as the responders, asking poorly-worded questions, etc.
Occasional flaws in polling is understandable and tolerated. But when those misses repeatedly line up the same way, and are rather sizeable, that's evidence of either systematic flaws, or outright bias.
I'm not sure what a "sampling error" is. To echo the sibling poster, per-state sentiment is not normally distributed. For example, we know Trump is more popular among white men than other demographics. This means that if we were to create a random variable that reflected the sentiment of white men throughout the US, we would (probably though I'd have to dig deeper into the data) presume to see a higher median vote count in this demographic. However, we cannot say that Trump's popularity in Massachusetts is independent from his popularity in New York, because his popularity in the white male demographic is the dependent variable between both random variables.
I was discussing in good faith, so I'm not sure why you chose to be snarky. Let's clarify here, I'm not sure what "sampling error" in this case would be, such that it is distinct from electoral trends at large. The random variables in question _are_ demographic groups. How is it meaningful to discuss sampling error if your assumption is that state and county data is independently distributed? The poll data that Gelman et al used is public data, I urge you to take a look and work with it.
The inputs it uses to spit out probabilities is known to be bad. Any scientist or researcher who claimed to get valid results from known bad inputs would be ridiculed.
To offer a concrete example here, survery respondents are often biased based on who actually sees and fills out a survey. A common technique used to overcome this non-representative sample is called post-stratification. There are, of course, limits to post stratification especially in instances of low sample sizes, but techniques to overcome issues with data are well known.
Science does not require an unbiased sample from a normal distribution to work. Bias is a technical term that the field of statistics is very comfortable working with. Scientists can also often get good results out of biased inputs.
538 has corrections for bias already. They seem to have worked in this instance - I repeat myself but: massive surprise, Biden still president.
You are pointing at evidence that 538 correctly called 11/12 races using statistics, and their confident call on a Biden president withstood a 4-7% swing (!!).
The existence of bias doesn't invalidate their predictions. Everyone knows that polls can be badly off target in a biased way - that isn't a new phenomenon.
When they talk about X% chance of Y being president they should be optimising to the outcome, not the margins.
It's not like we do elections every month to test out their probability distribution against empirical data. The distribution collapses into a binary outcome at the end.
I have a dice. I claim the distribution is of equal outcome for each side. Well...we don't get to test the dice more than once. 1 sample size does not prove that the 538's predictions were right (or wrong).
Thanks for assuming I can't do math, no way to argue with someone but I am actually pretty bad at it. :-)
Everyone is bad at probability and statistical distributions, not just you. The problem with modeling elections is that there are so few of them and the data is very noisy and until quite recently rather suspect. Let's not pretend that this was a normal election, either in the candidates running or in the manner in which the campaign and election was conducted.
As to the question of why bother, it is because bad polling is better than no polling at all. Campaigns are now multi billion dollar enterprises managing tens of thousands of temporary employees for the creation of a product that will only be sold once and in 18+ months from when they start the process. Any data is better than nothing.
The fact that the public has become obsessed with polls is probably due to the ongoing nationalization of politics.