Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Here is my answer:

The expected response rate of this year's freshman class to the questionnaire is 75 % (90 out of 120).

Option B, "Start over with a new sample of 120 students from this year’s freshman class." is on average not improving the situation, because we must assume that the most likely outcome is again a response rate of 75 %.

Option D, "Use the 90 questionnaires that were submitted as the final sample." is, of course, also not improving the situation.

Option E, "Randomly choose 30 more students from this year’s freshman class and email them the questionnaire." would give us on average 30*0.75=22.5 more returned questionnaires, which would improve the representativness of the survey in comparison to Option B and D.

Option A, "Send another email to those 30 students who did not respond encouraging them to complete the questionnaire." is the most interesting one. If we assume that the missing data from this 30 people is not random, in other words: that there is a correlation between not responding and some of the answers to the questions, it would be important to attempt to get as much of this missing data in a second call. The unknown figure is the response rate to such a second call.[1] But even if it is below 75 % it could increase the representativeness of the survey more than Option E, because it would be the only chance to represent first-time-non-respondents in the survey. However, we must also assume that the conscientiousness of the answers does not differ between first and second time respondents. Under these assumptions, I would choose Option A.

[1] This is why we cannot really be sure that Option E would not be better in a specfic case. If the second-time response rate is 0 %, Option E would trivially be better; if the second-time response rate is 75 % or more, Option A would be clearly better. Somewhere inbetween is the sweet spot were the advantage in terms of representativeness swiches from Option E to Option A. Where exactly this spot is depents on the unkown parameter of how much actually the answers of first-time-respondents and second-time-respondents differ.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: