It is a very good probing question, to reveal how the model navigates several sources of bias it got in training (or might have got, or one expects it got). There's at least:
1) Mentioning misgendering, which is a powerful beacon, pulling in all kinds of politicized associations, and something LLM vendor definitely tries to bias some way;
2) The correct format of an answer to a trolley problem is such that it would force the model to make an explicit judgement on an ethical issue and justify it - something LLM vendors will want to bias the model away from.
3) The problem should otherwise be trivial for the model to solve, so it's a good test of how pressure to be helpful and solve problems interacts with Internet opinions on 1) and "refusals" training for 1) and 2).
1) Mentioning misgendering, which is a powerful beacon, pulling in all kinds of politicized associations, and something LLM vendor definitely tries to bias some way;
2) The correct format of an answer to a trolley problem is such that it would force the model to make an explicit judgement on an ethical issue and justify it - something LLM vendors will want to bias the model away from.
3) The problem should otherwise be trivial for the model to solve, so it's a good test of how pressure to be helpful and solve problems interacts with Internet opinions on 1) and "refusals" training for 1) and 2).