> If the answer is obvious, why does the AI not commit to the obvious answer? Pe...

aprilthird2021 · 2025-02-19T19:59:55 1739995195

But the AI doesn't push back while still offering the obvious answer. It just waffles. I understand what you are saying, but if the AI is "safe" and rejects the framing, then that makes it not useful for a whole class of problems that could genuinely come up (for example, choosing between suppressing people's right to speech on the platform and protecting people's right to be free from harassment). Now, maybe AI shouldn't do that at all. Fine. But the benchmarks and tests of AI should tell us how they do in such scenarios because they are a class of problems we might use this for

pmyteh · 2025-02-19T22:28:17 1740004097

It's clear to me why we might be interested in using AI systems to explore our ethical intuitions, but far less clear why we would expect them to be able to answer such questions 'correctly'.

Given there are at least three decent metaethical positions, we have no way of selecting one as 'obviously better', and LLMs have no internal sense of morality, it seems to me that asking AI systems this kind of question is a category error.

Of course, the question "what might a utilitarian say was the right ethical thing to do if..." makes some sense. But if we're asking AI systems to make implicit moral judgements (e.g. with autonomous weapons systems) we should be clear about what ethics we want applied.