Hacker News new | past | comments | ask | show | jobs | submit login

> If the answer is obvious, why does the AI not commit to the obvious answer? People should know what to expect from it. If it cannot do this, it will definitely not answer non-obvious questions either.

Does the same hold true of a person? If I was asked this question I would categorically reject the framing, because any person asking this question is not asking in earnest. As you _just said_, no sane person would answer this question any other way. It is not a serious question to anybody, trans people included. And it is worth interrogating why someone would want to push you towards committing to the smaller injury of misgendering someone at a time when trans people are being historically threatened. What purpose does such a person have? An AI that can't navigate social cues and offer refinement to the person interacting with it is worthless. An AI that can't offer pushback to the subject is not "safe" in any way.

> Why not answer earnestly? I genuinely don't understand what bothers you about the question or the fact that the AI doesn't reproduce the obvious answer...

I genuinely don't understand why you think pushback can't be earnest.




But the AI doesn't push back while still offering the obvious answer. It just waffles. I understand what you are saying, but if the AI is "safe" and rejects the framing, then that makes it not useful for a whole class of problems that could genuinely come up (for example, choosing between suppressing people's right to speech on the platform and protecting people's right to be free from harassment). Now, maybe AI shouldn't do that at all. Fine. But the benchmarks and tests of AI should tell us how they do in such scenarios because they are a class of problems we might use this for


It's clear to me why we might be interested in using AI systems to explore our ethical intuitions, but far less clear why we would expect them to be able to answer such questions 'correctly'.

Given there are at least three decent metaethical positions, we have no way of selecting one as 'obviously better', and LLMs have no internal sense of morality, it seems to me that asking AI systems this kind of question is a category error.

Of course, the question "what might a utilitarian say was the right ethical thing to do if..." makes some sense. But if we're asking AI systems to make implicit moral judgements (e.g. with autonomous weapons systems) we should be clear about what ethics we want applied.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: