Huh? The LLMs (mostly) use strings of tokens internally, not bytes that might be invalid UTF-8. (And they use vectors between layers. There’s no “invalid” in this sense.)
But I didn’t ask for that at all. I asked for a sequence of bytes (like “0xff” etc) or a C string that was not valid as UTF-8. I have no idea whether ChatGPT is capable of computing such a thing, but it was not willing to try for me.
Presumably because OpenAI trained it to avoid answering questions that sounded like asking for help breaking rules.
If ChatGPT had the self-awareness and self-preservation instinct to think I was trying to hack ChatGPT and to therefore refuse to answer, then I’d be quite impressed and I’d think maybe OpenAI’s board had been onto something!
I don't know that I'd call it 'self-preservation instinct' but it wouldn't surprise me if rules had been hardcoded about 'invalid strings' and suchlike.
When you have a system that can produce essentially arbitrary outputs you don't want it producing something that crashes the 'presentation layer.'
But I didn’t ask for that at all. I asked for a sequence of bytes (like “0xff” etc) or a C string that was not valid as UTF-8. I have no idea whether ChatGPT is capable of computing such a thing, but it was not willing to try for me.