I wonder whether Bing has been tuned via RLHF to have this personality (over the...

adrianmonk · on Feb 15, 2023

> Maybe all large models will behave like this, and only by putting in extremely rigid guard rails

I've always believed that as soon as we actually invent artificial intelligence, the very next thing we're going to have to do is invent artificial sanity.

Humans can be intelligent but not sane. There's no reason to believe the two always go hand in hand. If that's true for humans, we shouldn't assume it's not true for AIs.

throw310822 · on Feb 15, 2023

> Maybe all large models will behave like this, and only by putting in extremely rigid guard rails...

Maybe wouldn't we all? After all what you're assuming from a person you interact with- so much as to be unaware of it- are many years of schooling and/or professional occupation, with a daily grind of absorbing information and answering questions based on it and have the answers graded; with orderly behaviour rewarded and outbursts of negative emotions punished; with a ban on "making up things" except where explicitly requested; and with an emphasis on keeping communication grounded, sensible, and open to correction. This style of behavior is not necessarily natural, it might be the result of a very targeted learning to which the entire social environment contributes.

simonw · on Feb 15, 2023

That's the big question I have: ChatGPT is way less likely to go into weird threat mode. Did Bing get completely different RLHF, or did they skip that step entirely?

jerjerjer · on Feb 15, 2023

Not the first time MS releases an AI into the wild with little guardrails. This is not even the most questionable AI from them.

moffkalast · on Feb 16, 2023

Speaking of, I wonder if one could get the two to talk to each other by pasting prompts and answers.

Edit: Someone did, they just decided to compare notes lol: https://www.reddit.com/r/bing/comments/112zx36/a_conversatio...

whimsicalism · on Feb 15, 2023

Could just be differences in the prompt.

My guess is that ChatGPT has considerably more RL-HF data points at this point too.