They can when there are entire teams dedicated to adding guardrails via hidden s...

		transcriptase 3 months ago \| parent \| context \| favorite \| on: Expanding on what we missed with sycophancy They can when there are entire teams dedicated to adding guardrails via hidden system prompts and running all responses through other LLMs trained on flagging and editing certain things before the original output gets relayed to the user.