Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Humans have incentives to not do those things. Family. Jail. Money. Food. Bonuses. Etc.

If we could align an AI with incentives in the same way we can a person then youd have a point.

So far alignment research is hitting dead ends no matter what fake incentives we try to feed an AI.





Can you remind me of the link between alignment and writing accurate documentation? Honestly don't understand how they are linked.

You want the ai aligned with writing accurate documentation, not aligned with a goal thats near but wrong, e.g. writing accurate sounding documentation.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: