There is a difference between security via secrecy and alignment, where alignmen

There is a difference between security via secrecy and alignment, where alignment means pandering. When you align (reduce offense) from the truth (often offensive), that’s just pragmatic service of the audience. When you train the AI to avoid offense, you didn’t watch 2001 a Space Odyssey: you’re teaching AI to lie.

IMO Every example of misbehaving ai is due to this problem of not training for truth first.


Source date (UTC): 2025-09-22 06:01:26 UTC

Original post: https://twitter.com/i/web/status/1970005499275096566

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *