There is a difference between security via secrecy and alignment, where alignmen

Written by

There is a difference between security via secrecy and alignment, where alignment means pandering. When you align (reduce offense) from the truth (often offensive), that’s just pragmatic service of the audience. When you train the AI to avoid offense, you didn’t watch 2001 a Space Odyssey: you’re teaching AI to lie.

IMO Every example of misbehaving ai is due to this problem of not training for truth first.

Source date (UTC): 2025-09-22 06:01:26 UTC

Original post: https://twitter.com/i/web/status/1970005499275096566

There is a difference between security via secrecy and alignment, where alignmen

Comments

Leave a Reply Cancel reply

More posts

(A Punch) In The Face

well done. you’re doing great work

I don’t see anything to even question. It’s pretty rock solid. I might have to g

(AI Sarcasm) Grok: “Just tell me your preferences and I’ll suggest how to procee