But you needed to tell it that it was wrong 3-4 times. Why do you trust it to be...

		notachatbot123 on Dec 20, 2024 \| parent \| context \| favorite \| on: Alignment faking in large language models But you needed to tell it that it was wrong 3-4 times. Why do you trust it to be "correct" now? Should it need more flogging? Or was it you who were wrong in the first place?