Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
How to Hack Transformers: Steering LLMs via Prompts, States, and Weight Edits (arxiv.org)
2 points by WASDAai 6 months ago | hide | past | favorite | 1 comment


TL;DR: The paper shows how you can steer LLMs by messing with prompts, hidden states, or weight edits—and warns that the same tricks can be used maliciously.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: