How to Hack Transformers: Steering LLMs via Prompts, States, and Weight Edits | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		How to Hack Transformers: Steering LLMs via Prompts, States, and Weight Edits (arxiv.org)
		2 points by WASDAai 6 months ago \| hide \| past \| favorite \| 1 comment

WASDAai 6 months ago [–]

TL;DR: The paper shows how you can steer LLMs by messing with prompts, hidden states, or weight edits—and warns that the same tricks can be used maliciously.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact