If what you refer to by “on demand training ” is fine tuning, it's going to be m... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		littlestymaar 51 days ago \| parent \| context \| favorite \| on: Embarrassingly simple self-distillation improves c... If what you refer to by “on demand training ” is fine tuning, it's going to be much more efficient on a small model than a big one.

red75prime 51 days ago [–]

LoRA can work with big models. But I mean sample-efficient RL.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact