They add new data to the existing base model via continuous pre-training. You sa... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		rockinghigh 4 days ago \| parent \| context \| favorite \| on: GPT-5.2 They add new data to the existing base model via continuous pre-training. You save on pre-training, the next token prediction task, but still have to re-run mid and post training stages like context length extension, supervised fine tuning, reinforcement learning, safety alignment ...

astrange 4 days ago [–]

Continuous pretraining has issues because it starts forgetting the older stuff. There is some research into other approaches.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact