“Pretraining” is not a correlate of the learning paradigms, it is a correlate of the “fine-tuning” process.
Also LLM pretraining is unsupervised. Dwarkesh is wrong.
“Pretraining” is not a correlate of the learning paradigms, it is a correlate of the “fine-tuning” process.
Also LLM pretraining is unsupervised. Dwarkesh is wrong.