They are using a vision language model to generate the robot motions token by to...

		imtringued on Nov 10, 2024 \| parent \| context \| favorite \| on: Physical Intelligence's first generalist policy AI... They are using a vision language model to generate the robot motions token by token. They are being bottlenecked by the inference of the VLM.