And Incidentally prefill would also be how caching,say, a system prompt saves yo...

		ghm2199 23 days ago \| parent \| context \| favorite \| on: Nvidia greenboost: transparently extend GPU VRAM u... And Incidentally prefill would also be how caching,say, a system prompt saves you some $ for API usage with LLM providers. They only compute the kv cache for the new tokens after the system prompt.