It's a cool release, but if someone on the google team reads that:
flash 2.5 is awesome in terms of latency and total response time without reasoning. In quick tests this model seems to be 2x slower. So for certain use cases like quick one-token classification flash 2.5 is still the better model.
Please don't stop optimizing for that!
>You cannot disable thinking for Gemini 3 Pro. Gemini 3 Flash also does not support full thinking-off, but the minimal setting means the model likely will not think (though it still potentially can). If you don't specify a thinking level, Gemini will use the Gemini 3 models' default dynamic thinking level, "high".
I was talking about Gemini 3 Flash, and you absolutely can disable reasoning, just try sending thinking budget: 0. It's strange that they don't want to mention this, but it works.