It could be using the same early trick of Grok (at least in the earlier versions) that they boot 10 agents who work on the problem in parallel and then get a consensus on the answer. This would explain the price and the latency.
Essentially a newbie trick that works really well but not efficient, but still looking like it's amazing breakthrough.
(if someone knows the actual implementation I'm curious)
Essentially a newbie trick that works really well but not efficient, but still looking like it's amazing breakthrough.
(if someone knows the actual implementation I'm curious)