I have this toy agent I'm writing, I always laugh that I, human, write a code that generates human-readable markdown, that I feed to llm where I ask it to produce a json, so I can parse (by code I, or it wrote) and output in a consistent human-readable form.
I'm thinking about let it output freeform and then use another model to use to force that into structured.
I've found this approach brings slightly better result indeed. Let the model "think" in natural language, then translate it's conclusions to Json. (Vibe checked, not benchmarked)
I believe it could be true because I think training dataset contained a lot more yaml than json. I mean...you know how much yaml get churned out every second?
(I'm pretty sure this is actually what drove Microsoft Sydney insane.)
Reasoning models can do better at this, because they can write out a good freeform output and then do another pass to transform it.