It's very interesting but presenting success rates without any measure of the er...

		minimal_action 14 days ago \| parent \| context \| favorite \| on: AGENTS.md outperforms skills in our agent evals It's very interesting but presenting success rates without any measure of the error, or at least inline details about the number of iterations is unprofessional. Especially for small differences or when you found the "same" performance.