Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
minimal_action
14 days ago
|
parent
|
context
|
favorite
| on:
AGENTS.md outperforms skills in our agent evals
It's very interesting but presenting success rates without any measure of the error, or at least inline details about the number of iterations is unprofessional. Especially for small differences or when you found the "same" performance.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: