Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
|
from
login
Codex Daily Benchmarks for Degradation Tracking (Marginlab.ai)
(
marginlab.ai
)
1 point
by
wendgeabos
13 days ago
|
past
|
discuss
Claude Code daily benchmarks for degradation tracking
(
marginlab.ai
)
760 points
by
qwesr123
13 days ago
|
past
|
355 comments
No one is evaluating AI coding agents in the way they are used
(
marginlab.ai
)
1 point
by
qwesr123
29 days ago
|
past
Claude Code Daily Degradation Tracker
(
marginlab.ai
)
3 points
by
qwesr123
33 days ago
|
past
|
3 comments
Anatomy of a Coding Agent: A step-by-step illustration
(
marginlab.ai
)
3 points
by
qwesr123
51 days ago
|
past
How are coding assistants evaluated? SWE-Bench Pro Explorer
(
marginlab.ai
)
2 points
by
qwesr123
53 days ago
|
past
SWE-Bench: The $500B Benchmark
(
marginlab.ai
)
5 points
by
qwesr123
55 days ago
|
past
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: