People keep saying stuff like this. That the improvements are so obvious and bre...

aspenmartin · 2026-01-02T20:54:59 1767387299

You might want to be more specific because benchmarks abound and they paint a pretty consistent picture. LMArena "vibes" paint another picture. I don't know what you are doing to "check" the frontier LLMs but whatever you're doing doesn't seem to match more careful measurement...

You don't actually have to take peoples word for it, read epoch.ai developments, look into the benchmark literature, look at ARC-AGI...

qualifck · 2026-01-05T16:20:10 1767630010

That's half the problem though. I can see benchmarks. I can see number go up on some chart or that the AI scores higher on some niche math or programming test, but those results don't seem to actually connect much to meaningful improvements in daily usage of the software when those updates hit the public.

That's where the skepticism comes in, because one side of the discussion is hyping up exponential growth and the other is seeing something that looks more logarithmic instead.

I realize anecdotes aren't as useful as numbers for this kind of analysis, but there's such a wide gap between what people are observing in practice and what the tests and metrics are showing it's hard not to wonder about those numbers.

senordevnyc · 2026-01-02T01:35:04 1767317704

I’m genuinely curious what your “checking the frontier LLMs” looks like, especially if you haven’t used AI since last year.

jennyholzer3 · 2026-01-02T14:20:32 1767363632

"maybe a tiny bit better" is what you say when you've been tricked by snake oil salesman

This shit has gotten worse since 2023.

aspenmartin · 2026-01-02T20:56:47 1767387407

> This shit has gotten worse since 2023.

I would really appreciate it if people could be specific when they say stuff like this because it's so crazy out of line with all measurement efforts. There are an insane amount of serious problems with current LLM / agentic paradigms, but the idea that things have gotten worse since 2023? I mean come on.

senordevnyc · 2026-01-03T01:34:18 1767404058

You’re responding to a troll who just has a nasty, bitter axe to grind against AI. It’s honestly pretty sad and pathetic.