Keep in mind that this is the stupidest the LLM will ever be and we can expect m...

llamaLord · on Jan 13, 2025

My experience observing commercial LLM's since the release of GPT-4 is actually the opposite of this.

Sure, they've gotten much cheaper on a per-token basis, but that cost reduction has come with a non-trivial accuracy/reliability cost.

The problem is, tokens that are 10x cheaper are still useless if what they say is straight up wrong.

maeil · on Jan 13, 2025

> Sure, they've gotten much cheaper on a per-token basis, but that cost reduction has come with a non-trivial accuracy/reliability cost.

This only holds for OpenAI.

maeil · on Jan 13, 2025

> Keep in mind that this is the stupidest the LLM will ever be and we can expect major improvements every few months.

We have seen no noticable improvements (at usable prices) for 7 months, when the original Sonnet 3.5 came out.

Maybe specialized hardware for LLM inference will improve so rapidly that o1 (full) will be quick and cheap enough a year from now, but it seems extremely unlikely. For the end user, the top models hadn't gotten cheaper for kore than a year until the release of Deepseek v3 a few weeks ago. Even that is currently very slow at non-Deepseek providers, and who knows just how subsidized the pricing and speed at Deepseek itself is, given political interests.

Eliezer · on Jan 13, 2025

No major AI advancements for 7 months? Guess everyone's jobs are safe for another year, and after that we're all dead?

maeil · on Jan 16, 2025

> No major AI advancements for 7 months?

For my caveat "at usable prices", no, there haven't been any. o1 (full) and now o3 have been advancements, but are hardly available for real-world use given limitations and pricing.

sdesol · on Jan 13, 2025

> we can expect major improvements every few months.

I'm not sure this is grounded in reality. We've already seen articles related to how OpenAI is behind schedule with GPT-5. I do believe things will improve over time, mainly due to advancements in hardware. With better hardware, we can better brute force correct answers.

> junior devs will always be junior devs

Junior developers turn into senior developers over time.

smcnally · on Jan 13, 2025

> I'm not sure this is grounded in reality. We've already seen articles related to how OpenAI is behind schedule with GPT-5.

Progress by Google, meta, Microsoft, Qwen and Deepseek is unhampered by OpenAI’s schedule. Their latest — including Gemini 2.0, Llama 3.3, Phi 4 — and the coding fine tunes that follow are all pretty good.

sdesol · on Jan 13, 2025

> unhampered by OpenAI’s schedule

Sure, but if the advancements are to catch up to OpenAI, then major improvements by other vendors are nice and all, but I don't believe that was what the commenter was implying. Right now the leaders in my opinion are OpenAI and Anthropic and unless they are making major improvements every few months, the industry as a whole is not making major improvements.

smcnally · on Jan 13, 2025

OpenAI and Anthropic are definitely among the leaders. Playing catch-up to these leaders' mind-share and technology is some of the motivation for others. Calling the progress being made in the space by Google (Gemini), MSFT (Phi), Meta (llama), Alibaba (Qwen) "nice and all" is a position you might be pleasantly surprised to reconsider if this technology interests you. And don't sleep on Apple and AMZ -

In the space covered by Tabby, Copilot, aider, Continue and others, capabilities continue to improve considerably month-over-month.

In the segments of the industry I care most about, I agree 100% with what the commenter said w/r/t expecting major improvements every few months. Pay even passing attention to huggingface and github and see work being done by indies as well as corporate behemoths happening at breakneck pace. Some work is pushing the SOTA. Some is making the SOTA more widely available. Lots of it is different approaches to solving similar challenges. Most of it benefits consumers and creators looking use and learn from all of this.

harvodex · on Jan 13, 2025

I wish this was true as being a shitty programmer who is old , I would benefit from this as much as anyone here but I think it is delusional.

From my experience I wouldn't even say LLMs are stupid. The LLM is a carrier and the intelligence is in the training data. Unfortunately, the training data is not going to get smarter.

If any of this had anything to do with reality then we should already have a programming specific model only trained on CS and math textbooks that is awesome. Of course, that doesn't work because the LLM is not abstracting the concepts how we normally think of in order to be stupid or intelligent.

It hardly shocking that next token prediction on math and CS textbooks is of limited use. You hardly have to think about it to see how flawed the whole idea is.

n144q · on Jan 13, 2025

GitHub Copilot came out in 2021.