Lol this is almost comical. As if anyone riding this wave and making billions is...

basch · 2025-01-27T21:50:59 1738014659

I’m almost shocked this spooked the market as much as it did, as if the market was so blind to past technological innovation to not see this coming.

Innovation ALWAYS follows this path. Something is invented in a research capacity. Someone implements it for the ultra rich. The price comes down and it becomes commoditized. It was inevitable that “good enough” models became ultra cheap to run as they were refined and made efficient. Anybody looking at LLMs could see they were a brute forced result wasting untold power because they “worked” despite how much overkill they were to get to the end result. Them becoming lean was the obvious next step, now that they had gotten pretty good to the point of some diminishing returns.

ddalex · 2025-01-27T22:52:50 1738018370

sure, but what nobody expected how QUICKLY the efficiency progress has been - aviation took about 30 years to progress from "the rich" to "everybody", personal computers about 20 years (from 1980s to 2000s), I think the market expected to have at least 10 years of "rich premium" - not 2 years and get taken to the cleaners by the economic archenemy, China

basch · 2025-01-28T02:09:19 1738030159

The Google transformer paper was 2017. ChatGPT was the “we can give a version away of this for free.” Llama was “we can afford to give away the whole product for free to even the playing field.” Every tech giant comes out with a comparable product simultaneously. And now a hedge fund, not even a megacap company, can churn out a clone by hiring a small or medium size engineering team.

Really this should be an indictment of corporate bloat, having hundreds of thousand headcount companies distracted by performance reviews, shareholders, marketing, rebuilding the same product they launched two years ago under a new name.

whinendine · 2025-01-28T02:29:56 1738031396

Transformer paper was 2017

fuzztester · 2025-01-28T19:51:52 1738093912

>Really this should be an indictment of corporate bloat, having hundreds of thousand headcount companies distracted by performance reviews, shareholders, marketing, rebuilding the same product they launched two years ago under a new name.

Yeah.

There are some shorter words or acronyms for it though, roughly equivalent to your about 30-word paragraph above:

IBM DEC Novell Oracle MS Sun HP ... MBA , all in their worse days or incarnations or ...

timschmidt · 2025-01-28T02:17:19 1738030639

Anyone who's ever read Kurzweil isn't surprised.

XorNot · 2025-01-27T22:40:34 1738017634

The notion I now believe more fully is that the money people - managers, executives, investors and shareholders - like to hear about things in units they understand (so money). They don't understand the science, or the maths and in so much as they might acknowledge it exists it's an ambient concern: those things happen anyway (as far as they can tell), and so they don't know how to value them (or don't value them).

Because we saw, what a week ago the leading indicator that the money people were now feeling happy they were in charge which was that weird not-government US$500 billion investment in AI announcement. And we saw the same being breathlessly reported when Elon Musk founded xAI and had "built the largest AI computer cluster!"...as though that statement actually meant anything?

There was a whole heavily implied analogy going on of "more money (via GPUs) === more powerful AIs!" - ignoring any reality of how those systems worked, their scaling rules or the fact that inferrence tended to run on exactly 1 GPU.

Even the internet activist types bought into this, because people complaining about image generators just could not be convinced that the Stable Diffusion models ran locally on extremely limited hardware (the number of arguments where people would discuss this and imply a gate while I'm sitting their with the web GUI in another window on my 4 year old PC).

Groxx · 2025-01-28T02:23:38 1738031018

I would generally agree, but the market isn't rational about the future prospects of a company. It's rational about "can I make money off this stock" and nothing else matters in the slightest.

Riding hype, and dumping at the first sign of issues, follows that perfectly well.

j-krieger · 2025-01-28T16:04:37 1738080277

> I’m almost shocked this spooked the market as much as it did, as if the market was so blind to past technological innovation to not see this coming.

Regulatory capture only benefits you nationally. You might even get used to it.

pilooch · 2025-01-27T20:25:01 1738009501

Sure but it's good to recognize Meta never stopped publishing even after Openai and deepmind most notably stopped sharing the good sauce. From clip to dinov2 and llama series, it's a serious track to be remembered.

AnimeLife · 2025-01-27T20:54:13 1738011253

But there is a big difference, llama is still way behind chatgpt and one of the key reasons to open source it could have been to use open source community to catch up with chatgpt. Deepseek on contrary is already at par with chatgpt.

llm_trw · 2025-01-27T21:05:08 1738011908

Llama is worse than gpt4 because they are releasing models 1/50th to 1/5th the size.

R1 is a 650b monster no one can run locally.

This is like complaining an electric bike only goes up to 80km/h

thot_experiment · 2025-01-27T23:44:36 1738021476

R1 distills are still very very good. I've used Llama 405b and I would say dsr1-32b is about the same quality, or maybe a bit worse (subjectively within error) and the 70b distill is better.

potamic · 2025-01-28T05:11:32 1738041092

What hardware do you need to be able to run them?

llm_trw · 2025-01-28T07:21:39 1738048899

The distils run on the same hardware as the llama models they are based on llama models anyway.

The full version... If you have to ask you can't afford it.

kandesbunzler · 2025-01-27T20:59:55 1738011595

Yea no shit, that's because meta is behind and Noone would care about them if it wasn't open source

troyvit · 2025-01-27T21:47:30 1738014450

Right, so it sounds like it's working then given how much people are starting to care about them in this sphere.

We can laugh at that (like I like to do with everything from Facebook's React to Zuck's MMA training), or you can see how others (like Deepseek and to a lesser extent, Mistral, and to an even lesser extent, Claude) are doing the same thing to help themselves (and each other) catch up. What they're doing now, by opening these models, will be felt for years to come. It's draining OpenAI's moat.

fragmede · 2025-01-28T01:06:26 1738026386

How's that old chestnut go? "First they laugh at us..."?

Herring · 2025-01-27T20:28:27 1738009707

There's no need to read it uncharitably. I'm the last person you can call a FB fan, I think overall they're a strong net negative to society, but their open source DL work is quite nice.

baxtr · 2025-01-27T20:37:55 1738010275

Just to add on the positive side: their quarterly meta threats report is also quite nice.

A4ET8a8uTh0_v2 · 2025-01-27T21:11:38 1738012298

This. Even their less known work is pretty solid[1] ( used it the other day and was frankly kinda amazed at how well it performed under the circumstances ). Facebook/Meta sucks like most social madia does, but, not unlike Elon Musk, they are on the record of having some contributions to society as a whole.

[1]https://github.com/facebook/zstd

A4ET8a8uTh0_v2 · 2025-01-27T21:07:58 1738012078

<< And as if releasing llama is one of the main reasons we got here in AI…

Wait.. are you saying it wasn't? Just releasing it in that form was a big deal ( and heavily discussed on HN, when it happened ). Not to mention, a lot of the work that followed on llama partly because it let researches and curious people dig deeper into internals.