> Parent comment never said operating inference at a loss Context. Whether infer...

roadside_picnic · 2025-11-07T01:10:37 1762477837

So you're assuming there's a world where these companies exist solely by providing inference?

The first obvious limitation of this would be that all models would be frozen in time. These companies are operating at an insane loss and a major part of that loss is required to continue existing. It's not realistic to imagine that there is an "inference" only future for these large AI companies.

And again, there are many inference only startups right now, and I know plenty of them are burning cash providing inference. I've done a lot of work fairly close to the inference layer and getting model serving happening with the requirements for regular business use is fairly tricky business and not as cheap as you seem to think.

vel0city · 2025-11-07T01:45:56 1762479956

The models may be somewhat frozen in time but with the right tools available to it they don't need all information innately coded into it. If they're able to query for reliable information to drag in they can talk about things that are well outside their original training data.

roadside_picnic · 2025-11-07T02:46:07 1762483567

For a few months of news this works, but over the span of years even the statistical nature of language drifts a bit. Have you shipped natural language models to production? Even simple classifiers need to be updated periodically because of drift. There is no world where you lead the industry serving LLMs and don't train them as well.

furyofantares · 2025-11-07T01:28:28 1762478908

> So you're assuming there's a world where these companies exist solely by providing inference?

Yes, obviously? There is no world where the models and hardware just vanish.

roadside_picnic · 2025-11-07T02:43:17 1762483397

> and hardware just vanish.

Okay, this tells me you really don't understand model serving or any of the details of infrastructure. The hardware is incredibly ephemeral. Your home GPU might last a few years (and I'm starting to doubt that you've even trained a model at home), but these GPUs have incredibly short lifespans under load for production use.

Even if you're not working on the back end of these models, you should be well aware that one of the biggest concerns about all this investment is how limited the lifetime of GPUs is. It's not just about being "outdated" by superior technology, GPUs are relatively fragile hardware and don't last too long under constant load.

As far as models go, I have a hard time imagining a world in 2030 where the model replies "sorry, my cutoff date was 2026" and people have no problem with this.

Also, you still didn't address my point that startups doing inference only model serving are burning cash. Production inference is not the same as running inference locally where you can wait a few minutes for the result. I'm starting to wonder if you've ever even deployed a model of any size to production.

furyofantares · 2025-11-07T05:37:32 1762493852

I didn't address the comment about how some startups are operating at a loss because it seems like an irrelevant nitpick at my wording that "none of them" is operating inference at a loss. I don't think the comment I was replying to was referring to relying on whatever startups you're talking about. I think they were referring to Google, Anthropic, and OpenAI - and so was I.

That seems like a theme with these replies, nitpicking a minor thing or ignoring the context or both, or I guess more generously I could blame myself of not being more precise with my wording. But sure, you have to buy new GPUs after making a bunch of money burning the ones you have down.

I think your point about knowledge cutoff is interesting, and I don't know what the ongoing cost to keeping a model up to date with world knowledge is. Most of the agents I think about personally don't actually want world knowledge and have to be prompted or fine tuned such that they won't use it. So I think that requirement kind of slipped my mind.

HDThoreaun · 2025-11-07T02:24:16 1762482256

If the game is inference the winners are the cloud mega scalers, not the ai labs.

furyofantares · 2025-11-07T03:21:50 1762485710

This thread isn't about who wins, it's about the implication that it's too risky to build anything that depends on inference because AI companies are operating at a loss.