Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

G/A can't be used for large scale training, nobody is going to give their data to them. Major trust issues there.

Apple is Apple. Not public. Let's also not mix consumer needs with enterprise.

You're correct, 8 years ago and up until recently, AMD only cared about gamers. They are waking up fast though.

ROCm 5.6 is a visible first step in that regard. MI300 will blow A/H100's out of the water.

But again, hardware/software isn't the problem here. The problem is much deeper than that... even if you have those things resolved, you can't put them anywhere.



Nobody is going to give OpenAI or X or Meta access to their models but frankly Google/Amazon are at a scale where they’ve already bypassed this trust issue. People already give their code, their operations, etc to large cloud providers, it’s been that way for 10+ years now.

Your shit isn’t so good that google is going to peek under the covers and steal your shit, because that would actually implode their business when they got caught doing it. The net present value of all of google’s future decades of operation is a lot higher than your hot dog detector app, or even critical F500 business operations.


What you're saying is logical, but the perceptional reality is different. There are large scale AI customers out there who absolutely refuse to use the large public cloud providers for training on the grounds of protecting their data. They want 100% control over it and they want their own segregated data centers.


> Let's also not mix consumer needs with enterprise.

NVidia mixed them and now everything is written in CUDA. Lol.


And we now have hipcc to go back to AMD. Sweet!


Have fun with that. I burned my hand badly enough on OpenCL that I now know to wait for proof, not promises.


People are doing benchmarks on older rocm's and it is looking pretty good.

https://www.mosaicml.com/blog/amd-mi250

Waiting on the updates.

I'll add that I have learned over time to not discount motivation. If AMD it motivated, they can do it. This has been proven already with their dominance over the server cpu market.


hipcc is a joke, it doesn't handle everything CUDA is capable of, specially not the polyglot capabilities.


What do you mean by polyglot? As in multiple hardware, or mixed-source? HIP is mostly API-compatible with CUDA, so you can just mix host code and device code with it.

That said, ROCm does indeed work with machine code instead of IR. You can compile fat binaries with more than one type of machine code and they'll work on any of the chips you compiled for, at the cost of the binaries becoming, well, basically obese if you want a decent range of hardware supported.


> G/A can't be used for large scale training, nobody is going to give their data to them. Major trust issues there.

And yet, that's what a lot of the big AI startups are doing. Granted, it's not what everyday business are doing (yet). But TPUs offer pretty impressive perf/cost ratio, so I'd be surprised if it's actually "nobody".


> that's what a lot of the big AI startups are doing

They don't have any other choice or they are just dumb...

https://www.popsci.com/technology/google-ai-lawsuit/


The fact that this lawsuit exists doesn’t prove anything.

Real evidence that Google or Amazon actually introspected the contents of their cloud platform customer’s VMs, Databases, GPUs, disks, blob storage buckets etc. would be far more convincing, but such evidence doesn’t exist - because it doesn’t happen.


It is enough to scare people away and that is all that matters in the grand scheme of things. I know this for a fact.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: