Coincidentally, the first issue (referencing Navi 21) was the one I started these experiments with, and this turned out to be pretty informative.
Our Navi 21 would almost always go AWOL after a test run had been completed, requiring a full reboot. At some point, I noticed that this only happened when our test runner was driving the test; I never had an issue when testing interactively. I eventually realized that our test driver was simply killing the VM when the test was done, which is fine for a CPU-based test, but this messed with the GPU's state. When working interactively, I was always shutting down the host cleanly, which apparently resolved this. A patch to our test runner to cleanly shut down VMs fixed this.
And I've had no luck with iGPUs, as referenced by the second issue.
From what I understand, I don't think that consumer AMD GPUs can/will ever be fully supported, because the GPU reset mechanisms of older cards are so complex. That's why things like vendor-reset [3] exist, which apparently duplicate a lot of the in-kernel driver code but ultimately only twiddle some bits.
She also had no choice, as SBF was blaming her. The point being that they still didn't really need her help. It was obvious that he committed fraud, and there was plenty of proof of it.
I mean, the guy was constantly high on nootropics and they had no idea what actual investments FTX made. I'd imagine most of the time was just spent untangling that web, his case was more or less a slam dunk.
SemiAnalysis made this a base requirement for being appropriately ranked on their ClusterMAX report, telling me it is akin to FAA certifications, and then getting hacked themselves for not enforcing simple security controls.
My favorite one is the "We've identified a security hole in your website"... and I always respond quickly that my website is statically generated, nothing dynamic and immutable on cloudflare pages. For some odd reason, I never hear back from them.
https://github.com/amd/MxGPU-Virtualization/issues/6
https://github.com/amd/MxGPU-Virtualization/issues/16
reply