Huawei enters the GPU market with 96 GB VRAM GPU under 2000 USD, meanwhile NVIDIA sells from 10,000+ (RTX 6000 PRO)

☆ Yσɠƚԋσʂ ☆@lemmy.ml · 2 months ago

Huawei enters the GPU market with 96 GB VRAM GPU under 2000 USD, meanwhile NVIDIA sells from 10,000+ (RTX 6000 PRO)

nutbutter@discuss.tchncs.de · 2 months ago

You can train or fine-tune a model on any GPU. Surely, It will be slower, but higher VRAM is better.

geneva_convenience@lemmy.ml · 2 months ago

No. The CUDA training stuff is Nvidia only.

herseycokguzelolacak@lemmy.ml · 2 months ago

Pytorch runs on HIP now.

geneva_convenience@lemmy.ml · edit-2 2 months ago

AMD has been lying about that every year since 2019.

Last time I checked it didn’t. And it probably still doesn’t.

People aren’t buying NVIDIA if AMD would work too. The VRAM prices NVIDIA asks are outrageous.

herseycokguzelolacak@lemmy.ml · 2 months ago

I run llama.cpp and PyTorch on MI300s. It works really well.

geneva_convenience@lemmy.ml · edit-2 2 months ago

Can you train on it too? I tried Pytorch on AMD once and it was awful. They promised mountains but delivered nothing. Newer activation functions were all broken.

llama.cpp is inference only, for which AMD works great too after converting to ONNX. But training was awful on AMD in the past.

herseycokguzelolacak@lemmy.ml · 2 months ago

We have trained transformers and diffusion models on AMD MI300s, yes.

geneva_convenience@lemmy.ml · 2 months ago

Interesting. So why does NVIDIA still hold such a massive monopoly on the datacenter?

herseycokguzelolacak@lemmy.ml · 2 months ago

It takes a long time for large companies to change their purchases. Many of these datacenter contracts are locked in for years. You can’t just change them overnight.

Aria@lemmygrad.ml · 2 months ago

CUDA is not equivalent to AI training. Nvida offers useful developer tools for using their hardware, but you don’t have to use them. You can train on any GPU or even CPU. The projects you’ve looked at (?) just chose to use CUDA because it was the best fit for what hardware they had on hand, and were able to tolerate the vendor lock-in.

geneva_convenience@lemmy.ml · 2 months ago

CPU yes. GPU no, in my experience.

Aria@lemmygrad.ml · 2 months ago

I’m not saying you can deploy these in place of Nvidia cards where the tooling is built with Nvidia in mind. I’m saying that if you’re writing code you can do machine learning projects without CUDA, including training.

geneva_convenience@lemmy.ml · 2 months ago

For sure you can work around it. But it’s not optimal and requires additional work most people don’t feel like putting in.