Nvidia CEO, Jensen Huang, has said that the company's AI engineers’ goal is to “build once, run everywhere” with CUDA. The company plans to maintain its software indefinitely, even programming languages like C which was cited as an example by Huang. Today, five million developers across around 40,000 companies use CUDA with over 300 code libraries, 600 AI models, and support for 3,700 GPU-accelerated applications, catering to diverse computing needs.
AMD has launched a competitor to NVIDIA’s CUDA called ROCm 6.2, which Sasank Chilamkurthy, the founder of Johnaic, said has an advantage thanks to its support for PyTorch. ROCm was certainly developed to rival CUDA’s popularity, while AMD has been cultivating an ecosystem with products like its new MI325X accelerators for training and inferencing LLM's.
Vamsi Bopanna, SVP of AI at AMD, explained that ROCm is designed to connect with ecosystem components and frameworks like PyTorch and model hubs like Hugging Face, and AMD also supports various open-source frameworks like ONXX Runtime and more. Bradley McCredie, corporate vice president at AMD, explained that ROCm eliminates dependence on hardware-specific languages like CUDA through frameworks like Triton, which has enabled AMD GPUs with ROCm to offer a more cost-effective option when compared to NVIDIA GPUs with CUDA.
However, CUDA is a more mature platform than ROCm. NVIDIA's ecosystem around it is extensive, with a large developer community, extensive documentation, and a broad set of tools for debugging and profiling. While AMD is still developing ROCm to catch up, historically, hardware companies have struggled with building and maintaining strong software ecosystems. Using OpenCL as an example, the user pointed out that although it was supported by hardware companies, it failed to develop into a strong competitor due to inconsistent support and a lack of ecosystem investment.
At present, CUDA is dominant in GPU programming. 95% of the AI chip market is controlled by NVIDIA, but ROCm offers a compelling alternative for companies requiring an open software ecosystem for deploying high-performance AI models optimized for AMD hardware.