Groq is using a free inference tier to compete with Nvidia's CUDA software and attract developers. The AI chip startup offers some of the fastest inference out there, with roughly 652k developers now using Groq API keys. Groq uses no kernels and requires no use of CUDA libraries, providing users with models already built-in that just work. Although this provides developers with an easy-to-use system it also means the barrier to entry for Groq users is the same as any other cloud provider and potentially lower than that of other chips.
Groq has focused on early AI computing, which requires less need for directly programming chips. Despite one investor having called companies like Groq insane for attempting to dent Nvidia's estimated 90% market share, Groq's novel plan for programming its chips gives the company a unique approach to the most crucial element within Nvidia's moat. Groq expects speed to hook developers on its software but will need to ensure developers troubleshoot and improve its base software.
This also means that Groq is unlikely to accumulate a stable of developers continuously improving its base software like CUDA has. Groq aims to capture market share by providing faster inference and global joint ventures. Saudi Arabia is already on track, and Canada and Latin America are also in the works. Despite being realistic about the near-term, Groq has lofty ambitions, setting a goal of providing half the world's inference.
The company's strategy comes with some level of risk, as its offering may be more like a restaurant menu than a grocery store. Mark Heaps, Groq's chief tech evangelist, said training would require more customization at the chip level. Inference, on the other hand, requires Groq to choose the right models and to ensure that they run as fast as possible.
Though Groq started out as a company with a novel chip design, today, of the company's roughly 300 employees, 60% are software engineers. CEO Jonathan Ross says the goal is to cast a net over the globe and be achieved via joint ventures, with the aim of shipping 108,000 of its language processing units by the first quarter of next year.