Exclusive-Google is working to erode Nvidia’s software advantage with Meta

Dec 17 (Reuters) – Alphabet’s Google is working on a new initiative to improve its artificial intelligence chips running PyTorch, the world’s most widely used AI software framework, in a move aimed at weakening Nvidia’s long-standing dominance of the AI ​​computing market, according to people familiar with the matter.

The effort is part of Google’s aggressive plan to make its tensor processing units a viable alternative to market-leading Nvidia GPUs. TPU sales have become a key driver of Google’s cloud revenue growth as it tries to prove to investors that its AI investments are paying off.

But hardware alone is not enough to drive adoption. The new initiative, known internally as “TorchTPU,” aims to remove a key barrier that has slowed the adoption of TPU chips by making them fully compatible and developer-friendly for customers who have already built their technology infrastructure using PyTorch software, the sources said. Google is also considering open-sourcing parts of the software to speed customer uptake, some of the people said.

Compared to previous attempts to support PyTorch on TPU, Google has given more organizational attention, resources and strategic importance to TorchTPU as demand grows from companies that want to adopt the chips but view the software stack as a bottleneck, the sources said.

PyTorch, an open-source project strongly supported by Meta Platforms, is one of the most widely used tools for developers building AI models. In Silicon Valley, very few developers write every line of code that chips from Nvidia, Advanced Micro Devices or Google will actually execute.

Instead, these developers rely on tools like PyTorch, which is a collection of pre-written code libraries and frameworks that automate many common tasks in AI software development. Originally released in 2016, PyTorch’s history has been closely tied to Nvidia’s development of CUDA, software that some Wall Street analysts see as the company’s strongest shield against competitors.

Nvidia engineers have spent years making sure software developed with PyTorch runs as fast and efficiently as possible on its chips. Google, by contrast, has long had its in-house armies of software developers use a different code framework called Jax, and its TPU chips use a tool called XLA to make the code run efficiently. Much of Google’s AI and performance optimization software stack has been built around Jax, bridging the gap between how Google uses its chips and how customers want to use them.

A Google Cloud spokesperson would not comment on the details of the project, but confirmed to Reuters that the move would give customers a choice.

Leave a Comment