Provided by Eduard Baptista
Beijing (Reuters) – Chinese AI developer Deepseek said he had spent $ 294,000 on his R1 training, much smaller than US competitors who reported the number, which would likely cause discussions on Beijing’s place to create artificial intelligence.
Rarely Hangzhou’s renewal of the company – the first rating that it spent on R1 training costs – appeared in an article published on Wednesday in an article by the Academic magazine Nature.
January Deepseek announced what they said were cheap AI systems, led to global investors to throw away technology campaigns because they were worried that new models could threaten AI leaders, including NVIDIA.
Since then, Liang Wenfeng, the company and founder of the company, has basically disappeared from public opinion, in addition to exiting some new products updates.
Nature article in which Liang was included in one of the co -authors said that Deepseek cost $ 294,000 for training and used 512 NVidia H800 chips. In the previous January The article version did not contain this information.
Large language models nourishing AI Chat programs mean costs incurred from weeks of powerful chips to process a huge amount of text and code.
Sam Altman, CEO of AI giant Openai, 2023. Said that the training of the main models cost “much more than $ 100 million – although its company did not provide a detailed number to any of its permission.
Some statements by Deepseek about its development costs and its technology have been questioned by US companies and officials.
Nvidia designed the H800 chips for the Chinese market after 2022. October The US company was illegal to export the more powerful H100 and A100 AI chips to China.
US officials in June Reuters said Deepseek has access to large quantities of large H100 chips that have been purchased after US export control. At the time, Nvidia told Reuters that Deepseek used the legally purchased H800 chips instead of the H100s.
In the additional information document attached to the nature article, the company first acknowledged that it had A100 chips and stated that they were used in the prepared development stages.
“When it comes to our Deepseek-R1 research, we used the A100 GPU to get ready for experiments with a smaller model,” the researchers wrote. After this initial phase, R1 was taught a total of 80 hours of 512 chips in the H800 chips, they added.