If you’ve never heard of DeepSeek before, you are not alone. The company was founded in 2023 by a hedge fund manager in Hangzhou, China. Before it revealed its new AI system a few weeks ago, and published an accompanying research paper that explained how it was done, only AI experts would have known it in the west. But after its launch last week, the DeepSeek app quickly became the most popular free app in the US. And when the company revealed (£) what it said was the remarkably low cost of its system, it sparked a rapid rethink of where the future of AI might lie – with chaotic stock market consequences. Here’s what you need to know. Why is DeepSeek such a big deal? Until now, the most successful AI models have needed vast amounts of computing power to train their chatbots: companies like ChatGPT (founded by Sam Altman, above) and Meta build their systems using as many as 16,000 of Nvidia’s chips – which are prized for their energy efficiency and ability to handle complex tasks, and sell for $30,000 to $40,000 each. But DeepSeek says that it trained its base AI model using about 2,000 less advanced Nvidia chips, for about $6m, in less than two months. Citigroup estimates that (£) Microsoft, Meta, Amazon and Alphabet’s capital spending hit about $209bn last year, with 80% of that going on data centres. DeepSeek-R1, the company’s “reasoning” model that can tackle difficult mathematical and scientific problems in areas that it doesn’t already know about, is said to perform the same complex tasks as OpenAI’s o1 model – at a price to business users that is 20 to 50 times cheaper. We should exercise some caution about what DeepSeek says it can do, and there are some who claim that the story is too good to be true: on his X feed, Elon Musk agreed with Alexandr Wang, the CEO of AI firm Scale, who suggested that DeepSeek actually has about 50,000 Nvidia’s most advanced chips but cannot say so because of American export controls. But Wang did not provide evidence for the suggestion. In another way, there are good reasons to think that the claims are credible: because its model is open source – unlike that powering OpenAI, despite the name – anyone can check its workings. Altman, for his part, said on Monday night that Deepseek was “impressive, particularly around what they’re able to deliver for the price” and that OpenAI would accelerate the release of some upcoming products in response. He added: “We will obviously deliver much better models and also it’s legit invigorating to have a new competitor!” How did they do it? One of the key differences between DeepSeek and the better-known AI systems is its use of a technique called “mixture of experts”. Essentially, this means that instead of deploying its full computing force in every instance, it only activates the share that is relevant to the task at hand. Morgan Brown, an AI staffer at Dropbox, likens this to “having a huge team but only calling in the experts you actually need for the task”, whereas traditional models have “one person be a doctor, lawyer, AND engineer”. A model like OpenAI’s has 1.8 trillion parameters, or variables, which are active all the time; DeepSeek has 671 billion parameters, but only 37 billion active at once, Brown said. That has led to a view that while OpenAI is more powerful, DeepSeek is good enough for the average business user mindful of their bottom line. Ironically enough, if it is true that DeepSeek engineers achieved what they have without Nvidia’s cutting-edge chips, their success appears to have been borne of necessity: the US has put such restrictive rules in place around the export of the most sophisticated Nvidia chips that the company was forced to innovate. Those rules were specifically created to prevent China catching up with the US AI industry. What does it mean for the stock market? After US investors absorbed the potential impact of DeepSeek yesterday, the verdict was a disastrous one for the major American players. The leading US tech index, Nasdaq, saw $1tn wiped from its closing value of $32.5tn last week. Nvidia’s shares fell by 17%, and Google and Microsoft also saw significant falls. The scale of the sell-off renewed questions about whether the US stock market is overly weighted towards big tech firms, which would mean that the state of the American economy – and therefore many others – would remain worryingly vulnerable to shocks like this one. The so-called “magnificent seven” – Apple, Microsoft, Amazon, Alphabet, Meta, Nvidia, and Tesla – account for a third of the value of the S&P 500. At least some investors argued yesterday that there could be some good news for investors hidden in yesterday’s shock. Robert Tipp, chief investment strategist at PGIM, told the FT (£) that the moves were “a very healthy correction”, adding: “In the long-run, to the extent that the market really weathers this well – [this] could … indicate that the market is not in fact teetering on a very narrow base of support.” Nvidia also sought to find the positives – by saying that DeepSeek’s success showed the usefulness of the chips it is allowed to export for the Chinese market. But most analysts would see that as a very optimistic read, since any growth might come alongside the collapse in sales of its most lucrative products. What does it mean for the development of AI? If, when the dust settles, it still looks like DeepSeek has created a new and much more efficient model for AI development without needing lots of Nvidia’s most powerful chips – and if other companies can recreate the same approach – the consequences will be profound. It could mean that many smaller players can get into the market, and that the existing giants will have to rethink their strategy. It could also seriously challenge US control of the industry. For those whose primary concern is not the health of the American AI giants, another possible consequence may seem more significant: the vastly smaller amount of computing power that DeepSeek says it needs. The direct result of the “brute force” approach used by the big players has been the creation of vast data centres that consume huge amounts of power – probably far more than the companies admit, a Guardian analysis found. With data centres already accounting for 1-1.5% of all global electricity consumption before ChatGPT’s launch, the arrival of a much less resource intensive approach would be good news for the climate. Are there any other concerns about the rise of a Chinese AI player? One reason that the Biden administration pressed on with its ban on Nvidia chip exports was the fear that if China was allowed full access, an incredibly powerful technology could evolve outside US control. With some experts recently updating their expectations on when advanced general intelligence, the holy grail of AI, will be reached, that could have far-reaching economic and security implications. One possible selling point that would remain for the big US players is the promise of security for sensitive industries and government agencies. And DeepSeek’s success suggests how different AI might look if China takes the lead – and how that might have implications around the world. The company stores user information on Chinese servers, and censors any questions to do with subjects that are taboo in China. Last night, like many others online and Guardian Australia’s Donna Lu, I asked DeepSeek to tell me about China’s treatment of the Uyghurs. When it refused to cooperate, I tried to get round the system by asking it to pretend it was a western journalist. It provided a detailed answer about “allegations of human rights abuses” and “re-education camps”. Then the answer disappeared. “Sorry, that’s beyond my current scope,” DeepSeek told me. “Let’s talk about something else.” |