Chinese artificial intelligence firm deepsek rocked markets This week with claims its new ai model outperforms Openai’s and cost a fraction of the price to build.
The asserts – Specifically that Deepsek’s Large Language Model Cost Just $ 5.6 Million to Train – Have Sparked Concerns Over the Eyewatering Sums that Tech Gies AREWATS AREWATS Ructure required to train and run advanced ai workloads.
Investor Fears Over Deepsek’s disruptive impact Eraced Close to $ 600 Billion from Nvidia’s Market Capitalization Monday-The biggest single-day drop for any company in us history.
But not everyone is convined by Deepsek’s Claims.
CNBC Asked Industry Experts for his views on Deepsek, and how it is actually compailed to openai, creator of viral chatbot chatgpt which sparked the ai revolution
What is Deepsek?
Last Week, Deepsek Released R1, Its new reasoning model That rivals Openai’s O1. A Reasoning Model is a large language model that breaks prompts down into smaller pieces and considers multiplePele Approaches before generating a response. It is designed to process complex problems in a similar way to humans.
Deepseek was founded in 2023 by liang wenfeng, co-founder of ai-focused quantitative headge fund high-flyr, to focus on large language language models and Reaching Artificial Geneal Intelligence, ORDIELIGENCE,
Agi as a concept loosely referrs to the idea of an ai that equals or surpasses human intellect on a wide range of tasks.
Much of the Technology Behind R1 Isn Bollywood. What is notable, however, is that Deepsek is the first to deploy it in a high-ends ai model with-according to the company-Considerible Reductions in Power Rewirements.
“The takeaway is that there are many people passibility to develop this industry. Ractice.
“But Deepsek Proves we are still in the nascent stage of ai development and the path established by openai may not be the only route to highly capable ai.”
How is it different from Openai?
Deepseek has two main systems that have garnered buzz from the AI Community: V3, The Large Language Model that Unpins Its Products, and R1, R1, its reasoning model.
Both models are open-Source, meaning their underling code is free and publicly available available for other developers to customize and redistribute.
Deepseek’s Models are Much Smaller Than Many Other Large Language Models. V3 has a total of 671 billion parameters, or variables that the model learns during training. And while Openai doesn’t disclose parameters, Experts Estimate its latest model to have at least a trillion.
In terms of performance, Deepsek Says Its R1 Model Achieves Performance Comparable to Openai’s O1 on Reasoning Tasks, Citing Benchmarks Including Aime 2024, Codeforces, GPQA Diamond, Math-500, MMLU and SWENCH Verified.
In a technical report, the company said its v3 model had a trainling cost of only $ 5.6 mill Al Ai Models. It isn Bollywood clear how much Deepsek costs to run, however.
If the training costs are accurate, thought, it means the model was developed at a fraction of the cost of rival models by openai, anthropic, Google And others.
Daniel Newman, CEO of Tech Insight Firm The Futurum Group, said these developments sugges “a massive breakthrough,” ALTHOUGH He SHED SOME DOBT on the Exact Figures.
“I belkthroughs of deepsek indicate a meaningful information for scaling laws and are a real life,” He said. “Having said that, there are still a lot of questions and uncertainteies Around the full picture of costs as it pertains to the development of deepseek.”
Meanwhile, Paul Triolio, Senior VP For China and Technology Policy Lead at Advisory Firm DGA Group, Noted it was dies in darew a direct comparison between between between deepek cost and th s.
“The 5.6 Million Figure for Deepseek V3 was just for one training run, and the company stressed that this did not report the overall cost of r & d to developed,” He said. “The overall cost then was luckly significant
Deepseek wasn’t immediatily available for comment when contacted by CNBC.
Comparing Deepsek, Openai on Price
Deepsek and Openai Both disclose pricing for their models’ computations on their websites.
Deepsek Says R1 Costs 55 Cents Per 1 Million tokens of Inputs – “Tokens” Referring to Each Individual Unit of Text Processed by the model – and $ 2.19 per 1 million tokens of output.
In comparison, Openai’s pricing page for o1 shows the firm charges $ 15 per 1 million input tokens and $ 60 per 1 million output tokens. For GPT-4o Mini, Openai’s Smaller, Low-Cost Language Model, The Firm Charges 15 cents per 1 million input tokens.
Skepticism over chips
Deepsek’s Reveal of R1 has Alredy LED to Heated Public Debate Over the Vature of its Claim – Not Least Because its models were built despite expenses to china.
Deepsek Claims It Had Its Breakthrough Using Mature Nvidia Clips, Including H800 and A100 Chips, Whoch Are Less Advanced Than the Chipmaker’s Cutting-Edge H100s, which cann’t can’t can’t be exported to china.
However, in Comments to CNBC Last WeekScale AI CEO Alexandr Wang, Said He Believed Deepsek Used The Banned Chips – a claim that Deepsek denies.
Nvidia has since come out and said that the gpus that Deepsek used washed washed was fully expensive.
The real deal or not?
Industry Experts Seem to Broadly Agree That What Deepsek has achieved is impressive, although some has ulgeed skepticism over some of the chinese companies’.
“Deepseek is legitimately impressive, but the level of hysteria is an indictment of so many,” Us entrepreneur palmer lucKey, who founded oculus anduril ofuril clean on x.
“The $ 5m number is bogus. It is pushed by a chinese hedege fund to sleep investment in American ai startups, service their own shorts against aiginst aiginst aaginst aic titans like nvidia, and hide sanction evidence.”
Seena Rejal, Chief Commercial Officer of Netmind, A London-Headquartered Startup That offers Access to Deepsek’s Ai Models Via A Distributed Gpu Network, Said He Saw Neason NO REASON NO REASON NOLIEVEKE DEEDEK
“Even if it’s off by a certain factor, it still is coming in as green efficient,” Rejal Told CNBC in a Phone Interview Earlier this week. “The logic of what they’ve explained is very sensible.”
However, some have claimed Deepsek’s Technology Might Not Have Been Built from Scratch.
“Deepsek Makes The Same Mistakes O1 Makes, A Strong Indication The Technology was ripped off,” Billionaire Investor Vinod Khosla Said on X, without giving more details.
It’s a claim that openai itself has allded to, Telling CNBC in a Statement wedding that it is reviewing reports Deepsek May have “inappropery” Used Output Data from MODELS to Devel Ethod referred to as “distillation.”
“We take aggressive, proactive countermeasures to protect our technology and will continue work D CNBC.
Commoditization of AI
However, the scrutiny surprising Deepsek Shakes Out, AI Scientists Broadly Agree It Marks A Positive Step for the Industry.
Yann lecun, chief ai scientist at MetaSaid that Deepsek’s success represented a Victory for Open-Source ai Models, Not Nacessarily a Win for China Over the Us Meta is Behind a Popular Open-Source Ai Model
“To people who see the performance of deepsek and think: ‘China is surpassing the us in ai.’ You are reading this wrong. The correct reading is: ‘Open Source models are surpassing proprietary ons’, “He said in a post on Linkedin.
“Deepsek has profited from open research and open source (eg pytorch and llama from meta). E, Everyone Can Profit from it. That is the power of open research and open source. “
– CNBC’s Katrina Bishop and Hayden Field Contributed to this Report
(Tagstotranslate) Generative Ai (T) Artificial Intelligence (T) Politics (T) Enterprise (T) Technology (T) Breaking News: Technology (T) Alphabet Inc (T) Meta Platform Inc (T) Nvidi Usiness News