DeepSeek released an updated version of its DeepSeek-V3 model on Women Who Have Tasted Swapping [Uncut]March 24. The new version, DeepSeek-V3-0324, has 685 billion parameters, a slight increase from the original V3 model’s 671 billion. The company has not yet released a system card for the updated model. DeepSeek has also changed the model’s open-source license to an MIT license, aligning it with the DeepSeek-R1 model.
The original DeepSeek-V3 gained worldwide attention for its cost-effectiveness. In multiple benchmark tests, it outperformed other open-source models such as Qwen2.5-72B and Llama-3.1-405B, while delivering performance comparable to top proprietary models like GPT-4o and Claude-3.5-Sonnet. DeepSeek investor High-Flyer Quant has emphasized in a published paper that the model was trained at exceptionally low costs. By optimizing algorithms, frameworks, and hardware, the total training cost of DeepSeek-V3 was just $5.576 million – assuming an H800 GPU rental price of $2 per GPU per hour. [Cailian, in Chinese]
(Editor: {typename type="name"/})
Man City vs. Real Madrid 2025 livestream: Watch Champions League for free
An exclusive look at Snapchat's newest Discover publisher, The New York Times
A group of 4 drones grounded 60 flights in a day, leaving 10,000 passengers stranded
McDonald's braces for the apocalypse with chic new uniforms
A hedgehog blown up 'like a beach ball' was popped in life
Feeding America wants to wipe out hunger and food waste with the power of a single app
Woman casually browsed social media on her phone while doctors operated on her
Confessions of a dating app voyeur
Report: Match Group dating apps conceal assault cases
The one thing 'Zelda: Breath of the Wild' has in common with Manhattan
Here's how I feel about all this Stephen Hawking 'news' going around
Tesla plans to double its charging network by the end of the year
接受PR>=1、BR>=1,流量相当,内容相关类链接。