(They/Them) Working on HPC and Computational Neuroscience. Bachelor@Tongji, MSc @UTokyo_GSFS (2023.10-2025.9), @Cadence 中文/English/日本語 云门挂锁难留客,一片青山了此身。
@RajaXg But the scaling law(?) and memory pattern in DeepSeek doesn't really change, (imagine we provide high-flyer with GH200/MI300, it would even accelerate or enhance their work with a larger model) Maybe still need LPDDR+SRAM+Locality to change the game?
Some publicly available benchmarks on Chinese domestic solutions. (information source can be find on the photo) Huawei Ascend 910C has been released (and performance has improved), #DeepSeek has official support for Ascend.
And actually that's where US sanction already restrict China, (imagine High-Flyer with same amount of GH200, they can afford that), #DeepSeek could be way further than current performance.
Will talk about China domestic design in AI training accelerator later.
When I was in Tongji, I tried to apply for a intern in High-Flyer. They was working on "AGI" as second priority but still quant as first priority. At that time they was recruiting NCCL and other GPU cluster internship and seems to freeze quant HC. (1/1) #DeepSeek
Thus I simply have no idea why #DeepSeek is astonishing to others, like if your employees are brilliant as Jump or Optiver, and you have more resources (10,000 A100 instead of China domestic design) than other company in China (https://t.co/5NGYRPDfFT), likely you'll succeed too.