AGI ### Think a lot about the future ๐ฐ๐ Interested in: GPT-4 ๐ธ Aerospace ๐ Health Tech ๐งฌ Tech Startups | ไธญๅฝๆๆฏ ๐จ๐ณ๐บ๐ธ๐ช๐บ
๐นDeepSeek V4 has been released โ trillion-scale parameters, million-token context, and an MIT license.
๐นRead more ... ๐๐ปhttps://t.co/YaHgVDIikZ ๐๐ป
Yesterday we decided to build a new 2B core from scratch. New Architecture. It will be ~10ร cheaper/faster โ a quick win.
AGI & KIT
#SPRIND@NextFrontierAI@Bundestag
No one has beaten DeepSeek/Qwen.
Kimi K2 copied the V3 architecture ~1:1, but DeepSeek were the first to combine
aux-loss-free MoE + MLA + FP8 at 671B simultaneously.
No one has beaten this combo.
PEFT + distillation + routing.
We will. ๐๐ป
Yesterday we decided to build a new 2B core from scratch. New Architecture. It will be ~10ร cheaper/faster โ a quick win.
AGI
#SPRIND#NextFrontierAI#UZH#ETH#KIT
No one has beaten DeepSeek/Qwen.
Kimi K2 copied the V3 architecture ~1:1, but DeepSeek were the first to combine
aux-loss-free MoE + MLA + FP8 at 671B simultaneously.
No one has beaten this combo.
PEFT + distillation + routing.
We will.๐๐ป