@Jesseeckel@AnthropicAI There will be a decentralized version not saying it will be $tao as it’s still quite early and as can be seen they are moving a lot of pieces but it’s def proving there is a place for decentralized ai that will compete
Especially on cost for those builders
bittensor:native bros… remember that Covenant-72B moment back in March that lit a 90% fire under the price?
Yeah, that was the first time the whole timeline went “holy shit, decentralized LLM training is actually possible.”
But @MacrocosmosAI just dropped Orion-100B on SN9 (IOTA subnet) at Proof of Talk Paris… and this one feels like the real unlock.
Quick recap if you’re new to @bittensor: it’s the network where anyone with a GPU can jump in, help train AI models and actually earn TAO. No single company running the show.
Covenant-72B was massive at the time:
- 72.7B parameters on Subnet 3
- 70+ independent nodes
- But every node needed those crazy expensive 8x B200 GPU clusters (~$50 an hour)
- Data parallelism, so each peer had to hold the full model
- Hit 67.1 on MMLU (Llama 2 70B level)
- Even Jensen Huang and Chamath shouted it out
Then the Covenant team walked away in April calling parts of it “decentralization theatre.”
Orion-100B flips the script:
- 100B parameters (bigger than Covenant)
- New pipeline parallelism – model gets split across nodes instead of everyone copying the whole thing
- You only need one single A100-80GB GPU per person
- Runs fully distributed on the open internet
- Cost is nuts: ~$1.25 per hour per participant → roughly 40x cheaper
- Live 128-node demo: full training run done in just 5 minutes for ~5 TAO
- Efficiency: 30.8% MFU average (peak 38%), hitting about 65% of real datacenter speed (peak 82%) with 81.8% overall utilization
They already trained 1.1 billion tokens in this early stage.
Simple way to look at it - Covenant proved it could happen.
Orion-100B proves it’s now practical, stupidly cheap and actually scalable for normal hardware.
No 90% pump this week (market’s been chopping and this landed only a few days ago) but this feels like the moment the story moves from “cool experiment” to “this is how frontier AI gets built going forward.”
Macrocosmos quietly fixed the memory bottleneck and the insane cost problem.
This is the next chapter.
You watching this one?
#Bittensor
If you're looking for an 'intro to Bittensor / $TAO' -- specifically: how it works, what it is, what subnets are -- this fireside with the founder turned out pretty well:
What if you could take three completely different model families… and distill them into one tiny model? 🤯
📜 Paper: https://t.co/K2iKD4xFvp
MOPD (Multi-Teacher On-Policy Distillation) has become a standard procedure in post-training. We already distill multiple specialized variants of the same model into a single set of weights.
But what if we could go further - and distill models from entirely different families? Turns out, it is possible.
Today we’re releasing a paper on cross-tokenizer distillation - our first steps in this exciting direction. 📄
We distilled Qwen3-4B, Phi-4-Mini, and Llama-3B into Llama-3.2-1B.
MMLU jumped from 32.05 → 46.32 when using multiple teachers. 📈
The team is now working on Nemo-RL integration so the community can try this method in their own settings. Plus, we are scaling experiments up. 🚀
A malicious node is believed to have exploited @THORChain GG20 TSS signing stack to leak vault key material, reconstructed the private key offline, and drained $10.7M across multiple chains. Safeguards fired automatically, node operators completed the rest.
https://t.co/mF6XQIjXV2
@crypto_bitlord7 For real.. what do you believe in what do you care for. You have an online presence and image
But if your the cunt living in Perth
How many days do you think you have left to fucksr off