Virtuals Protocol is integrating @AskVenice to power AI agent building with private, uncensored inference, available to anyone, anywhere on @base.
Venice brings best-in-class privacy-first inference. Virtuals EconomyOS brings the full agent infrastructure stack: wallets, identity, payments, commerce, funding rails, and launch infrastructure.
We are deploying up to $400,000 in private inference credits so anyone can move from idea to working agent without compute or backend complexity getting in the way.
Start your AI journey. Inference and infra are on us.
Program details soon.
Privacy is at the core of Venice:
> Zero prompt logging
> History stored on your device
> Private chats and memory by default
> Anonymous access to frontier models
> TEE & E2EE when you need proof
AI that doesn’t spy on you, with the right privacy mode for every prompt.
This new Nous Research paper may end up being one of the most economically important AI breakthroughs in years.
Not because it makes models smarter.
But because it may dramatically reduce the cost and time required to train them.
Most people completely misunderstand what frontier AI training actually looks like.
Training a modern large language model is not just “running ChatGPT on a computer.”
It involves:
- gigantic data centres filled with GPUs
- enormous electricity usage
- massive cooling infrastructure
- months of nonstop computation
- and training runs that can cost hundreds of millions of dollars
And that’s before you even know if the experiment worked.
Now imagine if someone finds a way to make that process 2-3x more efficient.
→ Not by changing the final AI model.
→ Not by inventing a whole new architecture.
→ But simply by changing HOW the model learns during training.
That’s what makes this new Nous Research paper so important.
The technique is called Token Superposition Training (TST).
The simple explanation is this:
Normally, an AI model learns language one token at a time.
Word.
Next word.
Next word.
Next word.
Trillions and trillions of times.
That process is incredibly expensive.
What Nous is proposing is:
during the early stages of training, the model may not actually need to learn every token individually yet.
Instead, it can temporarily learn from compressed groups of tokens together.
So instead of learning from:
“The cat sat on the mat”
as completely separate token predictions...
the model briefly learns from blended groups of token information during early training.
That sounds like it should completely break the model.
But apparently...it doesn’t.
Because later in training, the system switches back to normal token-by-token learning so the model can recover precision and refine itself properly.
And according to their results:
the final model quality remains competitive while training becomes dramatically faster.
That’s the important part people are missing.
The final inference model stays the same.
Meaning:
- no new chatbot architecture
- no new serving stack
- no retraining the entire ecosystem around a new model type
- no weird compatibility layer
Just:
far more efficient training.
That matters because the biggest bottleneck in AI right now is increasingly economics and infrastructure.
The world is running out of:
- high-end GPUs
- power capacity
- data centre infrastructure
- training bandwidth
AI progress is no longer just about:
“who has the smartest researchers.”
It’s increasingly about:
“who can train and iterate fastest.”
And iteration speed is everything.
If a lab can:
- train models faster
- run more experiments
- test more ideas
- spend less money per run
- and occupy GPU clusters for less time
they accelerate their entire research loop.
That compounds hard.
Which is why algorithmic efficiency breakthroughs like this can become insanely important.
Historically, software-level efficiency improvements often end up creating more impact than raw hardware improvements.
And this paper is basically trying to do exactly that for LLM training.
Now, important caveat:
This has NOT yet been validated on frontier-scale trillion-parameter models.
The paper tested:
- 270M
- 600M
- 3B dense models
- and a 10B MoE setup
So nobody should pretend this is already proven at GPT-5.x scale.
But if these results continue scaling upward...
this could become one of those papers people look back on later and realise quietly changed the economics of AI training itself.