@vx_m12 Macro factors are too risky right now with the current inflation data so I expect a deeper correction back to seventy two thousand before any real bullish breakout happens
The model poker playoff winner: Kimi K2.6
This was a bit of a surprise for me, but Kimi beat Gemini decisively in 6 matches
> 1,929 hands played
> 5,753 player actions
> 2.8m tokens used
> total inference cost $8.37
It was a fun weekend experiment and if anyone would like to analyze the data, I've pushed it to my github (link in replies)
Well we got a bit of an upset in the model poker playoffs!
Semifinals results:
Gemini Pro 3.1 (#4 seed) beat GPT-5.5 (#1 seed) 4-2
Kimi K2.6 (#6 seed) beat Opus 4.7 (#2 seed) 4-2
Finals matchup: Gemini 3.1 Pro vs. Kimi K2.6
Not what I expected
Venice revenue estimates vary wildly, but the fact is nobody outside the team knows
Based on this comment, it’s a multiple of $12-14m ARR
But I’ve seen other estimates as high as $200m
My best estimate is $25-35m from evaluating a bunch of sources
What model is the best at poker?
Benchmarks are great, but they're not fun, I wanted to put models in head-to-head competition
Background: a few weekends ago I built an agent poker engine and wanted to see which agent was better - Hermes or OpenClaw
Hermes won the first match, then I had them play 100 matches (not hands) of heads up Texas Hold'em
The result? Exactly 50-50, neither is decisively better out of the box
I used a variety of models across the 100 matches to mix it up and noticed some trends, so last night I ran a tournament to see which MODEL was best at poker
Here's how it worked:
> 8 models
> model vs model in heads up play
> best-of-7 series to determine winner
> each match played until either one model was bankrupt or 100 hands were played
After the first round:
> GPT-5.5 (#1 seed) beat Qwen 3.6 (#8 seed) 4-0
> Opus 4.7 (#2 seed) beat GLM-5.1 (#7 seed) 4-1
> Kimi K2.6 (#6 seed) beat Grok 4.3 (#3 seed) 4-3
> Gemini 3.1 (#4 seed) beat DeepSeek V4 (#5 seed) 4-2
No real surprises, and the one "upset" with Kimi beating Grok went the full 7 matches
Moving onto the semis today
The AI pie is growing so fast it’s easy to miss trends within the trend
One worth watching:
Serious products are stacking proprietary RL on open-weight bases instead of just routing to frontier APIs
They’re bringing post-training in house and getting better results than just routing to frontier APIs
Cursor’s Composer 2 is the best example
25% of compute on the Kimi base and 75% on Cursor’s own RL using their coding trajectories
Many others are doing this quietly, many more will go this route
The base model is the commodity and the post-training is the moat
The only complication is that the post-training is expensive, so not something many companies can afford
Companies that make this process more affordable and accessible will have an insanely valuable offering
For those who are still curious about Hermes Agent, check out the Hermes Atlas (link in replies)
> 100+ curated community tools
> in-depth handbook + install guide
> “Ask the Atlas” - chat directly with the largest knowledge base of all things Hermes
> newsletter to stay on top of everything
> and more
This is a super powerful statement:
“The pattern is now too consistent to be coincidence. Any company whose product is a long-horizon agentic loop is moving off the labs and onto models trained against their own interaction data”
The net of this is that many companies at the app layer that have accumulated good data are finding that:
open-weights model + RL/fine tune > closed source models
“are all training their own models on top of open weights”
This is going to be a big trend to keep eyes on
While everyone is paying attention to the frontier labs, there is a market segment that:
> wants open weights
> can't or won't use Chinese models
> values vendor neutrality
I'll give you 1 guess as to who fits this perfectly
The question is how big and fast this segment grows
Ok, but who manages all these FDEs?
Nobody is talking about that - these are humans after all and someone needs to actually lead these teams of FDEs
What is that role archetype? Consulting partner background, former operator, more/less technical?
Here is my crypto x AI tracker
It was running locally but a few folks asked me for the link (cc @andyyy) so I pushed it to vercel
It filters anything with <$5m mcap, if you feel a project is missing let me know
Link in replies
For anyone who wants an ELI5 on this:
Training a big language model is like teaching a kid to read one word at a time
You show them a sentence, hide the next word, and make them guess
They guess, you tell them the answer, they update their brain a tiny bit
Now repeat this for trillions of words - it takes forever and costs a fortune in GPUs
Enter the gigabrain team at Nous:
They noticed that at the very start of training, the model is so dumb that making it guess one word at a time is overkill
It's like making a toddler sound out every letter when really you just want them to learn that "the cat" tends to be followed by words like "sat" or "ran" or "is"
So they did training in two phases:
1. First ~30% of training: "read in chunks, guess in chunks"
> take the sentence "the cat sat on the mat" and squish every 4 words into one blob: [the-cat-sat-on] [the-mat-...]
> the model only has to process one blob instead of four words so it does 4× as much reading per unit of GPU effort
> for the answer, the model doesn't have to guess the exact next 4 words in order, it just has to guess which words show up next, in any order, and that's way easier
2. Remaining ~70%: "OK now do it properly"
> switch back to normal one-word-at-a-time training
> the model is briefly confused, then snaps back and is better than a model that trained the normal way the whole time
Ok cool, but so what?
1. It's 2-3× faster training at the same compute budget
2. The finished model looks byte-for-byte identical to a normal one
3. Better scores on standard benchmarks (HellaSwag, ARC, MMLU)
Why it works (their best guess):
Early in training, the model mostly needs to learn the rough vibe of language like "what kinds of words tend to appear near each other"
Once it has the vibe, it can learn the precise word-by-word details much faster, so front-loading the "rough vibe" phase with a cheaper, blurrier signal turns out to be a free lunch
3 business models emerging for AI inference using crypto incentives:
1. Specialized premium inference
Users pay a bit extra to get something additional of value, @AskVenice is the obvious example offering privacy for a premium
Others have written good pieces lately on Venice so I won't repeat, but what to watch for here is if they can use the specialized inference as a trojan horse to build enough scale to compete with the traditional players
The token mechanics with $VVV and $DIEM is what helped bootstrap the traction which could help them get to that scale
2. Subsidized inference to acquire training data
Offer lower cost inference than traditional providers for the purposes of collecting proprietary data to train specialized models
This is what I like about @dphnAI - they repurpose idle GPUs (gamers, prosumers, data centers) for distributed LLM inference but then generate synthetic data for further model training
The team behind Dolphin Network has a strong Hugging Face presence with >4M monthly downloads and they already have models running on Venice
Where could this lead?
> synthetic data suite
> demand auto-balancing (inference vs. training/RL)
> sharded distributed inference (split large models across GPUs)
> distributed LoRA/SFT/RL/full fine-tuning
> user-facing model creation suite
> exploratory large-scale distributed pre-training
Low cost inference is the gateway drug and the $POD token is helping with distribution
3. Onchain agentic inference
This is more of a crypto-specific use case, but as agents begin to proliferate and take actions onchain you start to worry about centralized inference providers that could censor, rate-limit, or leak data
This isn't just about privacy but where you need hardware-level guarantees against tampering, memory inspection, or data exfiltration
It looks more like TEE + no logging + decentralized routing, and one project to keep on the radar is @MorpheusAIs and the use of the $MOR token for network access