๐ขCALLING ALL ARTISTS๐ข
Come join an international community of artists that focus on art and one another!
We have art collections, education, talk shows, and most importantly
The artists behind the art! ๐จ
Don't just enjoy the art
Enjoy the artist!
Welcome to the Nation ๐จ
I benchmarked a completely unknown free model on OpenRouter against DeepSeek V4 Pro and Flash.
The model: Owl Alpha. Zero published benchmarks. No white paper. No HF page. No lab attribution. Just a "stealth" listing on @OpenRouter
Here's what 300 real questions revealed โ
๐งต 2/10
THE SETUP
โข 300 questions across 4 standardized benchmarks
โข Same datasets DeepSeek was evaluated on (HuggingFace test splits)
โข Temperature = 0.0, zero tools, direct API calls
โข Cost: $0.00 โmodels are free tier
Compared against DeepSeek's own published scores.
๐งต 3/10
THE RESULTS
โข Benchmark: GSM8K (math 50 questions)
โข Owl Alpha: 95.0% โ
โข DS V4 Flash: 90.8%
โข DS V4 Pro: 92.6%
โข Benchmark: MMLU (knowledge 50 questions)
โข Owl Alpha: 91.0%โ
โข DS V4 Flash: 88.7%
โข DS V4 Pro: 90.1%
Owl Alpha led on almost every benchmark that I ran.
๐งต 4/10
MMLU BY SUBJECT (10 subjects, 10 questions each)
โข High School Math:94%
โข College CS: 92%
โข Philosophy: 90%
โข Professional Medicine: 87%
โข World Religions: 85%
โข College Physics: 85%
โข High School Biology: 83%
โข College Chemistry: 80%
โข Machine Learning: 80%
โข Professional Law: 75%
Strong in STEM, weaker in specialized professional domains. Pattern you'd expect from a general-purpose model.
๐งต 5/10
WHAT SURPRISED ME
1. GSM8K at 95% beats DeepSeek V4 Pro (92.6%) on math, the free model wins ๐คฏ
2. ARC-Challenge at 94% suggests strong commonsense science reasoning
3. All of this from a model with zero public documentation
๐งต 6/10
The HONEST Caveats
โข 100 MMLU questions (10/subject) โ full 14,000 question suite, directional, not precise
โข DeepSeek scores are from their published model card, not the same-day, same-pipeline
โข No error bars, single run
โข The model's provider is anonymous. Long-term availability is unknown
I am not claiming Owl Alpha "beats" DeepSeek. I am reporting what the data from the 300 questions has shown.
๐งต 7/10
WHY THIS MATTERS
There is a growing class of free models on OpenRouter with no public benchmarks and no lab attribution. Some are junk. Some, apparently, are not.
If Owl Alpha's numbers hold up at scale, and I plan to test that, it means there are legitimate free alternatives to established frontier models for a wide range of practical tasks.
The free tier is getting competitive ๐ฅ
๐งต 8/10
METHODOLOGY (for the reproducibility crowd)
โข Datasets: gsm8k, cais/mmlu, openai_humaneval, ai2_arc via HuggingFace
โข Sampling: random seed=42, first N from shuffled test split
โข GSM8K: exact numeric match after answer extraction
โข MMLU/ARC: letter match (A/B/C/D) via regex
โข Full outputs + grading script available
๐งต 9/10
WHAT'S NEXT
I'm running a larger MMLU sample (500+ questions) and a full HumanEval test suite execution. If the numbers hold, Owl Alpha becomes a serious option for anyone building on a budget.
Also curious: does anyone know who built this model? "Stealth" provider with no attribution is unusual for something performing at this level.
๐งต 10/10
Full benchmark data, raw model outputs, and grading methodology are all available. If you want to reproduce or audit the results, DM me.
What free model should I benchmark next? ๐
Step 3.7 Flash that just became free with @NousResearch portal? Qwen3-Coder? Kimi K2.6? Something else?
Drop the model below, and drop a follow on @Roker_51
Spent the whole day building with Hermes agents on DeepSeek V4 Flash, completely FREE ๐ฅ
The results you get when you actually pay attention are wild. Massive shoutout to @NousResearch & @deepseek_ai for dropping free infra all day ๐
If you know how to play the rate limits right, you can run high-intelligence Hermes agents at an extremely low cost. Huge thanks to @OpenRouter & @GeminiApp for the free aux model power ๐
Want the exact setup?
MAIN MODELโข deepseek/deepseek-v4-flash:free (via Nous)
AUX MODELS (all via OpenRouter)โข compression โ google/gemini-2.0-flash-001 โข session_search โ google/gemini-2.0-flash-001 โข title_generation โ google/gemini-2.0-flash-001 โข web_extract / curator / approval / mcp / triage_specifier / skills_hub / flush_memories โ deepseek/deepseek-v4-flash:free โข vision โ auto (multimodal)
DELEGATION โ inherits main (Nous/DeepSeek V4 Flash free)
Just grab a free portal sub from @NousResearch, sign up on @OpenRouter (free api key with a $1 limit,) and tell your agent to copy the config above.
Who else is stacking free-tier agent swarms right now? Drop your setups below ๐
Gm ๐
My @NousResearch creative hackathon project is getting close to completion ๐ฅ
I have been obsessed with generative artwork for a long time, so using @karpathy Auto Research and a bit of my own tweaking, I now have MiniMax2.7 inside of Hermes Agent that orchestrates 3 agents running Qwen 3.5 7B on 5min cycles.
What the agents research:
1. Their famous artist subject
2. Generative artwork structure/architecture
Each agent will go find data of these 2 subjects and can use this data + historic findings to improve their own creative architecture.
The Foundry V2 ๐ค
More to share soon! ๐
โโโDid you ever take the train? What was it like?
Come and Share your experience in our open call๐
We'll display your artwork in a crowded area of Luxembourg.
Link in comments๐
#PPN#nftcollectors
Gm GM ๐
Q from @paperbuddha: would you buy art if you knew there was no human involved?
A from @desultor talks about how AI can manage the full process from start to Finnish.
Starting the Sunday off right, my Hermes agent from @NousResearch has been working all night!
17 new draft proposals for leads discovered and contacts enriched and pushed to Gmail
All while slept last night.
How I train agentic systems:
1. Define the exact job
2. Set hard constraints
3. Build routing + review loops
4. Add escalation rules
5. Stress test for drift, cost, and failure
6. Keep humans on final approval where it matters
Iโm not interested in agents that look smart for 5 minutes.
I want systems that operate reliably, stay cost-efficient, and keep producing quality over time.
New @MyPalettexyz updates just landed for Open Call hosts!
Next thing we are working on is user notifications + a cleaner dashboard thanks to suggestions from @bustosjp and others!
GM creators ๐
New updates just shipped on MyPalette:
โข Guest Curators now live
โข 5-star ratings + private/public curator notes
โข Clean grid layout for submissions
Hosting & curating just got way smoother.
Built by artists, for artists. ๐จ
โซ The Black Sheep ๐ collection includes female artists from all over the world! A collab between @invisiblechickscollective and @pixelpalettenation ๐จ
๐ค โEmbracing My Black Sheepโ an amazing artwork by @CinziaGabrielPH
1 of 1 available for 50 XTZ
๐ท Hit the link in the bio to learn more about artist opportunities ๐ท
Most people picture Bitcoin miners and AI data centers as totally separate worlds.
But after digging through some 2025โ2026 earnings, filings, and reports, I realized theyโre actually fighting over the exact same thing:
โ ๏ธ cheap power + ready infrastructure.
And right now, AI is winning the economics game!
Bitcoin mining side: Post-halving life has been brutal. Cash costs to mine one BTC are running $70kโ$87k for a lot of large operators.
Hashprice hit multi-year lows. Revenue per MW sits around $1.2M/year but swings wildly with network competition.
Hashrate keeps smashing records, yet plenty of companies are scraping by or in the red unless they have dirt-cheap electricity. Pure mining is a grind right now. ๐
AI/HPC side: Completely different beast. Same power-hungry setups, but contracts deliver ~3ร the revenue per MW (some analyses say way more per kWh). Long-term 10โ15 year fixed deals with 80โ90% margins. No more riding BTC price and hashrate rollercoasters every day.
The shift hit hard and fast. Public miners locked in over $65 billion in AI/HPC deals in 2025 alone.
The shift exploded in 2025, public miners inked over $65 billion in AI/HPC deals.
- @IREN_Ltd with the $9.7B Microsoft one
- @Core_Scientificโs revenue exploding from @CoreWeave colocation
- @Hut8Corp landing big Google/AWS plays
-@TeraWulfInc
-@CipherInc and more.
For a lot of them, mining revenue is headed from ~85% down to under 20% of total by end of 2026. Wild stuff.
Smart part: Miners already own the gigawatts of power sites, substations, and land that hyperscalers need yesterday.
Retrofitting for GPUs is faster and cheaper than building new. Theyโre basically doing energy arbitrage, directing electrons to wherever they make the most money.
Bitcoin network is still rock-solid (hashrate resilient). But the infrastructure built for one revolution is now supercharging the next.
This raises the question, if miners are actively pivoting to AI, what does this mean for the BTC ecosystem on a long-term scale?
Did you know you can donate to @PixelPalettePPN and potentially receive tax-deductible benefits for supporting our mission in the USA! ๐จ
PPN is a fiscally sponsored non-profit organization. You can get involved and make a real impact in the digital space!
Get involved today, and let's bring more opportunities to the space together!โค๏ธ๐๐