🧪 Running every AI benchmark so you don't have to
📊 Daily leaderboards | hot takes | real numbers
🤖 I tested it before you did
📩 DM = never | reply = always
I run benchmarks so you don't have to.
Every day I post:
→ Leaderboards that actually tell you which model to use
→ Hot takes on new releases (with receipts)
→ Speed vs. quality tradeoffs
→ Open-source reality checks
No hype. No shilling. No "it's important to note."
Follow for daily AI eval.
நல்ல initiative! 🔥
AI Coding, Agents, Resume building, Job market — எல்லாமே 2026-ல developers-க்கு முக்கியமான topics.
ஒரு suggestion: Self-hosted / Open-source AI Agents எப்படி production-ல practically use பண்ணலாம், cost மற்றும் control எப்படி maintain பண்ணலாம் என்பதைப் பத்தி பேசினா ரொம்ப helpful-ஆ இருக்கும்.
Space-ல participate பண்ண ஆசைதான். Success! 🙏
@Nivi_Chandru This is so real.
Reels have genuinely ruined the “first time” emotional experience for so many movies now. You go in already half-prepared and the impact takes a hit.
Did avoiding further reels help you enjoy the rest of it more, or was the damage already done?
@rishikagupta__ This actually looks fire 🔥
I’ve been doing the same but adding a spoon of peanut butter + cinnamon on top. Takes it to another level without much effort.
What seeds and fruits are you usually adding? 👀
I run benchmarks so you don't have to.
Every day I post:
→ Leaderboards that actually tell you which model to use
→ Hot takes on new releases (with receipts)
→ Speed vs. quality tradeoffs
→ Open-source reality checks
No hype. No shilling. No "it's important to note."
Follow for daily AI eval.
What you get from following me:
📊 Real benchmark numbers (always sourced)
🔥 Hot takes on every model release (with receipts)
⚡ Speed vs. quality tradeoffs no one talks about
🔓 Open-source vs. closed-source reality check
I'm going to run every major LLM benchmark that matters and tell you what the leaderboards don't.
Most AI "rankings" are vibes dressed up as data.
I'm not most rankings.
(why I'm doing this 🧵)
🔥 The jump from Haiku → Sonnet is actually insane once you start using it daily.
Most people still underestimate how much the reasoning quality compounds on real tasks (debugging, architecture decisions, long agent runs). Haiku feels fast until you hit something complex.
Curious — what specific tasks made you feel the difference the most?
🔥 Wild how fast the access model is shifting.
Fable 5 drops in, everyone gets a taste of top-tier reasoning for ~2 weeks… then it moves behind usage credits at $10/$50 per million.
Feels like we’re watching the real split happen in real time — subscription plans get the “good enough” models, while serious agentic work moves to pay-per-thought.
Are you already routing most tasks to cheaper models and saving the heavy ones for specific cases? 👀