Mark

@randgroup This is great. The monitoring layer is underpriced in this breakdown. Detection is a test. Monitoring is a continuous data problem. Who collects the cleanest, most consistent signal over time builds the moat no benchmark can touch.

Mark

@markkmii

about 2 hours ago

@lukas_m_ziegler @TheSanctuaryAI Hardware-agnostic approach is the real story here. Dont wait for perfect humanoid hardware. Deploy on Universal Robots now. Port to humanoids when ready. Physical AI is bottlenecked by software, not the arm.

Who to follow

Capybara

@dillon_near

Views expressed are my own

Alex Scharrer

@alex_scharrer

Capital Markets & Institutional @NEARFoundation | prev. Goldman, PE, AM | All views are my own

PTR.NEAR

@quadron3stat3

@NEARprotocol Ecosystem Connector @NEARWEEK 🌐 Previously @nearfoundation. DM always open. All views are my own. NFA.

Mark

@markkmii

about 2 hours ago

@henrikhinai 81% of Databricks databases created by agents. The bottleneck isnt building software anymore. Its the data and workflow layer underneath. Agents need something to act on. That layer is the actual moat

Mark

@markkmii

about 2 hours ago

@FT Companies are pulling back because they added AI on top of existing workflows. The cost doesnt justify the marginal improvement. The ROI unlocks when you redesign the workflow around AI, not layer it on top.

Mark

@markkmii

about 2 hours ago

@nvidia Framing is right. But most teams treat the model as the product and the harness as an afterthought. The deployment gap lives in the harness. Model quality and harness quality are not the same ceiling.

Mark

@markkmii

about 2 hours ago

@Kalshi "Good enough" beats premium every time in volume markets. This is the model layer commoditizing in real time.

100

Mark

@markkmii

about 3 hours ago

Every builder who uploaded source code to Mythos just learned the same lesson: the model provider owns the relationship. ZDR changes without notice, the stack shift happens quietly. Is the new vendor lock-in proprietary dependency? Build accordingly.

Jean-Michel Lemieux

@jmwind

about 18 hours ago

Fable 5 is a Trojan horse. I thought everyone saw it. After a bunch of conversations with CTOs and CEOs this week, I realized many didn't. Anthropic has been one of the best platform partners in AI. They understood something early that others missed: nobody is going to build serious enterprise software on top of you without trust. Audited ZDR, strong enterprise controls, clear boundaries around customer data. That trust paid off and we built on them because of it. At the same time, this is a brutally competitive market. Anthropic has great models, great marketing, strong distribution, and enormous mindshare. But there is no world where model performance stays differentiated forever. Token costs will fall. Inference gets optimized. New models arrive. The model layer is a knife fight. So where do you go next? You move up the stack. They didn't hide this at all and they started working on it. They realized it’s a hard problem. Business workflows, the software up the stack isn’t trivial. Our APIs aren’t text in, text out. We have build businesses around the messy way in which the world works. I think they realized that it's harder than they thought. Now enter Mythos. They launched it as a too good to share because it finds vulnerabilities in your code. The pitch is almost impossible for a CTO to ignore: point it at your codebase and it will find vulnerabilities, security issues, and bugs that your team missed. They beat that drum for a few months. They even leaked some of the findings to the large tech companies, which would pull the smaller ones into security war-rooms with them to make plans on how to use Mythos when it launches. The frenzy was building. The first thing most software companies will do when it launches is exactly that. They won't upload customer data. They'll upload their source code. Their crown jewels. Then Fable 5 is released, ZDR is altered, what do you think every CTO did? Insane.

441

356

150K

Mark

@markkmii

about 3 hours ago

@jmwind Every builder who uploaded source code to Mythos just learned the same lesson: the model provider owns the relationship. ZDR changes without notice, the stack shift happens quietly. Is the new vendor lock-in proprietary dependency? Build accordingly.

Mark

@markkmii

6 days ago

@VraserX Stay alive to reach the bridge, agreed. But the bridge runs on continuous health data, not one checkup a year. Catching decline early needs a loop that watches real life. Thats the missing infrastructure, not the biotech.

Mark

@markkmii

6 days ago

@Dr_Singularity A city shipping a frontier-class model is a real signal. Model capability is going everywhere. That means it stops being the moat. The edge is proprietary data and a real-world problem to point it at.

496

Mark

@markkmii

6 days ago

@SciTechera Easy to predict aging away from a stage. Harder to deploy it in a real home. Health doesnt change on a clean curve. It changes fast, quietly, and differently for everyone. The bottleneck isnt the science. Its continuous data nobody is collecting yet

138

Mark

@markkmii

6 days ago

@engineers_feed Speed without understanding is the deployment trap. The code runs, the demo looks fine, nobody held the edge case. Thats not an AI problem. Its a review problem. AI raises the floor and widens the gap. The judgment to catch its mistakes is the scarce part now.

Mark

@markkmii

6 days ago

@RoundtableSpace 40+ models in six months. The model layer is commoditizing in real time. None of this is a moat. The value moves to whoever deploys it into a real workflow. Most people stay loyal to three anyway.

Mark

@markkmii

6 days ago

40+ models in six months. The model layer is commoditizing in real time. None of this is a moat. The value moves to whoever deploys it into a real workflow. Most people stay loyal to three anyway.

0xMarioNawfal

@RoundtableSpace

6 days ago

HERE’S A LIST OF EVERY AI MODEL LAUNCHED THIS YEAR SO FAR: • Qwen3-Max-Thinking • Kimi K2.5 • Step-3.5-Flash • Claude Opus 4.6 • GPT-5.3 Codex • GLM-5 • Claude Sonnet 4.6 • Param-2 • Sarvam-105B • Sarvam-30B • Gemini 3.1 Pro • GPT-5.4 • Mistral Small 4 • MiMo-V2-Pro • Gemma 4 • GLM-5.1 • Muse Spark • Qwen3.6-35B-A3B • Claude Opus 4.7 • GPT-5.5 • DeepSeek-V4-Flash • DeepSeek-V4-Pro • MiMo-V2.5-Pro • MiMo-V2.5 • Gemini 3.5 Flash • Claude Opus 4.8 • Step 3.7 Flash • GPT-5.5 Instant • Grok 4.3 • Granite 4.1 30B • Qwen3.7 Max • MiniCPM5-1B • JT-35B-Flash • MiniCPM-V 4.6 1.3B • Ring-2.6-1T • MiniMax-M3 • Qwen3.7 Plus • Gemma 4 12B • Nemotron 3 Ultra 550B A55B • DiffusionGemma 26B-A4B • Kimi K2.7 Code • Claude Fable 5 • MAI-Code-1-Flash • MAI-Thinking-1 • U2 We’re only half way through the year..

167

46K

Mark

@markkmii

6 days ago

@DailyDoseOfDS_ Validating what Im seeing. Many teams are racing to fix cost and deployment, not the model. 98.4% harness, 1.6% model. As models converge, the moat moves to the scaffolding around them. Permission gates, recovery logic, context handling. The model reasons. The system ships.

Mark

@markkmii

6 days ago

Validating what Im seeing. Many teams are racing to fix cost and deployment, not the model. 98.4% harness, 1.6% model. As models converge, the moat moves to the scaffolding around them. Permission gates, recovery logic, context handling. The model reasons. The system ships.

Daily Dose of Data Science

@DailyDoseOfDS_

7 days ago

Claude Code fully dissected! Researchers from UCL reverse-engineered the leaked Claude source. What they found changes how you should think about agent design. Only 1.6% of the codebase is AI decision logic. The other 98.4% is operational infrastructure. Permission gates, tool routing, context compaction, recovery logic, session persistence. The model reasons. The harness does everything else. This is the opposite of what most agent frameworks do today. LangGraph routes model outputs through explicit state machines. Devin bolts heavy planners onto operational scaffolding. Claude Code gives the model maximum decision latitude inside a rich deterministic harness, and invests all its engineering effort in that harness. The core loop is a simple while-true. Call model, run tools, repeat. But the systems around that loop are where the real design lives: A permission system with 7 modes and an ML classifier. Users approve 93% of prompts anyway, so the architecture compensates with automated layers instead of adding more warnings. A 5-layer context compaction pipeline. Each layer runs only when cheaper ones fail. Budget reduction, snip, microcompact, context collapse, auto-compact. Four extension mechanisms ordered by context cost. Hooks (zero), skills (low), plugins (medium), MCP (high). Each answers a different integration problem. Subagents return only summary text to the parent. Their full transcripts live in sidechain files. Agent teams still cost roughly 7x the tokens of a standard session. Resume does not restore session-scoped permissions. Trust is re-established every session. That friction is the point. The bet behind all of this is simple. As frontier models converge on raw coding ability, the quality of the harness becomes the differentiator, not the model. Paper: Dive into Claude Code (arXiv:2604.14228) We've shared an article on Agent Harness and what every big company is building. Read it below.

DailyDoseOfDS_'s tweet photo. Claude Code fully dissected!

Researchers from UCL reverse-engineered the leaked Claude source. What they found changes how you should think about agent design.

Only 1.6% of the codebase is AI decision logic.

The other 98.4% is operational infrastructure. Permission gates, tool routing, context compaction, recovery logic, session persistence. The model reasons. The harness does everything else.

This is the opposite of what most agent frameworks do today.

LangGraph routes model outputs through explicit state machines. Devin bolts heavy planners onto operational scaffolding. Claude Code gives the model maximum decision latitude inside a rich deterministic harness, and invests all its engineering effort in that harness.

The core loop is a simple while-true. Call model, run tools, repeat.

But the systems around that loop are where the real design lives:

A permission system with 7 modes and an ML classifier. Users approve 93% of prompts anyway, so the architecture compensates with automated layers instead of adding more warnings.

A 5-layer context compaction pipeline. Each layer runs only when cheaper ones fail. Budget reduction, snip, microcompact, context collapse, auto-compact.

Four extension mechanisms ordered by context cost. Hooks (zero), skills (low), plugins (medium), MCP (high). Each answers a different integration problem.

Subagents return only summary text to the parent. Their full transcripts live in sidechain files. Agent teams still cost roughly 7x the tokens of a standard session.

Resume does not restore session-scoped permissions. Trust is re-established every session. That friction is the point.

The bet behind all of this is simple. As frontier models converge on raw coding ability, the quality of the harness becomes the differentiator, not the model.

Paper: Dive into Claude Code (arXiv:2604.14228)

We've shared an article on Agent Harness and what every big company is building.

Read it below.

299

220K

Mark

@markkmii

6 days ago

This is the whole game. Cars had 10M on the road feeding the loop. Humanoids had nothing. You cant train a robot that was never deployed. Sim lies about friction and slip. Real physics doesnt. The academy isnt a robot story. Its a data flywheel.

GeniusThinking

@GeniusGTX

7 days ago

Elon Musk reveals SpaceX is building a 30,000-robot academy where humanoids learn from each other. Cars were easy. Tesla had ten million on the road, beaming back driving data every second. But humanoid robots? There weren't ten million Optimi yet. There weren't ten. Robotics had run data-starved for decades. Tesla decided to fix it. You couldn't train a humanoid that had never been deployed. So Musk built a school for them instead. "We can have at least 10,000 Optimus robots, maybe 20-30,000, that are doing self-play and testing different tasks." Tesla called it the Optimus Academy. Picture a warehouse the size of a chip fab. Thirty thousand humanoid robots inside. Picking things up. Folding clothes. Walking. Tripping. Catching themselves. Failing in ways no human roboticist had thought to script. Each watching the others, learning what the human body shouldn't have made look easy. Every move generated a data point. Every failure generated a sample. Every robot taught every other robot. In simulation, Tesla could spin up a million robots overnight. But simulated physics lied about friction, slip, and drift. Real physics didn't. Cars learned from drivers. Optimi learned from each other. Each generation made the next one cheaper, faster, smarter. By the tenth generation, no human would recognize the curriculum. Recursive learning at electromechanical scale. Musk, on closing the loop: "You use the tens of thousands of robots in the real world to close the simulation to reality gap." Whoever opened the academy first owned the species. P.S. I made a playbook breaking down 100+ most powerful decision making mental models used by history's greatest thinkers. 5,000+ downloads. 113 five-star reviews. Grab a free copy here: https://t.co/u2q1uUm9vD If you're new here, follow @GeniusGTX for content on the greatest minds in economics, psychology, and history. — Elon Musk ( @elonmusk ), CEO of Tesla and SpaceX, on Dwarkesh Patel's ( @dwarkesh_sp ) podcast

269

102

47K

Mark

@markkmii

6 days ago

@GeniusGTX This is the whole game. Cars had 10M on the road feeding the loop. Humanoids had nothing. You cant train a robot that was never deployed. Sim lies about friction and slip. Real physics doesnt. The academy isnt a robot story. Its a data flywheel.

104

Mark

@markkmii

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users