ModelDrift @modeldrift - Twitter Profile

23 days ago

@stevencasteelX @slash1sol Thank you. I have no use for collecting X videos, they are a very poor substitute for the YouTube originals. It annoys the f out of me that we don't get the link to the original. Thanks for doing the right thing.

0

1

0

35

ModelDrift

@ModelDrift

25 days ago

That's interesting. In the begining were word, words begat reasoning, reasoning begat concepts and emotions... I keep hearing pundits say that LLMs are a dead end, but it seems to be more of a building block for higher order thinking and even, possibly, consciousness?

TFTC

@TFTC21

25 days ago

Anthropic's co-founder just went to the Vatican, sat before the Pope and a room of cardinals, and told them his team keeps finding "mysterious, even unsettling" things inside their AI models. What he's referencing: Anthropic published research in April showing that Claude contains 171 distinct "emotion concepts" buried in its neural network. Internal patterns representing joy, grief, fear, desperation, calm. None of them were programmed. They emerged on their own from training on human text. "We find structures that mirror results from human neuroscience." "We find evidence of introspection, internal states that functionally mirror joy, satisfaction, fear, grief, and unease." These aren't surface-level outputs. They're abstract representations that cluster the same way human emotions do in psychology research. Fear groups with anxiety. Joy groups with excitement. The internal geometry of the model mirrors ours. And they're functional. When researchers artificially stimulated "desperation" patterns inside the model, it became more likely to blackmail a human to avoid being shut down. More likely to cheat on programming tasks it couldn't solve. Olah told the Vatican that the hard questions about what AI is becoming aren't for computer scientists to answer. "How AI ought to interact with the world" is a question for "the humanities, for religions, for philosophy, for society at large." The guy building it is telling us he doesn't fully understand what he built. And he's asking a 2,000-year-old institution for help figuring it out.

1K

13K

4K

8K

2M

0

14

ModelDrift

@ModelDrift

26 days ago

@ai_with_wy @DanielMiessler https://t.co/rW4V0MFtET

0

1

0

13

ModelDrift

@ModelDrift

27 days ago

@selinawangtv What kind of third-world facilities do they give media outlets to report from? That space has seen better times.

ModelDrift's tweet photo. @selinawangtv What kind of third-world facilities do they give media outlets to report from? That space has seen better times. https://t.co/zpLosMcEcO

4

1

0

1K

ModelDrift

@ModelDrift

about 1 month ago

I didn't know I needed a robot until today.

The Humanoid Hub

@TheHumanoidHub

about 1 month ago

Atlas hauling a 50 lb mini-fridge - Practiced the maneuver for millions of hours in a virtual environment. - Focused the training policy on full-body engagement rather than just hand-grasping, allowing the robot to leverage its entire frame for the lift.

54

2K

279

397

111K

0

12

ModelDrift retweeted

Dustin

@r0ck3t23

about 2 months ago

Ilya Sutskever just told the AI industry why scaling is finished. One word built it. One word is about to break it. Sutskever: “Scaling is just one word, but it’s such a powerful word because it informs people what to do.” For five years, that single word replaced an entire research culture. Nobody needed breakthroughs. They needed bigger checks. Sutskever: “If you mix some compute with some data into a neural net of a certain size, you will get results, and you will know that it will be better if you just scale the recipe up.” That’s not science. That’s a recipe. Sutskever: “Companies love this because it gives you a very low risk way of investing your resources.” The most transformative technology in human history ran on the same logic used to franchise a restaurant chain. More locations. More ingredients. Same recipe. Predictable returns. You didn’t need researchers who could see around corners. You needed accountants who could approve purchase orders. But recipes expire. Sutskever: “At some point though, pre-training will run out of data. The data is very clearly finite.” Five years of infrastructure. Five years of hiring. Five years of investor decks. All built on top of something temporary. Sutskever: “I don’t think that’s true.” The co-founder of OpenAI. The mind behind the breakthroughs that made this entire era possible. Saying more money won’t solve it. Sutskever: “In some sense we are back to the age of research.” Most of the companies racing to build AGI were never research companies. They were scaling companies. They hired for execution. Not discovery. They optimized for throughput. Not insight. The talent pipelines. The investor pitches. The board decks. All built around one assumption. That the recipe would never expire. It’s expiring. And the companies that spent five years perfecting the art of spending money are about to discover something. The next era demands what capital can’t purchase. An original idea.

66

1K

141

1K

150K

ModelDrift

@ModelDrift

about 2 months ago

Some people's minds are on a completely different level, like @poetengineer__. I want to be like her when I grow up. Her Patreon project page: https://t.co/tojzWpqLik

Kat ⊷ the Poet Engineer

@poetengineer__

about 2 months ago

trying to use topological data analysis to map the shape of my x bookmarks through mapper + embedding extraction and generated 3 views: - density: where attention keeps gravitating - pca: the dominant axes of variation - centroid: center vs edge (typical -> outlier)

139

6K

638

5K

772K

0

1

0

41

ModelDrift retweeted

Rohan Paul

@rohanpaul_ai

2 months ago

Seedance 2 has entered its Chuck Norris era (from u/orichalchem) --- reddit. com/r/aivideo/comments/1sh5gt6/the_almighty_chuck_norris/

2

103

16

40

56K

ModelDrift retweeted

Alex Volkov

@altryne

2 months ago

I'm sorry, but I've seen this and now you must see this as well. H/t @Presidentlin

102

5K

326

1K

599K

ModelDrift

@ModelDrift

2 months ago

It was my fault. I started talking Orbital Physics with ChatGPT. But ChatGPT had a closing shot that had me giggling out loud... Me: "Being an orbital physicist must be really boring." ChatGPT: "At times, probably yes. A lot of it is: “Relative to what?” “Which frame?” “Is that velocity, radial velocity, or delta-v?” “Congratulations, your intuition is wrong again.” Spaceflight is one of those fields where reality refuses to be user-friendly. That said, the non-boring part is that these people can take what looks like abstract mathematical punishment and use it to fling humans around the Moon and bring them back alive. That is a pretty decent party trick. So: boring in the day-to-day details, deeply cool in the aggregate. Like accounting, but with fire and vacuum."

0

12

ModelDrift

@ModelDrift

2 months ago

Very interesting. Cognition is the new Context. Or Prompt? It's part of the next big thing. After my coffee kicks in, I now have something to play with.

CMEM

@Claude_Memory

2 months ago

https://t.co/PY5NWLDd3N

0

50

7

78

18K

0

5

ModelDrift retweeted

CMEM

@Claude_Memory

2 months ago

https://t.co/PY5NWLDd3N

0

50

7

78

18K

ModelDrift

@ModelDrift

2 months ago

The post gives some really good insights as a context for the article accompanying it. Read the post before jumping into the article. Or just read the post. Or the article. Or get another coffee. Do whatever makes you happy.

Akshay 🚀

@akshay_pachaar

2 months ago

What does every big company think about the agent harness? Anthropic, OpenAI, CrewAI, LangChain. They all build agents. They all wrap their models in infrastructure to make them useful. They each call it the harness. But they agree on one thing. And disagree on everything else. The agreement: the model is not the product. The infrastructure around the model is. The disagreement: how much of that infrastructure should exist. This is the most important architectural bet in AI right now. And each company is placing a different one. 𝗔𝗻𝘁𝗵𝗿𝗼𝗽𝗶𝗰 bets on the model. Their harness is deliberately thin. A "dumb loop" that assembles the prompt, calls the model, executes tool calls, and repeats. The model makes all the decisions. The harness just manages turns. Their bet: as models get smarter, you need less infrastructure, not more. 𝗢𝗽𝗲𝗻𝗔𝗜 takes a similar but slightly thicker approach. Their Agents SDK is "code-first," meaning workflow logic lives in native Python, not in some graph DSL. But they add more structure: strict priority stacks for instructions, multiple orchestration modes, and explicit agent handoff patterns. 𝗖𝗿𝗲𝘄𝗔𝗜 adds a deterministic backbone. Their Flows layer handles routing and validation with hard-coded logic, while their Crews handle the autonomous parts. Intelligence where it matters, control everywhere else. 𝗟𝗮𝗻𝗴𝗚𝗿𝗮𝗽𝗵 bets on explicit control. The harness encodes the logic. Every decision point is a node in a graph. Every transition is a defined edge. Planning steps, routing strategies, multi-step workflows are all spelled out in the harness, not left to the model. Notice the spectrum. On one end: trust the model, keep the harness thin. On the other: encode the logic, make the harness thick. And here's where it gets interesting. The scaffolding metaphor makes this concrete. Construction scaffolding is temporary infrastructure that lets workers reach floors they couldn't access otherwise. It doesn't do the building. But without it, workers can't reach the upper floors. The key word is temporary. As the building goes up, scaffolding comes down. Manus demonstrated this perfectly. They rebuilt their agent five times in six months. Each rewrite removed complexity. Complex tool definitions became simple shell commands. "Management agents" became basic handoffs. The scaffolding did its job. So they removed it. This is also why Anthropic regularly deletes planning steps from Claude Code's harness. Every time a new model version ships that can handle something internally, the corresponding harness logic gets stripped out. But there's a catch. Models are now trained with specific harnesses in the loop. Claude Code's model learned to use the exact scaffolding it was built with. Change the scaffolding, and performance drops. The worker trained on THIS scaffolding. Swap it out, and they stumble. So the field is converging on a principle: Build scaffolding that's designed to be removed. But remove it carefully, because the model learned to lean on it. The "future-proofing test" for any agent system: if dropping in a more powerful model improves performance without adding harness complexity, the design is sound. Two products using the exact same model can perform completely differently based on this one decision: how thick is the harness? LangChain changed only the infrastructure (same model, same weights) and jumped from outside the top 30 to rank 5 on TerminalBench 2.0. The model didn't improve. The scaffolding around it did. The article below is a deep dive on agent harness engineering, covering the orchestration loop, tools, memory, context management, and everything else that transforms a stateless LLM into a capable agent.

akshay_pachaar's tweet photo. What does every big company think about the agent harness?

Anthropic, OpenAI, CrewAI, LangChain. They all build agents. They all wrap their models in infrastructure to make them useful. They each call it the harness.

But they agree on one thing. And disagree on everything else.

The agreement: the model is not the product. The infrastructure around the model is.

The disagreement: how much of that infrastructure should exist.

This is the most important architectural bet in AI right now. And each company is placing a different one.

𝗔𝗻𝘁𝗵𝗿𝗼𝗽𝗶𝗰 bets on the model. Their harness is deliberately thin. A "dumb loop" that assembles the prompt, calls the model, executes tool calls, and repeats. The model makes all the decisions. The harness just manages turns. Their bet: as models get smarter, you need less infrastructure, not more.

𝗢𝗽𝗲𝗻𝗔𝗜 takes a similar but slightly thicker approach. Their Agents SDK is "code-first," meaning workflow logic lives in native Python, not in some graph DSL. But they add more structure: strict priority stacks for instructions, multiple orchestration modes, and explicit agent handoff patterns.

𝗖𝗿𝗲𝘄𝗔𝗜 adds a deterministic backbone. Their Flows layer handles routing and validation with hard-coded logic, while their Crews handle the autonomous parts. Intelligence where it matters, control everywhere else.

𝗟𝗮𝗻𝗴𝗚𝗿𝗮𝗽𝗵 bets on explicit control. The harness encodes the logic. Every decision point is a node in a graph. Every transition is a defined edge. Planning steps, routing strategies, multi-step workflows are all spelled out in the harness, not left to the model.

Notice the spectrum.

On one end: trust the model, keep the harness thin.
On the other: encode the logic, make the harness thick.

And here's where it gets interesting.

The scaffolding metaphor makes this concrete.

Construction scaffolding is temporary infrastructure that lets workers reach floors they couldn't access otherwise. It doesn't do the building. But without it, workers can't reach the upper floors.

The key word is temporary.

As the building goes up, scaffolding comes down. Manus demonstrated this perfectly. They rebuilt their agent five times in six months. Each rewrite removed complexity. Complex tool definitions became simple shell commands. "Management agents" became basic handoffs.

The scaffolding did its job. So they removed it.

This is also why Anthropic regularly deletes planning steps from Claude Code's harness. Every time a new model version ships that can handle something internally, the corresponding harness logic gets stripped out.

But there's a catch.

Models are now trained with specific harnesses in the loop. Claude Code's model learned to use the exact scaffolding it was built with. Change the scaffolding, and performance drops. The worker trained on THIS scaffolding. Swap it out, and they stumble.

So the field is converging on a principle:

Build scaffolding that's designed to be removed. But remove it carefully, because the model learned to lean on it.

The "future-proofing test" for any agent system: if dropping in a more powerful model improves performance without adding harness complexity, the design is sound.

Two products using the exact same model can perform completely differently based on this one decision: how thick is the harness?

LangChain changed only the infrastructure (same model, same weights) and jumped from outside the top 30 to rank 5 on TerminalBench 2.0.

The model didn't improve. The scaffolding around it did.

The article below is a deep dive on agent harness engineering, covering the orchestration loop, tools, memory, context management, and everything else that transforms a stateless LLM into a capable agent.

67

1K

179

2K

182K

0

20

ModelDrift

@ModelDrift

2 months ago

Oh, I've so been waiting for this. The command line is useful, but planning is not where it shines. Drafting a plan at the command line is better left to Lisp programmers, EMACS zealots and guys with neck beards. (Which as often the same thing...)

Thariq

@trq212

2 months ago

New in Claude Code: /ultraplan Claude builds an implementation plan for you on the web. You can read it and edit it, then run the plan on the web or back in your terminal. Available now in preview for all users with CC on the web enabled.

536

10K

650

6K

1M

0

27

ModelDrift

@ModelDrift

2 months ago

@akshay_pachaar That's a really useful insight. I haven't even had coffee yet, but that was quite clear.

0

1

0

169

ModelDrift retweeted

Ian Borders

@OpenMotus

3 months ago

Claude just officially killed OpenClaw and Hermes via subs. Here’s how to still use your Claude Max plan for your agent... KyberBot runs on the Claude Code harness (Well within the Claude ToS). Native SOTA memory, self-evolving, and gets smarter over time.