Anup

Verified account

@Anup

AI Engineer @ Oxford Dynamics AI Engineering Newsletter -

London

Joined December 2006

3.8K Following

3.7K Followers

5.2K Posts

4 days ago

A useful agent should not just remember facts. It should learn from what happened before. I’ve wrapped up my 3-part series on agent memory, from why context is not enough, to modern memory architectures, to the frontier of reflection, consolidation, and memory-guided action. Part 1: https://t.co/CxtFmAoEn5 Part 2: https://t.co/nauSzWmomg Part 3: https://t.co/dG0AkKOooP

0

0

0

0

29

Anup retweeted

28 days ago

@eladgil BS. Attention was born in Montréal PyTorch in NYC. AlphaGo in London AlphaFold in London ESMFold in NYC Llama 1 in Paris. Llama 2 in Paris+NYC+SV DeepSeek in Hangzhou Plus: DINO in Paris JEPA in Montréal+Paris+NYC SV is 3 mos ahead on topics SV is singularly obsessed with.

182

8K

499

2K

738K

30 days ago

well…there goes my weekend.

Anup's tweet photo. well…there goes my weekend. https://t.co/IQapimOm54

0

0

0

0

56

about 1 month ago

amazing

about 1 month ago

Effective today, we are: 1) Doubling Claude Code’s 5-hour rate limits for Pro, Max, and Team plans; 2) Removing the peak hours limit reduction on Claude Code for Pro and Max plans; and 3) Substantially raising our API rate limits for Opus models.

1K

44K

4K

5K

9M

0

0

0

0

158

Who to follow

Verified account

AI Architect & Engineer | Salesforce & AWS Cloud ☁️ | Startups | Healthcare, Legal Tech & EdTech | AI & ML | Family & Fun | Football ⚽️ Tennis 🎾 | MVP & CTO

7X Certified, Salesforce MVP'15'16, Love Sci-if and geekin' out 🤓

about 1 month ago

Wrote two posts on inference engineering. Part 1 (capacity): one formula, the MoE sizing trap, why embedding fleets want a different shape. Part 2 (throughput): decode is bandwidth-bound, not compute-bound. Two H200s often outserve four H100s on chat workloads, even when raw FLOPs look similar. Part 1: https://t.co/aAkWLlKV35 Part 2: https://t.co/XUq3dfKNJ5

0

0

0

0

65

about 1 month ago

Two events, four weeks, same argument: Claude Code's source leaked on March 31, and Anthropic's quality postmortem landed yesterday. A 25-word system prompt line cost 3% on coding evals. Zero model weight changes. The harness is the product. https://t.co/3S7iDf1Jtu

0

0

0

0

86

2 months ago

Better models alone will not solve agent reliability. If an agent cannot remember, verify, recover, or be inspected properly, the problem is often the harness, not the model. Wrote up my thinking on harness engineering: https://t.co/ET3qEt1XNh

1

0

0

0

84

2 months ago

I finally got through the 3 hr+ @maxsbennett interview on @MLStreetTalk (link at the end). It took me over two weeks to finish it. But it sharpened something I was already thinking while reading Packy McCormick’s Not Boring essay on world models, co-written with @PimDeWitte I think we are still too loose with the phrase “world model”. Current LLMs obviously have models. You do not get that level of performance without some internal structure that captures a surprising amount about language and the world. But Bennett’s distinction is more demanding than that. A world model, in the stronger sense, is about interventions and causality: I think this will happen if I do X, I do X, and then I update from the gap between what I expected and what actually happened. That is not the same thing as learning from a fixed corpus. What also stayed with me is his point about language. Language was not just a better way to communicate observations. It let humans share simulations, refine them together, and build on them across generations. That feels like a deeper explanation for why human knowledge compounds the way it does. Read through that lens, world models start to look less like a side branch of robotics and more like a serious attempt to move beyond systems that are very good at describing the world but cannot really test themselves against it. I still think this area is easy to overstate, and the term gets used too casually. But I do think the direction matters. Maybe the next step after LLMs is not just better text generation, but systems that can form hypotheses, act, and revise. MLST interview: https://t.co/dEbblABiFO NotBoring article on World Models: https://t.co/uui35k2DwK

0

0

0

0

78

2 months ago

This is fantastic

2 months ago

Introducing the new dev-browser cli. The fastest way for an agent to use a browser is to let it write code. Just `npm i -g dev-browser` and tell your agent to "use dev-browser"

153

3K

288

5K

861K

0

0

0

0

127

Anup retweeted

2 months ago

Introducing the new dev-browser cli. The fastest way for an agent to use a browser is to let it write code. Just `npm i -g dev-browser` and tell your agent to "use dev-browser"

153

3K

288

5K

861K

3 months ago

HN is asking the question everyone's avoiding: When AI makes your devs 2x more productive, do you fire half of them or build twice as much? The answers in this thread are more honest than any earnings call. https://t.co/slQZYNqg6A

0

0

0

0

89

3 months ago

New paper: GPT-5.2 and Claude Opus 4.6 independently produce identical refusals for certain prompts. "Deterministic silence" is correlated failure modes across competing labs. Alignment monoculture may be a bigger risk than we thought. https://t.co/5s1Y6Sdnb9

1

1

0

0

75

3 months ago

If you're trying to standardise how your team uses AI across the full software development lifecycle, this repo template is worth a look. https://t.co/2rV97b71cF

0

0

0

0

75

3 months ago

Claude Code as a self-managing AI team; one repo, multiple specialised agents (PM, architect, engineer, reviewer) coordinating autonomously. The agentic SDLC is here. https://t.co/esqlVrz8Vd

0

3

0

1

148

Anup retweeted

3 months ago

Today, we’re introducing Forge, a system for enterprises to build frontier-grade AI models grounded in their proprietary knowledge. 🌎 Forge bridges the gap between generic AI and enterprise-specific needs. Instead of relying on broad, public data, organizations can train models that understand their internal context embedded within systems, workflows, and policies, aligning AI with their unique operations. We have already partnered with world-leading organizations, like ASML, DSO National Laboratories Singapore, Ericsson, European Space Agency, Home Team Science and Technology Agency (HTX) Singapore and Reply to train models on the proprietary data that powers their most complex systems and future-defining technologies.

75

3K

363

1K

417K

3 months ago

@mitchellh thank you @mitchellh the phantom mouse bug was really annoying

0

0

0

0

345

3 months ago

@james406 <em dash> <em dash> some clever response <em dash> <em dash>

0

1

0

0

20

3 months ago

@trq212 @bcherny

0

1

0

0

42

3 months ago

Claude all the way down

Anup's tweet photo. Claude all the way down https://t.co/u4W4GnZYmP

1

0

0

0

73

Last Seen Users on Sotwe

Trends for you

Most Popular Users