hetong07 @hetong07 - Twitter Profile

hetong07 retweeted

1 day ago

cuda frameworks that JIT and autotune are so painful. i don't want to have magic happen during the start of training. i want to precompile my kernels, test them once, and be happy.

12

153

6

25

12K

hetong07 retweeted

Tianqi Chen

@tqchenml

12 days ago

What will the role of AI compilers be in the age of AI agents and frontier kernel programming? We believe agents should have access to a predictable DSL that offers maximum expressiveness, paired with a minimal compiler they can directly open up, build toolings, and improve for specialized optimizations. TIRx is our effort on this front. We've had a great experience using it in our latest mega-kernel compiler research and teaching Blackwell programming in our ML systems course at CMU. Check it out:

1

128

20

71

11K

hetong07 retweeted

Charles Rollet @CharlesRollet1

19 days ago

NEW: Inside Cursor's wild rise. Lots of great new details: • CEO Michael Truell didn't pay himself for years • Cursor once made up 40-50% of Anthropic's revenue • Anthropic told Cursor that Claude Code was just a 'research effort' (lol) • Cursor's unpaid "work trials"

CharlesRollet1's tweet photo. NEW: Inside Cursor's wild rise. Lots of great new details:

• CEO Michael Truell didn't pay himself for years

• Cursor once made up 40-50% of Anthropic's revenue

• Anthropic told Cursor that Claude Code was just a 'research effort' (lol)

• Cursor's unpaid "work trials" https://t.co/ClJF2F0T7U

50

3K

100

1K

835K

hetong07 @hetong07

about 2 months ago

@jian_w3ng 沙特地里不是长比特币吗

0

14

Who to follow

Beff Jezos

@iamParvezKhan07

You Know Who I Am. Twitter is new Learning Platform. #PythonProgrammer #DjangoDeveloper #JavaScript Learn Vocab Learn Good English Being Robust Learner.

AVIS IT Solutions Uganda

@itsolutionug

Avis IT Solutions Uganda provides comprehensive and reliable technology services to businesses and individuals, ensuring seamless operations.

hetong07 retweeted

about 2 months ago

LLM-powered AI agents are gonna be great! You should totally trust them!

39

334

38

95

29K

hetong07 retweeted

Avi Chawla

@_avichawla

2 months ago

The most comprehensive RL overview I've ever seen. Kevin Murphy from Google DeepMind, who has over 128k citations, wrote this. What makes this different from other RL resources: → It bridges classical RL with the modern LLM era: There's an entire chapter dedicated to "LLMs and RL" covering: - RLHF, RLAIF, and reward modeling - PPO, GRPO, DPO, RLOO, REINFORCE++ - Training reasoning models - Multi-turn RL for agents - Test-time compute scaling → The fundamentals are crystal clear Every major algorithm, like value-based methods, policy gradients, and actor-critic are explained with mathematical rigor. → Model-based RL and world models get proper coverage Covers Dreamer, MuZero, MCTS, and beyond, which is exactly where the field is heading. → Multi-agent RL section Game theory, Nash equilibrium, and MARL for LLM agents. I have shared the arXiv paper in the replies!

_avichawla's tweet photo. The most comprehensive RL overview I've ever seen.

Kevin Murphy from Google DeepMind, who has over 128k citations, wrote this.

What makes this different from other RL resources:

→ It bridges classical RL with the modern LLM era:

There's an entire chapter dedicated to "LLMs and RL" covering:

- RLHF, RLAIF, and reward modeling
- PPO, GRPO, DPO, RLOO, REINFORCE++
- Training reasoning models
- Multi-turn RL for agents
- Test-time compute scaling

→ The fundamentals are crystal clear

Every major algorithm, like value-based methods, policy gradients, and actor-critic are explained with mathematical rigor.

→ Model-based RL and world models get proper coverage

Covers Dreamer, MuZero, MCTS, and beyond, which is exactly where the field is heading.

→ Multi-agent RL section

Game theory, Nash equilibrium, and MARL for LLM agents.

I have shared the arXiv paper in the replies!

11

1K

180

2K

88K

hetong07 retweeted

Akshay 🚀

@akshay_pachaar

3 months ago

Claude Code fully dissected! Researchers from UCL reverse-engineered the leaked Claude source. What they found changes how you should think about agent design. Only 1.6% of the codebase is AI decision logic. The other 98.4% is operational infrastructure. Permission gates, tool routing, context compaction, recovery logic, session persistence. The model reasons. The harness does everything else. This is the opposite of what most agent frameworks do today. LangGraph routes model outputs through explicit state machines. Devin bolts heavy planners onto operational scaffolding. Claude Code gives the model maximum decision latitude inside a rich deterministic harness, and invests all its engineering effort in that harness. The core loop is a simple while-true. Call model, run tools, repeat. But the systems around that loop are where the real design lives: A permission system with 7 modes and an ML classifier. Users approve 93% of prompts anyway, so the architecture compensates with automated layers instead of adding more warnings. A 5-layer context compaction pipeline. Each layer runs only when cheaper ones fail. Budget reduction, snip, microcompact, context collapse, auto-compact. Four extension mechanisms ordered by context cost. Hooks (zero), skills (low), plugins (medium), MCP (high). Each answers a different integration problem. Subagents return only summary text to the parent. Their full transcripts live in sidechain files. Agent teams still cost roughly 7x the tokens of a standard session. Resume does not restore session-scoped permissions. Trust is re-established every session. That friction is the point. The bet behind all of this is simple. As frontier models converge on raw coding ability, the quality of the harness becomes the differentiator, not the model. Paper: Dive into Claude Code (arXiv:2604.14228) In the next tweet, I've shared an article I wrote on Agent Harness and what every big company is building. Do check.

akshay_pachaar's tweet photo. Claude Code fully dissected!

Researchers from UCL reverse-engineered the leaked Claude source. What they found changes how you should think about agent design.

Only 1.6% of the codebase is AI decision logic.

The other 98.4% is operational infrastructure. Permission gates, tool routing, context compaction, recovery logic, session persistence. The model reasons. The harness does everything else.

This is the opposite of what most agent frameworks do today.

LangGraph routes model outputs through explicit state machines. Devin bolts heavy planners onto operational scaffolding. Claude Code gives the model maximum decision latitude inside a rich deterministic harness, and invests all its engineering effort in that harness.

The core loop is a simple while-true. Call model, run tools, repeat.

But the systems around that loop are where the real design lives:

A permission system with 7 modes and an ML classifier. Users approve 93% of prompts anyway, so the architecture compensates with automated layers instead of adding more warnings.

A 5-layer context compaction pipeline. Each layer runs only when cheaper ones fail. Budget reduction, snip, microcompact, context collapse, auto-compact.

Four extension mechanisms ordered by context cost. Hooks (zero), skills (low), plugins (medium), MCP (high). Each answers a different integration problem.

Subagents return only summary text to the parent. Their full transcripts live in sidechain files. Agent teams still cost roughly 7x the tokens of a standard session.

Resume does not restore session-scoped permissions. Trust is re-established every session. That friction is the point.

The bet behind all of this is simple. As frontier models converge on raw coding ability, the quality of the harness becomes the differentiator, not the model.

Paper: Dive into Claude Code (arXiv:2604.14228)

In the next tweet, I've shared an article I wrote on Agent Harness and what every big company is building. Do check.

71

2K

300

2K

180K

hetong07 retweeted

Jerry Liu

@jerryjliu0

3 months ago

If you're an AI/agent builder, it's so important that you don't overbuild and overcommit on a specific toolset and infrastructure. Frontier labs are shipping not just the models, but the harnesses and surrounding tooling such that your existing stack might be obsolete next week. * e.g. if you had a super complex RAG stack, you may need to rip it out in favor of agents + sandboxes * e.g. if you spent a lot of time building the sandbox and serving layer, you may not need to anymore if you can just bootstrap the product with Claude Managed Agents The tradeoff is completely dependent on how good out-of-the-box these proprietary agent wrappers get. Back when the OpenAI Agent SDK came out, most people did not switch from frameworks because they were simply more powerful. Nowadays tools like the Claude Agent SDK + managed agent services are getting way better.

57

436

26

339

69K

hetong07 retweeted

Anthropic

@AnthropicAI

3 months ago

New on the Engineering Blog: Building Managed Agents—our hosted service for long-running agents—meant solving an old problem in computing: how to design a system for “programs as yet unthought of.” Read more: https://t.co/YYaEub2QGV

388

4K

448

2K

579K

hetong07 retweeted

Vivek Galatage

@vivekgalatage

3 months ago

Ultrafast Trading Systems in C++ by David Gross "While low-latency programming is sometimes seen under the umbrella of 'code optimization', the truth is that most of the work needed to achieve such latency is done upfront, at the design phase." https://t.co/FYv8Iml9aM

vivekgalatage's tweet photo. Ultrafast Trading Systems in C++ by David Gross

"While low-latency programming is sometimes seen under the umbrella of 'code optimization', the truth is that most of the work needed to achieve such latency is done upfront, at the design phase."

https://t.co/FYv8Iml9aM https://t.co/v7jAPXTfQM

9

1K

94

1K

135K

hetong07 retweeted

am.will

@LLMJunky

3 months ago

The Codex app server was such a brilliant stroke of foresight that really doesn't get enough love Not only are you allowed to use your chatgpt account with any harness, but you can build your own apps directly on top of theirs. They just make building on and with codex such a great experience To demonstrate this utility, I want to highlight the kitty litter app, made by @SIGKITTEN. Instead of having to build the entire harness, and all the infrastructure, he's plugged into the app server for a unified experience between mobile and dev machine. When I create a session on my computer, it's automatically available on my phone. All of the chats you see in this video automatically populated when we connected to the app server. All my skills. My agents. My sessions. My folders. My prompts. They're all ready to use - automatically. Because they're exposed by the app server, along with many other endpoints. It's a great ux/dx that really deserves some love. It's almost like they want you to build on top of their products ;) Btw Litter is great 👍