Feng Peng

Verified account

@feng

Engineer. Ex-Ask/Twitter Data Infra, building @leettools

Bay area

Joined April 2009

1.1K Following

1.6K Followers

729 Posts

1 day ago

Coding agents are good at static stuff that can be verified via syntax check, but very bad at understanding complex runtime semantics (data or system components). I don't see this problem being solved by LLMs themselves, but a good harness should be able to help.

Moritz Wallawitsch

1 day ago

holy shit - their api is leaking customer data

MoritzW42's tweet photo. holy shit - their api is leaking customer data https://t.co/AyQXGzWuMa

173

4K

218

1K

1M

0

1

0

0

78

4 days ago

Microsoft is releasing a coding model: MAI-Code-1-Flash is an inference-efficient agentic coding model. This model is tailor-made for and deeply integrated into GitHub Copilot, VS Code and the Microsoft stack, and, with 5 billion parameters, is comparable to Haiku but cheaper. (https://t.co/S8xF5l6wea) Why should we be bound to a single harness? I think we need a coordinator to run multi-harness with auto-configurable backend models; let's call it harness-of-harness engineering, or harnessness.

0

1

0

0

141

feng retweeted

6 days ago

This has quietly been a miracle month in medicine. In the last 5 weeks we’ve got news on: - retatrutide, the triple agonist GLP-1 from Lilly, basically melting fat and body-wide inflammation at record levels - RevMed’s new pancreatic cancer drug showing unprecedented abilities to extend life - small trial of a one-and-done PCSK9 gene editing therapy for slashing LDL cholesterol - Mayo’s AI-assisted radiology showing vastly improved cancer detection - this new therapy for metastatic solid tumors This stuff is at varying levels of evidence. Retatrutide is ~100% on its way, other stuff needs more clinical trial data. But put it together and we’re maybe on the verge of majorly reducing the mortality of heart disease and cancer, the two leading causes of death in America.

189

11K

2K

5K

2M

4 days ago

For most simple application-layer programs, I don't see any need for any other languages at all now. TypeScript will be almost the only one used very soon. Never saw this one coming. For the backend, C -> C++ -> Java -> Ruby -> Node(JS) -> Java -> Go -> Python -> now back to Node(TS), lol.

0

0

0

0

54

Who to follow

Utkarsh Srivastava

Verified account

Leading engineering at Parallel Web Systems

Kumar Chellapilla

Machines that learn, Web Search, Ads, Social media, Twitter, Microsoft, Metallica, Matrix, Blondie24 - all in a blender ...

8 days ago

Showing off token-used is basically the same as showing LOC, which is just very junior.

9 days ago

I can now probably say this: Two months ago, inside Anthropic someone suggested building a token leaderboard. A heated internal debate followed and the decision was made to *never* ever do it… because several people inside Anthropic simply thought ahead of the consequences

171

8K

307

1K

1M

0

0

0

0

95

8 days ago

@ziad_makes Nah, you can try yourself. It is pretty easily reproducible.

0

0

0

0

8

9 days ago

LOL, when you ask Claude which model it is through the API, its answer is "Qwen" when the question is in Chinese and "Claude" when the question is in English. LLMs are definitely Bayesian; anyone saying they are intelligent is just wrong.

feng's tweet photo. LOL, when you ask Claude which model it is through the API, its answer is "Qwen" when the question is in Chinese and "Claude" when the question is in English.

LLMs are definitely Bayesian; anyone saying they are intelligent is just wrong. https://t.co/fJmYpMzg6X

1

12

0

1

2K

8 days ago

Opus 4.8 has been pretty impressive, solving quite a few tasks that 4.7 and GPT-5.5 ran circles. But apparently it can't find its own bugs. Claude Code has been quite buggy recently, but this one is especially annoying since it is not recoverable, meaning the tokens used before the bug are wasted. Anyone who has worked on agent loops should easily see where the bugs are coming from. Agent-assisted coding still has a long way to go for sure.

feng's tweet photo. Opus 4.8 has been pretty impressive, solving quite a few tasks that 4.7 and GPT-5.5 ran circles.

But apparently it can't find its own bugs. Claude Code has been quite buggy recently, but this one is especially annoying since it is not recoverable, meaning the tokens used before the bug are wasted.

Anyone who has worked on agent loops should easily see where the bugs are coming from. Agent-assisted coding still has a long way to go for sure.

0

1

0

0

77

9 days ago

Damn, hard things are hard. Great that everyone is safe.

9 days ago

Very glad everyone is safe. An extremely bad night for everyone @blueorigin, and those counting on them. My heart goes out to all. Hopefully the cause of the explosion can be found swiftly, and the launchpad rebuilt on an accelerated timeline.

62

3K

246

129

550K

0

3

0

0

182

17 days ago

The boring stuff: security, response time, uptime, and fault tolerance are actually important.

18 days ago

We are investigating unauthorized access to GitHub’s internal repositories. While we currently have no evidence of impact to customer information stored outside of GitHub’s internal repositories (such as our customers’ enterprises, organizations, and repositories), we are closely monitoring our infrastructure for follow-on activity.

2K

25K

5K

6K

14M

0

0

0

0

123

22 days ago

@_brian_johnson Definitely! I hope the OSS models can get good enough soon.

0

0

0

0

19

22 days ago

Claude Code somehow reset my weekly quota this morning, but did not change the refresh date. So much money wasted; I tried my best, lol😭

feng's tweet photo. Claude Code somehow reset my weekly quota this morning, but did not change the refresh date. So much money wasted; I tried my best, lol😭 https://t.co/8eLvfGa8zs

1

1

0

0

167

23 days ago

Thankfully, Codex is at least as good as Claude Code these days. Can't say for sure though, their quality do fluctuate quite a bit.

feng's tweet photo. Thankfully, Codex is at least as good as Claude Code these days. Can't say for sure though, their quality do fluctuate quite a bit. https://t.co/hQxZYb7ynb

0

1

0

0

257

23 days ago

I am not sure why anyone would want to limit themselves to a single coding agent atm. The coding tool war is still very early; I bet Claude Code will have some promotions very soon. Also, Copilot and Gemini are getting better from my experience with them once in a while. They don't have any big promotions because they are not good enough right now.

24 days ago

codex is the best AI coding product and we want to make it easy to try. for the next 30 days, we are giving companies that want to try switching over two months of free codex usage.

2K

21K

882

4K

2M

1

2

0

0

226

about 1 month ago

GitHub has its problems. But we have used other options, including the self-hosted GitLab enterprise version and Bitbucket in the early days. GitHub is actually pretty impressive, especially considering that it is dealing with this kind of unexpected usage growth. The actual problem is that this kind of freemium business model needs real adjustment in the agent era of software and usage.

about 1 month ago

It's honestly impressive that GitHub kept the service up at all, given this kind of growth. I predicted this years ago: Free services will become untenable with the advent of human-level bots. Worth exploring micro-payments: Even cents per git push might be enough to reduce spam and make this sustainable. Maybe powered by Bitcoin to keep this open and accessible (as opposed to KYCing users).

amasad's tweet photo. It's honestly impressive that GitHub kept the service up at all, given this kind of growth.

I predicted this years ago: Free services will become untenable with the advent of human-level bots.

Worth exploring micro-payments: Even cents per git push might be enough to reduce spam and make this sustainable. Maybe powered by Bitcoin to keep this open and accessible (as opposed to KYCing users).

107

2K

86

390

267K

0

0

0

0

183

about 1 month ago

It is really hard to pick which parts of the code to write manually these days (trending zero though). I guess it is even harder to resist the temptation to ask the agent to write the corporate docs. I bet there are skills out there to list all the common AI patterns to restrict them from appearing, like a cat-and-mouse game.

about 1 month ago

The growth of AI Slop in corporate comms (via @AlphaSenseInc and @barronsonline)

tanayj's tweet photo. The growth of AI Slop in corporate comms

(via @AlphaSenseInc and @barronsonline) https://t.co/9QtNFU2xh2

22

766

94

143

75K

0

0

0

0

115

about 1 month ago

Actually, anyone with some reasonable ops experience knows that it SHOULD be impossible for any human (let alone agents) to destroy the production DB, its WAL (or CDC), and periodical backups altogether. It is basically ops 101 to prevent bad actors (used to be mostly human though) from doing these kinds of damage. It is not the failure of the agents (they are bound to fail sometimes), it is the failure of the ops. Agents just need good rails.

about 1 month ago

https://t.co/ofucbVgkLV

1K

5K

1K

6K

7M

0

3

0

1

139

about 1 month ago

Always wanted to do this but finally got some time to get it done using one command: Find a VM with 4GB RAM and 2vCPU on GCP and deploy the @examples/web_server/ to it for the domain name https://t.co/sbfKHf1JPR registered on GoDaddy with self-made SSL certificates.

feng's tweet photo. Always wanted to do this but finally got some time to get it done using one command:

Find a VM with 4GB RAM and 2vCPU on GCP and deploy the @examples/web_server/ to it for the domain name https://t.co/sbfKHf1JPR registered on GoDaddy with self-made SSL certificates. https://t.co/XUNhvDmCYH

0

0

0

0

103

about 2 months ago

Alameda Research's $200,000 pre-seed investment in Cursor sold for $200,000 in FTX bankruptcy, damn. https://t.co/hMUg6Jvivs

feng's tweet photo. Alameda Research's $200,000 pre-seed investment in Cursor sold for $200,000 in FTX bankruptcy, damn.

https://t.co/hMUg6Jvivs

0

0

0

0

92

Last Seen Users on Sotwe

Trends for you

Most Popular Users