Narrative violation and great insight from the latest Citadel Securities banger by Frank Flight: "We illustrated back in February that demand for software engineers, the most AI exposed occupation was accelerating higher, which we argued violates the displacement narrative. Indeed the acceleration in software job postings has continued, now up 18% from the inflection point in May last year."
NEWS: In an interview with me, @SecScottBessent fully closes the door to ever running for elective office but opens the door to becoming Fed chair down the road (post-Warsh):
"After he leaves the Treasury, Mr. Bessent...says he has zero interest in the presidency or any other elective office. ' I will take that off the table,' he says. 'I will be the Colin Powell of this administration.
"When I ask about the Federal Reserve, however, he pauses. 'I wouldn’t say "no'" to being Fed chair later,'" he answers. 'There’s no election involved. You can shape the economy, and it’s an institution.'"
Read the full interview here:
https://t.co/UGlsmPpoMV
@NickTimiraos@schwartzbWSJ@jeannasmialek@arappeport@amacker@colbyLsmith@clairecjones@ctorresreporter@SalehaMohsin
How two hedge fund analysts can take opposite sides of a trade and both build competitive advantage:
Kirk McKeown — ex-Head of Proprietary Research at Point72, previously Glenview & Tudor.
Kirk McKeown explains:
"You and I both get a 10-K, a couple 10-Qs, the last two transcripts, and a comp sheet on Lululemon."
"You take a two-year view. New CEO, closing underperforming stores — $10 goes to $25. Two-and-a-half bagger. You buy on dips."
"I come in and say it always takes longer. Alo and Vuori just launched new products. Stock's going to $6 from here. I short it."
"You're long. I'm short. We're both right."
"You need persistence of capital. I need timing."
"We're looking at the same information and bringing in different context to create competitive advantage."
Something I taught 14 yo: Most progress is a mix of steps forward and steps back, just with with more of the former. But you can get a run of steps back. So to judge progress accurately you need to use a big enough window, or it could look like you're failing.
Very happy to share a new paper with Guido Friebel, Yao Huang, Jin Li, and Andrew Zhang on how AI could change the structure of internal labor markets.
We show that cutting junior hiring when AI arrives may weaken the pipeline that creates future seniors and lead to “lost cohorts” of juniors and cycles of shortage and glut over time.
Craziest part is we all knew each other already in high school! Along with @randomjohnnyh (Perplexity cofounder), @demi_guo_ (Pika CEO), @stevenkplus1 and Andrew (Cognition), and many others. We all grew up in different states but met thru the olympiad scene.
Vividly remember this line from @alexandr_wang when we were around 19: "I hear people saying they want to find the next Paypal mafia. Why shouldn't it just be us?"
Glad to see @chameleon_jeff get the recognition he deserves :)
NEW: Inside the meetings + calls where President Trump and his team were given clear indicators that the economy (possible rise in prices) could take a hit if the war in Iran is prolonged.
Treasury Secretary Bessent and the president discussed various measures the Treasury could take if the war went on for eight to 12 week and how the U.S. could be vulnerable to a potential rise in gasoline prices.
Many of these little known conversations came in the buildup to the cease-fire
https://t.co/FlpFjDhUS2
Managed Agents is the first 'agent in the cloud' API that has the right mix of simplicity and complexity.
Implementation details like how you manage a sandbox are abstracted, but you have a lot of control over the actual execution of the model.
The authors estimate that the tariffs implemented through November of 2025 can explain the entirety of excess inflation in the core goods category and contributed to a 0.8 percent boost in core PCE prices through February 2026. https://t.co/bVGHP3qhhK #FEDSNote
Judging by my tl there is a growing gap in understanding of AI capability.
The first issue I think is around recency and tier of use. I think a lot of people tried the free tier of ChatGPT somewhere last year and allowed it to inform their views on AI a little too much. This is a group of reactions laughing at various quirks of the models, hallucinations, etc. Yes I also saw the viral videos of OpenAI's Advanced Voice mode fumbling simple queries like "should I drive or walk to the carwash". The thing is that these free and old/deprecated models don't reflect the capability in the latest round of state of the art agentic models of this year, especially OpenAI Codex and Claude Code.
But that brings me to the second issue. Even if people paid $200/month to use the state of the art models, a lot of the capabilities are relatively "peaky" in highly technical areas. Typical queries around search, writing, advice, etc. are *not* the domain that has made the most noticeable and dramatic strides in capability. Partly, this is due to the technical details of reinforcement learning and its use of verifiable rewards. But partly, it's also because these use cases are not sufficiently prioritized by the companies in their hillclimbing because they don't lead to as much $$$ value. The goldmines are elsewhere, and the focus comes along.
So that brings me to the second group of people, who *both* 1) pay for and use the state of the art frontier agentic models (OpenAI Codex / Claude Code) and 2) do so professionally in technical domains like programming, math and research. This group of people is subject to the highest amount of "AI Psychosis" because the recent improvements in these domains as of this year have been nothing short of staggering. When you hand a computer terminal to one of these models, you can now watch them melt programming problems that you'd normally expect to take days/weeks of work. It's this second group of people that assigns a much greater gravity to the capabilities, their slope, and various cyber-related repercussions.
TLDR the people in these two groups are speaking past each other. It really is simultaneously the case that OpenAI's free and I think slightly orphaned (?) "Advanced Voice Mode" will fumble the dumbest questions in your Instagram's reels and *at the same time*, OpenAI's highest-tier and paid Codex model will go off for 1 hour to coherently restructure an entire code base, or find and exploit vulnerabilities in computer systems. This part really works and has made dramatic strides because 2 properties: 1) these domains offer explicit reward functions that are verifiable meaning they are easily amenable to reinforcement learning training (e.g. unit tests passed yes or no, in contrast to writing, which is much harder to explicitly judge), but also 2) they are a lot more valuable in b2b settings, meaning that the biggest fraction of the team is focused on improving them. So here we are.
Life update: After ~2 years at Anthropic, I joined OpenAI! This wasn’t the easiest decision and I’m very grateful to everyone who is supporting me through this transition, especially John, Barret, Boris, Mira, and Sam.
I joined Anthropic as the first designer / front-end engineer when there were ~60 people and left as a researcher when there were >500. I learned so much and hope that every project I’ve worked on, be it a UI, a paper, or a trained Claude model, will still carry all the love and care I put in. Some lessons learned:
1. The pace of a team’s progress is largely a function of its decisiveness and open-mindedness to take risky paths.
2. Every time you train a new model there will be an inevitable brain damage that needs to be solved and often you can reverse engineer the issue by carefully looking at the data.
3. The simplest and dumbest approach will often just work.
4. You have to go through the entire journey of full understanding to arrive at the simplest answer.
5. When technology is so transformative, it’s your job to tell customers what they need to do with it to solve their problems.
6. Scaling the company’s culture requires fostering internal champions for your core values.
7. In research, the beauty often lies in taking experimental ideas and making them work on a larger scale. In product engineering, the beauty lies in refining a visionary design idea to its most essential form you can execute within given constraints.
8. Early mishires will have 10x effects as the company grows. It is more heartbreaking when the organization is blind to this (e.g. they don’t let go and rather allow them to influence a lot of important decisions)
9. Evaluations are going to be an inevitable part of the story for your product. You can help academia by adopting their evaluations in your model card, and that’s the position of power that is really important to recognize responsibly.
10. Being the first design-oriented person is challenging and often you will end up teaching people how to think rather than doing things. But you can learn a great deal about fundraising and shape the public perception by continually making beautiful demos and communicating research clearly.
11. Not being afraid to jump into unknown problems, taking more responsibilities, and doing 200% more than what is asked for is how I personally grew the most.
12. Unique writing cultures shape how ideas take place.
13. Playing a catch up game is efficient.
14. Doing good work and being kind to people you collaborate with is how I built the most meaningful friendships in my life.
It’s been a wonderful journey delivering Claude 3 models, and I’m very excited to continue working on AGI and it safety by learning from incredibly talented researchers and product people at OpenAI.