Behrooz Ghorbani

@_ghorbani

Leading Reinforcement Learning at @Reflection_ai. Scaling RL for frontier reasoning models. Formerly @OpenAI, @GoogleBrain and @stanford_ee.

San Francisco, CA

Joined December 2017

629 Following

1.5K Followers

160 Posts

Behrooz Ghorbani

@_ghorbani

12 days ago

Exciting times at @reflection_ai! Science moves faster when researchers can inspect, adapt, reproduce, and build. That is why open models matter. If you want to build models that move science forward, come join us.

Reflection @reflection_ai

13 days ago

Our open models are designed to support the Genesis Mission by giving the scientists in our national labs the flexibility and sovereignty to work on their own terms. Learn more ⤵️

40K

475

_ghorbani retweeted

Yash Patil

@ypatil125

about 1 month ago

Culture is a moat

115

15K

_ghorbani retweeted

Misha Laskin

@MishaLaskin

3 months ago

AGI is in its first stages of take-off. Every country is realizing that AI sovereignty is existential, which requires open models. We’ve signed a deal with Shinsegae Group to build South Korea’s sovereign cloud on a US open model built by Reflection. More to come.

MishaLaskin's tweet photo. AGI is in its first stages of take-off.

Every country is realizing that AI sovereignty is existential, which requires open models.

We’ve signed a deal with Shinsegae Group to build South Korea’s sovereign cloud on a US open model built by Reflection.

More to come. https://t.co/J0LSAESsO5

133

24K

Behrooz Ghorbani

@_ghorbani

3 months ago

Proud to share that Reflection is partnering with Shinsegae Group to build a 250MW AI factory for Korea’s sovereign AI 🇰🇷 Excited to keep pushing the frontiers of RL, reasoning, and open models with this team! https://t.co/1e09vRFLnX

Who to follow

Vahid Kazemi

@VahidK

PhD in machine learning. KTH 14. Ex @xAI, @OpenAI, @Apple, @Google.

Jordan Burgess

@jordnb

pro technology, pro humanity MTS @anthropicai — views my own

David Beyer

@dbeyer123

Partner at Amplify Partners. Previously co-founder at Chartio. Some writing: https://t.co/LLUkRhniK1

_ghorbani retweeted

Patrick Fernandes @psanfernandes

5 months ago

Excited to announce that, after finishing my PhD a couple of months ago, I will continue to do *open* science at @reflection_ai on @_ghorbani's new team! And we are still looking for exceptional individuals to join us 😉

Behrooz Ghorbani

@_ghorbani

5 months ago

@psanfernandes @reflection_ai Welcome to the team Patrick! Excited to work together 🔥🔥🔥

188

_ghorbani retweeted

Reflection @reflection_ai

6 months ago

Most approaches to “agentic AI” focus on post-training fixes. In this conversation, member of our technical staff, @achowdhery argues the bottleneck is pre-training itself. Drawing on her work on PaLM and early Gemini, she explains why next-token prediction breaks down for long-horizon planning -- and how objectives, attention, and training data must evolve to support true agentic behavior.

111

45K

_ghorbani retweeted

Casey Flint

@FlintCasey

6 months ago

2 hrs in and I have almost lost my voice

Behrooz Ghorbani

@_ghorbani

6 months ago

I am deeply grateful to my colleagues at OpenAI. It has been a privilege to be there from the early days of ChatGPT and to learn from so many brilliant people, especially the reasoning team, which has been my home these past few years and a constant source of insight, collaboration, and support. Thank you for everything we built together. I am excited for what comes next.

Behrooz Ghorbani

@_ghorbani

6 months ago

Hi friends, after three incredible years at OpenAI I am excited to share that I am starting a new chapter at @reflection_ai, where I will be leading the Science of Scaling team. Our mission is to deepen the scientific understanding of large scale learning and to turn compute into intelligence as efficiently and predictably as possible.

_ghorbani's tweet photo. Hi friends, after three incredible years at OpenAI I am excited to share that I am starting a new chapter at @reflection_ai, where I will be leading the Science of Scaling team.

Our mission is to deepen the scientific understanding of large scale learning and to turn compute into intelligence as efficiently and predictably as possible.

281

75K

Behrooz Ghorbani

@_ghorbani

6 months ago

In Science of Scaling we will focus on three pillars: understanding LLM training dynamics at scale, the role of real and synthetic data, and the science of RL. I am especially excited to pursue this mission together with @MishaLaskin and @real_ioannis at Reflection. I am building a small, high trust team that cares deeply about open research, careful measurement, and engineering excellence. If you are interested in the science of pretraining, data, and RL at scale and want to help push the frontier with a focused, tight knit group, my DMs are open. I will also be at NeurIPS this week (https://t.co/vRcIBgK3rn).

Behrooz Ghorbani

@_ghorbani

7 months ago

@appliedcompute Congrats @ypatil125 and team!

680

_ghorbani retweeted

Applied Compute @appliedcompute

7 months ago

Generalists are useful, but it’s not enough to be smart. Advances come from specialists, whether human or machine. To have an edge, agents need specific expertise, within specific companies, built on models trained on specific data. We call this Specific Intelligence. It's what we're building at Applied Compute. We unlock the latent knowledge inside a company, use it to train custom models, and deploy an in-house agent workforce that reports to your team. We work with sophisticated companies that have already captured early gains from general models, like @cognition, @DoorDash, and @mercor_ai. They’re pulling even further ahead with proprietary in-house agents that don’t need to wait for the next public model release. Together, we are building and validating models and agents in days instead of months, achieving state-of-the-art performance on customer evals. Our team has high density and low latency. Our founders all worked on different parts of this problem while they were researchers at OpenAI — @ypatil125 as a key member on the agentic software engineer effort (Codex), @rhythmrg as a core contributor to the first RL-trained reasoning model (o1), and @lindensli as a core contributor on ML systems and infrastructure for RL training. Two-thirds of the team are former founders, and everyone brings a deep technical background, from top AI researchers to Math Olympiad winners. We are backed by $80M in funding from Benchmark, Sequoia, Lux, Elad Gil, Victor Lazarte, Omri Casspi, and others. With their support, we are growing the team, scaling deployments, and bringing to market the first generation of agent workforces built on specific models. In short: 1. We are building Specific Intelligence for specific work at specific companies. 2. That will power in-house agent workforces to support their human bosses. 3. That in turn will unlock AI’s full potential through humanity’s greatest engine of progress: thriving corporations in a free market.

appliedcompute's tweet photo. Generalists are useful, but it’s not enough to be smart.

Advances come from specialists, whether human or machine.

To have an edge, agents need specific expertise, within specific companies, built on models trained on specific data.

We call this Specific Intelligence.

It's what we're building at Applied Compute.

We unlock the latent knowledge inside a company, use it to train custom models, and deploy an in-house agent workforce that reports to your team.

We work with sophisticated companies that have already captured early gains from general models, like @cognition, @DoorDash, and @mercor_ai. They’re pulling even further ahead with proprietary in-house agents that don’t need to wait for the next public model release.

Together, we are building and validating models and agents in days instead of months, achieving state-of-the-art performance on customer evals.

Our team has high density and low latency. Our founders all worked on different parts of this problem while they were researchers at OpenAI — @ypatil125 as a key member on the agentic software engineer effort (Codex), @rhythmrg as a core contributor to the first RL-trained reasoning model (o1), and @lindensli as a core contributor on ML systems and infrastructure for RL training.

Two-thirds of the team are former founders, and everyone brings a deep technical background, from top AI researchers to Math Olympiad winners.

We are backed by $80M in funding from Benchmark, Sequoia, Lux, Elad Gil, Victor Lazarte, Omri Casspi, and others. With their support, we are growing the team, scaling deployments, and bringing to market the first generation of agent workforces built on specific models.

In short:

1. We are building Specific Intelligence for specific work at specific companies.
2. That will power in-house agent workforces to support their human bosses.
3. That in turn will unlock AI’s full potential through humanity’s greatest engine of progress: thriving corporations in a free market.

107

631

354

_ghorbani retweeted

Tejal Patwardhan @tejalpatwardhan

8 months ago

Understanding the capabilities of AI models is important to me. To forecast how AI models might affect labor, we need methods to measure their real-world work abilities. That’s why we created GDPval.

tejalpatwardhan's tweet photo. Understanding the capabilities of AI models is important to me. To forecast how AI models might affect labor, we need methods to measure their real-world work abilities. That’s why we created GDPval. https://t.co/YsQvmdGK94

185

739

_ghorbani retweeted

Anjney Midha

@AnjneyMidha

9 months ago

the distance between category leaders and stragglers in frontier AI starts with talent and culture by the time the revenue and valuation signals show up, it’s too late

_ghorbani retweeted

Jiantao Jiao @JiantaoJ

9 months ago

🚀 We’re hiring at NVIDIA! Our team is pushing the frontier of LLM / DLM post-training and system optimization. We are looking for exceptional people with large-scale LLM + systems experience to join us (full time only). 🔹 Focus areas include: •Post-training of large models •Systems for LLM/DLM training & inference at scale •Efficiency, scaling, and evaluation frameworks of LLMs At NVIDIA, you’ll work with world-class researchers and engineers on cutting-edge foundation models at unprecedented scale. 👉 If you’re passionate about LLMs, systems, and building the next generation of AI, we’d love to hear from you. 📩 If you’re interested, please send me your CV! @nvidia #LLM #AI #Systems #PostTraining #DeepLearning

468

353

103K

_ghorbani retweeted

Andrej Karpathy

@karpathy

9 months ago

In era of pretraining, what mattered was internet text. You'd primarily want a large, diverse, high quality collection of internet documents to learn from. In era of supervised finetuning, it was conversations. Contract workers are hired to create answers for questions, a bit like what you'd see on Stack Overflow / Quora, or etc., but geared towards LLM use cases. Neither of the two above are going away (imo), but in this era of reinforcement learning, it is now environments. Unlike the above, they give the LLM an opportunity to actually interact - take actions, see outcomes, etc. This means you can hope to do a lot better than statistical expert imitation. And they can be used both for model training and evaluation. But just like before, the core problem now is needing a large, diverse, high quality set of environments, as exercises for the LLM to practice against. In some ways, I'm reminded of OpenAI's very first project (gym), which was exactly a framework hoping to build a large collection of environments in the same schema, but this was way before LLMs. So the environments were simple academic control tasks of the time, like cartpole, ATARI, etc. The @PrimeIntellect environments hub (and the `verifiers` repo on GitHub) builds the modernized version specifically targeting LLMs, and it's a great effort/idea. I pitched that someone build something like it earlier this year: https://t.co/ANHhasxzD8 Environments have the property that once the skeleton of the framework is in place, in principle the community / industry can parallelize across many different domains, which is exciting. Final thought - personally and long-term, I am bullish on environments and agentic interactions but I am bearish on reinforcement learning specifically. I think that reward functions are super sus, and I think humans don't use RL to learn (maybe they do for some motor tasks etc, but not intellectual problem solving tasks). Humans use different learning paradigms that are significantly more powerful and sample efficient and that haven't been properly invented and scaled yet, though early sketches and ideas exist (as just one example, the idea of "system prompt learning", moving the update to tokens/contexts not weights and optionally distilling to weights as a separate process a bit like sleep does).

253

854

949K

_ghorbani retweeted

OpenAI

@OpenAI

10 months ago

LIVE5TREAM THURSDAY 10AM PT

23K

Behrooz Ghorbani

@_ghorbani

10 months ago

Huge congratulations to @AIatMeta and to @shengjia_zhao! Shengjia is one of the most brilliant and kind researchers I’ve had the privilege to work with.

AI at Meta

@AIatMeta

10 months ago

We're excited to have @shengjia_zhao at the helm as Chief Scientist of Meta Superintelligence Labs. Big things are coming! 🚀 See Mark's post: https://t.co/SL7h4sGfwx

AIatMeta's tweet photo. We're excited to have @shengjia_zhao at the helm as Chief Scientist of Meta Superintelligence Labs. Big things are coming! 🚀

See Mark's post: https://t.co/SL7h4sGfwx https://t.co/DRnpqBe1wD

978

113

386K

811

_ghorbani retweeted

Jerry Tworek

@MillionInt

11 months ago

To summarize this week: - we released general purpose computer using agent - got beaten by a single human in atcoder heuristics competition - solved 5/6 new IMO problems with natural language proofs All of those are based on the same single reinforcement learning system

111

248

173K

Behrooz Ghorbani

@_ghorbani

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users