Xiangchen Song

@XiangchenSong

PhD student @mldcmu @SCSatCMU | Undergrad @dmguiuc @UofIllinois | Intern @AmazonScience @SFResearch @MSFTResearch

Pittsburgh, PA

Joined December 2016

711 Following

190 Followers

23 Posts

XiangchenSong retweeted

17 days ago

Introducing CHI-Bench on @huggingface: the world’s first long-horizon healthcare benchmark for AI agents. 75 real healthcare workflows + 20 apps + 200+ MCP tools + 1,290 skills + process / outcome rewards https://t.co/PKmQ4RiIJY Any questions, lmk!

8

145

25

130

32K

XiangchenSong retweeted

Aether AI (Causal Intelligence) @AetherLab_AI

21 days ago

We are building Aether AI. #AetherAI Scaling has made AI powerful. But scaling pattern recognition alone will not deliver real-world intelligence. The next paradigm requires causal world models and causal agentic systems — systems that uncover mechanisms, reason about interventions, and improve through the consequences of their own actions. Our first proving ground is Physical AI. #Causality #AI

1

7

2

0

2K

XiangchenSong retweeted

23 days ago

In real healthcare operations, agents must do far more than answer medical questions. They need to read charts, interpret clinical and operational policies, verify coverage, route referrals, draft P2P scripts, and finalize care plans — where a single policy violation can mean a denied claim or missed patient outcome. @actAVAai @iscreamnearby led and developed CHI-Bench (Clinical Healthcare In-situ Benchmark), the first long-horizon, policy-rich benchmark for AI agents operating across end-to-end U.S. healthcare workflows. Key highlights: ▶️ High-fidelity simulators for Provider Prior Authorization, Payer Utilization Management, and Population Health Care Management, all exposed as MCP servers over patient, clinician, and insurer records. 🧪 Each trial runs 60–80 agent steps across 4–6 clinical stages, with access to 21 healthcare apps, 200+ MCP tools, and a 1,279-document operations handbook. Leaderboard results across 30 frontier agents: • Claude Code + Opus 4.6: 28% pass@1 • Codex + GPT-5.5: 21% • Utilization review: 41% • Care management: 32% • Prior authorization: 29% Reliability remains a major challenge: no agent exceeds 20% when the same case is repeated three times.

CaimingXiong's tweet photo. In real healthcare operations, agents must do far more than answer medical questions. They need to read charts, interpret clinical and operational policies, verify coverage, route referrals, draft P2P scripts, and finalize care plans — where a single policy violation can mean a denied claim or missed patient outcome.

@actAVAai @iscreamnearby led and developed CHI-Bench (Clinical Healthcare In-situ Benchmark), the first long-horizon, policy-rich benchmark for AI agents operating across end-to-end U.S. healthcare workflows.

Key highlights:

▶️ High-fidelity simulators for Provider Prior Authorization, Payer Utilization Management, and Population Health Care Management, all exposed as MCP servers over patient, clinician, and insurer records.

🧪 Each trial runs 60–80 agent steps across 4–6 clinical stages, with access to 21 healthcare apps, 200+ MCP tools, and a 1,279-document operations handbook.

Leaderboard results across 30 frontier agents:
• Claude Code + Opus 4.6: 28% pass@1
• Codex + GPT-5.5: 21%
• Utilization review: 41%
• Care management: 32%
• Prior authorization: 29%
Reliability remains a major challenge: no agent exceeds 20% when the same case is repeated three times.

8

55

19

21

3K

XiangchenSong retweeted

23 days ago

1/🧵Can AI agents automate U.S. healthcare workflows end to end given just clinician & insurer apps and operations, medical policy library? Introducing CHI-Bench: 75 long-horizon realistic healthcare workflows × 30 frontier agents. Best agent solves only 28% #AIinHealthcare 👇

iscreamnearby's tweet photo. 1/🧵Can AI agents automate U.S. healthcare workflows end to end given just clinician & insurer apps and operations, medical policy library? Introducing CHI-Bench: 75 long-horizon realistic healthcare workflows × 30 frontier agents. Best agent solves only 28% #AIinHealthcare 👇 https://t.co/YoEtfHlVbu

12

45

24

25

64K

Who to follow

Verified account

FAIR CodeGen @AIatMeta | Ex-Meta Llama | Alum @Tsinghua_Uni @GTCSE | NLP | Large Language Models

Coding @AIatMeta TBD | Ex: @xAI @GoogleDeepMind @Caltech @UCLA

Kelvin (Keqiang) Yan

RS at Bytedance Seed. DE Shaw Research Doctoral Fellow. AI and LLMs for Scientific Discovery. Ex @MSFTResearch @Princeton @PKU1898, etc. Opinions are my own.

XiangchenSong retweeted

8 months ago

Stop restarting your long-running agents. Enterprise Deep Research (EDR) lets you steer mid-run—like driving a car. It can save you hours or even days of work. Open-source, enterprise-ready, built by @SFResearch. Try it & drop your use case below 👇 ��GitHub: https://t.co/5elr3XBrCG

4

22

5

15

9K

XiangchenSong retweeted

Kun Zhang-in pursuit of Causality with ML @kunkzhang

8 months ago

MBZUAI Machine Learning Winter School 2026: Representation Learning & GenAI (https://t.co/voU5FqSZE3) on Feb. 9-13, 2026, in Abu Dhabi, UAE. Application Deadline: Oct. 20, 2025! Join us for an exciting 5-day program with world-class researchers! Funding available! #MBZUAI

0

44

17

18

7K

XiangchenSong retweeted

Aashiq Muhamed @AashiqMuhamed

12 months ago

🧵 Your SAE learns different features each time? Struggling to convince people to trust your interpretations? Maybe you're only one architecture choice away from a solution. We formulate this as a Feature Consistency problem and show that high consistency is achievable!

AashiqMuhamed's tweet photo. 🧵 Your SAE learns different features each time? Struggling to convince people to trust your interpretations? Maybe you're only one architecture choice away from a solution.

We formulate this as a Feature Consistency problem and show that high consistency is achievable! https://t.co/kk2KiM2u2h

1

26

6

14

2K

XiangchenSong retweeted

almost 3 years ago

We present 🧩Retroformer🧩, iteratively improving LLM agents by learning a plug-in retrospective model, that through the process of policy gradient optimization, automatically refines the prompts with env-specific rewards. arXiv: https://t.co/zITi65Z14q #LanguageAgents #LLM

CaimingXiong's tweet photo. We present 🧩Retroformer🧩, iteratively improving LLM agents by learning a plug-in retrospective model, that through the process of policy gradient optimization, automatically refines the prompts with env-specific rewards.
arXiv: https://t.co/zITi65Z14q
#LanguageAgents #LLM https://t.co/efDrAmb3Br

1

110

34

30

15K

XiangchenSong retweeted

Kun Zhang-in pursuit of Causality with ML @kunkzhang

almost 3 years ago

Registration deadline of #UAI2023 (39th Conf. on Uncertainty in #Artificialintelligence) is July 24! It will take place @CarnegieMellon, Pittsburgh from 07/31-08/04. Check out the beautiful @PhippsNews for the banquet: https://t.co/otjYYqwWWZ

0

26

14

0

7K

XiangchenSong retweeted

Kun Zhang-in pursuit of Causality with ML @kunkzhang

almost 3 years ago

Four days left for early registration for #UAI2023: https://t.co/qye8lWgtCO #UAI2023. UAI 2023 will take place at Carnegie Mellon University, Pittsburgh, PA, USA, Jul 31-Aug 4, with banquet @PhippsNews Phipps Conservatory and Botanical Gardens!

0

30

14

2

8K

XiangchenSong retweeted

about 3 years ago

We are organizing a @UncertaintyInAI workshop on the #History and #Development of Search Methods for #CausalStructure. Welcome submissions of "Case Studies of Applied Causal Discovery", either successful or not. For details see https://t.co/c4konY5tJJ

0

14

11

1

8K

XiangchenSong retweeted

Kun Zhang-in pursuit of Causality with ML @kunkzhang

about 3 years ago

Registration for UAI 2023 is now open! https://t.co/W1qXhYuAF1 #UAI23 @UAI2023 will take place at Carnegie Mellon University, Pittsburgh, PA, USA Jul 31-Aug 4, with banquet @PhippsNews Phipps Conservatory and Botanical Gardens! Early bird deadline is June 22. See you there!

2

61

15

7

8K

XiangchenSong retweeted

Kun Zhang-in pursuit of Causality with ML @kunkzhang

over 3 years ago

UAI 2023 looks forward to seeing you at Carnegie Mellon University from July 31 to Aug. 4, 2023. Thanks to our local team and CMU for making things happen!

0

57

12

3

10K

XiangchenSong retweeted

CLeaR-Conference on Causal Learning and Reasoning @Conf_CLeaR

almost 4 years ago

The CLeaR society is delighted to announce that we are organizing the 2023 edition of CLeaR in Tubingen, Germany. The submission deadline will be around mid-October. Details will be released shortly. Please stay tuned!

0

113

32

5

0

XiangchenSong retweeted

almost 4 years ago

We've just released Betty, a PyTorch library for generalized meta-learning (GML) and multilevel optimization (MLO)! Betty gives a unified programming interface for applications including HPO, NAS, MAML, RL, and more. Code: https://t.co/LTU20uDFOR Paper: https://t.co/IwIZP1OInI

sangkeun_choe's tweet photo. We've just released Betty, a PyTorch library for generalized meta-learning (GML) and multilevel optimization (MLO)!

Betty gives a unified programming interface for applications including HPO, NAS, MAML, RL, and more.

Code: https://t.co/LTU20uDFOR
Paper: https://t.co/IwIZP1OInI https://t.co/5hTcvMoprv

2

102

21

25

0

XiangchenSong retweeted

uai2026 @UncertaintyInAI

about 4 years ago

We are happy to announce that the UAI 2022 program committee is carefully reviewing the 730 submissions to the conference! We are looking forward to seeing you in Eindhoven, The Netherlands on August 1-5, 2022!

UncertaintyInAI's tweet photo. We are happy to announce that the UAI 2022 program committee is carefully reviewing the 730 submissions to the conference! We are looking forward to seeing you in Eindhoven, The Netherlands on August 1-5, 2022! https://t.co/WHiLh1dwW6

0

52

14

0

0

XiangchenSong retweeted

Kevin Patrick Murphy

over 4 years ago

I am delighted to announce that a draft of my latest book, “Probabilistic Machine Learning: Advanced Topics”, is now available online at https://t.co/dSlKkwYpLr. It covers #DeepGenerativeModels, #BayesianInference, #Causality, #ReinforcementLearning, #DistributionShift, etc.

sirbayes's tweet photo. I am delighted to announce that a draft of my latest book, “Probabilistic Machine Learning: Advanced Topics”, is now available online at https://t.co/dSlKkwYpLr. It covers #DeepGenerativeModels, #BayesianInference, #Causality, #ReinforcementLearning, #DistributionShift, etc. https://t.co/BbLFTNZSro

36

5K

962

1K

0

Xiangchen Song @XiangchenSong

over 4 years ago

Excited to serve as a workflow chair for UAI 2022 with Petar Stojanov. Paper submission deadline is February 25, 2022 (23:59 UTC). #UAI2022 @UncertaintyInAI https://t.co/ldtPGXomkH

0

1

0

0

0

Xiangchen Song @XiangchenSong

over 4 years ago

We are excited to release the Python causal-learn package for causal discovery! See the package (https://t.co/6fVGoEAkqV) and documentation (https://t.co/HA5r0Tf8Rg). Any feedback is welcome.

0

4

1

0

0

XiangchenSong retweeted

over 4 years ago

We are excited to release the Python causal-learn package for causal discovery! See the package (https://t.co/D0YK6ZqMjs) and documentation (https://t.co/kA2bwYtU1l). Any feedback is welcome.

2

376

78

110

0

Last Seen Users on Sotwe

Trends for you

Most Popular Users