NovaSky

7 days ago

RT @charlie_ruan: Excited to have supported @trajectorylabs with the SkyRL team over the past month, bringing training onto their own clust…

0

2

0

19

NovaSkyAI retweeted

Ronak Malde

@rronak_

7 days ago

Today, @MichaelElabd, @QuantumArjun, and I are excited to announce Trajectory. We are a research lab and product company building the platform for Continual Learning. Our platform unlocks the signal already sitting in product usage, so companies can continuously post-train large-scale agentic models that outperform the frontier. @trajectorylabs We’ve raised $15M from @Conviction, @BessemerVP, @radicalvcfund, @jeffdean, @drfeifei and more. We’re partnering with some of the best AI-native companies: @ClayRunHQ @Harvey, @DecagonAI, @mercor_ai, @RogoAI to power their agentic systems, some of which we are already in production with. We’ve brought together a world class research team from DeepMind, OpenAI, Apple, Meta Superintelligence, Amazon AGI, Scale AI, and an elite product team from Stripe and Figma. AI will never again start on day one. Every correction, every retry, every edit will make products smarter. This is Continual Learning.

244

1K

145

778

2M

NovaSkyAI retweeted

16 days ago

Super cool work built with SkyRL! Have also thought about how it's so wasteful not to train on the observation tokens. Can't wait to upstream it:)

0

31

4

9

3K

NovaSkyAI retweeted

22 days ago

A 4B RLM trained with SkyRL that matches Sonnet 4.6, very cool work!

0

28

5

18

3K

about 1 month ago

Check out VLM training in SkyRL! SFT / multi-turn RL, LoRA / full-finetuning, tinker-compatible as well!

Nithin Chalapathi

@nithinch10

about 1 month ago

SkyRL now supports end-to-end vision-language post-training, from SFT to agentic RL, and adds vision model support to SkyRL’s Tinker interface! Existing multimodal cookbooks, e.g. VLM classification, work out of the box:

nithinch10's tweet photo. SkyRL now supports end-to-end vision-language post-training, from SFT to agentic RL, and adds vision model support to SkyRL’s Tinker interface! Existing multimodal cookbooks, e.g. VLM classification, work out of the box: https://t.co/LSv79llkAk

3

44

11

21

4K

0

10

0

2

779

NovaSkyAI retweeted

Ziming Mao

@ziming_mao

about 2 months ago

🚀 Excited to share the training & inference results for UCCL-EP: a portable, high-performance expert-parallel communication library across heterogeneous GPU + NIC hardware. 💻 Code: https://t.co/wVWiso8ajS 📝 Blog: https://t.co/PSRH7CQpK6 📈 Highlights: • Up to 45% faster Megatron-LM training vs RCCL on 128 AMD GPUs • Up to 40% faster SGLang inference vs NCCL on 32 H200 GPUs • Up to 25% lower vLLM TPOT vs NCCL • Up to 2.3x better EP dispatch/combine on AWS EFA 🔁 Fully portable across heterogeneous GPU/NIC hardware and a drop-in replacement for DeepEP Amazing team: Chon Lam Lao, @yangzhouy, Yihan Zhang, Chihan Cui, Zhongjie Chen, Zhiying Xu, @KaichaoYou, Zhen Huang, Zhenyu Gu, Costin Raiciu, Scott Shenker, @istoica05

ziming_mao's tweet photo. 🚀 Excited to share the training & inference results for UCCL-EP: a portable, high-performance expert-parallel communication library across heterogeneous GPU + NIC hardware.
💻 Code: https://t.co/wVWiso8ajS
📝 Blog: https://t.co/PSRH7CQpK6
📈 Highlights:
• Up to 45% faster Megatron-LM training vs RCCL on 128 AMD GPUs
• Up to 40% faster SGLang inference vs NCCL on 32 H200 GPUs
• Up to 25% lower vLLM TPOT vs NCCL
• Up to 2.3x better EP dispatch/combine on AWS EFA
🔁 Fully portable across heterogeneous GPU/NIC hardware and a drop-in replacement for DeepEP

Amazing team: Chon Lam Lao, @yangzhouy, Yihan Zhang, Chihan Cui, Zhongjie Chen, Zhiying Xu, @KaichaoYou, Zhen Huang, Zhenyu Gu, Costin Raiciu, Scott Shenker, @istoica05

1

69

20

30

5K

Aditya Soni @Aditya_Soni_8

2 months ago

Great work from the @OpenHandsDev community and CMU! Open source SOTA on code localization via RL. Happy to see the beautiful reward curves trained with SkyRL!

3 months ago

Can we train code agents to search relevant locations in a codebase only using a terminal? Introducing CodeScout: an effective RL recipe for code search 🚀 🏆 Outperforms 18x larger OSS LLMs 🔥 Comparable to proprietary LLMs 📈 SoTA on SWE-Bench Verified, Pro, & Lite 🧵 [1/N]

Aditya_Soni_8's tweet photo. Can we train code agents to search relevant locations in a codebase only using a terminal?
Introducing CodeScout: an effective RL recipe for code search 🚀

🏆 Outperforms 18x larger OSS LLMs
🔥 Comparable to proprietary LLMs
📈 SoTA on SWE-Bench Verified, Pro, & Lite
🧵 [1/N] https://t.co/zBZP6axzzy

2

108

19

94

26K

0

25

7

9

3K

NovaSkyAI retweeted

vLLM

@vllm_project

3 months ago

Excited to see SkyRL sharing their work on inference and vLLM in RL at the LLMs on Ray office hours this Thursday. If you’re exploring using vLLM in RL workflows, this will be a great session to join. See you there 👇

1

42

9

10

7K

3 months ago

We’ve been consistently surprised lately by how capable frontier models are at handling complex kernel implementation and system optimization. Check out this work as a step toward automating AI infrastructure building!

Shiyi Cao

@shiyi_c98

3 months ago

Introducing our new work K-Search: LLM Kernel Generation via Co-Evolving Intrinsic World Model — a new paradigm for automated GPU kernel generation, achieving SoTA results. 🔍 Big insight: Traditional methods treat LLMs as stochastic code generators inside heuristic loops — but this misses a key point: LLMs are powerful planners with rich domain priors. 🧠 Core idea: K-Search uses the LLM itself as a co-evolving world model — one that plans + updates beliefs + guides search decisions based on experience. 📌 This decouples high-level strategy (intent) from low-level code implementation, allowing the optimizer to pursue multi-step transformations even when intermediate implementations don’t immediately improve performance. 📈 Key results: 🔥 Our discovered kernels are ~2.10× average speedup vs state-of-the-art evolutionary search across 4 FlashInfer kernels on H100/B200. 🔥 Up to 14.3× gain on complex Mixture-of-Experts (MoE) kernels. 🔥 State-of-the-art performance on GPUMode TriMul (H100) task — beating both automated and human solutions. 🙏 Acknowledgements This work is developed in @BerkeleySky, w/ the amazing @ziming_mao, @profjoeyg, and @istoica05. We thank @DachengLi177, @MayankMish98, @randwalk0, @pgasawa, @fangz_zzu, and @tian_xia_ for helpful discussion and feedback. We also thank the generous compute support from @databricks, @awscloud, @anyscalecompute, @nvidia, @Google, @LambdaAPI, and @MayfieldFund. 👨‍💻 GitHub: https://t.co/YJJ9SYvTvD 📄 arXiv: https://t.co/JtZDnZBkKM

shiyi_c98's tweet photo. Introducing our new work K-Search: LLM Kernel Generation via Co-Evolving Intrinsic World Model — a new paradigm for automated GPU kernel generation, achieving SoTA results.

🔍 Big insight:
Traditional methods treat LLMs as stochastic code generators inside heuristic loops — but this misses a key point: LLMs are powerful planners with rich domain priors.

🧠 Core idea:
K-Search uses the LLM itself as a co-evolving world model — one that plans + updates beliefs + guides search decisions based on experience.

📌 This decouples high-level strategy (intent) from low-level code implementation, allowing the optimizer to pursue multi-step transformations even when intermediate implementations don’t immediately improve performance.

📈 Key results:
🔥 Our discovered kernels are ~2.10× average speedup vs state-of-the-art evolutionary search across 4 FlashInfer kernels on H100/B200.
🔥 Up to 14.3× gain on complex Mixture-of-Experts (MoE) kernels.
🔥 State-of-the-art performance on GPUMode TriMul (H100) task — beating both automated and human solutions.

🙏 Acknowledgements
This work is developed in @BerkeleySky, w/ the amazing @ziming_mao, @profjoeyg, and @istoica05. We thank @DachengLi177, @MayankMish98, @randwalk0, @pgasawa, @fangz_zzu, and @tian_xia_ for helpful discussion and feedback.

We also thank the generous compute support from @databricks, @awscloud, @anyscalecompute, @nvidia, @Google, @LambdaAPI, and @MayfieldFund.

👨‍💻 GitHub: https://t.co/YJJ9SYvTvD
📄 arXiv: https://t.co/JtZDnZBkKM

12

307

64

257

97K

0

20

3

7

3K

NovaSkyAI retweeted

Shiyi Cao

@shiyi_c98

3 months ago

Introducing our new work K-Search: LLM Kernel Generation via Co-Evolving Intrinsic World Model — a new paradigm for automated GPU kernel generation, achieving SoTA results. 🔍 Big insight: Traditional methods treat LLMs as stochastic code generators inside heuristic loops — but this misses a key point: LLMs are powerful planners with rich domain priors. 🧠 Core idea: K-Search uses the LLM itself as a co-evolving world model — one that plans + updates beliefs + guides search decisions based on experience. 📌 This decouples high-level strategy (intent) from low-level code implementation, allowing the optimizer to pursue multi-step transformations even when intermediate implementations don’t immediately improve performance. 📈 Key results: 🔥 Our discovered kernels are ~2.10× average speedup vs state-of-the-art evolutionary search across 4 FlashInfer kernels on H100/B200. 🔥 Up to 14.3× gain on complex Mixture-of-Experts (MoE) kernels. 🔥 State-of-the-art performance on GPUMode TriMul (H100) task — beating both automated and human solutions. 🙏 Acknowledgements This work is developed in @BerkeleySky, w/ the amazing @ziming_mao, @profjoeyg, and @istoica05. We thank @DachengLi177, @MayankMish98, @randwalk0, @pgasawa, @fangz_zzu, and @tian_xia_ for helpful discussion and feedback. We also thank the generous compute support from @databricks, @awscloud, @anyscalecompute, @nvidia, @Google, @LambdaAPI, and @MayfieldFund. 👨‍💻 GitHub: https://t.co/YJJ9SYvTvD 📄 arXiv: https://t.co/JtZDnZBkKM

12

307

64

257

97K

3 months ago

Excited to see SkyRL being used by systems research to study how agentic RL workload can be optimized!! https://t.co/8Y0vDkK8Ye

Hao Kang

@GT_HaoKang

4 months ago

🔥Modifying 2 lines of code and get your agentic serving/rollout up to 3.9x faster losslessly! ⚡️Say hello to ThunderAgent, a fast, simple, and program-aware agentic Inference System. 🥇 We propose a program abstraction to schedule all GPU and CPU resources, the first principled approach for distributed agentic inference and rollout. 🌐 Blog: https://t.co/PAcgTZzlhD 💻 Code: https://t.co/nr7XJj1L7B 📜 Paper: https://t.co/aCD6POzwkU #AI #ThunderAgent #LLMAgent #Mlsys 1/n

3

109

24

58

31K

1

21

4

11

2K

4 months ago

Train your terminal-use agents with SkyRL+Harbor!

4 months ago

Releasing the official SkyRL + Harbor integration: a standardized way to train terminal-use agents with RL. From the creators of Terminal-Bench, Harbor is a widely adopted framework for evaluating terminal-use agents on any task expressible as a Dockerfile + instruction + test script. This integration extends it: the same tasks you evaluate on, you can now RL-train on. Blog: https://t.co/yDyId02UfH 🧵

charlie_ruan's tweet photo. Releasing the official SkyRL + Harbor integration: a standardized way to train terminal-use agents with RL.

From the creators of Terminal-Bench, Harbor is a widely adopted framework for evaluating terminal-use agents on any task expressible as a Dockerfile + instruction + test script.

This integration extends it: the same tasks you evaluate on, you can now RL-train on.

Blog: https://t.co/yDyId02UfH
🧵

9

241

45

168

34K

0

16

1

6

1K

NovaSkyAI retweeted

4 months ago

A 30-minute talk on SkyRL’s most recent updates:)

2

13

3

4

2K