Rathul Anand @vendablechart - Twitter Profile

Pinned Tweet

10 months ago

We won 1st place at the @JaneStreetGroup x @GPU_MODE Hackathon in NYC this weekend! 🚀⚡ Our challenge was to a real-time inference server for an ensemble of sequential stateful models (Mamba2, xLSTM, etc.) processing streaming market data. The goal was to maximize the PnL of our trading system, which demanded high model accuracy while decreasing latency, maximizing throughput, and maintaining live uptime in production. We worked across the full inference stack, from high-level batching algorithms to GPU utilization optimizations. Our production engine ultimately processed ~400 requests per second with ~30ms of latency, achieving the highest PnL of the competition! Our key techniques: 💡 Dynamic batching and state management to maximize throughput while preserving sequential inference accuracy 💡 Profiled and eliminated CPU <-> GPU communication overhead, removing synchronization points and bottlenecks 💡 Reduced kernel launch overhead with PyTorch optimizations like torch.compile 💡 Fast state expansion/reduction strategies to minimize batch latency 💡 Explored model quantization and custom Triton kernels to fuse operations and improve GPU utilization Huge thanks to @marksaroufim and @GuggerSylvain for designing a deeply interesting open-ended technical challenge and a smooth contest experience, and the real-world insights throughout!

vendablechart's tweet photo. We won 1st place at the @JaneStreetGroup x @GPU_MODE Hackathon in NYC this weekend! 🚀⚡

Our challenge was to a real-time inference server for an ensemble of sequential stateful models (Mamba2, xLSTM, etc.) processing streaming market data. The goal was to maximize the PnL of our trading system, which demanded high model accuracy while decreasing latency, maximizing throughput, and maintaining live uptime in production. We worked across the full inference stack, from high-level batching algorithms to GPU utilization optimizations.

Our production engine ultimately processed ~400 requests per second with ~30ms of latency, achieving the highest PnL of the competition! Our key techniques:
💡 Dynamic batching and state management to maximize throughput while preserving sequential inference accuracy
💡 Profiled and eliminated CPU <-> GPU communication overhead, removing synchronization points and bottlenecks
💡 Reduced kernel launch overhead with PyTorch optimizations like torch.compile
💡 Fast state expansion/reduction strategies to minimize batch latency
💡 Explored model quantization and custom Triton kernels to fuse operations and improve GPU utilization

Huge thanks to @marksaroufim and @GuggerSylvain for designing a deeply interesting open-ended technical challenge and a smooth contest experience, and the real-world insights throughout!

6

71

4

46

9K

Rathul Anand

@vendablechart

about 1 month ago

@CFC_Berungee @RealAdiRao i believe so

0

1

0

44

vendablechart retweeted

Tri Dao

@tri_dao

3 months ago

The frontier has increasingly shifted to hybrid models - from Qwen to Kimi-Linear and now with NVIDIA's Nemotron-3 Super - that rely on a strong linear sequence model. Today we release Mamba-3, the most powerful linear model to date. https://t.co/OpMmqEWMkP

11

842

113

329

78K

vendablechart retweeted

Tri Dao

@tri_dao

4 months ago

I’m unreasonably excited about the fact that we wrote everything in Cute-DSL, embedded in Python. Installing / “compiling” now takes seconds instead of minutes / hours (looking at you, C++ templates). Try pip install fa4!

4

428

19

57

29K

Rathul Anand

@vendablechart

7 months ago

@Waymo pleasanton didn’t make the cut 🥲

0

2

0

48

Rathul Anand

@vendablechart

8 months ago

@PingbangHu @MorganStanley Congrats!

0

1

0

104

Rathul Anand

@vendablechart

9 months ago

@vikhyatk gluon 👀 https://t.co/SoZIlVWnZg

0

1

0

1

136

Rathul Anand

@vendablechart

10 months ago

Check out our implementation at https://t.co/ttbyyLJH6c :) This wouldn’t be possible without my incredible teammates Kyle Yu (https://t.co/5nka3EhSDu) and Aswinkumar (https://t.co/fnhIIB8Vgx) 🙌 Especially grateful to Jane Street team and the broader GPU MODE community for giving us a taste of the demands of ML infra for low-latency trading, to Tri Dao and the PyTorch team for sharing their insights on the future of GPU programming models, and to CoreWeave and Northflank for the H100s and support! 💸

0

10

0

5

825

Rathul Anand

@vendablechart

10 months ago

We won 1st place at the @JaneStreetGroup x @GPU_MODE Hackathon in NYC this weekend! 🚀⚡ Our challenge was to a real-time inference server for an ensemble of sequential stateful models (Mamba2, xLSTM, etc.) processing streaming market data. The goal was to maximize the PnL of our trading system, which demanded high model accuracy while decreasing latency, maximizing throughput, and maintaining live uptime in production. We worked across the full inference stack, from high-level batching algorithms to GPU utilization optimizations. Our production engine ultimately processed ~400 requests per second with ~30ms of latency, achieving the highest PnL of the competition! Our key techniques: 💡 Dynamic batching and state management to maximize throughput while preserving sequential inference accuracy 💡 Profiled and eliminated CPU <-> GPU communication overhead, removing synchronization points and bottlenecks 💡 Reduced kernel launch overhead with PyTorch optimizations like torch.compile 💡 Fast state expansion/reduction strategies to minimize batch latency 💡 Explored model quantization and custom Triton kernels to fuse operations and improve GPU utilization Huge thanks to @marksaroufim and @GuggerSylvain for designing a deeply interesting open-ended technical challenge and a smooth contest experience, and the real-world insights throughout!

6

71

4

46

9K

Rathul Anand

@vendablechart

10 months ago

@GPU_MODE @JaneStreetGroup 👀👀👀

0

2

0

1K

vendablechart retweeted

GPU MODE

@GPU_MODE

10 months ago

Final moments at the @JaneStreetGroup hackathon today, everyone monitoring the leaderboard before the winners are declared

GPU_MODE's tweet photo. Final moments at the @JaneStreetGroup hackathon today, everyone monitoring the leaderboard before the winners are declared https://t.co/nx20gY6uqF

6

410

12

70

33K

Rathul Anand

@vendablechart

10 months ago

@DavidSHolz 👋

0

1

0

35

Rathul Anand

@vendablechart

11 months ago

cool use of @semgrep to secure the triton inference server!

Trail of Bits

@trailofbits

11 months ago

Today, we’re disclosing two 9.8 CVSS memory corruption vulnerabilities in the @NVIDIA Triton Inference Server that lets attackers crash production AI services through malicious HTTP requests (CVE-2025-23310 and CVE-2025-23311) 🧵

1

91

19

14K

0

1

0

429

vendablechart retweeted

Cynthia Wang

@cynthwangg

11 months ago

We just wanted to cowork past 5pm. Turns out the entire city did too :) @samanthaaouyang and I first crossed paths in Turkey earlier this year. In April, we reconnected at a women founders’ event and nerded out for hours about café culture, cities that sleep too early, and what it would take to build something different this summer. Two days later, we had a full Notion doc and a dream for a late-night popup called Elsewhere. A third space that would be a little less lonely than working from home, but a little more magical than your usual café. When @elsewhere_today launched on July 16th, we got way more inbound than expected: 100K+ views in 24 hours and 700+ new followers from almost nothing. Still, the real test: would SF actually show up? 400+ people bought in. During our first pop-up last Thursday, we had a line out of the door before 7pm. We served hojicha lattes, blue Thai tea, crepe cakes, fruit tarts, coconut macarons, and, of course, endless matcha. People brought their laptops and stayed there working until 11pm on passion projects. Someone even paid in USDC on Solana! (other chains tap in 👀) At first, some skeptics on X were quick to dismiss “just another late-night café attempt.” But honestly, if we were able to make SF even 1% more alive that night, I think that’s beautiful :) Huge thanks to the best team in the world: Sam, Akshaya (@akshayadinesh19), and Daniel (@Bluezmango123). Thank you to Fayeeza and the Entrepreneurs First team (@join_ef) for an incredible venue. And much love to Jade and Homeroom for an amazing collab! Let us know what other cities to stop by! Stay tuned for more from Elsewhere ❤️

46

216

17

22

27K

Rathul Anand

@vendablechart

11 months ago

@_a20m_ peak performance

0

38

Rathul Anand

@vendablechart

11 months ago

@cynthwangg @elsewhere_today @_worma_ thanks for organizing! was super fun to meet everyone :)

0

33

Rathul Anand

@vendablechart

11 months ago

yum! @elsewhere_today

1

5

1

0

3K

Rathul Anand

@vendablechart

11 months ago

@elsewhere_today (mango) matcha 🥭🍵!

1

2

0

145

vendablechart retweeted

Semgrep

@semgrep

12 months ago

🔥 Semgrep is officially live on Cursor! You can now harness the power of @semgrep directly in your AI coding assistant, combining fast, accurate static analysis with LLMs to help developers ship code that’s secure from the start, fast. From securing code at leading AI companies to joining the @cursor_ai tools ecosystem, Semgrep is becoming essential for dev-first security in the modern stack. Shoutout to our team for making this integration happen, and to our customers, partners, and community for pushing us forward 🚀 https://t.co/f0HxFhr02D #Cursor #AppSec #DeveloperTools #SecureCoding #LLM #StaticAnalysis

semgrep's tweet photo. 🔥 Semgrep is officially live on Cursor!

You can now harness the power of @semgrep directly in your AI coding assistant, combining fast, accurate static analysis with LLMs to help developers ship code that’s secure from the start, fast.

From securing code at leading AI companies to joining the @cursor_ai tools ecosystem, Semgrep is becoming essential for dev-first security in the modern stack.

Shoutout to our team for making this integration happen, and to our customers, partners, and community for pushing us forward 🚀

https://t.co/f0HxFhr02D

#Cursor #AppSec #DeveloperTools #SecureCoding #LLM #StaticAnalysis

1

54

9

36

11K

Rathul Anand

@vendablechart

12 months ago

@cory security research intern @semgrep, would love to learn more!

1

0

519

Rathul Anand

@vendablechart

Last Seen Users on Sotwe

Trends for you

Most Popular Users