Swetava Ganguli @Swetava - Twitter Profile

Swetava retweeted

4 days ago

An Ex-Meta L8’s Agentic Engineering Setup In this guest article, @kunchenguid shares the agentic engineering workflow he uses on a day-to-day basis. Read the full article here: https://t.co/4mXyh1EiFF

alexxubyte's tweet photo. An Ex-Meta L8’s Agentic Engineering Setup

In this guest article, @kunchenguid shares the agentic engineering workflow he uses on a day-to-day basis.

Read the full article here: https://t.co/4mXyh1EiFF https://t.co/HdX8sYIFHa

6

238

46

266

20K

Swetava retweeted

Chao Ma

@ickma2311

11 days ago

CMU Advanced NLP Lecture 9: Decoding Algorithms This lecture explains a key aspect of generative LLMs: The model learns a probability distribution, but useful generation still depends on how we decode from that distribution. 🔹 Greedy decoding picks the most likely token each step, but local best choices may not produce the best full sequence. 🔹 Beam search keeps multiple candidate paths, making decoding closer to sequence-level optimization. 🔹 Sampling turns probabilities into diverse outputs, but naive sampling can become incoherent because of long-tail tokens. 🔹 Top-k, top-p, and temperature control the tradeoff between quality, diversity, and randomness. The key idea: LLM generation is not just “the model predicts words.” It is model probabilities + decoding strategy. My note: https://t.co/ucwWjpPowE

ickma2311's tweet photo. CMU Advanced NLP Lecture 9: Decoding Algorithms

This lecture explains a key aspect of generative LLMs:

The model learns a probability distribution, but useful generation still depends on how we decode from that distribution.

🔹 Greedy decoding picks the most likely token each step, but local best choices may not produce the best full sequence.

🔹 Beam search keeps multiple candidate paths, making decoding closer to sequence-level optimization.

🔹 Sampling turns probabilities into diverse outputs, but naive sampling can become incoherent because of long-tail tokens.

🔹 Top-k, top-p, and temperature control the tradeoff between quality, diversity, and randomness.

The key idea: LLM generation is not just “the model predicts words.” It is model probabilities + decoding strategy.

My note:
https://t.co/ucwWjpPowE

1

122

16

100

4K

Swetava retweeted

Arjun

@arjunkocher

19 days ago

RL Algorithm Interview Questions 2026 (as compiled by @sheriyuo) https://t.co/sNLyXanzaP

11

2K

170

3K

97K

Swetava retweeted

Phosphen

@phosphenq

about 1 month ago

Jane Street pays $650,000 a year for quants. Stanford just released the exact RL-for-trading bible for free. 16 chapters. 0 to algo trader. Asset allocation, market making, American option exercise, full Python code & Colab notebooks. Bookmark & give it a weekend.

phosphenq's tweet photo. Jane Street pays $650,000 a year for quants. Stanford just released the exact RL-for-trading bible for free.

16 chapters. 0 to algo trader. Asset allocation, market making, American option exercise, full Python code & Colab notebooks.

Bookmark & give it a weekend. https://t.co/LPfiIbwOLq

8

596

83

883

29K

Who to follow

Noam Brown

@polynoamial

Researching reasoning @OpenAI | Co-created Libratus/Pluribus superhuman poker AIs, CICERO Diplomacy AI, and OpenAI o-series 🍓 reasoning models

Prof. Anima Anandkumar

@AnimaAnandkumar

AI+Science, Bren Professor @caltech, Time100, Fmr Sr Director of #AI research @nvidia Fmr Principal Scientist @awscloud

Tengyu Ma

@tengyuma

Assistant prof. @ Stanford; Chief AI Scientist @ MongoDB; Former Co-founder/CEO of Voyage AI Working on ML, DL, RL, LLMs, and their theory.

Swetava retweeted

Rohit Kumar Tiwari

@_rohit_tiwari_

about 1 month ago

This 115-page book unlocks the secrets of LLM fine tuning. https://t.co/Uhs8edPUV8 A comprehensive guide which covers: > the fine-tuning process for LLMs > combining both theory and practice.

_rohit_tiwari_'s tweet photo. This 115-page book unlocks the secrets of LLM fine tuning.

https://t.co/Uhs8edPUV8

A comprehensive guide which covers:
> the fine-tuning process for LLMs
> combining both theory and practice. https://t.co/VLrGXwypLj

4

467

105

574

19K

Swetava retweeted

Xiuyu Li

@sheriyuo

about 1 month ago

The Hands-on Modern RL tutorial everyone has been waiting for is finally available in English🥳🥳🥳 PDF download link: https://t.co/LgeX8gXBqT

sheriyuo's tweet photo. The Hands-on Modern RL tutorial everyone has been waiting for is finally available in English🥳🥳🥳

PDF download link: https://t.co/LgeX8gXBqT https://t.co/eRBuEgcL6v

8

816

119

1K

58K

Swetava retweeted

Oier Mees @oier_mees

about 1 month ago

𝗧𝗵𝗲 𝗿𝗲𝗰𝗼𝗿𝗱𝗶𝗻𝗴 𝗼𝗳 𝗟𝘂𝗰𝗮𝘀 𝗕𝗲𝘆𝗲𝗿'𝘀 (@giffmana) 𝗹𝗲𝗰𝘁𝘂𝗿𝗲 𝗮𝘁 @ETH 𝗶𝘀 𝗻𝗼𝘄 𝗹𝗶𝘃𝗲 𝗼𝗻 𝗬𝗼𝘂𝗧𝘂𝗯𝗲 𝗳𝗼𝗿 𝗲𝘃𝗲𝗿𝘆𝗼𝗻𝗲 𝘄𝗵𝗼 𝗰𝗼𝘂𝗹𝗱𝗻'𝘁 𝗷𝗼𝗶𝗻 𝘂𝘀 𝗶𝗻 𝗽𝗲𝗿𝘀𝗼𝗻! This past Monday, we had the pleasure of hosting Lucas (@Meta @AIatMeta Superintelligence Labs) for our "Robot Learning: From Fundamentals to Foundation Models" course. He joined us to talk about: "𝗩𝗶𝘀𝗶𝗼𝗻 𝗶𝗻 𝘁𝗵𝗲 𝗔𝗴𝗲 𝗼𝗳 𝗟𝗟𝗠𝘀". Drawing from a remarkable track record in computer vision and multimodal AI (𝗩𝗶𝗧, 𝗦𝗶𝗴𝗟𝗜𝗣, 𝗣𝗮𝗹𝗶𝗚𝗲𝗺𝗺𝗮) 🧠, Lucas delivered a masterclass on the frontier of multimodal foundation model training: from pre-training to post-training, where the field stands today, and what comes next 🚀 📽️ YouTube Recording: https://t.co/wNz1NwYkvb 📚 Course Website: https://t.co/DoQUYy3MjB

oier_mees's tweet photo. 𝗧𝗵𝗲 𝗿𝗲𝗰𝗼𝗿𝗱𝗶𝗻𝗴 𝗼𝗳 𝗟𝘂𝗰𝗮𝘀 𝗕𝗲𝘆𝗲𝗿'𝘀 (@giffmana) 𝗹𝗲𝗰𝘁𝘂𝗿𝗲 𝗮𝘁 @ETH 𝗶𝘀 𝗻𝗼𝘄 𝗹𝗶𝘃𝗲 𝗼𝗻 𝗬𝗼𝘂𝗧𝘂𝗯𝗲 𝗳𝗼𝗿 𝗲𝘃𝗲𝗿𝘆𝗼𝗻𝗲 𝘄𝗵𝗼 𝗰𝗼𝘂𝗹𝗱𝗻'𝘁 𝗷𝗼𝗶𝗻 𝘂𝘀 𝗶𝗻 𝗽𝗲𝗿𝘀𝗼𝗻!
This past Monday, we had the pleasure of hosting Lucas (@Meta @AIatMeta Superintelligence Labs) for our "Robot Learning: From Fundamentals to Foundation Models" course. He joined us to talk about: "𝗩𝗶𝘀𝗶𝗼𝗻 𝗶𝗻 𝘁𝗵𝗲 𝗔𝗴𝗲 𝗼𝗳 𝗟𝗟𝗠𝘀".
Drawing from a remarkable track record in computer vision and multimodal AI (𝗩𝗶𝗧, 𝗦𝗶𝗴𝗟𝗜𝗣, 𝗣𝗮𝗹𝗶𝗚𝗲𝗺𝗺𝗮) 🧠, Lucas delivered a masterclass on the frontier of multimodal foundation model training: from pre-training to post-training, where the field stands today, and what comes next 🚀
📽️ YouTube Recording: https://t.co/wNz1NwYkvb
📚 Course Website: https://t.co/DoQUYy3MjB

5

671

70

700

54K

Swetava retweeted

Stefano Ermon

@StefanoErmon

about 2 months ago

Excited to see my student’s work on Flux Matching out. It turns out you can learn a much broader class of vector fields with the data distribution as stationary (not just the score). This lets you enforce useful properties like fast mixing, and it already works on high-dimensional image datasets!

5

331

33

198

39K

Swetava retweeted

Stanford NLP Group

@stanfordnlp

about 2 months ago

Many roughly know how a transformer works To REALLY understand modern neural LMs—MoEs, GPU tiling, kernels, RLHF, data—you need CS336 By @tatsu_hashimoto, @percyliang The 2026 edition appears on yt with ~2 weeks delay https://t.co/iEWTqEivvB Materials https://t.co/E1pzUSC6Tr

stanfordnlp's tweet photo. Many roughly know how a transformer works

To REALLY understand modern neural LMs—MoEs, GPU tiling, kernels, RLHF, data—you need CS336

By @tatsu_hashimoto, @percyliang

The 2026 edition appears on yt with ~2 weeks delay
https://t.co/iEWTqEivvB

Materials
https://t.co/E1pzUSC6Tr https://t.co/yCdj8pDX45

12

2K

218

3K

296K

Swetava retweeted

tetsuo

@tetsuoai

about 2 months ago

Forty minutes of whiteboard. The full transformer architecture. Then open vim and write it in C.

21

2K

175

2K

57K

Swetava retweeted

Vivek Galatage

@vivekgalatage

about 2 months ago

A Modern Primer on Processing-In-Memory https://t.co/p9IQdmoFb7

3

593

68

651

29K

Swetava retweeted

Xiuyu Li

@sheriyuo

about 2 months ago

Never forget the name, QKRoPE Now dim and interpolation changed again.

1

17

2

15

4K

Swetava retweeted

Anthropic

@AnthropicAI

about 2 months ago

New Anthropic research: Natural Language Autoencoders. Models like Claude talk in words but think in numbers. The numbers—called activations—encode Claude’s thoughts, but not in a language we can read. Here, we train Claude to translate its activations into human-readable text.

593

17K

2K

9K

2M

Swetava retweeted

Albert Gu

@_albertgu

about 2 months ago

Introducing a new sequence model Raven which pushes the boundary of fixed-state-size sequence models! Raven bridges popular linear-time models with constant state capacity, like SSMs and sliding window attention (SWA). Like SWA, its state is a finite set of slots; unlike SWA, Raven learns to selectively choose which slots to update with each new token it caches. This is a much more principled update mechanism that leads to dramatically better retrieval abilities than prior linear models. I personally don't think SWA is a very principled model - but it's convenient and works well empirically - and am most excited to see Raven be used as a strictly better drop-in replacement. More broadly the framework it develops hopefully introduces more ideas to combine the strengths of SSM-like and attention-like models. This work was led by @rshia_afz and @avivbick

4

310

34

202

41K

Swetava retweeted

Vivek Galatage

@vivekgalatage

about 2 months ago

Yesterday's lecture on GPU Architectures by @_onurmutlu_ https://t.co/XnHP64b19U

1

552

78

519

82K

Swetava retweeted

Nathan Lambert

@natolambert

about 2 months ago

These are also the two books I recommend to people wanting a foundation in post-training. Both coming to print this summer!

4

461

48

431

39K

Swetava retweeted

Chao Ma

@ickma2311

about 2 months ago

David Silver RL Lecture 7: Policy Gradient Methods The main story is actor-critic. Pure policy gradient learns from full returns: unbiased, but high variance. Actor-critic adds a critic: 🔹 actor updates the policy 🔹 critic estimates how good the action was 🔹 lower-variance signal 🔹 online step-by-step learning Actor-critic feels like the bridge between policy optimization and value learning. My note: https://t.co/et5hAJFzcb

ickma2311's tweet photo. David Silver RL Lecture 7: Policy Gradient Methods

The main story is actor-critic.
Pure policy gradient learns from full returns:
unbiased, but high variance.

Actor-critic adds a critic:
🔹 actor updates the policy
🔹 critic estimates how good the action was
🔹 lower-variance signal
🔹 online step-by-step learning

Actor-critic feels like the bridge between policy optimization and value learning.

My note: https://t.co/et5hAJFzcb

0

76

7

45

2K

Swetava retweeted

Alex Stauffer

@alexstauffer_

about 2 months ago

We post-trained a 3B model with RL to beat Opus on spreadsheet retrieval. Faster, cheaper, more accurate. - If a piece of your agent loop is narrow, verifiable, and highly repeatable, a tiny trained model might beat the frontier. - The application layer is still early and new verticals are opening fast. Cheap domain specialists orchestrated by a frontier model that only spends tokens on judgment is a bet worth watching.

43

1K

122

1K

190K

Swetava retweeted

Xiuyu Li

@sheriyuo

about 2 months ago

This is exactly why Anthropic spends enormous effort on preventing reward hacking in every generation of RL training. In highly constrained RLVR settings, reward hacking can be relatively limited because the verifier is narrow and deterministic. But once you move toward reward models, open-ended environments, or agentic interaction, reward hacking never truly disappears. The model will always search for shortcuts inside the reward surface. Restrict KL divergence? Better verifiers? LLM-as-a-verifier? Probably the clearest article explaining Anthropic’s work on reward hacking prevention (Chinese): https://t.co/uTFBPHoHju

7

346

21

447

52K

Swetava retweeted

elie

@eliebakouch

about 2 months ago

this is fascinating, they train an encoder/decoder but use LLM matching the target model's shape for each part, so the latent space is just plain language and they can detect reward hacking, unwanted behavior and more could even see it being used as an eval to quantify how smart a model is, i love this

eliebakouch's tweet photo. this is fascinating, they train an encoder/decoder but use LLM matching the target model's shape for each part, so the latent space is just plain language and they can detect reward hacking, unwanted behavior and more

could even see it being used as an eval to quantify how smart a model is, i love this

21

1K

109

873

111K

Swetava Ganguli

@Swetava

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users