Veniamin Veselovsky @VminVsky - Twitter Profile

VminVsky retweeted

17 days ago

Language models are becoming our default interface to facts. Yet their ability to *verify* facts can differ from their ability to *generate* them. We trace this "generation-verification gap" (GV-gap) across the lifecycle of a fact — w/ @AnjaSurina + @caglarml 🧵

1

49

15

31

6K

VminVsky retweeted

a16z @a16z

about 1 month ago

From "System of Record" to "System of Intelligence" In the next decade, you want to own the system of intelligence that pulls from the system of record, becomes the user’s one-stop shop for gaining context and taking action, and turns the SoR into something that’s primarily consumed at the API layer. The reasoning layer that sits above the database is where a new generation of companies is being built, and it’s where the majority of the next decade’s enterprise value of GTM software will end up. Full piece from a16z's Gio Ahern, Steph Zhang, and Alex Immerman: https://t.co/2udG6l6SSx

a16z's tweet photo. From "System of Record" to "System of Intelligence"

In the next decade, you want to own the system of intelligence that pulls from the system of record, becomes the user’s one-stop shop for gaining context and taking action, and turns the SoR into something that’s primarily consumed at the API layer.

The reasoning layer that sits above the database is where a new generation of companies is being built, and it’s where the majority of the next decade’s enterprise value of GTM software will end up.

Full piece from a16z's Gio Ahern, Steph Zhang, and Alex Immerman: https://t.co/2udG6l6SSx

73

1K

183

2K

358K

VminVsky retweeted

Tim Davidson @im_td

about 2 months ago

we called this three years ago. the future is now: —> https://t.co/kKaGnuOInH 🤖💭🤖

0

7

1

2

2K

VminVsky retweeted

noscroll @noscroll

about 2 months ago

X has the best information on the internet and the worst incentives & culture. meet noscroll — the AI that doomscrolls it for you and texts you just the things that matter. no feed. no brainrot. no ragebait. just signal. try it for free → https://t.co/XqdExWR13j 🙅🏼‍♂️

78

860

263

853

778K

Who to follow

Bob West

@cervisiarius

Associate Professor at EPFL, Data Science Lab (dlab)

Giuseppe (Peppe) Russo

@russogiusep

Research Scientist @google working on AI-Safety, Synthetic Environments. Prev @EPFL and @Stanford

Kevin Yang

@yang3kc

Kaicheng Yang, PhD | Assistant professor @BingCompSci | AI & Society, + | views mine

VminVsky retweeted

ORO

@oroagents

about 2 months ago

Introducing Oro, the biggest agent competition in the world. Powered by Bittensor, Live Now.

94

871

164

256

1M

Veniamin Veselovsky

@VminVsky

4 months ago

@coinbase @siddjain99

1

0

33

VminVsky retweeted

Jake Lewin @jkelwn

8 months ago

Introducing Dijie The only non-AI social network Order now for December 1st delivery

138

1K

55

443

114K

VminVsky retweeted

Giuseppe (Peppe) Russo

@russogiusep

8 months ago

🏆🏆🏆 Thrilled to share that our paper “The AI Review Lottery: Widespread AI-Assisted Peer Reviews Boost Paper Scores and Acceptance Rates” received an Honorable Mention Award at @ACM_CSCW 2025 🎉 By analyzing thousands of ICLR peer reviews, we show that papers receiving AI-assisted reviews systematically receive higher scores and are more likely to be accepted. Thanks to my amazing coauthors for making this possible! @manoelribeiro, @im_td , @VminVsky , @cervisiarius

1

16

4

3

3K

VminVsky retweeted

Bob West @cervisiarius

8 months ago

🚨New paper alert! 🚨 Tandem Training for Language Models https://t.co/Emzcgf1KHx Actions & thoughts of AI w/ superhuman skills will be hard for humans to follow, undermining human oversight of AI. We propose a new way to make AI produce human-understandable solutions. How?👉🧵

cervisiarius's tweet photo. 🚨New paper alert! 🚨

Tandem Training for Language Models
https://t.co/Emzcgf1KHx

Actions & thoughts of AI w/ superhuman skills will be hard for humans to follow, undermining human oversight of AI. We propose a new way to make AI produce human-understandable solutions. How?👉🧵 https://t.co/3L2yG6lzoI

4

83

25

42

8K

Veniamin Veselovsky

@VminVsky

8 months ago

terrific effort from sayash and benedikt! if you’re working on evals this is a must read.

Sayash Kapoor @sayashk

8 months ago

📣New paper: Rigorous AI agent evaluation is much harder than it seems. For the last year, we have been working on infrastructure for fair agent evaluations on challenging benchmarks. Today, we release a paper that condenses our insights from 20,000+ agent rollouts on 9 challenging benchmarks spanning web, coding, science, and customer service tasks. Our key insight: Benchmark accuracy hides many important details. Take claims of agents' accuracy with a huge grain of salt. 🧵

sayashk's tweet photo. 📣New paper: Rigorous AI agent evaluation is much harder than it seems.

For the last year, we have been working on infrastructure for fair agent evaluations on challenging benchmarks.

Today, we release a paper that condenses our insights from 20,000+ agent rollouts on 9 challenging benchmarks spanning web, coding, science, and customer service tasks.

Our key insight: Benchmark accuracy hides many important details. Take claims of agents' accuracy with a huge grain of salt. 🧵

20

421

94

371

86K

0

3

1

2

1K

Veniamin Veselovsky

@VminVsky

8 months ago

@itsCathyDi @KindredVentures @stevejang @Saga_Ventures @maxaltman @braveben @itsthomson @dedaluslabs @WindsorNguyen congrats cathy and windsor!!

0

1

0

160

Veniamin Veselovsky

@VminVsky

8 months ago

phenomenal work by @exnx and co!

Radical Numerics

@RadicalNumerics

8 months ago

Introducing RND1, the most powerful base diffusion language model (DLM) to date. RND1 (Radical Numerics Diffusion) is an experimental DLM with 30B params (3B active) with a sparse MoE architecture. We are making it open source, releasing weights, training details, and code to catalyze further research on DLM inference and post-training. We are researchers and engineers (DeepMind, Meta, Liquid, Stanford) building the engine for recursive self-improvement (RSI) — and using it to accelerate our own work. Our goal is to let AI design AI. We are hiring.

103

1K

251

712

848K

0

3

0

1

637

VminVsky retweeted

Francesco Salvi @fraslv

8 months ago

🌱✨ Life update: I just started my PhD at Princeton University! I will be supervised by @manoelribeiro and affiliated with @PrincetonCITP. It's only been a month, but the energy feels amazing —very grateful for such a welcoming community. Excited for what’s ahead! 🚀

fraslv's tweet photo. 🌱✨ Life update: I just started my PhD at Princeton University!

I will be supervised by @manoelribeiro and affiliated with @PrincetonCITP.

It's only been a month, but the energy feels amazing —very grateful for such a welcoming community. Excited for what’s ahead! 🚀 https://t.co/w5kYuAw3ks

25

600

19

68

47K

Veniamin Veselovsky

@VminVsky

9 months ago

@jkminder @wendlerch @cervisiarius congrats julian!!

0

2

0

134

Veniamin Veselovsky

@VminVsky

9 months ago

nlp goat

Benedikt Stroebl

@benediktstroebl

9 months ago

Two papers on agent safety got accepted to NeurIPS 2025! 🥳 1) Dynamic Risk Assessments for Offensive Cybersecurity Agents https://t.co/kLWCDUOxRG 2) Safety Devolution in AI Agents https://t.co/aqAxX0Fh9D

4

31

4

17

8K

0

2

0

854

Veniamin Veselovsky

@VminVsky

10 months ago

@Rg2official @ycombinator @sellwithatext @benediktstroebl For now we're only in SF / Bay Area. But will roll out across the USA soon. Text your zip code to the number and we'll notify you when we're in your area!

0

2

0

52

Veniamin Veselovsky

@VminVsky

10 months ago

@cdxker @ycombinator @sellwithatext @benediktstroebl should be fixed!

0

67

Veniamin Veselovsky

@VminVsky

10 months ago

Selling your stuff sucks. With @sellwithatext you can sell it with a text. Just send us a photo of what you want to sell and then we'll do the rest. List, negotiate, answer questions, handle delivery. Launching in bay area for now (text zip code if your outside)!

4

21

1

0

2K

VminVsky retweeted

Y Combinator

@ycombinator

10 months ago

Rid (@sellwithatext) is making selling easier than buying. Just text them a photo of what you want to sell and they'll do the rest. Finding buyers, negotiating, answering questions, and picking up the item. Congrats on the launch, @vminvsky & @benediktstroebl! https://t.co/n6OdMdkc7A

19

214

10

78

28K

VminVsky retweeted

Clément Dumas

@Butanium_

12 months ago

This work got accepted to ACL 2025 main! 🎉 In this updated version, we extended our results to several models and showed they can actually generate good definitions of mean concept representations across languages.🧵

1

43

6

3

2K

Veniamin Veselovsky

@VminVsky

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users