Alana Renda @alanamarzoev - Twitter Profile

about 1 month ago

Starting in 30 min! If you’re interested in deploying LLMs in decision making domains + reasoning under uncertainty come chat with me and @JillianRossA_

alanamarzoev's tweet photo. Starting in 30 min! If you’re interested in deploying LLMs in decision making domains + reasoning under uncertainty come chat with me and @JillianRossA_ https://t.co/Skgu9Rsfei

Alana Renda

@alanamarzoev

about 2 months ago

Heading to #ICLR2026 (@iclr_conf) 🇧🇷 to present OpenEstimate! As LLMs get deployed in decision-making domains, they're increasingly expected to do subjective probability estimation, drawing on everything they know to form beliefs about unknown quantities. Our paper studies this capability with a leakage-resistant benchmark. This sits at the intersection of a few things I care about: RL in hard-to-verify domains, forecasting, and making LLMs honest about what they don't know. Come find me Saturday 10:30–1 at poster #1716 in Pavilion 3! And if you'd like to grab coffee and chat about any of these, DMs are open!

alanamarzoev's tweet photo. Heading to #ICLR2026 (@iclr_conf) 🇧🇷 to present OpenEstimate!

As LLMs get deployed in decision-making domains, they're increasingly expected to do subjective probability estimation, drawing on everything they know to form beliefs about unknown quantities. Our paper studies this capability with a leakage-resistant benchmark.

This sits at the intersection of a few things I care about: RL in hard-to-verify domains, forecasting, and making LLMs honest about what they don't know.

Come find me Saturday 10:30–1 at poster #1716 in Pavilion 3! And if you'd like to grab coffee and chat about any of these, DMs are open!

3

46

8

23

8K

2

17

3

0

2K

alanamarzoev retweeted

Laura Ruis @LauraRuis

about 2 months ago

One piece of code can replace 100 chains of thought when training LLMs. Come chat with us tomorrow at the afternoon poster session #iclr2026 P3 poster #507 🕺

1

27

5

12

5K

alanamarzoev retweeted

Jillian Ross @JillianRossA_

about 2 months ago

On my way to #ICLR2026 to present OpenEstimate with @alanamarzoev and give a spotlight talk at the FINAI Workshop. Over the past few years, @AndrewWLo and I have been studying whether LLMs can be trusted to give sound investment advice. In my talk, I'll show that LLMs demonstrate heuristic collapse: rather than weighing all relevant factors, they latch onto a few salient features and ignore the rest. Heuristic collapse has direct consequences for whether LLMs can meet the legal standard of a fiduciary — and for AI advisors more broadly. This is one of many reasons I think investing is one of the best domains for studying LLMs. Through this domain, I've been able to study LLM reasoning, human-LLM interaction, and emergent systemic effects. If you're working on any of these topics, I'd love to meet. Come find me before or after the talk on Monday at 1:35PM!

0

6

1

357

Alana Renda

@alanamarzoev

about 2 months ago

Link to full paper: https://t.co/qP46b8ES6r

0

3

1

3

382

Who to follow

Readyset

@readysetio

Readyset is a MySQL and Postgres wire-compatible caching layer that sits in front of existing databases to speed up queries and horizontally scale reads.

Yoko

@stuffyokodraws

Cartoonist, Engineer, PM, Partner @a16z investing in infra & AI Prev Product lead @HashiCorp, Founding Eng/PM @Transposit. Eng @AppDynamics. Opinions = own.

Turso

@tursodatabase

The next evolution of SQLite https://t.co/LsugLYx8qw https://t.co/Zn4AwNBEVp

Alana Renda

@alanamarzoev

about 2 months ago

Heading to #ICLR2026 (@iclr_conf) 🇧🇷 to present OpenEstimate! As LLMs get deployed in decision-making domains, they're increasingly expected to do subjective probability estimation, drawing on everything they know to form beliefs about unknown quantities. Our paper studies this capability with a leakage-resistant benchmark. This sits at the intersection of a few things I care about: RL in hard-to-verify domains, forecasting, and making LLMs honest about what they don't know. Come find me Saturday 10:30–1 at poster #1716 in Pavilion 3! And if you'd like to grab coffee and chat about any of these, DMs are open!

3

46

8

23

8K

alanamarzoev retweeted

Gabe Grand

@gabe_grand

7 months ago

Do AI agents ask good questions? We built “Collaborative Battleship” to find out—and discovered that weaker LMs + Bayesian inference can beat GPT-5 at 1% of the cost. Paper, code & demos: https://t.co/ZFPt46XYUj Here's what we learned about building rational information-seeking agents... 🧵🔽

4

179

35

109

48K

alanamarzoev retweeted

Jacob Andreas @jacobandreas

8 months ago

👉 New preprint! We have lots of great benchmarks for tasks where it's possible, in principle, for models to get all the answers exactly correct. But what about tasks that *intrinsically* require reasoning about uncertain facts and quantities?

1

64

3

42

13K

Alana Renda

@alanamarzoev

8 months ago

This was joint work with @JillianRossA_ @MikeCafarella @jacobandreas We’ve open sourced our benchmark OpenEstimate to drive research and progress in this space. Stay tuned for more! 📝 Paper: https://t.co/dJkDBBNmJr ⚙️ Source code: https://t.co/KhBtz5wluA

0

12

3

2

895

Alana Renda

@alanamarzoev

8 months ago

🚨 New paper up on how LLMs reason under uncertainty! 🎲 Many real world uses of LLMs are characterized by the unknown—not only are the models prompted with partial information, but often even humans don't know the "right answer" to the questions asked. Yet most LLM evals focus on problems with clearly defined success criteria. There’s a gap in our understanding of how models perform in this setting. We investigate.... 🔎

6

132

23

114

26K

Alana Renda

@alanamarzoev

8 months ago

All of that’s to say… There's a lot of room for improvement! And we’re starting to see some action– maybe new RL methods like RLCR from @MehulDamani2, @ishapuri101 could make things better 👀 https://t.co/CdrkcLoIgC

Mehul Damani

@MehulDamani2

11 months ago

🚨New Paper!🚨 We trained reasoning LLMs to reason about what they don't know. o1-style reasoning training improves accuracy but produces overconfident models that hallucinate more. Meet RLCR: a simple RL method that trains LLMs to reason and reflect on their uncertainty -- improving both accuracy ✅ and calibration 🎯. [1/N]

MehulDamani2's tweet photo. 🚨New Paper!🚨
We trained reasoning LLMs to reason about what they don't know.

o1-style reasoning training improves accuracy but produces overconfident models that hallucinate more.

Meet RLCR: a simple RL method that trains LLMs to reason and reflect on their uncertainty -- improving both accuracy ✅ and calibration 🎯. [1/N]

14

801

165

628

116K

1

12

0

4

2K

Alana Renda

@alanamarzoev

8 months ago

Dr. GRPO paper was presented at @COLM_conf today, and it's a great read: https://t.co/SnWBLvZKhg If I had a nickel for every time someone found a bug in a core ML algorithm, I would have at least two nickels

1

4

0

434

Alana Renda

@alanamarzoev

8 months ago

Bonjour from Montreal 🇨🇦 spending the next few days here @ COLM! DM me if you’re around and want to chat about research or non-research topics, including but not limited to: reasoning under uncertainty, forecasting, summarization/RAG, and startups

0

9

2

3

4K

alanamarzoev retweeted

Alex Renda @alex_renda_

8 months ago

✈️ 🦙 Heading to COLM through Thursday! We’re hiring ML researchers at Jane Street for intern and full time roles, as well as supporting grad students through our fellowship program — DM me or stop by the JS booth if you want to chat about what we’re doing with ML @ JS!

alex_renda_'s tweet photo. ✈️ 🦙 Heading to COLM through Thursday!

We’re hiring ML researchers at Jane Street for intern and full time roles, as well as supporting grad students through our fellowship program — DM me or stop by the JS booth if you want to chat about what we’re doing with ML @ JS! https://t.co/Bj41RFm7mx

1

13

2

5

2K

Alana Renda

@alanamarzoev

over 1 year ago

me, deepresearch, and operator rn

0

312

Alana Renda

@alanamarzoev

over 1 year ago

after a week of deliberation finally took the leap and upgraded to the ChatGPT pro plan... feels like waking up on Christmas morning 🥲

alanamarzoev's tweet photo. after a week of deliberation finally took the leap and upgraded to the ChatGPT pro plan... feels like waking up on Christmas morning 🥲 https://t.co/LZKEl3nf2K

1

5

0

675

alanamarzoev retweeted

Readyset

@readysetio

about 2 years ago

Streaming dataflow provides a unique solution to scaling OLTP applications. Want to learn how? Founder and CEO of Readyset, @alanamarzoev, will be giving a talk on this subject at @qconlondon on Tuesday, April 9th at 10:35AM BST! Learn more: https://t.co/L3JrIuWBY5

1

10

3

1

1K

alanamarzoev retweeted

apuchitnis

@apuchitnis

over 2 years ago

caching can be really helpful to reduce backend load, but cache invalidation is famously one of the hard problems in CS enter https://t.co/cpDJ9xjU20 - a cache that is **always in sync** with postgres, so you don't need to invalidate stale data 😮

1

2

1

571

Alana Renda

@alanamarzoev

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users