Nandan Thakur @beirmug - Twitter Profile

Pinned Tweet

My new Weaviate podcast from Argentina with a freshly brewed mate in hand!🧉🇦🇷 It's always fun to chat with @CShorten30: I talk about search agents, I'm amazed how this field has changed rapidly and share ORBIT findings & research ideas! 💚

beirmug's tweet photo. My new Weaviate podcast from Argentina with a freshly brewed mate in hand!🧉🇦🇷

It's always fun to chat with @CShorten30: I talk about search agents, I'm amazed how this field has changed rapidly and share ORBIT findings & research ideas! 💚 https://t.co/a02zNMNSiY

Connor Shorten

@CShorten30

30 days ago

How do we train and evaluate Search Agents? 👾🔎 I am SUPER EXCITED to publish a new episode of the Weaviate Podcast with Nandan Thakur (@beirmug) on Search Agents! 🎙️💚 Firstly, congratulations to Nandan who has just completed his Ph.D. at the University of Waterloo advised by Professor Jimmy Lin (@lintool)! 🎉 During this time, Nandan published several impactful works such as BEIR 🍻, MIRACL 🌍🙌🌏, FreshStack 🥞, and many more. This podcast dives into his new work on ORBIT and the current state of Search Agents! ⚛️ ORBIT contains 20K training examples, each one a complex, multi-hop question paired with a short verifiable answer. For example, "What was the runtime of the 2017 animated film set inside a smartphone, directed by..." (Answer: 86 minutes). 🎬 This dataset is used to train Search Agents on queries that require say 4 to 5 searches in order to answer. The crazy part is that ORBIT was generated entirely without paid Web Search APIs! The entire pipeline runs on a 2018 Linux laptop driving DeepSeek's free chat interface! 💻♻️ Trained on ORBIT, Qwen3-4B beats InfoSeeker-4B by 4.3 EM and Search-R1-4B by 9.0 EM across 7 Wikipedia QA benchmarks. A lot of interesting nuggets in this one! As always, I hope you find it useful and happy to discuss further! 👋

CShorten30's tweet photo. How do we train and evaluate Search Agents? 👾🔎

I am SUPER EXCITED to publish a new episode of the Weaviate Podcast with Nandan Thakur (@beirmug) on Search Agents! 🎙️💚

Firstly, congratulations to Nandan who has just completed his Ph.D. at the University of Waterloo advised by Professor Jimmy Lin (@lintool)! 🎉

During this time, Nandan published several impactful works such as BEIR 🍻, MIRACL 🌍🙌🌏, FreshStack 🥞, and many more.

This podcast dives into his new work on ORBIT and the current state of Search Agents! ⚛️

ORBIT contains 20K training examples, each one a complex, multi-hop question paired with a short verifiable answer. For example, "What was the runtime of the 2017 animated film set inside a smartphone, directed by..." (Answer: 86 minutes). 🎬

This dataset is used to train Search Agents on queries that require say 4 to 5 searches in order to answer.

The crazy part is that ORBIT was generated entirely without paid Web Search APIs! The entire pipeline runs on a 2018 Linux laptop driving DeepSeek's free chat interface! 💻♻️

Trained on ORBIT, Qwen3-4B beats InfoSeeker-4B by 4.3 EM and Search-R1-4B by 9.0 EM across 7 Wikipedia QA benchmarks.

A lot of interesting nuggets in this one! As always, I hope you find it useful and happy to discuss further! 👋

5

45

15

22

10K

2

33

7

6

3K

Nandan Thakur @beirmug

about 6 hours ago

@Robro612 @mixedbreadai Congratulations!

0

1

0

157

beirmug retweeted

Mixedbread @mixedbreadai

1 day ago

By now, everyone knows that single-vector embedding models are hugely limiting for modern workflows. But they contain than you think: you can extract sparse Latent Terms from them. And it turns out that BM25 is all you need to turn this vocabulary into a strong retriever.

6

176

22

169

33K

Nandan Thakur @beirmug

5 days ago

@bclavie Looking forward to the blogpost! Very interesting work :)

1

4

0

293

Who to follow

Pratik Joshi

@Roprajo

Research Engineer @GoogleDeepMind | Teaching machines to code | Prev @LTIatCMU @GoogleAI, @MSFTResearch @BITSPilaniGoa

Nils Reimers

@Nils_Reimers

VP AI Search @Cohere | ex-huggingface | Creator of SBERT (https://t.co/MKKOMfuQ4C)

Sewon Min

@sewon__min

Assistant professor @Berkeley_EECS @berkeley_ai || Research scientist at @allen_ai || PhD from @uwcse @uwnlp

beirmug retweeted

Ben Clavié

@bclavie

5 days ago

Very excited to finally share this one after sitting on it for far too long! It's very topical now. Blog post coming very soon :)

9

88

15

23

13K

beirmug retweeted

Negar Arabzadeh

@NegarEmpr

7 days ago

Grateful that my PhD thesis was recognized as one of the top dissertations in the 2026 Faculty of Mathematics Doctoral Prize at the @UWaterloo ! 🎉 And it is always especially nice to hear kind words from your PhD supervisor @claclarke . I guess that feeling never really goes away, even after you graduate. 😊 https://t.co/P6huzVj0Y9

7

95

5

7

7K

Nandan Thakur @beirmug

7 days ago

@antoine_chaffin Is their model multilingual?

1

2

0

185

beirmug retweeted

TREC RAG @ 2026 @TREC_RAG

19 days ago

Does retrieval help RAG or did the LLM already memorize the answer? 🤔 Too often, the overlap between RAG corpora and what LLMs “know” is unclear Better RAG evaluation needs tighter alignment between NLP and IR 📚 That's why for RAG 2026 we are using @nvidia's ClimbMix corpus

15

16

7

4

2K

Nandan Thakur @beirmug

27 days ago

@ShuklaAngira Take a long vacation/break from research, disconnect so that you are fresh when you start your research. Highly recommend before a PhD.

1

0

539

beirmug retweeted

Weaviate Podcast

@weaviatepodcast

29 days ago

What is a Search Agent? 👾🔎 Here is a clip from our latest Weaviate Podcast discussing what separates Deep Research from Search Agents! 👇

1

10

7

1K

beirmug retweeted

Omar Khattab

@lateinteraction

28 days ago

I’ve never been this excited about search. 6-7 years ago, IR got an influx of the paradigms we still use, all enabled by the big headroom MS MARCO and then BEIR created. Then progress slowed. Today, Diane releases perhaps the most ambitious IR benchmark to date: OBLIQ-Bench. Queries in it are meant to be increasingly opaque to current first-stage retrieval paradigms. Oblique queries put the bottleneck very early in the search process, as the relevance of a document to the query is quite latent. I can't wait for core IR research on fundamentally more powerful paradigms for first-stage search to be reignited again. Stay tuned for more stories about this, and read Diane's thread and her paper below!!

8

361

44

212

39K

Nandan Thakur @beirmug

29 days ago · Ciudad Autónoma de Buenos Aires

Here is the link to the original ORBIT post discussed in the podcast: https://t.co/rXtCNnNMQD

Nandan Thakur @beirmug

about 2 months ago · Waterloo

Introducing ⚛️ORBIT, a 20K reasoning-intensive web training dataset for search agents frugally generated without relying on paid APIs. Small (<4B) search agents trained with ORBIT outperform others by up to 9.0 EM accuracy on single & multi-hop Wikipedia QA. 🧵

beirmug's tweet photo. Introducing ⚛️ORBIT, a 20K reasoning-intensive web training dataset for search agents frugally generated without relying on paid APIs. Small (<4B) search agents trained with ORBIT outperform others by up to 9.0 EM accuracy on single & multi-hop Wikipedia QA. 🧵 https://t.co/z6MM6blpOG

6

82

15

48

20K

0

5

1

405

Nandan Thakur @beirmug

29 days ago · Buenos Aires City Region

My new Weaviate podcast from Argentina with a freshly brewed mate in hand!🧉🇦🇷 It's always fun to chat with @CShorten30: I talk about search agents, I'm amazed how this field has changed rapidly and share ORBIT findings & research ideas! 💚

Connor Shorten

@CShorten30

30 days ago

How do we train and evaluate Search Agents? 👾🔎 I am SUPER EXCITED to publish a new episode of the Weaviate Podcast with Nandan Thakur (@beirmug) on Search Agents! 🎙️💚 Firstly, congratulations to Nandan who has just completed his Ph.D. at the University of Waterloo advised by Professor Jimmy Lin (@lintool)! 🎉 During this time, Nandan published several impactful works such as BEIR 🍻, MIRACL 🌍🙌🌏, FreshStack 🥞, and many more. This podcast dives into his new work on ORBIT and the current state of Search Agents! ⚛️ ORBIT contains 20K training examples, each one a complex, multi-hop question paired with a short verifiable answer. For example, "What was the runtime of the 2017 animated film set inside a smartphone, directed by..." (Answer: 86 minutes). 🎬 This dataset is used to train Search Agents on queries that require say 4 to 5 searches in order to answer. The crazy part is that ORBIT was generated entirely without paid Web Search APIs! The entire pipeline runs on a 2018 Linux laptop driving DeepSeek's free chat interface! 💻♻️ Trained on ORBIT, Qwen3-4B beats InfoSeeker-4B by 4.3 EM and Search-R1-4B by 9.0 EM across 7 Wikipedia QA benchmarks. A lot of interesting nuggets in this one! As always, I hope you find it useful and happy to discuss further! 👋

5

45

15

22

10K

2

33

7

6

3K

Nandan Thakur @beirmug

29 days ago · Ciudad Autónoma de Buenos Aires

@AmelieTabatta @CShorten30 I try and I hope you enjoy it! 🙌🏻🙌🏻🙌🏻

0

2

0

18

beirmug retweeted

Connor Shorten

@CShorten30

30 days ago

YouTube: https://t.co/WD86DHQgBC Spotify: https://t.co/WHN6YftmeB

0

7

1

0

429

Nandan Thakur @beirmug

29 days ago

@din0s_ @CShorten30 Thank you! I'm late to this drinking mate, but hope you liked it! 🧉

0

3

0

26

beirmug retweeted

Connor Shorten

@CShorten30

30 days ago

How do we train and evaluate Search Agents? 👾🔎 I am SUPER EXCITED to publish a new episode of the Weaviate Podcast with Nandan Thakur (@beirmug) on Search Agents! 🎙️💚 Firstly, congratulations to Nandan who has just completed his Ph.D. at the University of Waterloo advised by Professor Jimmy Lin (@lintool)! 🎉 During this time, Nandan published several impactful works such as BEIR 🍻, MIRACL 🌍🙌🌏, FreshStack 🥞, and many more. This podcast dives into his new work on ORBIT and the current state of Search Agents! ⚛️ ORBIT contains 20K training examples, each one a complex, multi-hop question paired with a short verifiable answer. For example, "What was the runtime of the 2017 animated film set inside a smartphone, directed by..." (Answer: 86 minutes). 🎬 This dataset is used to train Search Agents on queries that require say 4 to 5 searches in order to answer. The crazy part is that ORBIT was generated entirely without paid Web Search APIs! The entire pipeline runs on a 2018 Linux laptop driving DeepSeek's free chat interface! 💻♻️ Trained on ORBIT, Qwen3-4B beats InfoSeeker-4B by 4.3 EM and Search-R1-4B by 9.0 EM across 7 Wikipedia QA benchmarks. A lot of interesting nuggets in this one! As always, I hope you find it useful and happy to discuss further! 👋

5

45

15

22

10K

beirmug retweeted

Weaviate Podcast

@weaviatepodcast

29 days ago

Weaviate Podcast #137 is live! Search Agents! 🎉 Here is an overview of chapters covered in the podcast 📜

1

3

5

3

533

Nandan Thakur @beirmug

29 days ago · Ciudad Autónoma de Buenos Aires

Haha, I think I can't emphasize enough that I think compute/funding should not limit us in academia. In ⚛️ ORBIT, i found that a single Linux laptop running non-stop for months is enough to generate a pretty good dataset, you don't need expensive APIs!

Connor Shorten

@CShorten30

29 days ago

Haha, Nandan working on Orbit ⚛️😃