Rohan Jha @Robro612 - Twitter Profile

Rohan Jha @Robro612

about 6 hours ago

@antoine_chaffin @derangineer With a repo that beautiful the code is the documentation

1

2

0

62

Robro612 retweeted

Eugene Yang @EYangTW

2 days ago

RAGTIME is coming back for the 2nd edition! Come join us to evaluate your search agents! We once again feature multilingual and fully human evaluation This is the ultimate way to know whether your systems are doing better

1

7

5

0

686

Robro612 retweeted

debashish 🦘

@derangineer

about 7 hours ago

did someone say late interaction? it’s getting serious now

4

9

2

3

511

Rohan Jha @Robro612

about 14 hours ago

@helloiamleonie @liquidai @maximelabonne @paulabartabajo_ Congratulations Leonie!

0

92

Who to follow

Rodrigo Nogueira

@rodrigfnogueira

Researcher in Deep Learning, Information Retrieval, and NLP

YUFAN_SuperMario

@___YUFAN___

LLM Researcher at Microsoft Redmond｜Senior AS L64 | Working on scalable LLM for personalized news recommendation

Tiago Freitas in founder mode

@tiagoefreitas

Founder @scarletaico. I am passionate about ways to improve the way we live and grow together, from meditation to AI.

Rohan Jha @Robro612

about 14 hours ago

@antoine_chaffin hell yeah

0

1

0

137

Rohan Jha @Robro612

about 15 hours ago

Please consider this my announcement / shill warning. Super excited to be @mixedbreadai 🍞 this summer! Lots of cool stuff to be done in the first-stage / agentic retrieval areas!

gautham

@capemox

about 15 hours ago

@Robro612 @bclavie @mixedbreadai woah I just noticed you joined mxbai, congrats!

0

3

0

2K

4

36

2

2K

Rohan Jha @Robro612

about 15 hours ago

@bclavie @capemox Dang I hadn't read your mandatory MTS quasi-essay on research yet. Compelling argument to break the loop and let software/ideas wag hardware/infra rather than vise versa. The other evidence for that viewpoint is my ~2 days @mixedbreadai watching the science lead

1

4

0

105

Rohan Jha @Robro612

about 15 hours ago

@bclavie @capemox counterargument: this isn't paving a new software/compute path that we need new infra to serve, it's explicitly *returning* us to a well-known infra regime of BM25.

1

3

0

93

Rohan Jha @Robro612

1 day ago

The more Zipfian distribution of these terms can't be understated. That means this tiny SAE adapter enables decades of BM25 index optimizations, rather than having to step into the new set of (also great) manifold-hypothesis-driven LSR engines designed for SPLADE.

Mixedbread @mixedbreadai

1 day ago

By now, everyone knows that single-vector embedding models are hugely limiting for modern workflows. But they contain than you think: you can extract sparse Latent Terms from them. And it turns out that BM25 is all you need to turn this vocabulary into a strong retriever.

6

178

22

169

34K

0

30

3

12

3K

Robro612 retweeted

Ben Clavié

@bclavie

5 days ago

Very excited to finally share this one after sitting on it for far too long! It's very topical now. Blog post coming very soon :)

9

88

15

23

13K

Rohan Jha @Robro612

5 days ago

@antoine_chaffin @SilvioMartinico Definitely, s or m?

1

0

24

Rohan Jha @Robro612

5 days ago

@antoine_chaffin @SilvioMartinico Testing now!

1

0

27

Rohan Jha @Robro612

5 days ago

@antoine_chaffin @matospiso > MaxSim only training a few tokens doesn't help this Hmmm, XTR training is strictly sparser in this regard but seems to succumb less

0

27

Rohan Jha @Robro612

5 days ago

@topk_io @matospiso Shameless plug that XTR training mitigates anisotropism/degen score distributions! See modernbert_colbert_kd -> ModernBERT-XTR: same exact training except for XTR vs ColBERT training. Discussed here: https://t.co/OwLPS1kDsJ

1

6

1

140

Rohan Jha @Robro612

5 days ago

Anisotropy seems to be the quality-giver (at least confounder) but efficiency-killer. @topk_io identified it as a blocker to SMVE and I'm finding it to block TACHIOM too. Would love to know what the ISO FT consists of @matospiso 🧐

topk.io

@topk_io

6 days ago

Even strong multi-vector models may break down when optimized for low-latency and high-QPS inference in production. But this can be fixed. We're open-sourcing Iso-ModernColBERT, a late interaction model built for efficient inference and scalable retrieval. 🧵 (1/6)

topk_io's tweet photo. Even strong multi-vector models may break down when optimized for low-latency and high-QPS inference in production. But this can be fixed.

We're open-sourcing Iso-ModernColBERT, a late interaction model built for efficient inference and scalable retrieval.

🧵 (1/6)

1

55

9

40

10K

5

11

0

6

2K

Rohan Jha @Robro612

5 days ago

@lateinteraction @topk_io @matospiso Lol the anisotropy measurement is only on a tiny set of MSMARCO docs, and TACHIOM support is all thanks to the nice python bindings of @SilvioMartinico . Turns out it helps to have a vibrant community behind you!

1

4

1

0

359

Robro612 retweeted

Sumit @_reachsumit

6 days ago

Latent Terms: Dense Retrievers Contain Trivially Extractable BM25-ready Zipfian Vocabularies @bclavie et al. extract indexable, BM25-ready sparse features from frozen dense retrievers using reconstruction-trained Sparse Autoencoders. 📝 https://t.co/WRIaCu2xIm

1

92

14

77

22K

Rohan Jha @Robro612

6 days ago

ICYMI: @raphaelsrty just added index.freeze() to FastPlaid v1.4.7 which halves your size on disk if you know you won’t modify the index 🥶 Reversible with index.unfreeze() 🔥

Rohan Jha @Robro612

6 days ago

@antoine_chaffin The halving of the size of FastPlaid indexes for analytical read-only workloads is real! https://t.co/i69k6xHfCe

0

3

0

1K

1

16

3

1

1K

Rohan Jha @Robro612

6 days ago

@antoine_chaffin The halving of the size of FastPlaid indexes for analytical read-only workloads is real! https://t.co/i69k6xHfCe

0

3

0

1K

Rohan Jha @Robro612

6 days ago

PLAID's residual gather for full MaxSim is (theoretically) bandwidth limited, not IOPs limited. A single doc's residuals occupy >1 page. So you can't expect to save much time with a clever ordering of document embedding bundles according to token-centroid co-occurrence.

2

16

0

3

1K

Rohan Jha @Robro612

6 days ago

This is why 1) XTR/WARP win big by skipping this load entirely 2) CPU-only versions of PLAID/TACHIOM indices can perform so well, and 3) VecFlow-Chamfer wins by saturating and parallelizing memory reads over its new 900 GB/s GH interconnect, not even needing to compress

2

6

0

1

195

Rohan Jha @Robro612

6 days ago

Related to this is the general challenge in optimizing vector indices on GPU (in fact single-vec is even fewer FLOPs/bit than MaxSim), we sit well below the roofline, so the arithmetic we do for the final MaxSim hardly contributes to the wall time.

1

6

0

2

202

Rohan Jha

@Robro612

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users