Akshay Goindani @AkshayGoindani1 - Twitter Profile

3 months ago

@DirhousssiAmine @hgoel1000 If it works with 1e-5 then is it the case that the total update with the smaller learning rate goes out of the representable range in bf16 whereas when we use 1e-5 the update is still representable? By update I mean the product of the learninf rate and gradient

1

0

28

Akshay Goindani @AkshayGoindani1

8 months ago

Our latest rerankers consistently outperform LLMs across all setups, delivering larger improvements regardless of the first-stage retriever. Check out our blog for more insights https://t.co/ZxPYJW1BTi

0

44

Akshay Goindani @AkshayGoindani1

8 months ago

It was great to drive this effort — the results are very exciting. Most works highlight the benefits of using LLMs for reranking, but often rely on results from weak retrieval models. When we pair rerankers with a strong retriever like voyage-3-large, those advantages disappear.

Voyage AI by MongoDB

@VoyageAI

8 months ago

@zhmeishi @AkshayGoindani1 @HongLiu9903 Get all the details in our blog: https://t.co/IbwBDdaXlQ Shoutout to @zhmeishi , @AkshayGoindani1, and @HongLiu9903 for their incredible work on this research!

0

4

0

1

587

1

0

109

Akshay Goindani @AkshayGoindani1

11 months ago

Excited to see Instruction-following capabilities in our latest rerankers.

Hong Liu

@HongLiu9903

11 months ago

Instruction-following takes reranker capabilities to the next level 🔥 Huge thanks to @zhmeishi and @AkshayGoindani1 for driving this leap forward!

1

11

1

3

1K

0

3

0

238

Who to follow

Ritwik Mishra

@koiRitwikHai

Computer Science. PhD Scholar @midasIIITD @precogatiiith. IIITD. NLP practitioner. Super enthu about Indian languages.

shizia

@deutranium

I like working on hard problems phd (dropout) @UZH_en fell in love with graphs @sn_ethz learnt all about NNs @iiit_hyderabad

Arvindh Arun

@arvindh__a

Research Scientist Intern @SakanaAILabs | PhD Student @ELLISforEurope @MPI_IS

AkshayGoindani1 retweeted

Hong Liu

@HongLiu9903

11 months ago

Greater things to come!

0

10

2

0

561

AkshayGoindani1 retweeted

Dev Ittycheria

@dittycheria

11 months ago

We just launched Voyage-context-3, a new embedding model that gives AI a full-document view while preserving chunk-level precision that offers better retrieval performance than leading alternatives. When building AI that reads and reasons over documents (such as reports, contracts, or medical records), it’s critical to break those documents into smaller pieces, or “chunks,” while still maintaining an understanding of the big picture. Most systems today lose important context, or require complicated workarounds to stitch it back together. https://t.co/OcxvTzfXah

2

25

13

3

3K

AkshayGoindani1 retweeted

Voyage AI by MongoDB

@VoyageAI

11 months ago

📢 voyage-context-3: contextualized chunk embeddings - Auto captures of chunk level detail & global doc context, w/o metadata augmentation - Beats OpenAI-v3-large by 14.24% & Cohere-v4 by 7.89% - Binary 512-dim matches OpenAI (float, 3072-dim) in accuracy, but 192x cheaper in VDB costs

VoyageAI's tweet photo. 📢 voyage-context-3: contextualized chunk embeddings

- Auto captures of chunk level detail & global doc context, w/o metadata augmentation
- Beats OpenAI-v3-large by 14.24% & Cohere-v4 by 7.89%
- Binary 512-dim matches OpenAI (float, 3072-dim) in accuracy, but 192x cheaper in VDB costs

4

88

25

55

21K

Akshay Goindani @AkshayGoindani1

about 1 year ago

Learning output format is easy and quickly saturates the reward — leading to zero advantage and no gradient signal (if there's no KL). Interesting that this still seems to induce reasoning. Any hypothesis for why that happens? @natolambert

AkshayGoindani1's tweet photo. Learning output format is easy and quickly saturates the reward — leading to zero advantage and no gradient signal (if there's no KL).
Interesting that this still seems to induce reasoning.
Any hypothesis for why that happens?
@natolambert https://t.co/ALy2Fw9vTb

Rulin Shao

@RulinShao

about 1 year ago

Arxiv: https://t.co/S5m3e6VhVQ Clearly a lot more work is needed to understand what’s really happening with RL and prompting. We hope that our experiments with spurious rewards and spurious prompts, as well as the released code, data, checkpoints, etc. will help with this! 🔍

1

16

1

5

1K

0

2

0

170

Akshay Goindani @AkshayGoindani1

about 1 year ago

@shradhasgl Agreed, I think this might be because several runs are needed for RL trainings, as it is high variance. This makes it hard to reproduce the results as well. It would be interesting to see if such observations hold true after averaging over several runs. https://t.co/2PzTuCMDp1

Stephanie Chan @scychan_brains

about 1 year ago

Agree that we need to remember the high variance of RL, as we push further into long horizon etc! We developed metrics to help folks track RL reliability -- codebase+paper here: https://t.co/K7PXJwgKL3

0

33

1

15

3K

0

2

0

102

AkshayGoindani1 retweeted

Voyage AI by MongoDB

@VoyageAI

about 1 year ago

📢 Meet voyage-3.5 and voyage-3.5-lite! • flexible dim. and quantizations • voyage-3.5 & 3.5-lite (int8, 2048 dim.) are 8% & 6% more accurate than OpenAI-v3-large, and 2.2x & 6.5x cheaper, resp. Also 83% less vectorDB cost! • 3.5-lite ~ Cohere-v4 in quality, but 83% cheaper.

VoyageAI's tweet photo. 📢 Meet voyage-3.5 and voyage-3.5-lite!
• flexible dim. and quantizations
• voyage-3.5 & 3.5-lite (int8, 2048 dim.) are 8% & 6% more accurate than OpenAI-v3-large, and 2.2x & 6.5x cheaper, resp. Also 83% less vectorDB cost!
• 3.5-lite ~ Cohere-v4 in quality, but 83% cheaper. https://t.co/jOHHwjwSYv

2

43

14

11

12K

AkshayGoindani1 retweeted

Ravid Shwartz Ziv

@ziv_ravid

over 1 year ago

I love @karpathy , but vibe coding is a waste of time. It is good for tech bros who want to look cool, but besides rare cases, it will not make you deliver your product faster

28

214

5

37

34K

Akshay Goindani @AkshayGoindani1

over 1 year ago

Results from our work HEMM (https://t.co/il5gIR7Q2v) resonate with the findings. Frontier models like GPT-4o struggle on Healthcare tasks, with Vision - Language Medical tasks being more challenging.

AkshayGoindani1's tweet photo. Results from our work HEMM (https://t.co/il5gIR7Q2v) resonate with the findings. Frontier models like GPT-4o struggle on Healthcare tasks, with Vision - Language Medical tasks being more challenging. https://t.co/Ca28xJXlW1

Percy Liang

@percyliang

over 1 year ago

1/🧵How do we know if AI is actually ready for healthcare? We built a benchmark, MedHELM, that tests LMs on real clinical tasks instead of just medical exams. #AIinHealthcare Blog, GitHub, and link to leaderboard in thread!

percyliang's tweet photo. 1/🧵How do we know if AI is actually ready for healthcare? We built a benchmark, MedHELM, that tests LMs on real clinical tasks instead of just medical exams. #AIinHealthcare
Blog, GitHub, and link to leaderboard in thread! https://t.co/8dEUrnjbKU

9

343

69

235

60K

0

128

Akshay Goindani @AkshayGoindani1

over 1 year ago

@percyliang Great insights! In our paper HEMM: Holistic Evaluation of Multimodal Foundation Models (https://t.co/il5gIR7Q2v), we show that even frontier models like GPT-4o aren’t ready for medical tasks yet. Focusing on key Image regions is challenging.

0

1

0

1

111

Akshay Goindani @AkshayGoindani1

over 1 year ago

@andrew_n_carr Some paper used “think carefully and I will give you a tip” 😂 Tip is all you need

0

36

Akshay Goindani @AkshayGoindani1

over 1 year ago

@AravSrinivas It’s actually very annoying after using for a while as it keeps generating long thoughts for simple things.

0

21

Akshay Goindani @AkshayGoindani1

over 1 year ago

@abeirami DPO also seems like a type of contrastive learning, trying to bring logprobabilties closer.

1

2

0

381

Akshay Goindani @AkshayGoindani1

over 1 year ago

@SeunghyunSEO7 Yeah, storing just the compressed vector is fine as it can be reconstructed on the fly with the up projection matrix.

0

1

0

121

AkshayGoindani1 retweeted

Hong Liu

@HongLiu9903

over 1 year ago

Tried to reproduce the COIR results. TLDR: SFR-Embedding-Code-2B_R is 26.5% worse than voyage-code-2 as oppposed to what is claimed in the paper.

HongLiu9903's tweet photo. Tried to reproduce the COIR results. TLDR: SFR-Embedding-Code-2B_R is 26.5% worse than voyage-code-2 as oppposed to what is claimed in the paper. https://t.co/6gQ5QLTKfq

4

13

2

1

3K

AkshayGoindani1 retweeted

Jonathan Ellis

@spyced

over 1 year ago

I ran a fresh evaluation of embedding models tuned for semantic retrieval, including the newest models from Voyage, Jina, Cohere, and NVIDIA. Link in thread.

spyced's tweet photo. I ran a fresh evaluation of embedding models tuned for semantic retrieval, including the newest models from Voyage, Jina, Cohere, and NVIDIA.

Link in thread.

5

31

5

22

12K

Akshay Goindani

@AkshayGoindani1

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users