Arthur Câmara @ArthurCamara - Twitter Profile

over 1 year ago

@bclavie @minhash Hmmm then we can keep only the positives in the top-K from both methods? This is also a good excuse to make the code I used available and add it as the first issue. 😬 I will try to do it later today.

1

0

43

Arthur Câmara @ArthurCamara

over 1 year ago

Thanks @minhash for clearing one of the long-standing things on my to-do list! NanoBEIR scores should be higher than the full BEIR scores (it is only a subset of the full corpora, so it's "easier") But it should correlate quite well with the full scores.

Bhavnick Minhas

@minhash

over 1 year ago

When cooking your own embedding model, it's necessary to have a quick evaluation set to validate your ideas. That's what I was in need of when trying my own set of experiments, when I found @ZetaVector's NanoBEIR set. It's perfect! A subset of BEIR to validate ideas on~ Though one thing missing for me to use it was how correlated were scores on NanoBEIR to those of BEIR? I didn't find this metric on their blog, so I decided to calculate it myself with a few models. Generally, from what I see on a limited set of models that offered BEIR scores publically and calculating their NanoBEIR scores myself, the correlation is ~99%, which is great! The scores come out to be on the higher end usually, so that score can't be compared against BEIR score, but to check on what works and what doesn't, it's good enough. [ Then again, STSBenchmark scores are said to be ~70% correlated too—which was my previous "quick" evaluation set.

minhash's tweet photo. When cooking your own embedding model, it's necessary to have a quick evaluation set to validate your ideas.

That's what I was in need of when trying my own set of experiments, when I found @ZetaVector's NanoBEIR set. It's perfect! A subset of BEIR to validate ideas on~

Though one thing missing for me to use it was how correlated were scores on NanoBEIR to those of BEIR? I didn't find this metric on their blog, so I decided to calculate it myself with a few models.

Generally, from what I see on a limited set of models that offered BEIR scores publically and calculating their NanoBEIR scores myself, the correlation is ~99%, which is great!

The scores come out to be on the higher end usually, so that score can't be compared against BEIR score, but to check on what works and what doesn't, it's good enough.

[ Then again, STSBenchmark scores are said to be ~70% correlated too—which was my previous "quick" evaluation set.

2

18

4

8

3K

2

7

0

758

Arthur Câmara @ArthurCamara

over 1 year ago

@bclavie @minhash That and the fact that it has so many positives that it would make the corpus too large compared to the other datasets. It didn’t help that the size was also breaking my pipeline. 😅 I spent about a week trying to make it work on a single A100

1

2

0

62

Arthur Câmara @ArthurCamara

over 1 year ago

We created it by randomly sampling 50 queries per dataset. The corpora are the set of all positives and the intersection of the top-100 documents retrieved by Pyserini's BM25 and and Arctic-Embed-1.5-m.

0

3

0

1

147

Who to follow

Claudia Hauff 🇪🇺 🇺🇦 🇩🇪 🇳🇱

@CharlotteHase

IR * Applied NLP * Applied ML. Spotify.

Mohammad Aliannejadi

@maliannejadi

Assistant Professor at UvA @UvA_IvI. Information Retrieval, Conversation Search, Crowdsourcing.

Craig Macdonald

@craig_macdonald

Professor of Information Retrieval

Arthur Câmara @ArthurCamara

over 1 year ago

@minhash @ZetaVector Hey, thanks for the shout-out! Indeed, the scores are supposed to be higher (it is an easier set after all, with way less documents than the full collection), but the scores to BEIR should correlate quite well!

1

2

0

140

Arthur Câmara @ArthurCamara

over 1 year ago

Visualizing agents as a state machine (literally as an FSM) is actually a nice and more realistic abstraction.

Han

@HanchungLee

over 1 year ago

state machines. pipelines are typically acyclic and stateless. workflows as fsms are much more robust and capable.

9

96

10

16

14K

0

4

0

288

ArthurCamara retweeted

Logan Kilpatrick

@OfficialLoganK

over 1 year ago

The world is going to look shockingly similar in 5 years, despite massive technological innovation enabled by AI.

277

2K

101

278

237K

ArthurCamara retweeted

Doug Turnbull

@softwaredoug

over 1 year ago

I need a meeting response that's "I'm attending but working at the same time. Say my name 3 times to summon me for a question"

2

3

1

0

457

Arthur Câmara @ArthurCamara

over 1 year ago

Amazing opportunity that I would take in a heartbeat if I were a student!

tomaarsen @tomaarsen

over 1 year ago

I'm looking for an intern to introduce Sparse Embedding models to Sentence Transformers! If you're passionate about open source, interested in helping practitioners use your tools, and enjoy embedders/retrievers/rerankers, then I'd love to hear from you! Links to apply in 🧵

tomaarsen's tweet photo. I'm looking for an intern to introduce Sparse Embedding models to Sentence Transformers! If you're passionate about open source, interested in helping practitioners use your tools, and enjoy embedders/retrievers/rerankers, then I'd love to hear from you!

Links to apply in 🧵 https://t.co/IIXlzYTJ0n

5

116

18

45

16K

0

2

0

143

Arthur Câmara @ArthurCamara

over 1 year ago

@itsaflecha Aqui no noroeste tá de boa

1

0

98

Arthur Câmara @ArthurCamara

over 1 year ago

@Robro612 I’ve been bullish on listwise learning for a while. Too bad there are not many training datasets with multiple relevance annotations per query. Synthetic data to the rescue?

0

2

0

54

Arthur Câmara @ArthurCamara

over 1 year ago

@charliermarsh No, thank YOU for uv (and ruff, of course)

0

2

0

48

Arthur Câmara @ArthurCamara

over 1 year ago

TBH, uv is one of the best things that happened in the Python ecosystem recently.

Charlie Marsh

@charliermarsh

over 1 year ago

The latest uv release includes support for conflicting dependencies across optional groups. A subtle but very powerful feature. For example: use the PyTorch CPU build with `uv sync --extra cpu` and the CUDA build with `uv sync --extra gpu`. All powered by a single lockfile.

charliermarsh's tweet photo. The latest uv release includes support for conflicting dependencies across optional groups.

A subtle but very powerful feature.

For example: use the PyTorch CPU build with `uv sync --extra cpu` and the CUDA build with `uv sync --extra gpu`. All powered by a single lockfile. https://t.co/dAwa8QmaDt

6

299

37

54

19K

1

37

0

1K

Arthur Câmara @ArthurCamara

over 1 year ago

Tell me you don’t have kids without telling me you don’t have kids.

INTERIOR PORN

@INTERIORPORN1

over 1 year ago

I just fell in love with this living space

1K

66K

5K

8K

19M

1

4

0

1

277

Arthur Câmara @ArthurCamara

over 1 year ago

@GergelyOrosz Of course, if you just open the chat, ask the full question to the LLM and copy-and-paste their answer without critical thinking, that’s a red flag, and we will ask you to throughly explain how that code works. Other than that, all good.

0

1

0

36

Arthur Câmara @ArthurCamara

over 1 year ago

@GergelyOrosz We explicitly tell people to use whatever tools they are used to, including cursor/copilot/etc. The reasoning is the same. Everyone here uses it. You will probably use it. We just want to see how you think through the problem and how you come to an answer.

1

0

288

Arthur Câmara @ArthurCamara

over 1 year ago

@beirmug @tomaarsen @Robro612 @JinaAI_ I was writing an answer to you on LinkedIn, my phone’s battery died and I forgot to get back to it, sorry about that. 😅

2

0

115

Arthur Câmara @ArthurCamara

over 1 year ago

@tomaarsen @Robro612 @JinaAI_ I can take a deeper look into correlation with other models later this week. It’s something I really want to do, just didn’t had the time to finish yet.

0

2

0

51

Arthur Câmara @ArthurCamara

over 1 year ago

@tomaarsen @Robro612 @JinaAI_ Yes! So, we sampled using the Anserini’s BM25 and arctic-embed-1.5-m and the positives, of course. We haven’t tested the correlation extensively, but for 7B parameters, the correlation was really good. I can share the code we used to create the dataset and the numbers we have.

1

4

0

75

ArthurCamara retweeted

tomaarsen @tomaarsen

over 1 year ago

I just released Sentence Transformers v3.3.0 & it's huge! 4.5x speedup for CPU with OpenVINO int8 static quantization, training with prompts for a free perf. boost, PEFT integration, evaluation on NanoBEIR, and more! Full release notes: https://t.co/n7Vgbk5LCv Details in 🧵

tomaarsen's tweet photo. I just released Sentence Transformers v3.3.0 & it's huge! 4.5x speedup for CPU with OpenVINO int8 static quantization, training with prompts for a free perf. boost, PEFT integration, evaluation on NanoBEIR, and more!

Full release notes: https://t.co/n7Vgbk5LCv

Details in 🧵 https://t.co/UZJdOx6mVn

7

223

41

91

15K

Arthur Câmara

@ArthurCamara

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users