timo moeller

over 3 years ago

🧵 Generative models have taken the world of NLP by storm. But LLMs do not know about your personal data. This makes personal assistants, enterprise knowledge management and many other applications challenging. Retrieval augmented pipelines are the answer 👇 #nlp #llm

1

84

20

36

25K

timomo1234 retweeted

about 4 years ago

Big news 🎉 Announcing deepset Cloud early access! But also - a funding round led by @GVteam! More here https://t.co/P5pUwolA04 Thanks everyone who helped us to achieve this - our community, our customers, our partners, friends and families🖖 #machinelearning #opensource #nlproc

0

36

14

1

0

timomo1234 retweeted

over 4 years ago

Natural Language Processing with Transformers by @_lewtun @lvwerra & @Thom_Wolf - great new book about #nlproc with @huggingface's Transformers by @OReillyMedia (& there's a chapter on #questionanswering w/ #haystack :) https://t.co/M6gpfMF1dH #python #opensource #machinelearning

0

64

17

9

0

Who to follow

Creators of the @Haystack_AI open source AI Orchestration framework and Enterprise Platform, helping organizations build reliable production AI.

Tuana

@tuanacelik

DevRel & AI Engineering at @prior_labs (ex LlamaIndex) from Istanbul ☀️ in Amsterdam 🚲 Posts about AI/ML and occasionally other random tidbits.

Sofie Van Landeghem

@OxyKodit

NLP engineer & open-source developer (FastAPI, Typer). Passionate about data / modeling / AI. Details over at: https://t.co/qLMba6tDhk

over 4 years ago

@srchvrs @stefan_it_ @BramVanroy @deepset_ai Bram and me had trouble with both training from scratch and finetuning to downstream tasks. But there are of course Albert models that are fine-tuned on downstream and work well... For fine-tuning on downstream you can of course iterate much faster on different hyperparameters.

1

3

1

0

over 4 years ago

@saitej786 @deepset_ai @philipvollet Hey Sai, I guess you mean a tutorial for the extraction of tables from PDF? We don't yet have a dedicated tutorial for this, but this test shows you how to use the TableExtractor Parsr (we also have a connector to Azure in the test above): https://t.co/S83wxx9Yd6

1

4

1

0

timomo1234 retweeted

over 4 years ago

1/3 Big news!! After more than 750 pull requests from 89 contributors and 19 months since our first release in May 2020 we are happy to announce the 1.0 release of #Haystack! 🎉🎁🖖 https://t.co/gUEVZISoko (release notes here: https://t.co/UJCsHD1Wgb)

11

175

43

16

0

timomo1234 retweeted

Shay Banon @kimchy

about 5 years ago

love seeing the innovation of deepset on top of @elastic, its already very exciting, still more work left on our end to make Elasticsearch and Lucene even better for such use cases

0

17

3

0

about 5 years ago

@Nils_Reimers @huggingface Congratulations Nils! Looking forward seeing some well performing sentence-transformers with native HF transformers support. And much more on the IR front!

0

2

0

timomo1234 retweeted

about 5 years ago

Super happy and proud to release #GermanQuAD and #GermanDPR! https://t.co/smdWVlSBOF https://t.co/cU8XKpRf7T

1

14

7

0

Open Source 101 Series @OS101Series

over 5 years ago

@OS101Series @rusic_milos @deepset_ai Proud to have Milos share some of our insights into open source. We are excited to combine what is good for everybody (access to latest technology) with a profitable business model for ourselves. Win win

0

1

0

timomo1234 retweeted

over 5 years ago

We're excited to have Milos Rusic (@rusic_milos), CEO of @deepset_ai, presenting "Building a Machine Learning Company Around an Open Source Project - Insights into Strategy, Culture and Process" at #OS101! https://t.co/dvHuUp9QLz

OS101Series's tweet photo. We're excited to have Milos Rusic (@rusic_milos), CEO of @deepset_ai, presenting "Building a Machine Learning Company Around an Open Source Project - Insights into Strategy, Culture and Process" at #OS101! https://t.co/dvHuUp9QLz https://t.co/MhUDycwcBL

1

7

6

0

timomo1234 retweeted

David Mimno @dmimno

over 8 years ago

Optimist: AI has achieved human-level performance! Realist: “AI” is a collection of brittle hacks that, under very specific circumstances, mimic the surface appearance of intelligence. Pessimist: AI has achieved human-level performance.

75

9K

3K

47

0

timomo1234 retweeted

over 5 years ago

Happy to announce our new SOTA German BERT and ELECTRA language models! Trained together with Stefan Schweter and already available on @huggingface's model hub: 👉 deepset/gbert-base 👉 deepset/gbert-large 👉 deepset/gelectra-base 👉 deepset/gelectra-large (1/2)

1

66

16

3

0

timomo1234 retweeted

almost 6 years ago

Thrilled to release Haystack 0.3.0! - Dense Passage Retrieval - Evaluation of the whole Retriever-Reader-Pipeline - Indexing of PDF / Docx - Better integration with transformers & @huggingface model hub - More #QA Models ... 👉 https://t.co/z3COprIKwc #NLP #QuestionAnswering

1

10

5

1

0

timomo1234 retweeted

about 6 years ago

Excited to release #Haystack incl. core features for a practical #QA system: 📈 Scalable backend (Elasticsearch) 🚀 Fast Retrievers (BM25, Embeddings ...) 👓 Flexible Readers (@huggingface's Transformers / FARM) 🔄 API for Inference & Feedback 👉🏻 Code: https://t.co/bwBzWKUmvd

deepset_ai's tweet photo. Excited to release #Haystack incl. core features for a practical #QA system:

📈 Scalable backend (Elasticsearch)
🚀 Fast Retrievers (BM25, Embeddings ...)
👓 Flexible Readers (@huggingface's Transformers / FARM)
🔄 API for Inference & Feedback

👉🏻 Code: https://t.co/bwBzWKUmvd https://t.co/k8IgmnMV5y

2

55

12

8

0

over 6 years ago

@seb_ruder - arabic - SOQAL: https://t.co/IYJAuVdTqb - MLQA also open sourced their automatically translated train and test sets: https://t.co/WakyLjHSY3 We are currently working on an overview over non-English QA datasets. Will let you know once it is finished

0

5

2

1

0

almost 7 years ago

@dl_weekly Hey, we would love our framework FARM featured in your newsletter. The framework makes Transfer Learning in NLP easy - you can find it here: https://t.co/zvAY6S3UfD FARM lets non-NLP experts create PoCs with their own data and showcase it to their colleagues.

0

almost 7 years ago

@javifreemind @seb_ruder @huggingface @explosion_ai @deepset_ai Our model already converged to decent performance after 50k steps and batch size 1024 - we observed a loss curve flattening out. You could try training locally for a week and see if the same happens to your Spanish model's loss. Good luck : )

0