Marcell Fekete @v4rmer - Twitter Profile

v4rmer retweeted

about 1 year ago

📢 Only 6.5 weeks left in our evaluation campaign. It is not too late to jump in! Consider submitting to the first Shared Task in Creole Language Machine Translation 🌐🗣️🤖 Register here: https://t.co/zRBEcMIFBS

0

3

1

268

v4rmer retweeted

Nathan Godey @nthngdy

over 1 year ago

I'm now officially looking for a post-doc position, starting in Spring! I would be happy to pursue my work on training dynamics, interpretability, or even more specifically on LM representations and character-level models Feel free to reach out!

0

8

5

0

859

v4rmer retweeted

Nathaniel R. Robinson @robinson_n8

over 1 year ago

Got to present our work in progress on leveraging adapters for machine translation of Creole languages @mrl2024_emnlp #EMNLP2024 🚀 Stay tuned for more on Creole language MT! https://t.co/d6Wze0VLW5 @v4mer @heather_nlp @prajdabre1 @johannesbjerva

robinson_n8's tweet photo. Got to present our work in progress on leveraging adapters for machine translation of Creole languages @mrl2024_emnlp #EMNLP2024 🚀 Stay tuned for more on Creole language MT!
https://t.co/d6Wze0VLW5

@v4mer @heather_nlp @prajdabre1 @johannesbjerva https://t.co/EQbVxGZHR2

1

62

5

3

3K

Marcell Fekete @v4rmer

over 1 year ago

Thanks to @Carlsbergfondet, @aauTECH, @CompSciAAU, @dkw_aau for supporting our work! 6/🧶

0

2

0

48

Who to follow

Michael Hanna

@michaelwhanna

PhD student at the University of Amsterdam / ILLC, interested in computational linguistics and (mechanistic) interpretability.

Karolina Stanczak

@karstanczak

Postdoc @ETH @ETH_en @ETH_AI_Center | Previous Postdoc @Mila_Quebec & @mcgillu & PhD candidate @uni_copenhagen @CopeNLU

Alexis Palmer

@lexicutioner

Computational linguist, CU Boulder Ling. Low-resource & endangered languages, lang documentation, computational discourse and semantics. Musician. she/her

Marcell Fekete @v4rmer

over 1 year ago

New #NLProc paper on ArXiv! Language models acquire broad linguistic knowledge, but how do they generalise between grammatical constructions? We (with @johannesbjerva) present our in-depth investigation grounded in linguistic theory. https://t.co/m1O7NdS73m 1/📖

2

9

2

0

482

Marcell Fekete @v4rmer

over 1 year ago

Our work demonstrates the value of linguistic analysis in an NLP context, helping to find out how language models encode and separate their knowledge about language. With this, the generalisation capabilities of models can be improved, favouring low-resource settings. 5/🪡

1

0

53

v4rmer retweeted

babyLM @babyLMchallenge

almost 2 years ago

BabyLM is looking for organizers to join our team! If you are a mid-stage graduate student, interested in (sample-efficient) language modeling or cognitive science, you could be a great fit! Find out more information, and fill out our interest form here: https://t.co/xH3k73TzCF

babyLMchallenge's tweet photo. BabyLM is looking for organizers to join our team! If you are a mid-stage graduate student, interested in (sample-efficient) language modeling or cognitive science, you could be a great fit!
Find out more information, and fill out our interest form here:
https://t.co/xH3k73TzCF https://t.co/IWuFVocNLx

1

23

8

7

3K

Marcell Fekete @v4rmer

almost 2 years ago

Completed my research stay at @EdinburghUni, supported by Otto Mønsteds Fond! Investigated linguistic variation using multi-agent communication. Gained insights, networked with top researchers, and made new friends! @PontiEdoardo @johannesbjerva @aautech @CompSciAAU @EdinburghNLP

v4rmer's tweet photo. Completed my research stay at @EdinburghUni, supported by Otto Mønsteds Fond! Investigated linguistic variation using multi-agent communication. Gained insights, networked with top researchers, and made new friends! @PontiEdoardo @johannesbjerva @aautech @CompSciAAU @EdinburghNLP https://t.co/jHOXfl5z8d

3

22

1

2K

v4rmer retweeted

Iñigo Alonso @alonsonlp

about 2 years ago

Reimagining table representation! In our new #ACL2024NLP paper we introduce PixT3: a family of image-based Table-to-Text Generation models that scale better at generating text from large tables, outperforming traditional text-based baselines. https://t.co/wCylW9DdsE

1

21

9

3

5K

v4rmer retweeted

Edoardo Ponti @PontiEdoardo

about 2 years ago

Today I am joining @nvidia part-time as a visiting professor I could not imagine a better place to explore new efficient architectures for LLMs and diffusion I am looking forward to collaborating with so many talented researchers!

12

218

3

22

20K

v4rmer retweeted

Benjamin Minixhofer

@bminixhofer

about 2 years ago

Introducing Zero-Shot Tokenizer Transfer (ZeTT) ⚡ ZeTT frees language models from their tokenizer, allowing you to use any model with any tokenizer, with little or no extra training. Super excited to (finally!) share the first project of my PhD🧵

bminixhofer's tweet photo. Introducing Zero-Shot Tokenizer Transfer (ZeTT) ⚡

ZeTT frees language models from their tokenizer, allowing you to use any model with any tokenizer, with little or no extra training.

Super excited to (finally!) share the first project of my PhD🧵 https://t.co/lSqdvZ3VUR

29

723

143

480

90K

v4rmer retweeted

Irina Saparina @irisaparina

over 2 years ago

Next week I’ll be in Malta 🇲🇹 to present our work on Improving Generalization in Semantic Parsing by Increasing Natural Language Variation at #EACL2024! 1/3

irisaparina's tweet photo. Next week I’ll be in Malta 🇲🇹 to present our work on Improving Generalization in Semantic Parsing by Increasing Natural Language Variation at #EACL2024!

1/3 https://t.co/mJmzpKd0JW

1

36

6

4

3K

v4rmer retweeted

Piotr Nawrot

@p_nawrot

over 2 years ago

The memory in Transformers grows linearly with the sequence length at inference time. In SSMs it is constant, but often at the expense of performance. We introduce Dynamic Memory Compression (DMC) where we retrofit LLMs to compress their KV cache while preserving performance and vastly surpassing GQA! The throughput of Llama 2 7B/13B/70B increases by up to 370% on a H100 GPU. Paper: https://t.co/uUnh4g92VX Code and models are coming soon! @AdrianLancucki @PontiEdoardo @nvidia @EdinburghNLP

p_nawrot's tweet photo. The memory in Transformers grows linearly with the sequence length at inference time.

In SSMs it is constant, but often at the expense of performance.

We introduce Dynamic Memory Compression (DMC) where we retrofit LLMs to compress their KV cache while preserving performance and vastly surpassing GQA!

The throughput of Llama 2 7B/13B/70B increases by up to 370% on a H100 GPU.

Paper: https://t.co/uUnh4g92VX
Code and models are coming soon!

@AdrianLancucki @PontiEdoardo
@nvidia @EdinburghNLP

10

456

73

292

80K

v4rmer retweeted

Edoardo Ponti @PontiEdoardo

over 2 years ago

Can open-source LLMs execute *chains of instructions* in a single query? Not so well, we found. However, they can learn this ability by: - augmenting examples from public SFT mixtures with chains of instructions automatically - performing *sequential instruction tuning* on them. This boosts performance in: - reasoning tasks - multilingual tasks (translate then answer) - multimodal tasks (caption then answer) 📄 https://t.co/K1wBdQZz55 🧑‍💻 https://t.co/xohy9ZPEGM @huhanxu1 @pinzhen_chen @EdinburghNLP

PontiEdoardo's tweet photo. Can open-source LLMs execute *chains of instructions* in a single query? Not so well, we found.

However, they can learn this ability by:
- augmenting examples from public SFT mixtures with chains of instructions automatically
- performing *sequential instruction tuning* on them.

This boosts performance in:
- reasoning tasks
- multilingual tasks (translate then answer)
- multimodal tasks (caption then answer)

📄 https://t.co/K1wBdQZz55
🧑‍💻 https://t.co/xohy9ZPEGM
@huhanxu1 @pinzhen_chen @EdinburghNLP

1

88

22

40

10K

v4rmer retweeted

Esther @EstherPloeger

over 2 years ago

New #NLProc paper on ArXiv! More and more papers in NLP claim to evaluate on ‘typologically diverse’ languages. But what does this even mean? In our new paper (with Wessel Poelman, @mdlhx and @johannesbjerva), we systematically such claims. https://t.co/yW7yJtgCsx 1/🧵

2

42

9

10

9K

v4rmer retweeted

Johannes Bjerva @johannesbjerva

over 2 years ago

Our latest typology #NLProc paper was accepted to @eaclmeeting main, with Emi who visited us from @Mila_Quebec @McGillU, and @EstherPloeger. We derive continuous word order features from treebanks, better reflecting the variability of language: https://t.co/YrCM8l5VFp @CompSciAAU

2

22

8

5

9K

v4rmer retweeted

Johannes Bjerva @johannesbjerva

over 2 years ago

Interested in a ph.d. position in NLP, in the beautiful city of Copenhagen? We're hiring for a project on the topic of Explainability and Factuality in Language Modelling, at AAU Copenhagen. @aautech @CompSciAAU #NLProc #LLM Apply via this link: https://t.co/FmhCvm1GRP

johannesbjerva's tweet photo. Interested in a ph.d. position in NLP, in the beautiful city of Copenhagen?
We're hiring for a project on the topic of Explainability and Factuality in Language Modelling, at AAU Copenhagen. @aautech @CompSciAAU #NLProc #LLM Apply via this link: https://t.co/FmhCvm1GRP https://t.co/wot6J8Naoe

1

39

12

3

6K

Marcell Fekete

@v4rmer

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users