Deema @deema_cs - Twitter Profile

Pinned Tweet

over 1 year ago

🚨 New paper alert🧵 Can we detect and mitigate LLMs’ hallucinations before they happen? 🤔 🚀 Introducing FactCheckmate ♔— a lightweight framework for preemptive hallucination detection and mitigation in LMs using LMs' internal representations ✨ Paper: https://t.co/IFWiXiQs8j

deema_cs's tweet photo. 🚨 New paper alert🧵
Can we detect and mitigate LLMs’ hallucinations before they happen? 🤔
🚀 Introducing FactCheckmate ♔— a lightweight framework for preemptive hallucination detection and mitigation in LMs using LMs' internal representations ✨
Paper: https://t.co/IFWiXiQs8j https://t.co/qbHY971zHk

6

92

17

50

14K

Deema @deema_cs

about 1 month ago

@yacinelearning gotta admit the choice of the cover picture is brilliant

0

1

0

26

Deema @deema_cs

7 months ago

@yataobian Interesting read, it reminds me of this work https://t.co/kOxGLNsb7X

0

111

deema_cs retweeted

Siddarth

@siddarthv66

7 months ago

> Be AI PhD student > Submit paper to conference > LLM slop reviews > Rejected > Concurrent paper with same method accepted > Resubmit to next conference > Reviewer points to concurrent paper which was accepted by last conference > Lack of novelty > Rejected

32

2K

58

143

90K

Who to follow

IT

@ksu_it

The official twitter account for the Information Technology Department, College of Computer and Information Sciences, King Saud University

Nadia Al-Ghreimil

@AlGhreimil

‏‏ما يسيرني في الحياة: ١- المسلم يحب لأخيه المسلم ما يحب لنفسه ٢- ��ن الله يحب إذا عمل أحدكم عملا أن يتقنه ٣- التفاؤل والإيجابية PS: Pics I post are taken by me

Areej Al-Wabil

@_areej

Interaction Design #HCI #UX and #XR ▪ Director of Alfaisal's AI Research Center @AlfaisalUniv ▪ Ibn Khaldun Fellow @MIT

Deema @deema_cs

about 1 year ago

@iOmarCS يستهبلون احس

1

0

121

deema_cs retweeted

ajay @ajayyy4y

over 1 year ago

are you ready to date a model?

78

2K

70

107

96K

Deema @deema_cs

over 1 year ago

@FaisalDilaijan Rating feature would help alot 😂

0

1

0

169

deema_cs retweeted

Rohan Paul

@rohanpaul_ai

over 1 year ago

Can we detect and mitigate hallucinations before they happen? Internal neural patterns reveal hallucination risks before LLMs generate false outputs. Basically Neural networks leak early warning signals when they're about to hallucinate Original Problem 🔍: LLMs frequently generate false or misleading information (hallucinations). Current solutions only detect hallucinations after they occur, adding significant overhead and missing opportunities to understand why they happen. ----- Solution in this Paper ⚡: • FactCheckMATE: A system that detects and prevents hallucinations before they occur • Uses lightweight binary classifier analyzing model's hidden states from middle transformer layers The system uses a lightweight binary classifier that takes the LM's hidden states as input and predicts hallucination probability. It averages the hidden states from middle transformer layers and passes them through a ReLU-MLP followed by a sigmoid function. • When hallucination detected, adjusts hidden states using intervention model • Works across multiple model families (Llama, Mistral, Gemma) • Requires minimal computational overhead (3.16 seconds per inference) ----- Key Insights from this Paper 💡: • LLMs' internal representations contain predictive signals for hallucinations • Middle layers of transformers show strongest hallucination detection capability • Preemptive intervention is more efficient than post-hoc correction • Hidden states can be steered to produce more factual outputs ----- Results 📊: • Over 70% preemptive detection accuracy across QA datasets • 34.4% improvement in factual output generation after intervention • Tested on multiple datasets: NQ-open (Wikipedia), MMLU (STEM), MedMCQA (medical) • Consistent performance across different model sizes (7B to 13B parameters) • Average inference overhead: 3.16 seconds

rohanpaul_ai's tweet photo. Can we detect and mitigate hallucinations before they happen?

Internal neural patterns reveal hallucination risks before LLMs generate false outputs.

Basically Neural networks leak early warning signals when they're about to hallucinate

Original Problem 🔍:

LLMs frequently generate false or misleading information (hallucinations). Current solutions only detect hallucinations after they occur, adding significant overhead and missing opportunities to understand why they happen.

-----

Solution in this Paper ⚡:

• FactCheckMATE: A system that detects and prevents hallucinations before they occur

• Uses lightweight binary classifier analyzing model's hidden states from middle transformer layers

The system uses a lightweight binary classifier that takes the LM's hidden states as input and predicts hallucination probability. It averages the hidden states from middle transformer layers and passes them through a ReLU-MLP followed by a sigmoid function.

• When hallucination detected, adjusts hidden states using intervention model

• Works across multiple model families (Llama, Mistral, Gemma)

• Requires minimal computational overhead (3.16 seconds per inference)

-----

Key Insights from this Paper 💡:

• LLMs' internal representations contain predictive signals for hallucinations

• Middle layers of transformers show strongest hallucination detection capability

• Preemptive intervention is more efficient than post-hoc correction

• Hidden states can be steered to produce more factual outputs

-----

Results 📊:

• Over 70% preemptive detection accuracy across QA datasets

• 34.4% improvement in factual output generation after intervention

• Tested on multiple datasets: NQ-open (Wikipedia), MMLU (STEM), MedMCQA (medical)

• Consistent performance across different model sizes (7B to 13B parameters)

• Average inference overhead: 3.16 seconds

2

26

3

14

2K

Deema @deema_cs

over 1 year ago

@Yazanmuteb ماشاء الله وين المكان👀

1

7

0

104

Deema @deema_cs

over 1 year ago

@gadahoth 🎤🫳

0

1

0

40

deema_cs retweeted

Ahmad Beirami

@abeirami

over 1 year ago

The question that a reviewer should ask themselves is: Does this paper take a gradient step in a promising direction? Is the community better off with this paper published? If the answer is yes, then the recommendation should be to accept.

6

221

18

27K

Deema @deema_cs

over 1 year ago

@XuanmingZhang07 Unfortunately, I am not. Have fun!

0

1

0

31

Deema @deema_cs

over 1 year ago

@Owainaa It was a cleaning day for me as well. Champaign misses you, tho 🐿️

1

0

58

Deema @deema_cs

over 1 year ago

@faisalbmahfoodh The test set is 1.8k datapoints in size, the repo will be released.

0

24

Deema @deema_cs

over 1 year ago

🚨 New paper alert🧵 Can we detect and mitigate LLMs’ hallucinations before they happen? 🤔 🚀 Introducing FactCheckmate ♔— a lightweight framework for preemptive hallucination detection and mitigation in LMs using LMs' internal representations ✨ Paper: https://t.co/IFWiXiQs8j

6

92

17

50

14K

deema_cs retweeted

Neeraja Kirtane @NeerajaKirtane

over 1 year ago

Excited to share our new work, FactCheckmate ♔: a lightweight framework for preemptive hallucination detection and mitigation in LMs using their internal representations. Huge thanks to collaborators @deema_cs @MKhalifaaaa and advisor @haopeng_nlp Paper: https://t.co/kLvlaW5xwf

1

30

3

2

4K

deema_cs retweeted

Muhammad Khalifa

@MKhalifaaaa

over 1 year ago

Inference-time techniques are just beginning to reveal their full potential. Great work by @deema_cs and @NeerajaKirtane!

0

10

3

2

1K

Deema @deema_cs

over 1 year ago

Many thanks to the wonderful collaborators @neerajakirtane and @mkhalifaaaa, and advisor @haopeng_nlp Paper Link: https://t.co/IFWiXiQs8j

0

2

1

347

Deema @deema_cs

over 1 year ago

With FactCheckmate ♔, we aim to open new avenues for understanding LMs by studying their internal workings and providing tools to create more reliable and truthful outputs.

1

0

315

Deema

@deema_cs

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users