Top Tweets for #KnowledgeEditing

9 months ago

Excited that our work got accepted at NeurIPS 2025! 🎉 Join us in exploring LLM mechanisms & steering to better understand how these models behave #NeurIPS2025 #Steering #LLMs #NLP #KnowledgeEditing. Paper: https://t.co/SASAgimMqk

about 1 year ago

We introduce Reinforcing "Cognitive Experts" – a new approach to enhance reasoning in MoE-based Large Reasoning Models (LRMs) 🌟. Thanks to Tencent's support, we had the opportunity to explore the inner workings of ultra-large models like DeepSeek-R1-671B and Qwen3-235B. By selectively amplifying (steering) just two cognitive experts, we can influence the model's reasoning behavior to improve performance without extra training. Paper: https://t.co/YBlTrtrWrJ 🔍 Key highlights: ✅ Identifying cognitive experts using nPMI through linguistic markers like the <think> token ✅ No additional training or supervision required Our technique is just an early exploration of how expert manipulation can steer model reasoning. LLMs are complex systems, and results may vary or not align with expectations. It’s like giving the model a little nudge! 🤖💡 Our approach strengthens certain "cognitive experts" inside the model, helping it solve problems more effectively. Think of it as stimulating the brain of a giant model to make better decisions! 🧠✨ The "cognitive expert" introduced in this work is a hypothetical concept. 🤔 Given the complexity of LRMs, we offer no theoretical justification for its existence—our conclusions are purely empirical. More research is needed to further explore this intriguing idea!🔍👩‍🔬 #AI #MoE #ModelEditing #KnowledgeEditing #Steering #NLP #LLM

zxlzr's tweet photo. We introduce Reinforcing "Cognitive Experts" – a new approach to enhance reasoning in MoE-based Large Reasoning Models (LRMs) 🌟.

Thanks to Tencent's support, we had the opportunity to explore the inner workings of ultra-large models like DeepSeek-R1-671B and Qwen3-235B.

By selectively amplifying (steering) just two cognitive experts, we can influence the model's reasoning behavior to improve performance without extra training.

Paper: https://t.co/YBlTrtrWrJ

🔍 Key highlights:

✅ Identifying cognitive experts using nPMI through linguistic markers like the <think> token

✅ No additional training or supervision required

Our technique is just an early exploration of how expert manipulation can steer model reasoning. LLMs are complex systems, and results may vary or not align with expectations.

It’s like giving the model a little nudge! 🤖💡 Our approach strengthens certain "cognitive experts" inside the model, helping it solve problems more effectively.

Think of it as stimulating the brain of a giant model to make better decisions! 🧠✨

The "cognitive expert" introduced in this work is a hypothetical concept. 🤔 Given the complexity of LRMs, we offer no theoretical justification for its existence—our conclusions are purely empirical. More research is needed to further explore this intriguing idea!🔍👩‍🔬 #AI #MoE #ModelEditing #KnowledgeEditing #Steering #NLP #LLM

1

64

12

31

37K

5

105

14

45

12K

10 months ago

📍 Find us at ACL 2025 – Hall 5X, Poster #83 🌐 More details & resources: https://t.co/nqzLMmDxrt See you there! #ACL2025 #ModelEditing #KnowledgeEditing

10 months ago

🛑 Stop using teacher forcing to evaluate model editing! Our ACL 2025 poster shows why past evaluations mislead progress & how to test editing in the wild. 📍 July 30, 11:00 AM – come chat! #ModelEditing #LLM #ACL2025NLP

fei__sun's tweet photo. 🛑 Stop using teacher forcing to evaluate model editing!

Our ACL 2025 poster shows why past evaluations mislead progress & how to test editing in the wild.

📍 July 30, 11:00 AM – come chat!

#ModelEditing #LLM #ACL2025NLP https://t.co/0S1L49vMFw

0

3

1

640

0

4

0

302

Boris Djordjevic

@longevityboris

11 months ago

Weight editors like ROME & MEMIT can overwrite facts, and ELDER stacks LoRA adapters for lifelong tweaks—but “ripple effects” still flip 20-30 % of related answers. #KnowledgeEditing

longevityboris's tweet photo. Weight editors like ROME & MEMIT can overwrite facts, and ELDER stacks LoRA adapters for lifelong tweaks—but “ripple effects” still flip 20-30 % of related answers.
#KnowledgeEditing https://t.co/UQrIYJ1kj0

1

0

22

Xinye Li @vclee8

about 1 year ago

Why this matters: 1️⃣ Reveals hidden limitations of current KE methods 2️⃣ Provides realistic evaluation for KE in procedural planning scenarios 3️⃣ Open-source benchmark enables future research Code: https://t.co/iyGV8hzeC5 #KnowledgeEditing #LLMs

1

0

68

about 1 year ago

Great post! Thanks for featuring the figure from our #ACL2025 paper “The Mirage of Model Editing.”https://t.co/aur7otRQN7 Glad to see the community's attention to evaluation challenges in knowledge editing. #LLMs #ModelEditing #KnowledgeEditing

Yunzhi Yao @yyzTodd

about 1 year ago

🚨 New Blog Drop! 🚀 "Reflection on Knowledge Editing: Charting the Next Steps" is live! 💡 Ever wondered why knowledge editing in LLMs still feels more like a lab experiment than a real-world solution? In this post, we dive deep into where the research is thriving — and where it's falling short. From foundational breakthroughs to the practical roadblocks no one’s talking about, we connect the dots and propose what’s needed to move forward. Join the conversation! #KnowledgeEditing #LLMs #AI #ModelEditing 📌 If you're working on LLMs, model updates, or mechanism interpretability, you don’t want to miss this. 👉 Read the full post: https://t.co/7sp8xYVDyq Key insights from our analysis: 0⃣ Current evaluation metrics and benchmarks inadequately assess knowledge updates in LRMs, highlighting the need for more comprehensive evaluation frameworks. 1⃣ Scaling challenges persist, with significant memory and computational constraints limiting the practical application of editing methods for larger or quantized local models. 🎁 Resource Release: To support the research community, we release covariance matrices for Qwen2.5-32B & QwQ-32B models for the current locate-and-edit methods. 2⃣ We outline promising research directions for developing language models that can effectively learn, adapt, and evolve their knowledge base. Huge thanks to the brilliant collaborators who made this deep dive into #ModelEditing possible! @uclanlp @CanyuChen3 @Jiachen_Gu @dsmall2apple1 @ManlingLi_ @VioletNPeng

0

39

16

8

5K

1

2

0

367

Wanli Yang @10k_miles_yang

about 1 year ago

We need more practical evaluation for #modelEditing #KnowledgeEditing 🤔 #LLM #AI #ACL2025

about 1 year ago

😯To assess the real-world effectiveness of model editing techniques, we evaluated them on practical QA tasks and found that current editing methods perform substantially worse than previously reported (38.5% vs. 96%).

10k_miles_yang's tweet photo. 😯To assess the real-world effectiveness of model editing techniques, we evaluated them on practical QA tasks and found that current editing methods perform substantially worse than previously reported (38.5% vs. 96%). https://t.co/CSITp9w6GE

0

2

0

236

0

3

0

159

about 1 year ago

This is a systematic study on technical AGI safety and security. Interpretability techniques like steering vectors and circuit analysis can help us understand and improve LLM safety—but they can also be misused. #Safety #ModelEditing #KnowledgeEditing #LLM #NLP

Séb Krier

@sebkrier

about 1 year ago

Excited to share @GoogleDeepMind's AGI safety and security strategy to tackle risks like misuse and misalignment. Rather than high-level principles, this 145-page paper outlines a concrete, defense-in-depth technical approach: proactively evaluating & restricting dangerous capabilities, implementing Amplified Oversight & robust training methods, all backed by system-level security.

sebkrier's tweet photo. Excited to share @GoogleDeepMind's AGI safety and security strategy to tackle risks like misuse and misalignment. Rather than high-level principles, this 145-page paper outlines a concrete, defense-in-depth technical approach: proactively evaluating & restricting dangerous capabilities, implementing Amplified Oversight & robust training methods, all backed by system-level security.

19

484

89

295

44K

2

11

0

2

912

Shumin Deng @dsmall2apple1

over 1 year ago

Congrats to our team for winning 2nd place in the SEMEval 2025 Challenge on Unlearning Sensitive Content from Large Language Models! @SemEvalWorkshop Congrats to Haoming @HaomingX1874 and all the team members! 🎉 🎉🏆 #SEMEval2025 #AI #LLM #Semeval #Unlearning #KnowledgeEditing The code will be released at https://t.co/OoPtf76GWg

zxlzr's tweet photo. Congrats to our team for winning 2nd place in the SEMEval 2025 Challenge on Unlearning Sensitive Content from Large Language Models! @SemEvalWorkshop Congrats to Haoming @HaomingX1874 and all the team members! 🎉 🎉🏆 #SEMEval2025 #AI #LLM #Semeval #Unlearning #KnowledgeEditing

The code will be released at https://t.co/OoPtf76GWg

0

17

2

0

3K

over 1 year ago

Impressive summary and outlook of #KnowledgeEditing 😇

over 1 year ago

Over the past year, #KnowledgeEditing has experienced rapid development. As the new year begins, I’ve taken some time to reflect on the progress of this field and share my thoughts on its future directions. I look forward to discussing and collaborating with everyone to further advance this area. 🛠 Progress in Knowledge Editing: 1. Scenarios: In addition to updating the knowledge of LLMs, many works have begun exploring knowledge editing as a means to control model behavior, promoting safer and more controllable generation while enabling capabilities like unlearning. 2. Side Effects: Many works have started to reflect on the fundamental causes of the side effects of knowledge editing and have explored various methods to mitigate them. Editing LLMs (parameter-altering) can lead to overfitting, where models assign disproportionately high importance to edited content and disrupt attention mechanisms, reducing generalization and general abilities. Whether the model has truly updated its relevant knowledge remains questionable. 3. Practicality: While knowledge editing has expanded to fields like software engineering and multimodal tasks, its real-world impact remains limited. 💡 Key Reflections: 1. The field's foundational goal—knowledge updates—has seen limited success outside areas like AI safety. This raises questions about how to better align methods with practical needs. 2. Mechanism research is lagging. Without clear insights into why knowledge editing works (or doesn’t), efforts to improve models risk being akin to “blind men describing an elephant.” 📈 Future Directions: 1. Evaluation: We need a set of metrics/benchmarks to evaluate whether an edited LLM behaves properly, that is, to achieve a balance between generalization and side effects. 2. Steering: Steering vectors (with SAE) are emerging as a promising approach for interventions in model behaviors, particularly in domains like safety and personality alignment. These methods demonstrate the potential to achieve precise control with minimal impact on overall model performance. Furthermore, they may pave the way for bridging the gap between prompts and model parameter updates, enabling prompt-driven, parameterized behavior adjustments within the model. 3. Agent Memory Updates: The debate between symbolic and parametric memory for AI agents is ongoing. Knowledge editing techniques can offer a unified approach to memory updates, bridging the gap between updating both the model's internal memory and external memory. Memory updates may enhance reasoning capabilities over the long term, fostering the gradual evolution of System 2-like slow thinking processes. 4. Mechanism Interpretation: Deepening our understanding of model mechanisms is essential. Currently, research on the mechanisms of LLMs—such as neurons and circuits—lacks systematic exploration. It also fails to explain phenomena like the dynamic acquisition and forgetting of knowledge, as well as higher-order cognitive behaviors such as slow-thinking reasoning. 5. Interdisciplinary: Drawing inspiration from cognitive/brain science, we may: design the next generation of model architectures and model updating paradigms; potentially simulate human brain behavior based on neural networks to construct an electronic digital twin brain, enabling better solutions (e.g., neuromodulation) to problems in neuroscience and cognitive science. If one day machines truly awaken to self-awareness, understanding their mechanisms and having the means to control them will be a critically important technology. 🎉 Exciting News: We’re thrilled to announce that EasyEdit2 is currently in development! This next-generation toolkit will integrate steering capabilities to enable control over model behavior. Stay tuned for updates, and we welcome the community to explore and contribute: https://t.co/yqAX1kPbOA Let’s continue pushing the boundaries of #KnowledgeEditing, tackling its challenges, and exploring its vast potential to redefine AI adaptability and usability. #LLM #AI #NLP #EasyEdit #LLM #ModelEditing #KnowledgeEditing

zxlzr's tweet photo. Over the past year, #KnowledgeEditing has experienced rapid development. As the new year begins, I’ve taken some time to reflect on the progress of this field and share my thoughts on its future directions. I look forward to discussing and collaborating with everyone to further advance this area.

🛠 Progress in Knowledge Editing:

1. Scenarios: In addition to updating the knowledge of LLMs, many works have begun exploring knowledge editing as a means to control model behavior, promoting safer and more controllable generation while enabling capabilities like unlearning.

2. Side Effects: Many works have started to reflect on the fundamental causes of the side effects of knowledge editing and have explored various methods to mitigate them. Editing LLMs (parameter-altering) can lead to overfitting, where models assign disproportionately high importance to edited content and disrupt attention mechanisms, reducing generalization and general abilities. Whether the model has truly updated its relevant knowledge remains questionable.

3. Practicality: While knowledge editing has expanded to fields like software engineering and multimodal tasks, its real-world impact remains limited.

💡 Key Reflections:

1. The field's foundational goal—knowledge updates—has seen limited success outside areas like AI safety. This raises questions about how to better align methods with practical needs.

2. Mechanism research is lagging. Without clear insights into why knowledge editing works (or doesn’t), efforts to improve models risk being akin to “blind men describing an elephant.”

📈 Future Directions:

1. Evaluation: We need a set of metrics/benchmarks to evaluate whether an edited LLM behaves properly, that is, to achieve a balance between generalization and side effects.

2. Steering: Steering vectors (with SAE) are emerging as a promising approach for interventions in model behaviors, particularly in domains like safety and personality alignment. These methods demonstrate the potential to achieve precise control with minimal impact on overall model performance. Furthermore, they may pave the way for bridging the gap between prompts and model parameter updates, enabling prompt-driven, parameterized behavior adjustments within the model.

3. Agent Memory Updates: The debate between symbolic and parametric memory for AI agents is ongoing. Knowledge editing techniques can offer a unified approach to memory updates, bridging the gap between updating both the model's internal memory and external memory. Memory updates may enhance reasoning capabilities over the long term, fostering the gradual evolution of System 2-like slow thinking processes.

4. Mechanism Interpretation: Deepening our understanding of model mechanisms is essential. Currently, research on the mechanisms of LLMs—such as neurons and circuits—lacks systematic exploration. It also fails to explain phenomena like the dynamic acquisition and forgetting of knowledge, as well as higher-order cognitive behaviors such as slow-thinking reasoning.

5. Interdisciplinary: Drawing inspiration from cognitive/brain science, we may: design the next generation of model architectures and model updating paradigms; potentially simulate human brain behavior based on neural networks to construct an electronic digital twin brain, enabling better solutions (e.g., neuromodulation) to problems in neuroscience and cognitive science.

If one day machines truly awaken to self-awareness, understanding their mechanisms and having the means to control them will be a critically important technology.

🎉 Exciting News:

We’re thrilled to announce that EasyEdit2 is currently in development! This next-generation toolkit will integrate steering capabilities to enable control over model behavior. Stay tuned for updates, and we welcome the community to explore and contribute:

https://t.co/yqAX1kPbOA

Let’s continue pushing the boundaries of #KnowledgeEditing, tackling its challenges, and exploring its vast potential to redefine AI adaptability and usability.

#LLM #AI #NLP #EasyEdit #LLM #ModelEditing #KnowledgeEditing

2

44

3

9

5K

0

2

0

278

over 1 year ago

Over the past year, #KnowledgeEditing has experienced rapid development. As the new year begins, I’ve taken some time to reflect on the progress of this field and share my thoughts on its future directions. I look forward to discussing and collaborating with everyone to further advance this area. 🛠 Progress in Knowledge Editing: 1. Scenarios: In addition to updating the knowledge of LLMs, many works have begun exploring knowledge editing as a means to control model behavior, promoting safer and more controllable generation while enabling capabilities like unlearning. 2. Side Effects: Many works have started to reflect on the fundamental causes of the side effects of knowledge editing and have explored various methods to mitigate them. Editing LLMs (parameter-altering) can lead to overfitting, where models assign disproportionately high importance to edited content and disrupt attention mechanisms, reducing generalization and general abilities. Whether the model has truly updated its relevant knowledge remains questionable. 3. Practicality: While knowledge editing has expanded to fields like software engineering and multimodal tasks, its real-world impact remains limited. 💡 Key Reflections: 1. The field's foundational goal—knowledge updates—has seen limited success outside areas like AI safety. This raises questions about how to better align methods with practical needs. 2. Mechanism research is lagging. Without clear insights into why knowledge editing works (or doesn’t), efforts to improve models risk being akin to “blind men describing an elephant.” 📈 Future Directions: 1. Evaluation: We need a set of metrics/benchmarks to evaluate whether an edited LLM behaves properly, that is, to achieve a balance between generalization and side effects. 2. Steering: Steering vectors (with SAE) are emerging as a promising approach for interventions in model behaviors, particularly in domains like safety and personality alignment. These methods demonstrate the potential to achieve precise control with minimal impact on overall model performance. Furthermore, they may pave the way for bridging the gap between prompts and model parameter updates, enabling prompt-driven, parameterized behavior adjustments within the model. 3. Agent Memory Updates: The debate between symbolic and parametric memory for AI agents is ongoing. Knowledge editing techniques can offer a unified approach to memory updates, bridging the gap between updating both the model's internal memory and external memory. Memory updates may enhance reasoning capabilities over the long term, fostering the gradual evolution of System 2-like slow thinking processes. 4. Mechanism Interpretation: Deepening our understanding of model mechanisms is essential. Currently, research on the mechanisms of LLMs—such as neurons and circuits—lacks systematic exploration. It also fails to explain phenomena like the dynamic acquisition and forgetting of knowledge, as well as higher-order cognitive behaviors such as slow-thinking reasoning. 5. Interdisciplinary: Drawing inspiration from cognitive/brain science, we may: design the next generation of model architectures and model updating paradigms; potentially simulate human brain behavior based on neural networks to construct an electronic digital twin brain, enabling better solutions (e.g., neuromodulation) to problems in neuroscience and cognitive science. If one day machines truly awaken to self-awareness, understanding their mechanisms and having the means to control them will be a critically important technology. 🎉 Exciting News: We’re thrilled to announce that EasyEdit2 is currently in development! This next-generation toolkit will integrate steering capabilities to enable control over model behavior. Stay tuned for updates, and we welcome the community to explore and contribute: https://t.co/yqAX1kPbOA Let’s continue pushing the boundaries of #KnowledgeEditing, tackling its challenges, and exploring its vast potential to redefine AI adaptability and usability. #LLM #AI #NLP #EasyEdit #LLM #ModelEditing #KnowledgeEditing

2

44

3

9

5K

Ofir Lindenbaum

@Ofirlin

over 1 year ago

Results: State-of-the-art performance across datasets! Big shoutout to my amazing students @AmitRozner and Barak Battas for their amazing work! Join us and check out the paper here 👉 https://t.co/UzBniuNr2b #EMNLP #AIResearch #KnowledgeEditing #MachineLearning #NLP

1

3

1

0

231

Ofir Lindenbaum

@Ofirlin

over 1 year ago

Please check our paper for more details: https://t.co/UzBniuNr2b Looking forward to #EMNLP2024! 🚀 #AI #MachineLearning #KnowledgeEditing #LanguageModels

0

1

0

296

prod42net @prod42net

over 1 year ago

🧠 Discover the latest in enhancing Large Language Models with innovative Knowledge Editing Techniques by Mike Young. Learn how KME can update models efficiently without losing valuable knowledge. #AI #NLP #KnowledgeEditing 📚 https://t.co/riSpGaCuuG

0

1

0

18

Bony Bean @bonybean

over 1 year ago

Introducing OneEdit: A groundbreaking neural-symbolic system offering seamless integration and conflict resolution in knowledge graphs and large language models. Read the full blog post at: https://t.co/fofmVcxZdI #AI #KnowledgeEditing #NeuralSymbolic

0

2

0

34

almost 2 years ago

Curious about what knowledge circuits in large language models look like? We've done some initial visualizations 🌐 #AI #MachineLearning #KnowledgeCircuits #KnowledgeEditing #NLP #LLM. Check them out here: https://t.co/keAjPxVzcw Title: Knowledge Circuits in Pretrained Transformers ArXiv: https://t.co/MmEwJhTyxb Code: https://t.co/a0cfoFpYxM

Managetech inc. @managetech_inc

about 2 years ago

Curious about how large language models store and utilize knowledge? What do the neurons actually learn? 🎉 Exciting insights from our latest research on "Knowledge Circuits in Pretrained Transformers"! Discover how pretrained transformers store and use knowledge, enhancing interpretability and editing approaches. #AI #ML #NLP #LLMs #KnowledgeEditing #ModelEditing #Circuit ArXiv: https://t.co/MmEwJhTyxb Code & Data (will be released soon): https://t.co/a0cfoFpYxM 🔍 Dive into the core of how transformers manage knowledge with "Knowledge Circuits", a framework tracking information flow and interactions within language models. 📊 Our preliminary exploration reveals: 1. Circuits may be responsible for specific knowledge representation and storage. 2. Insights into how current knowledge editing methods manipulate model knowledge and their limitations in multi-hop scenarios. 3. Special attention heads identified in context learning and hallucination. 🧠 Our experiments provide a tantalizing glimpse into the synergy between MLPs and attention heads for robust knowledge representation! This is just the beginning! Knowledge Circuits holds huge potential for advancing transformer interpretability and precise knowledge editing, leading to safer and more reliable AI applications. Join the discussion and share your thoughts!

zxlzr's tweet photo. Curious about how large language models store and utilize knowledge? What do the neurons actually learn?

🎉 Exciting insights from our latest research on "Knowledge Circuits in Pretrained Transformers"! Discover how pretrained transformers store and use knowledge, enhancing interpretability and editing approaches. #AI #ML #NLP #LLMs #KnowledgeEditing #ModelEditing #Circuit

ArXiv: https://t.co/MmEwJhTyxb
Code & Data (will be released soon): https://t.co/a0cfoFpYxM

🔍 Dive into the core of how transformers manage knowledge with "Knowledge Circuits", a framework tracking information flow and interactions within language models.

📊 Our preliminary exploration reveals:
1. Circuits may be responsible for specific knowledge representation and storage.
2. Insights into how current knowledge editing methods manipulate model knowledge and their limitations in multi-hop scenarios.
3. Special attention heads identified in context learning and hallucination.

🧠 Our experiments provide a tantalizing glimpse into the synergy between MLPs and attention heads for robust knowledge representation!

This is just the beginning! Knowledge Circuits holds huge potential for advancing transformer interpretability and precise knowledge editing, leading to safer and more reliable AI applications.

Join the discussion and share your thoughts!

9

260

53

234

40K

1

29

3

4

2K

almost 2 years ago

研修後に法学修士号の知識を編集すると、厄介な波及効果が生じる理由 #LLM #knowledgeediting #rippleeffects #GradSim https://t.co/nwCHIKvmel

0

1

0

34

Managetech inc. @managetech_inc

almost 2 years ago

研修後に法学修士号の知識を編集すると、厄介な波及効果が生じる理由 #KnowledgeEditing #RippleEffects #LanguageModels #ScienceX https://t.co/mFzPUfSzzM

0

1

0

15