Introducing Harness-1, a 20B search agent trained with a state-externalizing harness.
> frontier-level long-horizon search, rivaling Opus-4.6 and outperforming GPT-5.4
> Context-1-level cost and latency
> externalizes candidates, evidence, verification, and search history
> open-source
CoRank accepted at KDD 2026 ๐๐
Really grateful to all the collaborators from @dmguiuc โ @PatrickXu565299, Bowen @BowenJin13, Prof. Seongku Kang, and Prof. Jiawei Han โค๏ธโค๏ธ
๐จ New Paper Alert! ๐จ
Excited to share our latest work:
Retrieval is Cheap, Show Me the Code: Executable Multi-Hop Reasoning for Retrieval-Augmented Generation ๐๐๐ค
See you in San Diego for @aclmeeting#ACL2026NLP! ๐ฅ
Thrilled to share that our paper has been accepted. Iโm looking forward to sharing more about our research soon!
๐ฐNew preprint: How can we build a task-agnostic plug-and-play memory module for LLM agents that supports multiple memory types?
We present PlugMem๐๐ง , a plugin memory module that works across tasks by turning heterogeneous experience into knowledge.
Evaluated unchanged on long-term dialogue๐ฃ๏ธ, multi-hop QA๐ต๏ธ, and web agents๐ธ๏ธ๐ค, PlugMem improves performance while using far fewer memory tokens.
๐Paper: https://t.co/A8tNQjkCCb
๐จCode: https://t.co/mt1aJKxQIz
Maintaining agent performance over long horizons remains challengingโlargely because memory systems fail to associate latent context with intent.
๐ Introducing our paper: Grounding Agent Memory in Contextual Intent. STITCH achieves 35.6% gains on our new CAME-Bench.
๐ฃ Excited to share #DeepRetrieval - our novel approach using reinforcement learning for query augmentation in information retrieval!
๐ Our preliminary results (we got on Feb 16) CRUSH previous SOTA:
60.8% vs 24.7% recall on PubMed search engine
70.8% vs 32.1% recall on ClinicalTrial search engine
with a SMALLER model (3B vs 7B)
๐กNO supervision data:
- [no๐ฐ] vs [๐ฐ๐ฐ๐ฐ๐ฐ...] on creating augmented queries from ChatGPT/Claude!
๐ป Github: https://t.co/8RW7P7jyuO
๐ Preliminary Technical Report: https://t.co/KIxjlNJLVv
๐ฌ Currently testing on general IR datasets and with dense retrieval methods
๐ Full paper with more results will be released soon.
Just created this X account to share this breakthrough - follow for more NLP+IR research! #NLP #IR #MachineLearning #LLM #AAAI2025
๐ Introducing ๐ฆ๐ฒ๐ฎ๐ฟ๐ฐ๐ต-๐ฅ๐ญ โ the first ๐ฟ๐ฒ๐ฝ๐ฟ๐ผ๐ฑ๐๐ฐ๐๐ถ๐ผ๐ป ๐ผ๐ณ ๐๐ฒ๐ฒ๐ฝ๐๐ฒ๐ฒ๐ธ-๐ฅ๐ญ (๐๐ฒ๐ฟ๐ผ) for training reasoning and search-augmented LLM agents with reinforcement learning!
This is a step towards training an ๐ผ๐ฝ๐ฒ๐ป-๐๐ผ๐๐ฟ๐ฐ๐ฒ ๐ข๐ฝ๐ฒ๐ป๐๐ โ๐๐ฒ๐ฒ๐ฝ ๐ฟ๐ฒ๐๐ฒ๐ฎ๐ฟ๐ฐ๐ตโ via RL.
Our ๐ฏ๐ ๐ฏ๐ฎ๐๐ฒ ๐๐๐ ๐โincluding not just ๐ค๐๐ฒ๐ป ๐ฎ.๐ฑ but also ๐๐น๐ฎ๐บ๐ฎ ๐ฏ.๐ฎโlearn to ๐ฟ๐ฒ๐ฎ๐๐ผ๐ป and ๐ฐ๐ฎ๐น๐น ๐๐ฒ๐ฎ๐ฟ๐ฐ๐ต ๐ฒ๐ป๐ด๐ถ๐ป๐ฒ๐ all on their own!
Everything will be ๐ณ๐๐น๐น๐ ๐ผ๐ฝ๐ฒ๐ป ๐๐ผ๐๐ฟ๐ฐ๐ฒ. Stay tuned!
Code: https://t.co/oWVQke0t4H
Experimental logs: https://t.co/W1zD0EVDNc
#R1 #deepresearch #deepseek
๐Excited to share "InstructG2I: Synthesizing Images from Multimodal Attributed Graphs" has been accepted by @NeurIPSConf 2024!
https://t.co/bFAxkInycC
We propose a graph-conditioned stable diffusion model for image generation. GO and PLAY with it!
#graph#diffusion#neurips
๐Successfully defended my Ph.D. thesis!
๐My deepest gratitude goes to my thesis committee members: Prof. Jiawei Han @dmguiuc, Prof. Tarek Abdelzaher, Prof. Hanghang Tong, Prof. Wei Wang @WeiWang1973, and Dr. Iris Shen!
Happy to announce that TreeInstruct got accepted to EMNLP'24! Excited to discuss the work alongside @wonderingishika as part of a joint collaboration between @dmguiuc and @convai_uiuc. See you all in Miami!
#EMNLP2024
Our alumni, Yu Meng, won the #KDD2024 Outstanding Dissertation Award!!!
Congratulations on this well-earned distinction, Yu ! ๐
@yumeng0818@kdd_news
๐ Join our tutorial at #KDD2024, Automated Mining of Structured Knowledge from Text with Large Language Models!
๐คPresented by @YunyiZhang10, @Siru_Ouyang, Professor Jiawei Han.
๐ Aug 25, 10 AM - 1 PM CEST
๐ Room 129-130
๐ขWe have finally turned our "awesome" GitHub repository (290+ stars already) into a survey of ๐๐๐ข๐๐ง๐ญ๐ข๐๐ข๐ ๐๐๐๐ฌ and their applications in ๐๐๐ข๐๐ง๐ญ๐ข๐๐ข๐ ๐๐ข๐ฌ๐๐จ๐ฏ๐๐ซ๐ฒ! #LLM#AI4Science
Paper: https://t.co/svdZOM2sAG
GitHub: https://t.co/hwRZQNxmLq
๐Excited to share "Language Models as Semantic Indexers" is accepted to ICML 2024!
โญ๏ธWe propose to learn document semantic IDs with large language models in a self-supervised fashion.
โญ๏ธThe learned semantic IDs can benefit LLM generative recommendation and retrieval.
#LLM#IR