Michigan SLED Lab

Computer Science and Engineering at Michigan @UMichCSE

20 days ago · Ann Arbor

PhDone :)

181

14K

SLED_AI retweeted

3 months ago

🎉 Congrats to @ziqiao_ma on receiving the Towner Prize for Outstanding PhD Research! 🏆 His work in grounded language, embodied AI, and human-AI interaction is pushing the field forward. Full story: https://t.co/6m9KSnLMrv #AI #MachineLearning #UMich #CSE

UMichCSE's tweet photo. 🎉 Congrats to @ziqiao_ma on receiving the Towner Prize for Outstanding PhD Research! 🏆 His work in grounded language, embodied AI, and human-AI interaction is pushing the field forward.

Full story: https://t.co/6m9KSnLMrv
#AI #MachineLearning #UMich #CSE https://t.co/Dl6N7qyCFy

646

SLED_AI retweeted

Yinpei Dai @YinpeiD

3 months ago

Robot memory methods are growing fast, but systematic evaluation is largely lacking. 📉 Introducing RoboMME: a new benchmark for memory-augmented robotic manipulation! 🤖🧠 Featuring 16 tasks across temporal, spatial, object, and procedural memory 🔗 https://t.co/4ELtnhDwrt

236

153

71K

SLED_AI retweeted

6 months ago

NEPA: Next-Embedding Predictive Autoregression A simple objective for visual SSL and generative pretraining. Instead of reconstructing pixels or predicting discrete tokens, we train an autoregressive model to predict the next embedding given all previous embeddings. Key ideas: - One self-supervised signal: cosine-style next-embedding prediction - Autoregression runs directly on the embeddings from a native encoder (no offline encoder) - No pixel decoder (and loss), no contrastive pairs, no task-specific heads, no random masks Scales into modern ViT backbones and stays competitive after supervised fine-tuning: - ImageNet-1K (Base 83.8%; Large 85.3%) - ADE20K Fully open-sourced with reproducibility verified: - Homepage: https://t.co/9SoEshkeiO - Paper: https://t.co/gwILijFBDQ - Code: https://t.co/GaAWvK4tQT - Weights: https://t.co/pTNAMtVPwf This work is led by @6SihanXu and advised by @SLED_AI, @sainingxie, and Stella X. Yu. Contributors: me, @wenhaocha1, @ChenXuweiyi, and @JinWeiyang18434.

ziqiao_ma's tweet photo. NEPA: Next-Embedding Predictive Autoregression

A simple objective for visual SSL and generative pretraining. Instead of reconstructing pixels or predicting discrete tokens, we train an autoregressive model to predict the next embedding given all previous embeddings.

Key ideas:
- One self-supervised signal: cosine-style next-embedding prediction
- Autoregression runs directly on the embeddings from a native encoder (no offline encoder)
- No pixel decoder (and loss), no contrastive pairs, no task-specific heads, no random masks

Scales into modern ViT backbones and stays competitive after supervised fine-tuning:
- ImageNet-1K (Base 83.8%; Large 85.3%)
- ADE20K

Fully open-sourced with reproducibility verified:
- Homepage: https://t.co/9SoEshkeiO
- Paper: https://t.co/gwILijFBDQ
- Code: https://t.co/GaAWvK4tQT
- Weights: https://t.co/pTNAMtVPwf

This work is led by @6SihanXu and advised by @SLED_AI, @sainingxie, and Stella X. Yu. Contributors: me, @wenhaocha1, @ChenXuweiyi, and @JinWeiyang18434.

764

103

523

179K

Who to follow

technical staff @thinkymachines; less technical stuff @aclmentorship; phd @umich; views are my own

Shane Storks, PhD

@shanestorks

NLP researcher studying reasoning and grounding. Research Fellow in CogSci @umichLSA, incoming Assistant Professor of CS @EMUcas_. he/him 🏳️‍🌈

Laura Biester

@lbiester23

Assistant Professor @Middlebury College PhD, MS @UMich | BA @CarletonCollege Natural Language Processing + Computational Social Science

SLED_AI retweeted

Freda Shi @fredahshi

6 months ago

The first workshop on Computational Developmental Linguistics, collocated with ACL 2026, is welcoming submissions! Topics of interest include but are not limited to computational methods for developmental linguistics and language model learning dynamics. Program details below:

fredahshi's tweet photo. The first workshop on Computational Developmental Linguistics, collocated with ACL 2026, is welcoming submissions! Topics of interest include but are not limited to computational methods for developmental linguistics and language model learning dynamics.

Program details below: https://t.co/OD4drDKQUe

31K

SLED_AI retweeted

MichiganAI @michigan_AI

6 months ago

Here's how to babysit a language model from scratch! Research by @ziqiao_ma, Zekun Wang & Joyce Chai shows that interactive language learning with teacher demonstrations and student trials, can facilitate efficient word learning in language models: https://t.co/nmhheDXjIa

469

SLED_AI retweeted

Yidong Huang

@owenhuang117

6 months ago

🚨 Excited to share SketchVerify — a framework that scales trajectory planning for video generation. ➡️ Sketch-level motion previews let us search dozens of trajectory candidates instantly — without paying the cost of the time-consuming diffusion process. ➡️ A multimodal physics+instruction verifier filters out bad trajectories, so only the best trajectory plan makes it to generation. ➡️ The final videos show far more natural, physically-coherent motion, achieved through fast, scalable test-time planning. 🧵[1/6]

26K

SLED_AI retweeted

6 months ago

Will be at #NeurIPS2025 (San Diego) Dec 1-9, then in the Bay Area until the 14th. Hmu if you wanna grab coffee and talk about totally random stuff. Thread with a few things I’m excited about. P.S. 4 NeurIPS papers all started pre-May 2024 and took ~1 year of polishing...so proud of the team!

ziqiao_ma's tweet photo. Will be at #NeurIPS2025 (San Diego) Dec 1-9, then in the Bay Area until the 14th. Hmu if you wanna grab coffee and talk about totally random stuff.

Thread with a few things I’m excited about.
P.S. 4 NeurIPS papers all started pre-May 2024 and took ~1 year of polishing...so proud of the team!

107

18K

SLED_AI retweeted

7 months ago

Still wrapping up a few reality-check experiments and polishing the tutorial structure ... but we're excited! P.S. Sadly the ARC-AGI team can't join the tutorial panel this time due to conflict of schedule, but they’ll be with us at the @LAW2025_NeurIPS later in the NeurIPS program. Stay tuned :)

8 months ago

👀

8 months ago

I’ve always wanted to write an open-notebook research blog to (i) show the chain of thought behind how we formed hypotheses, designed experiments, and articulated findings, and (ii) lay out all the intermediate results that did not make it into the final paper, including negative ones that we believe others will find interesting. https://t.co/h7jb0Kr03w So @fredahshi and I wrote about grounding, a topic that sparks a lot of debate, both in VLM engineering and in linguistics and philosophy. Our latest work shows from a mechanistic interpretability perspective that symbol grounding can naturally emerge from mid-layer aggregate heads, without requiring fine-grained supervision or any special architectural inductive bias. Feel free to check it out. We tried to find a balance between open-notebook transparency and readability. It looks best on desktop since I gave up on engineering that CSS file.😅

ziqiao_ma's tweet photo. I’ve always wanted to write an open-notebook research blog to (i) show the chain of thought behind how we formed hypotheses, designed experiments, and articulated findings, and (ii) lay out all the intermediate results that did not make it into the final paper, including negative ones that we believe others will find interesting.

https://t.co/h7jb0Kr03w

So @fredahshi and I wrote about grounding, a topic that sparks a lot of debate, both in VLM engineering and in linguistics and philosophy. Our latest work shows from a mechanistic interpretability perspective that symbol grounding can naturally emerge from mid-layer aggregate heads, without requiring fine-grained supervision or any special architectural inductive bias.

Feel free to check it out. We tried to find a balance between open-notebook transparency and readability. It looks best on desktop since I gave up on engineering that CSS file.😅

229

179

27K

212

SLED_AI retweeted

Hua Shen✨

@huashen218

8 months ago

Thrilled to share that our paper “Towards Bidirectional Human-AI Alignment” has been accepted to #NeurIPS2025 (Position Track)! 🎉 👫<>🤖We argue for an explicit reflection on what we mean by “alignment”, and to take into account the bidirectional, dynamic interactions between humans and AI to achieve truly responsible and safe AI systems. 🧠+ if you’re generally interested in “alignment”, don’t miss our #NeurIPS2025 Tutorial on “Human-AI Alignment: Foundations, Methods, Practice, and Challenges” , with amazing @mitchellgordon & @adamfungi — more details coming soon! - 💎 NeurIPS 2025 Position Paper: https://t.co/VAATH8nDJS - 📚 NeurIPS 2025 Tutorial: https://t.co/dPsfUKjT1y 💗 Huge thanks to our incredible co-authors — this was our 3rd resubmission — your persistent support and encouragement made it happen! Big thanks to everyone in our ICLR & CHI 2025 BiAlign workshops — your enthusiasm keeps us believing we’re doing something right for our community.🙏 ☕️👯‍♀️I’m attending #COLM2025 at Montreal this week, happy to chat more if you’re around! Also, we (w/ multiple co-authors) will present our #BiAlign paper in-person @SanDiego -- catch us at #NeurIPS2025, we’d love to hear your thoughts and join discussions!

huashen218's tweet photo. Thrilled to share that our paper “Towards Bidirectional Human-AI Alignment” has been accepted to #NeurIPS2025 (Position Track)! 🎉

👫<>🤖We argue for an explicit reflection on what we mean by “alignment”, and to take into account the bidirectional, dynamic interactions between humans and AI to achieve truly responsible and safe AI systems.

🧠+ if you’re generally interested in “alignment”, don’t miss our #NeurIPS2025 Tutorial on “Human-AI Alignment: Foundations, Methods, Practice, and Challenges” , with amazing @mitchellgordon & @adamfungi — more details coming soon!

- 💎 NeurIPS 2025 Position Paper: https://t.co/VAATH8nDJS
- 📚 NeurIPS 2025 Tutorial: https://t.co/dPsfUKjT1y

💗 Huge thanks to our incredible co-authors — this was our 3rd resubmission — your persistent support and encouragement made it happen! Big thanks to everyone in our ICLR & CHI 2025 BiAlign workshops — your enthusiasm keeps us believing we’re doing something right for our community.🙏

☕️👯‍♀️I’m attending #COLM2025 at Montreal this week, happy to chat more if you’re around! Also, we (w/ multiple co-authors) will present our #BiAlign paper in-person @SanDiego -- catch us at #NeurIPS2025, we’d love to hear your thoughts and join discussions!

127

36K

8 months ago

Vision-Language Models Are Not Pragmatically Competent in Referring Expression Generation. By @ziqiao_ma @JaneDing_AI @eva_xuejunzhang @carrot0817_ @jiahed0322 @6SihanXu @ngutinhyc @RoihnPeng Joyce Chai https://t.co/RcGuCwKSAX

139

8 months ago

Excited to share 2 papers from SLED at #COLM2025: - Bootstrapping Visual Assistant Modeling with Situated Interaction Simulation. - Vision-Language Models Are Not Pragmatically Competent in Referring Expression Generation. Come talk to us at the poster sessions!

153

8 months ago

Bootstrapping Visual Assistant Modeling with Situated Interaction Simulation. By @594zyc @RoihnPeng Lingyun Wu @YinpeiD @ChenXuweiyi @qiaozikl Joyce Chai https://t.co/jYoovaEydD

105

SLED_AI retweeted

8 months ago

Over the past few months, I’ve heard the same complaint from nearly every collaborator working on computational cogsci + behavioral and mechanistic interpretability: “Open-source VLMs are a pain to run, let alone analyze.” We finally decided to do something about it (thanks @fredahshi @jzhou_jz for initiating and co-leading this). VLM-Lens (to appear in #EMNLP2025 demo) is a toolkit for systematically analyzing and interpreting open-source VLMs. [Code] https://t.co/r6nQLWBYzg [Paper] https://t.co/52mBTBufS7 VLM-Lens abstracts away model-specific complexity and gives you fine-grained access to any internal representation across 16 VLMs and their 30+ variants. It is built on these simple principles: 1. Unified access: consistent API across all models 2. YAML-configurable: minimal model-specific code needed 3. Extensible: add new models in a few lines 4. Interpretability-ready: plug into probes, PCA (more under dev) Let’s make VLM research a little easier to see through. :)

ziqiao_ma's tweet photo. Over the past few months, I’ve heard the same complaint from nearly every collaborator working on computational cogsci + behavioral and mechanistic interpretability:

“Open-source VLMs are a pain to run, let alone analyze.”

We finally decided to do something about it (thanks @fredahshi @jzhou_jz for initiating and co-leading this).

VLM-Lens (to appear in #EMNLP2025 demo) is a toolkit for systematically analyzing and interpreting open-source VLMs.

[Code] https://t.co/r6nQLWBYzg
[Paper] https://t.co/52mBTBufS7

VLM-Lens abstracts away model-specific complexity and gives you fine-grained access to any internal representation across 16 VLMs and their 30+ variants. It is built on these simple principles:
1. Unified access: consistent API across all models
2. YAML-configurable: minimal model-specific code needed
3. Extensible: add new models in a few lines
4. Interpretability-ready: plug into probes, PCA (more under dev)

Let’s make VLM research a little easier to see through. :)

181

125

23K

SLED_AI retweeted

Yinpei Dai @YinpeiD

10 months ago

Thanks @_akhaliq for sharing our work! Aim and Grasp! AimBot introduces a new design to leverage visual cues for robots - similar to scope reticles in shooting games. Let's equip your VLA models with low-cost visual augmentation for better manipulation! https://t.co/bSXPzml51B

SLED_AI retweeted

Jayjun Lee

@jayjunleee

10 months ago

Thanks @_akhaliq for posting our work! And I'm happy to share that AimBot 🎯 is accepted to CoRL 2025 @corl_conf! See you in Seoul! Project webpage: https://t.co/9YR9ialqNw Thanks to my amazing co-lead @YinpeiD, co-authors, and our advisors @NimaFazeli7, @SLED_AI

SLED_AI retweeted

10 months ago

Excited to announce the #NeurIPS2025 Workshop on Bridging Language, Agent, and World Models for Reasoning and Planning (LAW) https://t.co/RTA0xqsknf The LAW 2025 workshop brings together Language models, Agent models, and World models (L-A-W). It aims to spark bold conversations around whether LLMs possess internal world models, how we can build more grounded and generalizable agents, and what it takes to go beyond today’s capabilities. Huge thanks to our amazing co-organizers and speakers, and to @LambdaAPI for generously sponsoring us! Join us at LAW 2025 as we chart the future of AI systems that can reason, act, and adapt in complex, ever-changing environments.

SLED_AI retweeted