π Excited to share: I'm on the academic job market this year! My research bridges psychology, robotics, and ML to personalize robot/AI assistance through human feedback by:
1οΈβ£ Modeling human cognition & behavior;
2οΈβ£ Developing ML & control algorithms for personalized assistance.
Excited to present our #NeurIPS2024 Oral talk! π
Enhancing Preference-based Linear Bandits via Human Response Time
Coffee or tea? If you choose instantly, you likely have a strong preference. How can AI leverage this psychological insight to better learn human preferences?
Curious? Don't think too long! Let's connect and explore how psychology drives smarter AI.
π Dec. 11, 3:30-3:50 PM PST
π Oral Session 2A: Agents (East Ballroom A, B)
π Conference Session https://t.co/0eiinUqcSL
π Paper on arXiv https://t.co/lHUWfpYWqy
Excited to present our #NeurIPS2024 Oral talk! π
Enhancing Preference-based Linear Bandits via Human Response Time
Coffee or tea? If you choose instantly, you likely have a strong preference. How can AI leverage this psychological insight to better learn human preferences?
Curious? Don't think too long! Let's connect and explore how psychology drives smarter AI.
π Dec. 11, 3:30-3:50 PM PST
π Oral Session 2A: Agents (East Ballroom A, B)
π Conference Session https://t.co/0eiinUqcSL
π Paper on arXiv https://t.co/lHUWfpYWqy
Excited to share our new work: Enhancing Preference-based Linear Bandits via Human Response Time β±οΈπ€
@edgeyyzhang, Zhaolin Ren, Prof. Na Li, @ClaireYLiang, Prof. @julie_a_shah
π https://t.co/16IjLIXnkv
We show that human response times provide information about human preference strength, and speed up preference learning. This complements existing bandit algorithms that only learn from binary choices. We demonstrate this by integrating a psychology model (the EZ-Diffusion Model) into a bandit algorithm.
#AI #MachineLearning #RLHF #HumanFeedback #psychology #Bandits #Robotics #EZDiffusionModel
Excited to share our new work: Enhancing Preference-based Linear Bandits via Human Response Time β±οΈπ€
@edgeyyzhang, Zhaolin Ren, Prof. Na Li, @ClaireYLiang, Prof. @julie_a_shah
π https://t.co/16IjLIXnkv
We show that human response times provide information about human preference strength, and speed up preference learning. This complements existing bandit algorithms that only learn from binary choices. We demonstrate this by integrating a psychology model (the EZ-Diffusion Model) into a bandit algorithm.
#AI #MachineLearning #RLHF #HumanFeedback #psychology #Bandits #Robotics #EZDiffusionModel