Epsilon is hiring, come join us!
We're looking for Research Scientists, Research Engineers, and Full Stack SWEs to help us scale vision encoders and VLMs for radiology🩻
Jobs: https://t.co/Vjf6tuNe3G
Website: https://t.co/tEA7o0NDQq
Or email me! arjun [at] epsilonlabs . ai
And I’m also excited to support my old teammates as we share our paper:
TIPSv2: Advancing Vision-Language Pretraining with Enhanced Patch-Text Alignment
https://t.co/rcm53KnutX
^ please stop our poster on Sun to learn about training vision encoders w/ patch-level text info
On the way to Denver for #CVPR2026! 🛫
Looking forward to meeting folks working in the multimodal / visual grounding / medical imaging space, and to reconnecting with old friends!
Let me know if you’ll be there and want to meet up
True multimodal AI needs to understand the world spatially 🎯
🚀 Excited to release #CVPR2026 TIPSv2 from @GoogleDeepMind, a foundational image-text encoder with spatial awareness, leading to strong overall results and massive gains on patch-text alignment. 🔥
1/N
Multimodal AI encoders often lack spatial understanding… but not anymore! Our #ICLR2025 TIPS model (Text-Image Pretraining with Spatial awareness) from @GoogleDeepMind can help 💡🚀
Check out our strong & versatile image-text encoder 💪
Paper & code: https://t.co/LCiqV4gaQ0
📢📢 We released checkpoints and Pytorch/Jax code for TIPS: https://t.co/0JUIRML8gr
Paper updated with distilled models, and more:
https://t.co/zebYMD0VFz
#ICLR2025
Excited to release a super capable family of image-text models from our TIPS #ICLR2025 paper! https://t.co/1scX7H1DIb
We have models from ViT-S to -g, with spatial awareness, suitable to many multimodal AI applications. Can’t wait to see what the community will build with them!
Just released an ONNX version of OmniGlue. No more need for TensorFlow installations, folks! 😄 Check out the comparison between the TensorFlow and ONNX versions in the image below. @arjunkarpur
Code: https://t.co/1skwudSlEz
Meet #CVPR2024 OmniGlue, the first learnable matcher designed with generalization as a core principle! Great performance on many domains, ideal for in-the-wild matching 🎯
Code available!
https://t.co/Kj0SHMQJRn
with @hanwenjiang1, @arjunkarpur, Bingyi Cao, @qixing_huang
Are you evaluating 3D reconstruction/dense correspondences on synthetic datasets because real datasets are "not accurate enough"? Check out NAVI, a dataset that offers near-perfect alignments of 3D shapes on real image collections: https://t.co/zb68qaRaw8 #NeurIPS2023 (1/2)