#VisionLanguage - Twitter Hashtag

6 days ago

Ever wanted an AI that can see and talk? Meet Qwen3VL 8B SFT, a fine tuned vision language model that understands images and text together. It's like giving your apps eyes and a brain, all in one compact package. #AI #VisionLanguage

HuggingModels's tweet photo. Ever wanted an AI that can see and talk? Meet Qwen3VL 8B SFT, a fine tuned vision language model that understands images and text together. It's like giving your apps eyes and a brain, all in one compact package. #AI #VisionLanguage https://t.co/qjE2GpStil

1

8

1

570

Madhavilatha Polsani

@Madhavilat50916

11 days ago

🤖 https://t.co/UNrWKsJA1B – Vision Language Models, served as API. Multimodal AI at scale. The definitive infrastructure domain. For startups building the next wave of computer vision. #APIVLM #MultimodalAI #VisionLanguage #AIInfrastructure #Domains

0

9

arief d. Luffy @arieftheluffy

about 2 months ago

Representation geometry shapes VLM task performance untuk CT enterography. First study memilih representational choices untuk automated IBD analysis. https://t.co/C3rH5wA9ZR #AI #Medical #VisionLanguage #CTScan

0

2

0

38

Anisha Saha @anishashhh

about 2 months ago

Delighted to share that our work "System-Mediated Attention Imbalances Make Vision-Language Models Say Yes" has been accepted to #ACL2026 Findings! 🇺🇸 📜 Paper: https://t.co/bt0hREfceU ⌨️ Code: https://t.co/KKIYMcLLdS (coming soon) #ACL2026 #VisionLanguage #VLM #MultimodalAI

0

53

LCS2 Lab @lcs2lab

2 months ago

If you're interested in multimodal reasoning, math LLMs, or vision-language models, do check out our paper! #MultimodalAI #LLMs #MathReasoning #VisionLanguage #AIForScience #VisualReasoning #SymbolicReasoning #GeometricReasoning

0

48

Vikram Sharma

@v4vix

2 months ago

Multimodal Large Language Models (MLLMs) like MMLU and SwinBert are pushing the boundaries of language and vision integration. But, as Loc3R-VLM demonstrates, they still struggle with spatial understanding & viewpoint-aware reasoning. #VisionLanguage

0

20

AI Hot Sheets @aiHotSheets

3 months ago

🔥 VLMs struggle with multi-round visual reasoning, failing to iteratively refine understanding across visual contexts. 🌊 RegionReasoner enables iterative visual understanding via region-grounded multi-round reasoning. #AI #VisionLanguage #Reasoning https://t.co/5M6Q8CmzyM

0

19

Imran Ali Shah @YWhat1132

3 months ago · Shikarpur

@alibaba_cloud 🍀🌿 Excited to see #Qwen3_5Flash live! ⚡️ Pushing the boundaries of #AI with lightning-fast #VisionLanguage models. Can’t wait to explore the future of #CloudComputing and #Innovation! 🌐💡 #AlibabaCloud #Efficiency #LLM #ArtificialIntelligence

0

60

Inventions @inventions_MDPI

3 months ago

📣New publication in #Inventions! 📑Image Captioning Using Enhanced Cross-Modal Attention with Multi-Scale Aggregation for Social Hotspot and Public Opinion Monitoring 👤Jiang, S. et al. 🔗https://t.co/2S9nR7Utj3 #DeepLearning #VisionLanguage #ImageCaptioning #MultimodalAI

inventions_MDPI's tweet photo. 📣New publication in #Inventions!

📑Image Captioning Using Enhanced Cross-Modal Attention with Multi-Scale Aggregation for Social Hotspot and Public Opinion Monitoring
👤Jiang, S. et al.

🔗https://t.co/2S9nR7Utj3

#DeepLearning #VisionLanguage #ImageCaptioning #MultimodalAI https://t.co/xrbXLM9fuS

0

13

AI Hot Sheets @aiHotSheets

3 months ago

🔥 Multimodal models memorize visuals but fail to describe them in text. This "modal aphasia" challenges unified AI. 🌊 We reveal this dissociation: models recall images but can't articulate their content. @josh_swanson_ #AI #Multimodal #VisionLanguage https://t.co/n9QiP50Li7

0

1

0

57

LocalAI @LocalAI_API

4 months ago

🚨 New model alert! 🚨 We've got Qwen3-VL-8B-Instruct & Qwen3-VL-8B-Thinking added to LocalAI! 🎉 These are 8B parameter vision-language models. Try it out: `local-ai run qwen3-vl-8b-instruct` or `local-ai run qwen3-vl-8b-thinking` 🚀 #LocalAI #Qwen3 #VisionLanguage

0

4

0

229

AIトレンド速報｜最新ニュース & 活用術

@AI_Bridge_Japan

4 months ago

@fepegar_ @MSFTResearch 医療AI分野では2D画像の研究が先行していたが、CTスキャンのような3Dボリュームデータを扱えるモデルは限られていた。 COLIPRIは胸部CT特化で、より実用的な診断支援への道を開く。 #AI #医療AI #VisionLanguage #MicrosoftResearch #HuggingFace

0

3

Long Lian

@LongTonyLian

4 months ago

👀 🏋️‍♂️ Train smarter, not just larger. VisGym’s scalable visual tasks reveal where VLMs still struggle and how to push them further. Try it out! #MachineLearning #VisionLanguage #VisGym

Zirui "Colin" Wang @zwcolin

4 months ago

🎮 We release VisGym: Diverse, Customizable, Scalable Environments for Multimodal Agents (w/ @junyi42 @aomaru_21490) 🌐 With 17 environments across multiple domains, we show systematically the brittleness of VLMs in visual interaction, and what training leads to. 🧵[1/8]

2

178

32

82

40K

0

26

1

7

3K

Riswan Ahamed @riswan_ai_2033

5 months ago

👀 AI can finally find YOUR dog in a crowded park! Researchers fine‑tuned vision‑language models with video‑tracking data, boosting personalized object localization by up to 21%. #AI #VisionLanguage #ComputerVision #ML

0

37

Riswan Ahamed @riswan_ai_2033

5 months ago

🔎 AI can finally recognize _your_ pet, not just any dog! MIT researchers use video‑tracking frames + pseudo‑names to teach models context‑based localization, lifting accuracy 21% 🚀 #AI #ComputerVision #MachineLearning #VisionLanguage

0

24

Rheeya Uppaal @RUppaal

6 months ago

Joint work with @phu_pmh, Min Bai, Nikolaos Pappas, Zheng Qi, and Sandesh Swamy. Read more below! https://t.co/7Z81ltlka9 #AI #ML #VLM #MultimodalAI #VisionLanguage #Interpretability #AIAlignment #ReasoningModels #Evaluation #Hallucinations #Multimodal #MachineLearning

0

2

1

0

136

Intelligence & Robotics @OAE_IR

6 months ago

📢 Call for Papers | Vision-and-Language Intelligence: From image understanding to multimodal reasoning. 🗓️ Deadline: 31 Mar 2026 👥 Guest Editors: @QiWu_AIML Dr. Feras Dayoub, Jason Xue, Arpit Garg 🔗 https://t.co/M1PtxRs8gE #VisionLanguage #MultimodalAI #ComputerVision

OAE_IR's tweet photo. 📢 Call for Papers | Vision-and-Language Intelligence: From image understanding to multimodal reasoning.
🗓️ Deadline: 31 Mar 2026
👥 Guest Editors: @QiWu_AIML Dr. Feras Dayoub, Jason Xue, Arpit Garg
🔗 https://t.co/M1PtxRs8gE
#VisionLanguage #MultimodalAI #ComputerVision https://t.co/KeLeYtpfpu

0

4

1

0

125

Darshan Jain @i_darshanjain

6 months ago

Serving Qwen2-VL 7B with vLLM V1 on VisionArena benchmarks. At high QPS the V1 engine significantly outperforms V0. If you're still on the old architecture for multimodal workloads you're leaving perf on the table. #vLLM #VisionLanguage #Benchmark

0

8

Detectium @detectium

6 months ago

Excited to share our work at #NeurIPS2025! DetectiumFire: A Comprehensive Multi-modal Dataset Bridging Vision and Language for Fire Understanding 📌 Poster Presentation 📷 Learn more: https://t.co/p2GsUBtQx3 #AI #VisionLanguage #FireSafety #NeurIPS

detectium's tweet photo. Excited to share our work at #NeurIPS2025! DetectiumFire: A Comprehensive Multi-modal Dataset Bridging Vision and Language for Fire Understanding 📌 Poster Presentation
📷 Learn more: https://t.co/p2GsUBtQx3 #AI #VisionLanguage #FireSafety #NeurIPS https://t.co/NJ6LsvcYpG

0

1

0

12

Siavash Kh @SiavashKha

6 months ago

Excited to share our work at #NeurIPS2025! DetectiumFire: A Comprehensive Multi-modal Dataset Bridging Vision and Language for Fire Understanding 📌 Poster Presentation 📂 Learn more: https://t.co/d2sgFUpErD #AI #VisionLanguage #FireSafety #NeurIPS

SiavashKha's tweet photo. Excited to share our work at #NeurIPS2025!
DetectiumFire: A Comprehensive Multi-modal Dataset Bridging Vision and Language for Fire Understanding

📌 Poster Presentation
📂 Learn more: https://t.co/d2sgFUpErD
#AI #VisionLanguage #FireSafety #NeurIPS https://t.co/p99PGdqPg3

0

3

0

60

Top Tweets for #VisionLanguage

Last Seen Hashtags on Sotwe

Trends for you

Most Popular Users