Perouz Taslakian @PerouzT - Twitter Profile

Pinned Tweet

8 months ago

🚀 Internship Opportunities in AI Agents! At @ServiceNowRSRCH, we have 4 internship positions on AI Agents — exploring robustness, privacy & collaboration. 🧠 Applicants must be registered students at a Canadian university. 👇 Thread with details & links to apply:

2

82

13

52

15K

Perouz Taslakian @PerouzT

3 months ago

🔬👁️ Looking for an AI intern to work on eye disease detection! Build models that combine high-res images with clinical data to automate diagnosis. 🇨🇦 Mitacs internship (8 months, full-time, ASAP) (Must be enrolled at a Canadian university) 📩 https://t.co/xP4f93br7E #McGill

0

2

0

138

PerouzT retweeted

Alexandre Drouin @alexandredrouin

3 months ago

The NeurIPS Datasets & Benchmarks Track is now the Evaluations & Datasets (ED) Track. It now treats evaluation as a scientific object of study in its own right. Datasets/benchmarks still fully in scope. 👉Details: https://t.co/MN75FwQ8Ow We look forward to your submissions!

0

29

6

8

5K

PerouzT retweeted

Siva Reddy

@sivareddyg

5 months ago

Thoughtology paper -- the study of reasoning chains of thinking models -- is now published at TMLR. Since we wrote the paper, a lot has changed. Many more models have been released with open-weights. 1. These models are no longer thinking verbosely. GPT-OSS has crisper thoughts than Qwen3/R1. 2. GPT-OSS almost never self-verifies or tries alternate solutions. 3. Qwen3 has a large bloom step (initial solution) than R1. Among commonalities: 4. All of them still have a problem-specific sweet spot (i.e., overthinking doesn't help) 5. Incorrect problems still have a longer chain length. On another note, thanks to @TmlrOrg for allowing us to submit a ridiculously long paper :). 135 pages in total. We thank reviewers and AE for their time. This is the first paper where every member of the group contributed to it! Special thanks to @saraveramarjano and @arkil_patel. We have a documentary around it taken by @CBCNews and @binhanv, hopefully you will get to see it one day. Thanks to @SimonsInstitute for letting us work on this during their LLM2 semester program. @IVADO_Qc for the funding, and @Mila_Quebec members for the feedback. Full paper: https://t.co/kaTNGCv6rk

sivareddyg's tweet photo. Thoughtology paper -- the study of reasoning chains of thinking models -- is now published at TMLR. Since we wrote the paper, a lot has changed. Many more models have been released with open-weights.

1. These models are no longer thinking verbosely. GPT-OSS has crisper thoughts than Qwen3/R1.

2. GPT-OSS almost never self-verifies or tries alternate solutions.

3. Qwen3 has a large bloom step (initial solution) than R1.

Among commonalities:
4. All of them still have a problem-specific sweet spot (i.e., overthinking doesn't help)

5. Incorrect problems still have a longer chain length.

On another note, thanks to @TmlrOrg for allowing us to submit a ridiculously long paper :). 135 pages in total. We thank reviewers and AE for their time.

This is the first paper where every member of the group contributed to it! Special thanks to @saraveramarjano and @arkil_patel.

We have a documentary around it taken by @CBCNews and @binhanv, hopefully you will get to see it one day.

Thanks to @SimonsInstitute for letting us work on this during their LLM2 semester program. @IVADO_Qc for the funding, and @Mila_Quebec members for the feedback.

Full paper: https://t.co/kaTNGCv6rk

6

259

40

153

30K

Who to follow

Lucas Ventura

@Lucas__Ventura

Research Scientist Internship @AdobeResearch | PhD at Imagine (ENPC) and Willow (Inria) under the supervision of Gül Varol and Cordelia Schmid.

David Vázquez

@dvazquezcv

AI Research Director @ ServiceNow | Adjunct Prof @ UAB, PolyMTL, MILA & ELLIS | AI agents, multimodal AI | @ ICLR

Jason Hartford

@jasonhartford

Dame Kathleen Ollerenshaw Fellow at @csmcr; Member of @ELLISforEurope; Research Unit Lead for the causality unit at @valence_ai. South African 🇿🇦

PerouzT retweeted

Alexandre Drouin @alexandredrouin

8 months ago

Excited to speak at the AAAI-26 Workshop on Agentic AI Benchmarks & Enterprise Tasks (Jan 26, Singapore) 🇸🇬 As agents are rapidly productized, realistic enterprise benchmarks for capabilities and reliability are essential! Submit: https://t.co/NYWO6Xv89b 🗓️ Oct 29 cc @gneubig

0

4

0

451

Perouz Taslakian @PerouzT

8 months ago

4️⃣ Agent Memory Poisoning Attacks Detecting and defending against corrupted memories. Apply 👉 https://t.co/28UMVXJS9r

0

2

0

466

Perouz Taslakian @PerouzT

8 months ago

🚀 Internship Opportunities in AI Agents! At @ServiceNowRSRCH, we have 4 internship positions on AI Agents — exploring robustness, privacy & collaboration. 🧠 Applicants must be registered students at a Canadian university. 👇 Thread with details & links to apply:

2

82

13

52

15K

Perouz Taslakian @PerouzT

8 months ago

3️⃣ BlackBox Whisperer Tuning smaller open models to collaborate effectively with large black-box LLMs. Apply👉 https://t.co/rkj7vTE2VO

1

3

2

0

550

PerouzT retweeted

Christos Tsirigotis

@tsirigoc

8 months ago

A big shoutout to amazing collaborators @vaibhav_adlakha @joaomonteirof @AaronCourville @PerouzT Come find us at poster 55, 11am-1pm, on Tuesday to learn more! https://t.co/4rePyb50jj #COLM2025 #informationretrieval #denseencoders

0

6

1

0

246

PerouzT retweeted

ServiceNow AI Research

@ServiceNowRSRCH

8 months ago

SLAM Labs presents Apriel-1.5-15B-Thinker 🚀 An open-weights multimodal reasoning model that hits frontier-level performance with just a fraction of the compute.

$ServiceNowRSRCH's tweet photo. SLAM Labs presents Apriel-1.5-15B-Thinker 🚀 An open-weights multimodal reasoning model that hits frontier-level performance with just a fraction of the compute.$

14

333

77

114

60K

PerouzT retweeted

Massimo Caccia

@MassCaccia

9 months ago

See you in San Diego 🚀 #NeurIPS2025

3

60

10

7

5K

Perouz Taslakian @PerouzT

9 months ago

🎉Congratulations to all the authors for this great work -- specially to @Ahmed_Masry97 for his perseverance through the highs and lows of this project 😀 Excited to see AlignVLM accepted to #NeurIPS2025! @ServiceNowRSRCH

Ahmed Masry @Ahmed_Masry97

9 months ago

Excited to announce that AlignVLM got accepted to NeurIPS! 🎉🥳 We’ll be releasing the code and sharing an updated version of the paper with reviewer feedback soon. #NeurIPS2025

2

55

14

20

8K

1

9

0

783

PerouzT retweeted

Rabiul Awal @_rabiulawal

9 months ago

🚨Exciting news! Our paper “WebMMU: A Benchmark for Multimodal Multilingual Website Understanding and Code Generation” is accepted for an oral presentation at EMNLP 2025! 🎉 WebMMU addresses a critical gap in AI evaluation: how well can models understand and build websites? 🧵1/n

_rabiulawal's tweet photo. 🚨Exciting news! Our paper “WebMMU: A Benchmark for Multimodal Multilingual Website Understanding and Code Generation” is accepted for an oral presentation at EMNLP 2025! 🎉 WebMMU addresses a critical gap in AI evaluation: how well can models understand and build websites? 🧵1/n https://t.co/uh9USAIzCd

2

24

18

1

3K

PerouzT retweeted

Ahmed Masry @Ahmed_Masry97

10 months ago

UI-Vision vs GPT-5: Still holding the crown 👑 and far from saturation. GPT-5 has strengths in coding and reasoning, but when it comes to computer-use tasks, it’s still awkward to rely on it alone. And our team's UI-Vision (ICML 2025) remains a key and still unbeaten multimodal eval framework for screen understanding and grounding. What we continue to see: focused training is essential to beat our evals, and this is exactly where open-source models have been shining. A big thanks to research teams at Microsoft, OpenCUA, and UI-Tars for actively using UI-Vision to push the limits of visual screen understanding. If you are working on VLMs or screen grounding applications for ICLR submissions, UI-Vision is the place to measure and improve your systems. And we are only getting started: our next, UI-Vision-Grounding, is on the way🚀. It brings a larger dataset that the community can make use of, harder grounding tasks, and new training recipes to help models level up in grounding abilities. 🔗https://t.co/ObaK1zjD0p 📜https://t.co/NoKK387FLd Big kudos to all our partners and collaborators who made this possible! @ServiceNowRSRCH, @turingcom, @Mila_Quebec, @PShravannayak, @EdwardJian2, @aarashfeizi, @gspandana, @PerouzT, Qinghong Lin, @chrisjpal, @_rabiulawal, @dvazquezcv, @joanrod_ai, @RajeswarSai

Ahmed_Masry97's tweet photo. UI-Vision vs GPT-5: Still holding the crown 👑 and far from saturation.

GPT-5 has strengths in coding and reasoning, but when it comes to computer-use tasks, it’s still awkward to rely on it alone. And our team's UI-Vision (ICML 2025) remains a key and still unbeaten multimodal eval framework for screen understanding and grounding.

What we continue to see: focused training is essential to beat our evals, and this is exactly where open-source models have been shining.

A big thanks to research teams at Microsoft, OpenCUA, and UI-Tars for actively using UI-Vision to push the limits of visual screen understanding. If you are working on VLMs or screen grounding applications for ICLR submissions, UI-Vision is the place to measure and improve your systems.

And we are only getting started: our next, UI-Vision-Grounding, is on the way🚀. It brings a larger dataset that the community can make use of, harder grounding tasks, and new training recipes to help models level up in grounding abilities.

🔗https://t.co/ObaK1zjD0p
📜https://t.co/NoKK387FLd

Big kudos to all our partners and collaborators who made this possible! @ServiceNowRSRCH, @turingcom, @Mila_Quebec, @PShravannayak, @EdwardJian2, @aarashfeizi, @gspandana, @PerouzT, Qinghong Lin, @chrisjpal, @_rabiulawal, @dvazquezcv, @joanrod_ai, @RajeswarSai

2

19

9

3

2K

Perouz Taslakian @PerouzT

12 months ago

🚀 We just released the final test split of #RepLiQA —our dataset for evaluating QA on truly unseen content! 📚 Dataset: https://t.co/VTgESfBqv2 📝 NeurIPS ’24: https://t.co/9JKgGxdWSo Big thanks to my amazing co-authors @ @ServiceNowRSRCH ! 🙌 #RAG #LLMs #NLP #QA

0

10

4

0

425

PerouzT retweeted

NewInML @ NeurIPS 2025 @NewInML

12 months ago

New to ML research? Never published at ICML? Don't miss this! Check out the New in ML workshop at ICML 2025 — no rejections, detailed feedback, awards, and ICML tickets for selected authors. Deadline: June 10 (AoE) Submit: https://t.co/xNiccKTelq Info: https://t.co/1dBY6bnGji

0

27

14

12

2K

PerouzT retweeted

Joan Rodriguez

@joanrod_ai

about 1 year ago

Thanks @_akhaliq for sharing our work! Excited to present our next generation of SVG models, now using Reinforcement Learning from Rendering Feedback (RLRF). 🧠 We think we cracked SVG generalization with this one. Go read the paper! https://t.co/EnSbizvWOQ More details on the demo, code, and models coming soon! Stay tuned 💫

joanrod_ai's tweet photo. Thanks @_akhaliq for sharing our work! Excited to present our next generation of SVG models, now using Reinforcement Learning from Rendering Feedback (RLRF).

🧠 We think we cracked SVG generalization with this one.

Go read the paper! https://t.co/EnSbizvWOQ

More details on the demo, code, and models coming soon! Stay tuned 💫

3

123

39

65

16K

PerouzT retweeted

Patrice Bechard

@patricebechard

about 1 year ago

🚀 New paper from our team at @ServiceNowRSRCH!⁣ ⁣ 💫𝐒𝐭𝐚𝐫𝐅𝐥𝐨𝐰: 𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐧𝐠 𝐒𝐭𝐫𝐮𝐜𝐭𝐮𝐫𝐞𝐝 𝐖𝐨𝐫𝐤𝐟𝐥𝐨𝐰 𝐎𝐮𝐭𝐩𝐮𝐭𝐬 𝐅𝐫𝐨𝐦 𝐒𝐤𝐞𝐭𝐜𝐡 𝐈𝐦𝐚𝐠𝐞𝐬⁣ We use VLMs to turn 𝘩𝘢𝘯𝘥-𝘥𝘳𝘢𝘸𝘯 𝘴𝘬𝘦𝘵𝘤𝘩𝘦𝘴 and diagrams into executable workflows. ��️→⚙️⁣ ⁣ 🔗https://t.co/HRU22oXQsT⁣ 📝https://t.co/2Rpp9Nwuiz⁣ #Sketch2Flow #AI #VLM

1

24

7

1

4K

Perouz Taslakian @PerouzT

about 1 year ago

Our team has released the UI-Vision benchmark (accepted at #ICML2025) for testing GUI agent visual grounding and action prediction! 🚀🚀🚀 🤗 Dataset: https://t.co/EWOTL0nVVF Special thanks to the students to lead this effort, @PShravannayak and @EdwardJian2 @ServiceNowRSRCH

P Shravan Nayak @PShravannayak

about 1 year ago

🚀 Excited to share that UI-Vision has been accepted at ICML 2025! 🎉 We have also released the UI-Vision grounding datasets. Test your agents on it now! 🚀 🤗 Dataset: https://t.co/GhZSHI0uVO #ICML2025 #AI #DatasetRelease #Agents

0

37

13

2

5K

0

18

5

1

727

Perouz Taslakian

@PerouzT

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users