Francesco Pinto @ Neurips 2024

@FraPintoML

Postdoc #UChicago, ex-#UniversityOfOxford, #Meta,#Google, #FiveAI, #ETHZurich Trustworthy and Privacy-Preserving ML Email: [email protected]

Oxford, UK

Joined January 2022

176 Following

50 Followers

43 Posts

FraPintoML retweeted

Zhaorun Chen

@ZRChen_AISafety

over 1 year ago

🚀 Introducing 𝐒𝐚𝐟𝐞𝐖𝐚𝐭𝐜𝐡! 🚀 While generative models 👾🎥 like Sora and Veo 2 have shown us some stunning videos recently, they also make it easier to produce harmful content (sexual🔞, violent🙅‍♂️, deepfakes🧟‍♂️). 🔥 𝐒𝐚𝐟𝐞𝐖𝐚𝐭𝐜𝐡 is here to help 😎: the first MLLM-based video guardrail model designed to follow customized safety policies and provide guardrails with precise explanations in a zero-shot manner. In addition, we also introduce SafeWatch-Bench📊, a 2M+ high-quality video guardrail dataset covering over 30 unsafe video scenarios from various real-world platforms and SOTA generative models to comprehensively cover all potential risks. 🧐Why SafeWatch? 👉1. Strong policy-following: trained on diverse videos and policy taxonomies, yielding high generalizability to unseen scenarios and subtle policy definitions. 👉2. High Inference Speed: introducing two plug-and-play modules to process policies in parallel and prune irrelevant video tokens, reducing inference costs and eliminating positional bias. 👉3. In-depth explanations: trained on high-quality explanations from SafeWatch-Bench📊 labeled by a rigorous multi-agent consensus pipeline and verified by human experts. We evaluate SafeWatch on a large variety of guardrail tasks: 1️⃣ On both real-world and generative video subsets of SafeWatch-Bench, SafeWatch outperforms SOTAs, including GPT-4o, by 29.2% and 27.2% on average, while requiring much less inference time. 2️⃣ On 5 existing video guardrail benchmarks, SafeWatch achieves 87.1% accuracy, consistently outperforming previous SOTAs. 3️⃣ On 4 new video categories and unseen policy taxonomies, as well as 4 different prompting tasks, SafeWatch maintains high accuracy and outperforms GPT-4o (renowned for its zero-shot generalizability). 🔥🔥 Our project has been released: 👉Paper link: https://t.co/GaVau8jwzj 👉Project page: https://t.co/lCyLzOOOpo 👉Code (coming soon): https://t.co/N23B4BuRa7

ZRChen_AISafety's tweet photo. 🚀 Introducing 𝐒𝐚𝐟𝐞𝐖𝐚𝐭𝐜𝐡! 🚀
While generative models 👾🎥 like Sora and Veo 2 have shown us some stunning videos recently, they also make it easier to produce harmful content (sexual🔞, violent🙅‍♂️, deepfakes🧟‍♂️).

🔥 𝐒𝐚𝐟𝐞𝐖𝐚𝐭𝐜𝐡 is here to help 😎: the first MLLM-based video guardrail model designed to follow customized safety policies and provide guardrails with precise explanations in a zero-shot manner.

In addition, we also introduce SafeWatch-Bench📊, a 2M+ high-quality video guardrail dataset covering over 30 unsafe video scenarios from various real-world platforms and SOTA generative models to comprehensively cover all potential risks.

🧐Why SafeWatch?
👉1. Strong policy-following: trained on diverse videos and policy taxonomies, yielding high generalizability to unseen scenarios and subtle policy definitions.
👉2. High Inference Speed: introducing two plug-and-play modules to process policies in parallel and prune irrelevant video tokens, reducing inference costs and eliminating positional bias.
👉3. In-depth explanations: trained on high-quality explanations from SafeWatch-Bench📊 labeled by a rigorous multi-agent consensus pipeline and verified by human experts.

We evaluate SafeWatch on a large variety of guardrail tasks:
1️⃣ On both real-world and generative video subsets of SafeWatch-Bench, SafeWatch outperforms SOTAs, including GPT-4o, by 29.2% and 27.2% on average, while requiring much less inference time.
2️⃣ On 5 existing video guardrail benchmarks, SafeWatch achieves 87.1% accuracy, consistently outperforming previous SOTAs.
3️⃣ On 4 new video categories and unseen policy taxonomies, as well as 4 different prompting tasks, SafeWatch maintains high accuracy and outperforms GPT-4o (renowned for its zero-shot generalizability).

🔥🔥 Our project has been released:
👉Paper link: https://t.co/GaVau8jwzj
👉Project page: https://t.co/lCyLzOOOpo
👉Code (coming soon): https://t.co/N23B4BuRa7

Francesco Pinto @ Neurips 2024 @FraPintoML

over 1 year ago

I’ll be at #NeurIPS2024 from now to Sunday. DM here or on Whova to have a chat about (multimodal) large language models privacy, memorisation, training strategies using synthetic data, agents, judges, distribution shift robustness, hallucinations and uncertainty estimation.

Francesco Pinto @ Neurips 2024 @FraPintoML

over 1 year ago

Concerned your LLMs 🤖 may regurgitate copyrighted contents ©️ and get you sued? 🩸💸 Fix it with model fusion 🫠 Result of a fantastic collaboration with @JavierAbadM @DonhauserKonst @FannyYangETH 🇨🇭🇬🇧

Javier Abad Martinez @JavierAbadM

over 1 year ago

(1/5) LLMs risk memorizing and regurgitating training data, raising copyright concerns. Our new work introduces CP-Fuse, a strategy to fuse LLMs trained on disjoint sets of protected material. The goal? Preventing unintended regurgitation 🧵 Paper: https://t.co/OjUAlg2b65

141

FraPintoML retweeted

Virtue AI

@VirtueAI_co

over 1 year ago

AI coding assistants (e.g. @cursor_ai, @codeiumdev , @github Copilot) are transforming software development—but how secure are they? Our new blog post reveals which tools stand up to security best practices, which introduce hidden vulnerabilities, and what you can do to safeguard your code. Learn more: https://t.co/LEfsnKqgUA #ai #coding #copilot #security #safety

FraPintoML retweeted

Chhavi Yadav

@chhaviyadav_

over 1 year ago

Can't wait for our workshop 'Interpretable AI: Past, Present and Future' @NeurIPSConf ! Check out our super interesting program with talks from @NeelNanda5 , @CynthiaRudin , #RichCaruana , @jxzhangjhu and @TongWang! We'll have a panel moderated by the amazing @kamalikac ! Help us spread the word, RTs appreciated!

chhaviyadav_'s tweet photo. Can't wait for our workshop 'Interpretable AI: Past, Present and Future' @NeurIPSConf !
Check out our super interesting program with talks from @NeelNanda5 , @CynthiaRudin , #RichCaruana , @jxzhangjhu and @TongWang!
We'll have a panel moderated by the amazing @kamalikac !

Help us spread the word, RTs appreciated!

Francesco Pinto @ Neurips 2024 @FraPintoML

over 1 year ago

🧵 [3/3] Special thanks to all coauthors: Adam Davies, Ashkan Khakzar, Anjun Hu, Arshia Hemmat, Jianhao Yuan, Tom Lamb, Jiyang Guan, Philip Torr. Work done at @OxfordTVG

Francesco Pinto @ Neurips 2024 @FraPintoML

over 1 year ago

🧵 [1/3] Heading to #Vancouver 🇨🇦 tomorrow to present our latest work in @OxfordTVG #UniversityOfOxford at #NeurIPS2024 🧠: - 💥 Improving on #StylizedImageNet, use #IllusionBench: can you see the cat 🐈‍⬛ Hidden in Plain Sight in the picture 🖼️? Paper: https://t.co/3Vbj4iGEQs

341

Francesco Pinto @ Neurips 2024 @FraPintoML

over 1 year ago

🧵 [2/3] - 💥 Is DP In-Context Learning really making any progress? At the @solarneurips workshop we present a very preliminary draft that questions its progress in several settings. Paper: https://t.co/QzAxTyqfUa

122

Francesco Pinto @ Neurips 2024 @FraPintoML

almost 2 years ago

(3/3) 🛡️ Instructing the model not to respond if the image has been manipulated reduces the chances the attacker can extract PIIs without degrading its accuracy. Thanks to all co-authors: Nathalie, @FlorianTramer, @OxfordTVG, @fedassa

FraPintoML's tweet photo. (3/3) 🛡️ Instructing the model not to respond if the image has been manipulated reduces the chances the attacker can extract PIIs without degrading its accuracy.

Thanks to all co-authors: Nathalie, @FlorianTramer, @OxfordTVG, @fedassa https://t.co/VQXunpUfqu

Francesco Pinto @ Neurips 2024 @FraPintoML

almost 2 years ago

(1/3)🔥Multi-Modal LLMs (MLLMs) can respond to questions about document scans. How safe are they? Come at Hall C #2300 1.30pm to find out! 🧠Attackers may successfully query MLLMs to extract Personally Identifying Information! 🚨 https://t.co/UbUG973O3d

FraPintoML's tweet photo. (1/3)🔥Multi-Modal LLMs (MLLMs) can respond to questions about document scans. How safe are they? Come at Hall C #2300 1.30pm to find out!

🧠Attackers may successfully query MLLMs to extract Personally Identifying Information! 🚨
https://t.co/UbUG973O3d https://t.co/HmXcjiPHyn

518

Francesco Pinto @ Neurips 2024 @FraPintoML

almost 2 years ago

(2/3)🔎 These models may regurgitate names, addresses, card numbers, ids. 🧑‍🔬 We find high input training resolution and stronger pre-training can significantly reduce the chances of regurgitation.

FraPintoML's tweet photo. (2/3)🔎 These models may regurgitate names, addresses, card numbers, ids.

🧑‍🔬 We find high input training resolution and stronger pre-training can significantly reduce the chances of regurgitation. https://t.co/SuQCPFzS7A

Francesco Pinto @ Neurips 2024 @FraPintoML

almost 2 years ago

3/3🤖 Simple prompting and editing outperform traditional augmentations, producing more robust models with fewer augmented samples. 🛑 Given the quality of generative models, filtering is no longer required to attain improved performance. @OxfordTVG @DYDYYDYYYD @adamdaviesnlp

FraPintoML's tweet photo. 3/3🤖 Simple prompting and editing outperform traditional augmentations, producing more robust models with fewer augmented samples.

🛑 Given the quality of generative models, filtering is no longer required to attain improved performance.

@OxfordTVG @DYDYYDYYYD @adamdaviesnlp https://t.co/xpdVjJtj31

100

Francesco Pinto @ Neurips 2024 @FraPintoML

almost 2 years ago

1/3,🧪🤖 What's the best way to improve model robustness to distribution shift using synthetic data? 💪 Come to Hall C 4-9 #912 #ICML2024 to find out! 💥Classifiers fail to recognise objects observed in previously unseen settings. 🧪 Can #StableDiffusion be used to fix this?

FraPintoML's tweet photo. 1/3,🧪🤖 What's the best way to improve model robustness to distribution shift using synthetic data? 💪 Come to Hall C 4-9 #912 #ICML2024 to find out!

💥Classifiers fail to recognise objects observed in previously unseen settings.
🧪 Can #StableDiffusion be used to fix this? https://t.co/EdVb5PW4Fk

Francesco Pinto @ Neurips 2024 @FraPintoML

almost 2 years ago

2/3,⌨️ Prompting Text-to-Image generators proves to be an extremely effective (SOTA) and interpretable approach to synthesize interventional data for augmentation. 📈We extensively study the impact on robustness of conditioning mechanisms, prompting strategies and filtering.

FraPintoML's tweet photo. 2/3,⌨️ Prompting Text-to-Image generators proves to be an extremely effective (SOTA) and interpretable approach to synthesize interventional data for augmentation.

📈We extensively study the impact on robustness of conditioning mechanisms, prompting strategies and filtering. https://t.co/s3P3568JUK

Francesco Pinto @ Neurips 2024 @FraPintoML

almost 2 years ago

Thanks to all the co-authors for their fantastic work!

Francesco Pinto @ Neurips 2024 @FraPintoML

almost 2 years ago

[1/2] Excited to be presenting 3 papers on Responsible AI #ICML2024! "Extracting Training Data from Document Based Visual Question Answering Models" https://t.co/UbUG973O3d “NJPP: Toward Interventional Data Augmentation Using Text-to-Image Generators” https://t.co/jPqaFPH1KT

239

Francesco Pinto @ Neurips 2024 @FraPintoML

almost 2 years ago

[2/2] "Strong Copyright Protection for Language Models via Adaptive Model Fusion" https://t.co/dyW6wPhCJn GenLaw and Foundation Models in the Wild workshops 👋 Let's grab a coffee and chat about uncertainty, privacy, memorization, robustness, synthetic data, multimodal agents

FraPintoML retweeted

Pau de Jorge @paudejorge

over 2 years ago

1/n Happy to share our recent work with @rvolpis @puneetdokania Philip Torr and Grégory Rogez 🚀🤖: Placing Objects in Context via Inpainting for Out-of-distribution Segmentation 🖌️🎨 ->🔍🐏🐏🐺🐏🐏 Paper: https://t.co/viqm2YIZEu Code: https://t.co/XpX2ysm6Da

paudejorge's tweet photo. 1/n Happy to share our recent work with @rvolpis @puneetdokania Philip Torr and Grégory Rogez 🚀🤖:

Placing Objects in Context via Inpainting for Out-of-distribution Segmentation 🖌️🎨 ->🔍🐏🐏🐺🐏🐏

Paper: https://t.co/viqm2YIZEu
Code: https://t.co/XpX2ysm6Da https://t.co/dpEiabEA38

FraPintoML retweeted

Guillermo Ortiz-Jiménez @gortizji

almost 2 years ago

In the era of long-context LLMs it is not enough to make models “forget” unsafe knowledge. Adversaries can use this long context to “un-unlearn” the malicious behavior 👿

Francesco Pinto @ Neurips 2024 @FraPintoML

about 2 years ago

🔥 Excited to be co-organizing this #ECCV2024 workshop with an outstanding line-up of speakers! 🗣️ 🔎Submit if you got papers with new benchmarks and analyses inspecting Emergent Visual abilities ✔️ or limitations ❌of Foundation Models! 🤖

Oxford Torr Vision Group @OxfordTVG

about 2 years ago

🔥 #ECCV2024 Showcase your research on the Analysis and Evaluation of emerging VISUAL abilities and limits of foundation models 🔎🤖👁️ at the EVAL-FoMo workshop 🧠🚀✨ 🔗 https://t.co/LYM3IFejUy @phillip_isola @sainingxie @chrirupp @OxfordTVG @berkeley_ai @MIT_CSAIL

OxfordTVG's tweet photo. 🔥 #ECCV2024 Showcase your research on the Analysis and Evaluation of emerging VISUAL abilities and limits of foundation models 🔎🤖👁️ at the EVAL-FoMo workshop 🧠🚀✨
🔗 https://t.co/LYM3IFejUy

@phillip_isola @sainingxie @chrirupp @OxfordTVG @berkeley_ai @MIT_CSAIL https://t.co/1YhQukNeIo

14K

763

Francesco Pinto @ Neurips 2024

@FraPintoML

Last Seen Users on Sotwe

Trends for you

Most Popular Users