Drew A Hudson

@drewAhudson

Research Scientist @GoogleDeepMind. PhD grad @StanfordAILab @stanfordnlp, interested in compositionality, reasoning, and representation learning

Joined March 2018

137 Following

491 Followers

363 Posts

Drew A Hudson @drewAhudson

about 2 years ago

@DebjitPaul2 @WilliamWangNLP @chrmanning Thanks! 😊🙏

drewAhudson retweeted

fly51fly @fly51fly

over 2 years ago

[CV] SODA: Bottleneck Diffusion Models for Representation Learning D A. Hudson, D Zoran, M Malinowski, A K. Lampinen, A Jaegle, J L. McClelland, L Matthey, F Hill, A Lerchner [Google DeepMind] (2023) https://t.co/OHaC0KObCY - The paper introduces SODA, a self-supervised diffusion model for both representation learning and image generation. - SODA consists of an image encoder that distills an input image into a compact latent code, and a conditional denoising diffusion decoder that uses the latent code to guide the image generation process. - A tight bottleneck between the encoder and decoder encourages the emergence of disentangled and semantically meaningful latent representations. - SODA is trained with a novel view synthesis objective, where the encoder encodes a source image, and the decoder uses that code to generate a novel, related target image. This acts as a powerful pretext task for self-supervised representation learning. - SODA incorporates several innovations including layer modulation, modified classifier-free guidance, and an inverted noise schedule to further improve the latent representations. - Experiments demonstrate SODA's strong performance on downstream tasks like ImageNet classification, its ability to generate high fidelity images and novel views, and the disentangled nature of its latent space. - The compact bottlenecked design and novel view training objective sets SODA apart from prior diffusion models and establishes its capabilities for both representation learning and controllable image synthesis.

fly51fly's tweet photo. [CV] SODA: Bottleneck Diffusion Models for Representation Learning
D A. Hudson, D Zoran, M Malinowski, A K. Lampinen, A Jaegle, J L. McClelland, L Matthey, F Hill, A Lerchner [Google DeepMind] (2023)
https://t.co/OHaC0KObCY

- The paper introduces SODA, a self-supervised diffusion model for both representation learning and image generation.

- SODA consists of an image encoder that distills an input image into a compact latent code, and a conditional denoising diffusion decoder that uses the latent code to guide the image generation process.

- A tight bottleneck between the encoder and decoder encourages the emergence of disentangled and semantically meaningful latent representations.

- SODA is trained with a novel view synthesis objective, where the encoder encodes a source image, and the decoder uses that code to generate a novel, related target image. This acts as a powerful pretext task for self-supervised representation learning.

- SODA incorporates several innovations including layer modulation, modified classifier-free guidance, and an inverted noise schedule to further improve the latent representations.

- Experiments demonstrate SODA's strong performance on downstream tasks like ImageNet classification, its ability to generate high fidelity images and novel views, and the disentangled nature of its latent space.

- The compact bottlenecked design and novel view training objective sets SODA apart from prior diffusion models and establishes its capabilities for both representation learning and controllable image synthesis.

drewAhudson retweeted

Yann LeCun

@ylecun

over 2 years ago

But seriously folks, this a short and juicy tirade in which I say: (0) there will be superhuman AI in the future (1) they will be under our control (2) they will not dominate us nor kill us (3) they will mediate all of our interactions with the digital world (4) hence, they will need to be open platforms so that everyone can contribute to training and tuning them.

481

822

Drew A Hudson @drewAhudson

over 2 years ago

@soumikkanad @arankomatsuzaki Will be the goal for the next time! 😊 Am already running experiments of the model with a ViT encoder! 🙂

Who to follow

Douwe Kiela

@douwekiela

Contextualizing AI @GoogleDeepMind, ex-@ContextualAI CEO, @Stanford Adjunct Prof

Wei Xu

@cocoweixu

CS professor @GeorgiaTech @gtcomputing @ICatGT @mlatgt. Evaluating & Improving LLMs (multilingual, reasoning, RL, multi-turn, privacy/safety, etc.)

Simon

@iced_coffee_dev

Funemployed ex-(Google/game dev). I code things, mostly game related, sometimes not.

Drew A Hudson @drewAhudson

over 2 years ago

@soumikkanad @arankomatsuzaki Yep AI research these days goes fast indeed! 🏇

Drew A Hudson @drewAhudson

over 2 years ago

@soumikkanad @arankomatsuzaki (And lastly, as an unofficial side note, while of course the date of the publication is totally what counts, we actually got the Imagenet score in end of Feb and the paper publication got delayed a lot because of my PhD graduation/thesis writing.. 🙂)

Drew A Hudson @drewAhudson

over 2 years ago

@soumikkanad @arankomatsuzaki In addition to that, I wasn't aware of the diffusion-beats-gans paper while writing, I'll be most happy to add a discussion of it to the paper!

Drew A Hudson @drewAhudson

over 2 years ago

@soumikkanad @arankomatsuzaki Finally, considering the number of parameters is critical for valid comparison. While in SODA we make sure to use model of size comparable to competing methods, the first paper you mention uses 5x more parameters (!) (couldn't find model size details for the second paper). (3/3)

182

Drew A Hudson @drewAhudson

over 2 years ago

@soumikkanad @arankomatsuzaki In addition, I believe a key result is that for light data augmentation, our model beats all models we compared to, including both the leading generative and discriminative approaches, such as MAE, DINO, BYOL etc! (2/3)

188

Drew A Hudson @drewAhudson

over 2 years ago

@soumikkanad @arankomatsuzaki Hi @soumikkanad , thank you for these references! Note that these works achieve 61.95-63.9% in linear probing, significantly lower than both our SODA and contrastive methods (>72%) (1/3)

275

drewAhudson retweeted

Aran Komatsuzaki

@arankomatsuzaki

over 2 years ago

SODA: Bottleneck Diffusion Models for Representation Learning The first diffusion model to succeed at ImageNet linear-probe classification proj: https://t.co/FEG1zn873S abs: https://t.co/mg2wqJwyhY

arankomatsuzaki's tweet photo. SODA: Bottleneck Diffusion Models for Representation Learning

The first diffusion model to succeed at ImageNet linear-probe classification

proj: https://t.co/FEG1zn873S
abs: https://t.co/mg2wqJwyhY https://t.co/vIVUr6bzZ6

223

111

31K

drewAhudson retweeted

Yann LeCun

@ylecun

over 2 years ago

LLMs obviously have *some* understanding of what they read and generate. But this understanding is very limited and superficial. Otherwise, they wouldn't confabulate so much and wouldn't make mistakes that are contrary to common sense. I have argued, since at least 2016, that AI systems need to have internal models of the world that would allow them to predict the consequences of their actions, and thereby allow them to reason and plan. Current Auto-Regressive LLMs do not have this ability, nor anything close to it, and hence are nowhere near reaching human-level intelligence. In fact, their complete lack of understanding of the physical world and lack of planning abilities puts them way below cat-level intelligence, never mind human-level. AR-LLMs can accumulate large amounts of textual knowledge (if only approximately) and can retrieve it with appropriate context (if only approximately). More than a cat, certainly. But how is that any 10 year-old can learn to clear up the dinner table and fill up the dishwasher in one shot, whereas we are nowhere near having robots capable of learning this in any amount of time. Obviously, we are still missing something really big to reach human-level AI. I have written where I think AI research should go over the next decade or two to bridge that gap: https://t.co/yqWEubV9id All my talks of the last couple of years have been on "objective driven AI architectures" which are an attempt to bridge that gap while making AI systems controlable, safe, and subservient to humanity. E.g. this one: https://t.co/2QTDpXWjzy

191

278

838K

drewAhudson retweeted

Google DeepMind @GoogleDeepMind

over 2 years ago

Today with @YouTube, we’re announcing Lyria: our most advanced music generation model to date. 🎶 We’re also releasing 2️⃣ AI experiments in close collaboration with participating artists and creators to bring their ideas to life responsibly. → https://t.co/i9ve66A5rv

952

230

227

392K

drewAhudson retweeted

hardmaru

@hardmaru

over 3 years ago

A dog trained to fetch a stick using Deep Reinforcement Learning. Probably the cutest DeepRL demo ever made.

878

111

drewAhudson retweeted

Lex Fridman

@lexfridman

over 3 years ago

Simple is better than complicated.

21K

156

Drew A Hudson @drewAhudson

over 3 years ago

@adam_matan @pembleton בייסבול?! אמנם לא ראיתי את הוידאו אבל יש סיכוי שזה לגבי המקרה של הלייב ריזולט של אמ-אל-בי שירד מיד אחרי הלונץ׳?! איתי כל הכבוד על ההרצאה! :-) 🎉🥳⚾️

drewAhudson retweeted

Stanford AI Lab

@StanfordAILab

over 3 years ago

Last night, 50 years to the day after the pioneering Intergalactic SpaceWar Olympics first video game contest (https://t.co/urYsg77H7b), current and former members of @StanfordAILab gathered for the 2022 SAIL Gaming Tournament. Everyone had fun, with games old and new.

StanfordAILab's tweet photo. Last night, 50 years to the day after the pioneering Intergalactic SpaceWar Olympics first video game contest (https://t.co/urYsg77H7b), current and former members of @StanfordAILab gathered for the 2022 SAIL Gaming Tournament. Everyone had fun, with games old and new. https://t.co/bJpu0ubQYA

drewAhudson retweeted

Kyunghyun Cho

@kchonyc

almost 4 years ago

"Is this the real life? Is this just fantasy? Caught in #ICML2022, No escape from #NeurIPS2022 Open your laptop, Look up to Openreview and see," (1/3)

kchonyc's tweet photo. "Is this the real life?
Is this just fantasy?
Caught in #ICML2022,
No escape from #NeurIPS2022
Open your laptop,
Look up to Openreview and see,"

(1/3) https://t.co/NXpt9Jl9wX

153

drewAhudson retweeted

Yi Ma

@YiMaTweets

almost 4 years ago

I always tell my students: if you only read paper published in the past five years, the probability that you will have any ground-breaking idea in your lifetime is nearly zero. The odds is probably less than winning a jackpot on a slot machine in Vegas...

426

382

Drew A Hudson

@drewAhudson

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users