Starting a 3-month journey into Computer Vision & AI Research.
I’ll be reading papers, building small projects, sharing simple explanations, visualizations, notes, and everything I learn along the way.
Moving from basics to advanced research topics. Stay tuned and follow! ⚡
I documented my notes, learnings, and breakdown from the paper here:
https://t.co/h9EJjHflWn
https://t.co/dJKFYy11zD
Next papers this week:
ResNet → U-Net → ViT ⚡
What paper completely changed your understanding of AI?
Just finished reading the AlexNet paper.
Kinda crazy how one paper changed the direction of Computer Vision and pushed deep learning into the mainstream.
Started documenting my learnings publicly as I study CV research step-by-step. ⚡
One thing I really liked about AlexNet:
CNNs weren’t manually told what features matter.
They slowly learned visual hierarchies on their own, from edges → textures → objects.
That idea still shapes modern vision models today.
Also planning to start a Discord community for people interested in AI research & building.
I’ll be sharing:
• important papers
• useful websites/tools
• research notes
• learning roadmaps
• project ideas
• implementation resources
Starting a 3-month journey into Computer Vision & AI Research.
I’ll be reading papers, building small projects, sharing simple explanations, visualizations, notes, and everything I learn along the way.
Moving from basics to advanced research topics. Stay tuned and follow! ⚡
@GordonWetzstein Really cool work. Feels like there’s a deeper idea here around temporal ordering of information during denoising.
Have you explored whether different semantic features consistently emerge at specific timesteps?
Spectral Progressive Diffusion hints at a deeper idea:
generation quality may depend more on when frequencies emerge than on raw parameter count.
Temporal ordering of information could also become a scaling axis itself.
Any thoughts on this?
High-fidelity generation is hitting a scaling crisis as DiT compute grows with image resolution and video length. But do we need high-resolution denoising at every step?
We introduce Spectral Progressive Diffusion, a plug-and-play framework for efficient image and video generation that directly exploits the spectral autoregression property of diffusion to grow resolution during denoising.
[1/7]
Introducing https://t.co/Iy19LPzkYA 🔥
Submit any paper, not just @arxiv, it can be any external source, a blog post, and more.
AI will parse and index it so others can read it.
Feel free to give it a try :)
A bit about me:
• Experience in structured modeling for microscopy reconstruction (IEEE conference publication)
• Currently working on latent-space watermarking for generative models
I’m interested in deep learning, generative models, and visual representation learning.
🧵2/2
I’m currently Looking for Machine Learning / Research Intern roles starting May 2026.
I’m Aryan Pandit, an ECE student working on ML and computational imaging.
If your team is hiring (or you know someone who is), I’d love to chat.
DMs open | CV: https://t.co/p7TthsMuKq
🧵1/2