I’m getting increasingly annoyed by young people complaining that they cannot do AI-related research unless they join big industrial labs… well, here is my reply: academia is supposed to work on ideas that money cannot buy!
This may be a controversial take, but I think it needs to be said: the gap between computer vision research in academia and industry is widening with every conference.
A huge fraction of @CVPR papers—especially those that boil down to "we tweaked/fine-tuned/RL'ed large-scale model X to improve on task Y"—will become obsolete with the next model release. That's not where academia creates lasting value. PIs should adapt much faster to this changing reality.
Academia should focus on fundamentally new ideas, new problem formulations, explaining emergent phenomenology, or uncovering blind spots that industry can later solve with scale, compute, and data.
The journal extension of this paper has been accepted to Medical Image Analysis. We have added segmentation masks and conduced more experiments. Preprint: https://t.co/jQemPcS5kP Code: https://t.co/TSniqnVqRl
We got a NeurIPS paper accepted! 🎉 We've annotated 113k labels on blood cell images, detailing the fine-grained concepts pathologists recognize. Annotating at this level of detail and scale is unprecedented, offering unique value to AI in pathology. https://t.co/W39sP83ZeE
If you really want to feel old, remember that when you started your PhD you were writing Caffe neural net layers in C++ and constantly messing up the backprop equations such that you’d get no gradient flow, or that you used intermediate VGG-16 “feature maps” as embeddings for all tasks.
バイオ実験の自動記録に向けた多視点映像データセットFineBioがIJCVに採択されました!データセット・コードも公開しています。
Our paper on multi-view video dataset on biological experiments has been accepted to IJCV!
A paper to analyze visual linguistic learning of infants is accepted to CVPR25. I always wanted to publish a paper of this kind — no SOTA results, no new technique, but scientifically interesting. I failed as a student, but my student did it. Happy moment to be an advisor.😀
Many people are in the middle of the @CVPR deadline. So I'm sharing my guide to writing a CVPR paper (or any paper). My students have had this for years but I haven't shared it publicly before. I hope you find it useful and write a great paper. #CVPR2025 https://t.co/RAvnQFnuLQ
How Rotary Position Embedding Supercharges Modern LLMs
New video! 🥳 https://t.co/HODtPqvQBC
Topics:
1⃣ Why is attention permutation equivalent
2⃣ Earlier attempts of using sinusoidal positional encoding
3⃣ Rotary Position Embedding (RoPE) for encoding relative position
4⃣ Long-context LLMs with RoPE.
We have a paper at CSCW! We studied the visual development of figures and tables from computer vision research papers. Scholarly writing has changed to better fit the online attention economy.
There was no way to have bold numbers because we had no ground truth data. I had one sequence of a Pepsi can and no way to get my own data into a computer. I later invested a lot of time in creating datasets and benchmarks so that we could start to understand what worked and why (eg for optical flow, Middlebury and Sintel).
Have you heard the term full-stack researcher? That's what I identify as these days 😂! From data 📊, to architecture design 🏗️, to training a model 🧠, to benchmarking 🧪, and finally making a web app to demo the work 💻! I gotta admit tho, I had forgotten how fun it is to make web apps 😄
If you’re a researcher in tech and you’re not reading anything more than five years old, you can be sure your own work will be forgotten five years from now.