Felipe Rodríguez @Piperod_ - Twitter Profile

Very proud of our new preprint introducing reverse predictivity — a two-way test of AI–brain alignment. We find a striking asymmetry: models & brains don’t map to each other equally, while brain-to-brain mappings are symmetric 🧠🤖

0

20

6

1

1K

Piperod_ retweeted

Jürgen Schmidhuber

@SchmidhuberAI

10 months ago

Who invented convolutional neural networks (CNNs)? 1969: Fukushima had CNN-relevant ReLUs [2]. 1979: Fukushima had the basic CNN architecture with convolution layers and downsampling layers [1]. Compute was 100 x more costly than in 1989, and a billion x more costly than today. 1987: Waibel applied Linnainmaa's 1970 backpropagation [3] to weight-sharing TDNNs with 1-dimensional convolutions [4]. 1988: Wei Zhang et al. applied "modern" backprop-trained 2-dimensional CNNs to character recognition [5]. All of the above was published in Japan 1979-1988. 1989: LeCun et al. applied CNNs again to character recognition (zip codes) [6,10]. 1990-93: Fukushima’s downsampling based on spatial averaging [1] was replaced by max-pooling for 1-D TDNNs (Yamaguchi et al.) [7] and 2-D CNNs (Weng et al.) [8]. 2011: Much later, my team with Dan Ciresan made max-pooling CNNs really fast on NVIDIA GPUs. In 2011, DanNet achieved the first superhuman pattern recognition result [9]. For a while, it enjoyed a monopoly: from May 2011 to Sept 2012, DanNet won every image recognition challenge it entered, 4 of them in a row. Admittedly, however, this was mostly about engineering & scaling up the basic insights from the previous millennium, profiting from much faster hardware. Some "AI experts" claim that "making CNNs work" (e.g., [5,6,9]) was as important as inventing them. But "making them work" largely depended on whether your lab was rich enough to buy the latest computers required to scale up the original work. It's the same as today. Basic research vs engineering/development - the R vs the D in R&D. REFERENCES [1] K. Fukushima (1979). Neural network model for a mechanism of pattern recognition unaffected by shift in position — Neocognitron. Trans. IECE, vol. J62-A, no. 10, pp. 658-665, 1979. [2] K. Fukushima (1969). Visual feature extraction by a multilayered network of analog threshold elements. IEEE Transactions on Systems Science and Cybernetics. 5 (4): 322-333. This work introduced rectified linear units (ReLUs), now used in many CNNs. [3] S. Linnainmaa (1970). Master's Thesis, Univ. Helsinki, 1970. The first publication on "modern" backpropagation, also known as the reverse mode of automatic differentiation. (See Schmidhuber's well-known backpropagation overview: "Who Invented Backpropagation?") [4] A. Waibel. Phoneme Recognition Using Time-Delay Neural Networks. Meeting of IEICE, Tokyo, Japan, 1987. Backpropagation for a weight-sharing TDNN with 1-dimensional convolutions. [5] W. Zhang, J. Tanida, K. Itoh, Y. Ichioka. Shift-invariant pattern recognition neural network and its optical architecture. Proc. Annual Conference of the Japan Society of Applied Physics, 1988. First backpropagation-trained 2-dimensional CNN, with applications to English character recognition. [6] Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, L. D. Jackel: Backpropagation Applied to Handwritten Zip Code Recognition, Neural Computation, 1(4):541-551, 1989. See also Sec. 3 of [10]. [7] K. Yamaguchi, K. Sakamoto, A. Kenji, T. Akabane, Y. Fujimoto. A Neural Network for Speaker-Independent Isolated Word Recognition. First International Conference on Spoken Language Processing (ICSLP 90), Kobe, Japan, Nov 1990. A 1-dimensional convolutional TDNN using Max-Pooling instead of Fukushima's Spatial Averaging [1]. [8] Weng, J., Ahuja, N., and Huang, T. S. (1993). Learning recognition and segmentation of 3-D objects from 2-D images. Proc. 4th Intl. Conf. Computer Vision, Berlin, pp. 121-128. A 2-dimensional CNN whose downsampling layers use Max-Pooling (which has become very popular) instead of Fukushima's Spatial Averaging [1]. [9] In 2011, the fast and deep GPU-based CNN called DanNet (7+ layers) achieved the first superhuman performance in a computer vision contest. See overview: "2011: DanNet triggers deep CNN revolution." [10] How 3 Turing awardees republished key methods and ideas whose creators they failed to credit. Technical Report IDSIA-23-23, Swiss AI Lab IDSIA, 14 Dec 2023. See also the YouTube video for the Bower Award Ceremony 2021: J. Schmidhuber lauds Kunihiko Fukushima.

SchmidhuberAI's tweet photo. Who invented convolutional neural networks (CNNs)?

1969: Fukushima had CNN-relevant ReLUs [2].

1979: Fukushima had the basic CNN architecture with convolution layers and downsampling layers [1]. Compute was 100 x more costly than in 1989, and a billion x more costly than today.

1987: Waibel applied Linnainmaa's 1970 backpropagation [3] to weight-sharing TDNNs with 1-dimensional convolutions [4].

1988: Wei Zhang et al. applied "modern" backprop-trained 2-dimensional CNNs to character recognition [5].

All of the above was published in Japan 1979-1988.

1989: LeCun et al. applied CNNs again to character recognition (zip codes) [6,10].

1990-93: Fukushima’s downsampling based on spatial averaging [1] was replaced by max-pooling for 1-D TDNNs (Yamaguchi et al.) [7] and 2-D CNNs (Weng et al.) [8].

2011: Much later, my team with Dan Ciresan made max-pooling CNNs really fast on NVIDIA GPUs. In 2011, DanNet achieved the first superhuman pattern recognition result [9]. For a while, it enjoyed a monopoly: from May 2011 to Sept 2012, DanNet won every image recognition challenge it entered, 4 of them in a row. Admittedly, however, this was mostly about engineering & scaling up the basic insights from the previous millennium, profiting from much faster hardware.

Some "AI experts" claim that "making CNNs work" (e.g., [5,6,9]) was as important as inventing them. But "making them work" largely depended on whether your lab was rich enough to buy the latest computers required to scale up the original work. It's the same as today. Basic research vs engineering/development - the R vs the D in R&D.

REFERENCES

[1] K. Fukushima (1979). Neural network model for a mechanism of pattern recognition unaffected by shift in position — Neocognitron. Trans. IECE, vol. J62-A, no. 10, pp. 658-665, 1979.

[2] K. Fukushima (1969). Visual feature extraction by a multilayered network of analog threshold elements. IEEE Transactions on Systems Science and Cybernetics. 5 (4): 322-333. This work introduced rectified linear units (ReLUs), now used in many CNNs.

[3] S. Linnainmaa (1970). Master's Thesis, Univ. Helsinki, 1970. The first publication on "modern" backpropagation, also known as the reverse mode of automatic differentiation. (See Schmidhuber's well-known backpropagation overview: "Who Invented Backpropagation?")

[4] A. Waibel. Phoneme Recognition Using Time-Delay Neural Networks. Meeting of IEICE, Tokyo, Japan, 1987. Backpropagation for a weight-sharing TDNN with 1-dimensional convolutions.

[5] W. Zhang, J. Tanida, K. Itoh, Y. Ichioka. Shift-invariant pattern recognition neural network and its optical architecture. Proc. Annual Conference of the Japan Society of Applied Physics, 1988. First backpropagation-trained 2-dimensional CNN, with applications to English character recognition.

[6] Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, L. D. Jackel: Backpropagation Applied to Handwritten Zip Code Recognition, Neural Computation, 1(4):541-551, 1989. See also Sec. 3 of [10].

[7] K. Yamaguchi, K. Sakamoto, A. Kenji, T. Akabane, Y. Fujimoto. A Neural Network for Speaker-Independent Isolated Word Recognition. First International Conference on Spoken Language Processing (ICSLP 90), Kobe, Japan, Nov 1990. A 1-dimensional convolutional TDNN using Max-Pooling instead of Fukushima's Spatial Averaging [1].

[8] Weng, J., Ahuja, N., and Huang, T. S. (1993). Learning recognition and segmentation of 3-D objects from 2-D images. Proc. 4th Intl. Conf. Computer Vision, Berlin, pp. 121-128. A 2-dimensional CNN whose downsampling layers use Max-Pooling (which has become very popular) instead of Fukushima's Spatial Averaging [1].

[9] In 2011, the fast and deep GPU-based CNN called DanNet (7+ layers) achieved the first superhuman performance in a computer vision contest. See overview: "2011: DanNet triggers deep CNN revolution."

[10] How 3 Turing awardees republished key methods and ideas whose creators they failed to credit. Technical Report IDSIA-23-23, Swiss AI Lab IDSIA, 14 Dec 2023. See also the YouTube video for the Bower Award Ceremony 2021: J. Schmidhuber lauds Kunihiko Fukushima.

87

2K

406

1K

617K

Who to follow

Julien Colin

@juliencolin_

PhD student in Interpretability @ELLISAlicante / @tserre Lab at Brown University. Keen interest for Deep Learning & Computational Cognitive Science.

Youtao Lu

@youtao_lu

Postdoc @IRCN_UTokyo BabyLab. he/him. https://t.co/KazN6LnUDn

Thibaut Boissin

@ThibautBoissin

Felipe Rodríguez @Piperod_

10 months ago

https://t.co/flFN8pCJz1

0

11

Piperod_ retweeted

Patrick Mineault

@patrickmineault

over 1 year ago

Excited to release what we’ve been working on at Amaranth Foundation, our latest whitepaper, NeuroAI for AI safety! A detailed, ambitious roadmap for how neuroscience research can help build safer AI systems while accelerating both virtual neuroscience and neurotech. 1/N

patrickmineault's tweet photo. Excited to release what we’ve been working on at Amaranth Foundation, our latest whitepaper, NeuroAI for AI safety! A detailed, ambitious roadmap for how neuroscience research can help build safer AI systems while accelerating both virtual neuroscience and neurotech. 1/N https://t.co/tPDn4hqMGQ

18

375

101

221

109K

Piperod_ retweeted

Yiping Lu

@2prime_PKU

11 months ago

Anyone knows adam?

265

5K

431

499

636K

Piperod_ retweeted

Andy Keller @t_andy_keller

11 months ago

Why do video models handle motion so poorly? It might be lack of motion equivariance. Very excited to introduce: Flow Equivariant RNNs (FERNNs), the first sequence models to respect symmetries over time. Paper: https://t.co/dkk43PyQe3 Blog: https://t.co/I1gpam1OL8 1/🧵

9

462

85

310

58K

Piperod_ retweeted

Nick Jiang @nickhjiang

12 months ago

Vision transformers have high-norm outliers that hurt performance and distort attention. While prior work removed them by retraining with “register” tokens, we find the mechanism behind outliers and make registers at ✨test-time✨—giving clean features and better performance! 🧵

nickhjiang's tweet photo. Vision transformers have high-norm outliers that hurt performance and distort attention. While prior work removed them by retraining with “register” tokens, we find the mechanism behind outliers and make registers at ✨test-time✨—giving clean features and better performance! 🧵 https://t.co/5UnFnLK7A6

16

1K

131

851

179K

Piperod_ retweeted

Abdullah Hamdi

@Eng_Hemdi

about 1 year ago

Last week, our Triangle splatting paper was quietly released, and since then the tech community ignited fierce debates about it ! It was trending on @hackernews ! Today we released the code! A deep dive into the epic “comeback” of Triangles to the throne of 3D 🧵 1/n

Eng_Hemdi's tweet photo. Last week, our Triangle splatting paper was quietly released, and since then the tech community ignited fierce debates about it !

It was trending on @hackernews !

Today we released the code!

A deep dive into the epic “comeback” of Triangles to the throne of 3D

🧵
1/n https://t.co/aX1nlUmgmp

22

831

90

588

110K

Piperod_ retweeted

Ilir Aliu

@IlirAliu_

about 1 year ago

A robot hand grasp over 500 totally new objects without fail? Zero-shot, single-view & super reliable ⬇️ + Paper Grasping random objects is hard for robots, especially when shapes, weights, and materials vary. RobustDexGrasp solves this with a smart new way of seeing and controlling the hand, leading to near-perfect grasping, even in noisy or cluttered scenes. Thank you for sharing, @Hui_Zhang_eth 🙏 Follow him!! What makes it special ✅ Grabs 500+ unseen objects with 94.6% success using only single-view input ✅ Learns local shapes, not full geometry, for better generalization ✅ Trained with just 35 objects in sim but works in the real world with hundreds more ✅ Adapts to noise, unexpected forces, and even plays chess with VLM planning It shows that smart sensing and adaptive control can take dexterous grasping to the next level. Project: https://t.co/JWyFmmCmJ5 Paper: https://t.co/M90aheG6J6

9

561

88

332

38K

Piperod_ retweeted

elvis

@omarsar0

about 1 year ago

265 pages of everything you need to know about building AI agents. 5 things that stood out to me about this report:

29

2K

416

4K

281K

Piperod_ retweeted

Remi Cadene

@RemiCadene

about 1 year ago

Meet SO-101, next-gen robot arm for all, by @huggingface 🤗 Enables smooth takeover to boost AI capabilities, faster assembly (20mn), same affordable price ($100 per arm) 🤯 Get yours today! Links in thread below 👇

27

715

138

369

140K

Piperod_ retweeted

Srini Turaga @srinituraga

about 1 year ago

This preprint is now published at @Nature. With current and former DeepMinders @yuvaltassa, Josh Merel, Matt Botvinick, and my @HHMIJanelia colleagues @vaxenburg, Igor Siwanowicz, @KristinMBranson, @MichaelBReiser, Gwyneth Card and more

6

97

25

20

16K

Piperod_ retweeted

MIT CSAIL

@MIT_CSAIL

about 1 year ago

Happy birthday Joseph Fourier, whose 1822 equation allows us to listen to mp3s today: https://t.co/0AQNkGAxnu

45

3K

603

458

254K

Piperod_ retweeted

Remi Cadene

@RemiCadene

about 1 year ago

A banger just got released 💥 Here is a snapshot of L2D, the biggest self-driving dataset by far! - 90 TeraBytes of data - 5000 hours of driving - 6 surrounding HD cameras - OPENLY AVAILABLE - Train your car to drive like @Tesla at home 🧵 More details in thread

40

998

152

593

172K

Piperod_ retweeted

Poonam Soni

@CodeByPoonam

about 1 year ago

AI can now generate high-quality music, and it sounds insanely good NotaGen just dropped, and it's pre-trained on 1.6M pieces of music. 7 WILD examples so far

CodeByPoonam's tweet photo. AI can now generate high-quality music, and it sounds insanely good

NotaGen just dropped, and it's pre-trained on 1.6M pieces of music.

7 WILD examples so far https://t.co/50UitCYj7z

44

818

131

994

106K

Piperod_ retweeted

Ian Goodfellow

@goodfellow_ian

about 1 year ago

#5YearsLongCovid #HalfADecadeOfNeglect #LongCovidAwarenessDay2025

41

599

49

69

99K

Piperod_ retweeted

talia konkle @talia_konkle

about 1 year ago

I think this method is quite important for interpretability research, and for understanding learned representations. Hats off to Thomas Fel and team!

0

27

5

4

2K

Piperod_ retweeted

Alex Patrascu

@maxescu

about 1 year ago

I'm a bit confused... Google's Veo 2 is the best video model in text-to-video. But on the other hand... The newly released image-to-video for Veo 2 (on @freepik and @FAL) feels underwhelming. Input images generated with @runwayml Frames. Here it is compared to @LumaLabsAI

33

663

72

489

85K

Felipe Rodríguez

@Piperod_

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users