Attention in the wild: balancing flexibility and stability
Review by Maëlle Lerebourg (@MaelleLerebourg), Surya Gayet, Floris P. de Lange, & Marius V. Peelen (@MariusPeelen)
Free access before June 23: https://t.co/XOWF0fabW8
Layers of Mind: Affect as Field History, Prediction as Field Organization — Returning to the Deferred Question of Qualia in the Natural Criticality Hypothesis
https://t.co/lcPdIYCeOu
BREAKING: MIT just mass released their Al library for free. (Links included)
I went through these and honestly... this is better than most paid courses I've seen.
Here's the full list of books:
Foundations
1. Foundations of Machine Learning Core algorithms explained. Theory meets practice.
2. Understanding Deep Learning Neural networks demystified. Visual explanations included.
3. Machine Learning Systems Production-ready architecture. System design principles.
Advanced Techniques
4. Algorithms for ML Computational thinking simplified. Decision-making frameworks.
5. Deep Learning The definitive textbook. Covers everything deeply.
Reinforcement Learning
6. RL Basics (Sutton & Barto) The classic. Agent training fundamentals.
7. Distributional RL Beyond expected rewards. Advanced theory.
8. Multi-Agent Systems Agents working together. Coordination and competition.
9. Long Game Al Strategic agent design. Future-focused thinking.
Ethics & Probability
10. Fairness in ML Bias detection. Responsible Al practices.
11. Probabilistic ML (Part 1 & 2)
Links: https://t.co/AhDqm9x1QC
Most people pay thousands for bootcamps that teach half of this.
Bookmark it. Start anywhere. Just start.
Repost for others Follow for more insights on Al Agents.
MIT's books on Al
Foundations
1. Foundations of Machine Learning - https://t.co/HxbXfsDIl6
2. Understanding Deep Learning - https://t.co/AyeQav2yzN
3. Machine Learning Systems - https://t.co/0AxGtjBFwA
Advanced Techniques
4. Algorithms for ML - https://t.co/LOjFeK1hut
5. Deep Learning - https://t.co/Ztmu7X6gNM
Reinforcement Learning
6. RL Basics (Sutton & Barto) - https://t.co/HAWxL28df1
7. Distributional RL - https://t.co/VB1zBuSzag
8. Multi-Agent Systems - https://t.co/3tWqJaimYn
9. Long Game Al - https://t.co/vYDuy1XKT2
Ethics & Probability
10. Fairness in ML - https://t.co/B4lAj2ivpF
11. Probabilistic ML (Part 1) - https://t.co/folJrX24sf
12. Probabilistic ML (Part 2) - https://t.co/BMOjc8qSqZ
🧠📢 New preprint alert
Large-scale ephys is exploding but spike sorting remains the computational bottleneck. A 2-hr, 6-probe Neuropixels 2.0 Quad Base session can take over a week to sort on a single machine. Here's a better solution. 🧵
📣 Calling early-career cognitive scientists in the Asia-Pacific!
The CogSci Asia-Pacific Meetup Kickoff is coming:
📍 Dec 6–7, 2025 | Tokyo
🌱 Retreat-style event, community building, and cross-cultural exchange.
Details: https://t.co/vbxkzx2oPI
#CogSci2025#CognitiveScience
What if our intuition about consciousness is wrong? @annakaharris
"We have no direct evidence that consciousness arises from complexity. We assume it because it seems that way to us, based on what we now understand to be illusions created by the brain."
A: Do you know why consciousness research seems so far off?
B: Because it’s hard?
A: If the problem of consciousness gets solved, they’ll be out of a job.
B: …
The chapter on consciousness is usually tucked away at the very end of neuroscience textbooks.
Who invented convolutional neural networks (CNNs)?
1969: Fukushima had CNN-relevant ReLUs [2].
1979: Fukushima had the basic CNN architecture with convolution layers and downsampling layers [1]. Compute was 100 x more costly than in 1989, and a billion x more costly than today.
1987: Waibel applied Linnainmaa's 1970 backpropagation [3] to weight-sharing TDNNs with 1-dimensional convolutions [4].
1988: Wei Zhang et al. applied "modern" backprop-trained 2-dimensional CNNs to character recognition [5].
All of the above was published in Japan 1979-1988.
1989: LeCun et al. applied CNNs again to character recognition (zip codes) [6,10].
1990-93: Fukushima’s downsampling based on spatial averaging [1] was replaced by max-pooling for 1-D TDNNs (Yamaguchi et al.) [7] and 2-D CNNs (Weng et al.) [8].
2011: Much later, my team with Dan Ciresan made max-pooling CNNs really fast on NVIDIA GPUs. In 2011, DanNet achieved the first superhuman pattern recognition result [9]. For a while, it enjoyed a monopoly: from May 2011 to Sept 2012, DanNet won every image recognition challenge it entered, 4 of them in a row. Admittedly, however, this was mostly about engineering & scaling up the basic insights from the previous millennium, profiting from much faster hardware.
Some "AI experts" claim that "making CNNs work" (e.g., [5,6,9]) was as important as inventing them. But "making them work" largely depended on whether your lab was rich enough to buy the latest computers required to scale up the original work. It's the same as today. Basic research vs engineering/development - the R vs the D in R&D.
REFERENCES
[1] K. Fukushima (1979). Neural network model for a mechanism of pattern recognition unaffected by shift in position — Neocognitron. Trans. IECE, vol. J62-A, no. 10, pp. 658-665, 1979.
[2] K. Fukushima (1969). Visual feature extraction by a multilayered network of analog threshold elements. IEEE Transactions on Systems Science and Cybernetics. 5 (4): 322-333. This work introduced rectified linear units (ReLUs), now used in many CNNs.
[3] S. Linnainmaa (1970). Master's Thesis, Univ. Helsinki, 1970. The first publication on "modern" backpropagation, also known as the reverse mode of automatic differentiation. (See Schmidhuber's well-known backpropagation overview: "Who Invented Backpropagation?")
[4] A. Waibel. Phoneme Recognition Using Time-Delay Neural Networks. Meeting of IEICE, Tokyo, Japan, 1987. Backpropagation for a weight-sharing TDNN with 1-dimensional convolutions.
[5] W. Zhang, J. Tanida, K. Itoh, Y. Ichioka. Shift-invariant pattern recognition neural network and its optical architecture. Proc. Annual Conference of the Japan Society of Applied Physics, 1988. First backpropagation-trained 2-dimensional CNN, with applications to English character recognition.
[6] Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, L. D. Jackel: Backpropagation Applied to Handwritten Zip Code Recognition, Neural Computation, 1(4):541-551, 1989. See also Sec. 3 of [10].
[7] K. Yamaguchi, K. Sakamoto, A. Kenji, T. Akabane, Y. Fujimoto. A Neural Network for Speaker-Independent Isolated Word Recognition. First International Conference on Spoken Language Processing (ICSLP 90), Kobe, Japan, Nov 1990. A 1-dimensional convolutional TDNN using Max-Pooling instead of Fukushima's Spatial Averaging [1].
[8] Weng, J., Ahuja, N., and Huang, T. S. (1993). Learning recognition and segmentation of 3-D objects from 2-D images. Proc. 4th Intl. Conf. Computer Vision, Berlin, pp. 121-128. A 2-dimensional CNN whose downsampling layers use Max-Pooling (which has become very popular) instead of Fukushima's Spatial Averaging [1].
[9] In 2011, the fast and deep GPU-based CNN called DanNet (7+ layers) achieved the first superhuman performance in a computer vision contest. See overview: "2011: DanNet triggers deep CNN revolution."
[10] How 3 Turing awardees republished key methods and ideas whose creators they failed to credit. Technical Report IDSIA-23-23, Swiss AI Lab IDSIA, 14 Dec 2023. See also the YouTube video for the Bower Award Ceremony 2021: J. Schmidhuber lauds Kunihiko Fukushima.
What will society think about AI consciousness? Lessons from the animal case
Science & Society by Lucius Caviola, Jeff Sebo, & Jonathan Birch, https://t.co/UM7R8v06nb