ParalESN made it to ICML 🚀
We revisit the reservoir computing paradigm to enable parallel recurrence and higher-dimensional reservoirs.
Preprint: https://t.co/UsvaASiYie
Co-authors: G. Lagomarsini, A. Ceni, @claudiogallicc1#ICML#ICML2026
🎓 We're thrilled to welcome 121 new PhD students to the ELLIS network from our 2025/2026 call — selected from a record-breaking 2,100+ submitted applications, a 43% increase from last year.
👉 Read the full article for more details: https://t.co/II3rkmOIOm
@Guy_T_Sky@furongh Have the same question. The approach is also more (memory-) efficient so, unless I missed something, it cannot be due to compute, no?
This is your annual reminder that we don’t need to speculate about whether we will have a “theory of deep learning” and what form it might take, because we already have a basic understanding of generalization in deep learning: https://t.co/AgHdSQjCvU
Reviewers & ACs for #ICML2026 have been recognized for their service!
- Reviewers: 4439 Gold (free registration), 4437 Silver. 17749 total reviewers were assigned >= 1 paper
- ACs: 1647 receive free registration, out of 1691 who were assigned >= 1 paper
TY for your hard work!
A plot of ICLR papers by country is making the rounds, showing no EU + Japan papers and people are drawing all kinds of conclusions.
..but the plot excludes all (!) EU institutions due to a cutoff. China + USA still dominant of course but the full picture looks a bit different.
🚀 Reservoir Computing is officially back in the Deep Learning game!
Thrilled to announce our paper "ParalESN" has been accepted at #ICML2026! 🎉
We resurrected RC for the DL era, finally unlocking parallel processing via associative scan.
Preprint: https://t.co/dYPBkCpGeq🧵👇
@mhrezaeics AI-assisted research does not only generate slop, it also makes good labs more productive (i.e., higher number of good submissions).
I would also wait a couple more cycles before judging the acceptance rate trajectory.
@shaohua0116 In some cases there may be some fundamental limitation in the paper that does not warrant a higher score. Fine with that, but I would prefer them being explicit about it or tell me exactly what needs to be addressed in order for them to adjust the score.
[1/8] New paper with Hongjian Jiang, @YanhongLi2062, Anthony Lin, @Ashish_S_AI:
📜Why Are Linear RNNs More Parallelizable?
We identify expressivity differences between linear/nonlinear RNNs and, conversely, barriers to parallelizing nonlinear RNNs 🧵👇
@gabriberton Agree.
I argue that a good percentage of those complaining are the ones affected by the desk rejections. Otherwise it's absurd to complain about mechanisms introduced to protect authors from such low-effort reviews...
Introducing M²RNN: Non-Linear RNNs with Matrix-Valued States for Scalable Language Modeling
We bring back non-linear recurrence to language modeling and show it's been held back by small state sizes, not by non-linearity itself.
📄 Paper: https://t.co/AS8e2tNrRa
💻 Code: https://t.co/LMvBcI22Du
🤗 Models: https://t.co/NCmjrpNriq
Fun fact: In an N layer Transformer, you can drop the QK projections in the top N/2 layers and reuse the attention from layer N/2, and it basically works.
The top half layers now act like an MLP mixer with its mixing weights (attention) generated by a Transformer (the bottom half).