Manuel Traub

@traub_manuel

IMPRS-IS scholar and PhD student at the University of Tuebingen

Germany

Joined February 2022

87 Following

34 Followers

40 Posts

Manuel Traub @traub_manuel

5 months ago

@NielsRogge @YesThisIsLion @MLStreetTalk If you put polar coordinates into your MLP it will also learn to extrapolate the spiral. The M-Layer is just a particularly suited inductive bias for this problem, in other domains it doesn't look that impressive.

251

traub_manuel retweeted

Anand Gopalakrishnan @agopal42

6 months ago

Our new paper shows that RoPE—the positional encoding used in most modern LLMs like Qwen, Gemma, DeepSeek—has a fundamental flaw: it entangles "what" (content) and "where" (position) information. Our fix (PoPE) is simple but powerful. Paper: https://t.co/XlltfcSwHQ

176

156K

traub_manuel retweeted

alphaXiv

@askalphaxiv

6 months ago

Recursive reasoning beats multi-billion-parameter models You can now easily train your own 7M param model from scratch and outperform DeepSeek-r1 on ARC-AGI 1 We provide a simple speedrun script that handles setup, training, and eval in one go.

askalphaxiv's tweet photo. Recursive reasoning beats multi-billion-parameter models

You can now easily train your own 7M param model from scratch and outperform DeepSeek-r1 on ARC-AGI 1

We provide a simple speedrun script that handles setup, training, and eval in one go. https://t.co/c8l32QfRzF

201

87K

Manuel Traub @traub_manuel

6 months ago

Reminds me a lot of Active Inference: https://t.co/93QvGTe2yX

Who to follow

Georg Martius

@GMartius

Researcher, interested in autonomous machine learning, reinforcement learning, robotics, 3d printing and more

Andrii Zadaianchuk 🇺🇦

@ZadaianchukML

Postdoc UvA, PhD @ETH Zürich and @MPI_IS, ex-intern in @AmazonScience. Structured representation learning for and by autonomous agents. 🦋 @zadaianchuk

Cansu Sancaktar

@CcansuSancaktar

PhD Student @MPI_IS & @uni_tue | prev intern @AIatMeta & @Qualcomm | Working on open-endedness and getting agents to play like children with unsupervised RL 🤖

traub_manuel retweeted

Zhuang Liu

@liuzhuang1234

6 months ago

Stronger Normalization-Free Transformers – new paper. We introduce Derf (Dynamic erf), a simple point-wise layer that lets norm-free Transformers not only work, but actually outperform their normalized counterparts.

liuzhuang1234's tweet photo. Stronger Normalization-Free Transformers – new paper.

We introduce Derf (Dynamic erf), a simple point-wise layer that lets norm-free Transformers not only work, but actually outperform their normalized counterparts. https://t.co/NAPJvfsEGI

175

788

166K

traub_manuel retweeted

Yifan Zhang

@yifan_zhang_

6 months ago

Mixture of Parrots: Experts improve memorization more than reasoning https://t.co/V4H34FuvTM

242

200

71K

traub_manuel retweeted

Yuchen Jin

@Yuchenj_UW

6 months ago

Larry Page & Sergey Brin had the PageRank paper (the algorithm behind Google Search) rejected. A reviewer called it “disjointed.” Geoffrey Hinton's Dropout was rejected for being “too simple.” I often feel the academic peer review is like a random process, especially when a paper is very innovative and changes the paradigm; it often looks "wrong" to reviewers in the old paradigm.

Yuchenj_UW's tweet photo. Larry Page & Sergey Brin had the PageRank paper (the algorithm behind Google Search) rejected. A reviewer called it “disjointed.”

Geoffrey Hinton's Dropout was rejected for being “too simple.”

I often feel the academic peer review is like a random process, especially when a paper is very innovative and changes the paradigm; it often looks "wrong" to reviewers in the old paradigm.

124

399

326K

traub_manuel retweeted

JFPuget 🇫🇷🇺🇦🇨🇦🇬🇱

@JFPuget

6 months ago

Interesting read from the ARChitects, 2nd place team on @arcprize competition on @kaggle . Their combination of diffusion LLMs with iterative improvement is quite interesting. It has some ties with TRM and HRM models. There is some irony thoough. They tried something different from their winning solution from last year because they thought it was not successful. Irony is we won reusing their last year solution (with some improvements). Key for us was to use better pretraining data.

118

10K

traub_manuel retweeted

Shashwat Goel

@ShashwatGoel7

6 months ago · Tübingen

You can wait for the academic compute crisis to get solved 🥱 or... Just come to Tübingen for the: 1) Compute 2) Talent Density 3) Aesthetics, in research, workspaces, and the city as a whole (Job) Markets are not efficient ;) @FrancoisChauba1's plot extended by @nikhilchandak29

ShashwatGoel7's tweet photo. You can wait for the academic compute crisis to get solved 🥱 or...

Just come to Tübingen for the:
1) Compute
2) Talent Density
3) Aesthetics, in research, workspaces, and the city as a whole

(Job) Markets are not efficient ;)

@FrancoisChauba1's plot extended by @nikhilchandak29

14K

traub_manuel retweeted

Yuchen Jin

@Yuchenj_UW

6 months ago

More papers should include a “Things We Tried That Didn’t Work” section. DeepSeek R1 does this too, and it’s incredibly valuable.

Yuchenj_UW's tweet photo. More papers should include a “Things We Tried That Didn’t Work” section.

DeepSeek R1 does this too, and it’s incredibly valuable. https://t.co/gCNrDlT7Zc

365

817

152K

traub_manuel retweeted

Haitham Bou Ammar

@hbouammar

6 months ago

Ok, let us complain about OpenReviewer's latest saga! As you are aware, reviewer names have been leaked, including mine and yours, likely. I think OpenReview and the conferences are trying their best to come up with solutions. I salute them for that! Thank you for the hard work on this. I know it must be very tough! I came to learn that the distribution of reviewers is funny and weird. I kid you not, MSc and BSc students review at those top-tier conferences. I thought it was a myth, but it is not 🤓 Let us get some people upset! But, at least, let us be honest. Now let me tell you what is wrong with that: 1️⃣ MSc and BSc students, bright as they may be, are not experts in their fields. I think it is very awkward for senior researchers and professors to work their butt off to please students who are newbies to the field, and most of the time don't understand the papers well. Don't get me wrong, I am sure they put HUGE effort, but effort is just effort! It is not understanding! It might now become clearer to all of us why we get reviews like do 1000 more experiments, some weird comments - well, they don't know better. 2️⃣ The above would have been OK if they had actually listened and learned during the rebuttal process. But think about it! If you are an MSc and BSc reviewer, would you ever dare to admit you were wrong? It is a big deal for you; you want to propel your career, so how can you step back from claiming something outrageous? How mature as a researcher are you to suggest improvements to papers that have been worked on with people with over 10-15 years of experience? Heck, imagine someone did what you want to build your career on when you are still a BSc or MSc student. Wouldn't you simply reject it since it beats you to it? 3️⃣ Students tend to over-index on surface-level details because coursework trains them to prioritise correctness, completeness, and hyper-specific checklists. As a result, their reviews often fixate on typos, notation, formatting, and endless abstractions rather than the substance of the contribution. The entire evaluation becomes a grading exercise, not a scientific one, and the core ideas of the paper get lost. 4️⃣ Students often rely on intuition or “common sense” because they haven’t yet internalised the techniques, math, empirical tricks, and failure modes of the field. But top-tier research requires theory-backed judgment, statistical intuition, and the ability to separate real signal from noise. Those instincts take years to develop, and it's unfair to expect students to have them while evaluating frontier work. So less mathy papers and benchmarks rise to the top, and mathy ones get rejected. ... so yeah, it doesn't make any freaking sense. #openreview

Manuel Traub @traub_manuel

6 months ago

Be me Submit new SOTA to @iclr_conf Reviews 8,4,4,4 Schrödinger's accept Enter rebuttal arc, sleep schedule and will to live overfitted to this one decision OpenReviewer rolls nat 1 Committee pushes global reset Paper enters timeline where I hadn’t sacrificed my mental health

500

traub_manuel retweeted

Raj Dabre @prajdabre

6 months ago

Clearly a pivotal moment where we need to rethink peer review. Some thoughts: 1. I have been an advocate of fully non-anonymous peer review. On one hand, people will be careful about what they write but on the other hand, most people will just not sign up for reviewing. 2. Conferences have badly failed at scaling peer review. This has been an issue at least since 2019 where I remember AAAI had close to 9k submissions. It's not possible to have 24k papers submitted per major conference and have them well reviewed. 3. At this point the review signal is extremely noisy and what determines acceptance is whether you can "trick" 3 people into following your narrative. Submit a decent paper enough times and it will get accepted. 4. There are way too many niches and too many papers coming out daily. I don't think anyone is up to date anymore. Therefore anyone reviewing your papers is likely not in a position to make a good judgement. 5. The competition due to artificially low acceptance rates is so intense that it's not unnatural for a reviewer to see any paper they are reviewing as a threat. This combined with point 4 means that the reviews will either be shallow or adversarial, both useless. 6. Clearly AI reviews are here to stay so best if we invest in proper infra and training so that people use AI ethically and appropriately and thus help repair peer review. Now is the time!

14K

Manuel Traub @traub_manuel

6 months ago

@hive_echo The spiral example seems kind of selected for this problem since the inductive bias of the M-Layer is polynomial decomposition and regularizing for simple polynomials gives you a sort of rotation matrix. If you train an MLP in polar coodinates you get the same extrapolation.

traub_manuel retweeted

Peter Richtarik

@peter_richtarik

7 months ago

I am an AC for ICLR 2026. One of the papers in my batch was just withdrawn. The authors wrote a brief response, explaining why the reviewers failed at their job. I agree with most of their comments. The authors gave up. They are fed up. Just like many of us. I understand. We pretend the emperor has clothes, but he is naked. Here is the final part of their withdrawal notice. I took the liberty to make it public, to highlight that what we are doing with AI conference reviews these last few years is, basically, madness. --- Comment: We thank the reviewers for their time. However, upon reading the reviews for our paper, it became immediately apparent that the four "reject" ratings are not based on good-faith academic disagreement, but on a critical failure to read the submitted paper. The reviews are rife with demonstrably false claims that are directly contradicted by the text. The core justifications for rejection rely on asserting that key components are "missing" when they are explicitly detailed in the manuscript. Some specific examples are (and many are even fake claims). Claim: Harder tasks like GSM8K are missing. Fact: GSM8K results are in many tables, like Table 2 (Section 4.2) and Appendix G. Claim: The method does not use per-layer ranks. Fact: This is the entire point of our method. The reviewer clearly mistook our method for the baselines. (Section 2, Table 1). Claim: The GP kernel is not specified. Fact: It is specified in Appendix E (Table 6). Claim: There is no ablation of the method's three stages. Fact: Section 4.4 ("Ablation Study") and Appendix J are dedicated to this. Reviewers have a fundamental responsibility to read and evaluate the work they are assigned. The nature of these errors is so fundamental, so systemic in overlooking explicit content, that it goes far beyond what "limited time" or "oversight" can explain. This work has gone through several rounds of revision over the last year. In earlier submissions, the paper usually received borderline or weak-accept scores. Numerous signs strongly suggest that some reviewers are relying entirely on AI tools to automatically generate peer reviews, rather than fulfilling their fundamental responsibility of personally reading and evaluating manuscripts. We strongly protest this. This is a gross disrespect to the authors. It is a flagrant desecration of the reviewer's sacred duty. It fundamentally undermines the integrity of the entire peer-review process. Given that the reviews are not based on the actual content of our paper, we have decided to withdraw the submission. We leave this comment so that future readers of the OpenReview page are aware that the items described as "missing" are already present in the submitted manuscript. These negative reviews for this submission are factually unsound and do not reflect the content of the paper. We cannot and will not accept an assessment that is not based on the work we actually submitted.

204

288

150K

Manuel Traub @traub_manuel

7 months ago

@BingBongBrent How about something like pic breader, where the image selection is an open ended search?

traub_manuel retweeted

Martin Butz @mvbutz

over 1 year ago

Food for thought about how our brain does it, that is, THINK -- and how it thereby uses CONTEXT INFERENCE to not get overwhelmed but to focus on what seems relevant -- consider reading our new review paper on "Contextualizing Predictive Minds" #Context

620

Manuel Traub @traub_manuel

over 1 year ago

@Msadat97 @NeurIPSConf Congratulations 👏 Any chance you are going to release the models and source code?

traub_manuel retweeted

Yannic Kilcher 🇸🇨

@ykilcher

about 2 years ago

I've made a video explaining xLSTM. Watch here: https://t.co/PRpE3Iz3w2

362

133

33K

traub_manuel retweeted

Martin Butz @mvbutz

about 2 years ago

Want to know how to machine-learn contextualized, compositional world models from sensorimotor experiences and use them for planning and RL? Our #ICLR2024 spotlight paper provides an answer.

Manuel Traub

@traub_manuel

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users