Mihai Polceanu

@polceanum

AI Engineer and Researcher

London, UK

Joined February 2013

473 Following

57 Followers

327 Posts

polceanum retweeted

David McAllister @davidrmcall

11 months ago

Excited to share Flow Matching Policy Gradients: expressive RL policies trained from rewards using flow matching. It’s an easy, drop-in replacement for Gaussian PPO on control tasks.

204

964

150K

polceanum retweeted

Nicholas Fabiano, MD

@NTFabiano

11 months ago

Coffee changes connectivity in the brain. Increased functional connectivity of the higher visual & executive control networks were seen with coffee, but not caffeine alone.

NTFabiano's tweet photo. Coffee changes connectivity in the brain.

Increased functional connectivity of the higher visual & executive control networks were seen with coffee, but not caffeine alone. https://t.co/2w12C7PPDW

573

344K

polceanum retweeted

Grigory Bartosh @GrigoryBartosh

11 months ago

📢Presenting SDE Matching🔥🔥🔥 🚀We extend diffusion models to construct a simulation-free framework for training Latent SDEs. It enables sampling from the exact posterior process marginals without any numerical simulations. 📜: https://t.co/mKUvZZ4Vzt 🧵1/8

804

136

737

81K

polceanum retweeted

Floor Eijkelboom

@FEijkelboom

11 months ago

Flow Matching (FM) is one of the hottest ideas in generative AI - and it’s everywhere at #ICML2025. But what is it? And why is it so elegant? 🤔 This thread is an animated, intuitive intro into (Variational) Flow Matching - no dense math required. Let's dive in! 🧵👇

109

272

265K

Who to follow

Andrei Nicolicioiu

@anNicolicioiu

PhD student @Mila_Quebec, visiting researcher @AIatMeta studying LLM reasoning, robustness, causal ML. Past @MPI_IS @Bitdefender.

Torchborg

@torchborg

here for the AI Community | Retweets & Likes are memos for me

DeeBee

@self_supervised

Building dumb machines with supervised learning; AI researcher in practice | past 》MERL, SamsungResearch | Current 》 Credit and Fraud modeling using AI

polceanum retweeted

Ricardo Buitrago @rbuit_

11 months ago

Despite theoretically handling long contexts, existing recurrent models still fall short: they may fail to generalize past the training length. We show a simple and general fix which enables length generalization in up to 256k sequences, with no need to change the architectures!

rbuit_'s tweet photo. Despite theoretically handling long contexts, existing recurrent models still fall short: they may fail to generalize past the training length. We show a simple and general fix which enables length generalization in up to 256k sequences, with no need to change the architectures! https://t.co/Q7oPtIwjgj

196

117

42K

polceanum retweeted

Lester Li

@sizhe_lester_li

12 months ago

Now in Nature! 🚀 Our method learns a controllable 3D model of any robot from vision, enabling single-camera closed-loop control at test time! This includes robots previously uncontrollable, soft, and bio-inspired, potentially lowering the barrier of entry to automation! Paper: https://t.co/43qHC9sxZI (1/n)

sizhe_lester_li's tweet photo. Now in Nature! 🚀 Our method learns a controllable 3D model of any robot from vision, enabling single-camera closed-loop control at test time! This includes robots previously uncontrollable, soft, and bio-inspired, potentially lowering the barrier of entry to automation!

Paper: https://t.co/43qHC9sxZI

(1/n)

425

354

100K

polceanum retweeted

hardmaru

@hardmaru

12 months ago

Sakana AI developed a new coding agent, ALE-Agent, trained to solve NP-hard optimization problems. Our agent participated in a live coding competition, the challenging AtCoder Heuristic Contest, and ranked #21 out of 1,000 human participants! Learn more: https://t.co/KOYxix8oy0

351

148

53K

polceanum retweeted

C. Zhang @ChongZzZhang

12 months ago

FWIW, Isaac Sim just OSed

583

217

95K

polceanum retweeted

Graphcore Research

@GCResearchTeam

about 1 year ago

Your boss emails you a point in 128-billion-dimensional space. It's Llama 8B in bfloat16. They want it compressed. What should you do 🤔... quantise to NF4? 🧵

GCResearchTeam's tweet photo. Your boss emails you a point in 128-billion-dimensional space. It's Llama 8B in bfloat16. They want it compressed.

What should you do 🤔... quantise to NF4? 🧵 https://t.co/APFejwyJVf

288

polceanum retweeted

hardmaru

@hardmaru

about 1 year ago

Text-to-LoRA: Instant Transformer Adaption https://t.co/FXTZUFd0Jo Generative models can produce text, images, video. They should also be able to generate models! Here, we trained a Hypernetwork to generate new task-specific LoRAs by simply describing the task as a text prompt.

754

129

418

79K

polceanum retweeted

Junior Rojas @junior_rojas_d

about 1 year ago

I've been experimenting with attention mechanisms to design locomotion controllers that adapt to different shapes, this is the same controller running on two different bodies https://t.co/56a5NleE2Y paper coming soon 👀

219

170K

polceanum retweeted

hardmaru

@hardmaru

about 1 year ago

New Paper! Darwin Godel Machine: Open-Ended Evolution of Self-Improving Agents A longstanding goal of AI research has been the creation of AI that can learn indefinitely. One path toward that goal is an AI that improves itself by rewriting its own code, including any code responsible for learning. That idea, known as a Gödel Machine, proposed by @SchmidhuberAI over two decades ago, is a hypothetical self-improving AI. It optimally solves problems by recursively rewriting its own code when it can mathematically prove a better strategy, making it a key concept in meta-learning or “learning to learn.” While the theoretical Gödel Machine promised provably beneficial self-modifications, its realization relied on an impractical assumption: that the AI could mathematically prove that a proposed change in its own code would yield a net improvement before adopting it. Sakana AI, in collaboration with Jeff Clune’s lab at UBC, proposes something more feasible: a system that harnesses the principles of open-ended algorithms like Darwinian evolution to search for improvements that empirically improve performance. We call the result the Darwin Gödel Machine. DGMs leverage foundation models to propose code improvements, and use recent innovations in open-ended algorithms to search for a growing library of diverse, high-quality AI agents. Applied to practical tasks, we implemented Darwin Gödel Machine as a self-improving coding agent that rewrites its own code to improve performance on programming tasks. It creates various self-improvements, such as a patch validation step, better file viewing, enhanced editing tools, generating and ranking multiple solutions to choose the best one, and adding a history of what has been tried before (and why it failed) when making new changes (see the attached video). We believe that Darwin Gödel Machines represent a concrete step towards AI systems that can autonomously gather their own stepping stones to learn and innovate forever!

201

637

105K

Mihai Polceanu @polceanum

about 1 year ago

Have you used any form of AI in the past week?

polceanum retweeted

William Gilpin @wgilpin0

about 1 year ago

We present Panda: a foundation model for nonlinear dynamics pretrained on 20,000 chaotic ODE discovered via evolutionary search. Panda zero-shot forecasts unseen ODE best-in-class, and can forecast PDE despite having never seen them during training (1/8) https://t.co/AuWFmVal8o

wgilpin0's tweet photo. We present Panda: a foundation model for nonlinear dynamics pretrained on 20,000 chaotic ODE discovered via evolutionary search. Panda zero-shot forecasts unseen ODE best-in-class, and can forecast PDE despite having never seen them during training (1/8)
https://t.co/AuWFmVal8o https://t.co/VPNxw1y1Lt

325

174K

polceanum retweeted

Kenneth Stanley

@kenneth0stanley

about 1 year ago

Could a major opportunity to improve representation in deep learning be hiding in plain sight? Check out our new position paper: Questioning Representational Optimism in Deep Learning: The Fractured Entangled Representation Hypothesis. The idea stems from a little-known observation about networks trained to output a single image: when they are discovered through an unconventional open-ended search process, their representations are incredibly elegant and exhibit astonishing modular decomposition. In contrast, when SGD (successfully) learns to output the same image its underlying representation is fractured, entangled - an absolute mess! This stark difference in the underlying representation of the same "good" output behavior carries deep lessons for deep learning. It shows you cannot judge a book by its cover - an LLM with all the right responses could similarly be a mess under the hood. But also, surprisingly, it shows us that it doesn't have to be this way! Without the unique examples in this paper that were discovered through open-ended search, we might assume neural representation has to be a mess. These results show that is clearly untrue. We can now imagine something better because we can actually see it is possible. We give several reasons why this matters: generalization, creativity, and learning are all potentially impacted. The paper shows examples to back up these concerns, but in brief, there is a key insight: Representation is not only important for what you're able to do now, but for where you can go from there. The ability to imagine something new (and where your next step in weight space can bring you) depends entirely upon how you represent the world. Generalization, creativity, and learning itself depend upon this critical relationship. Notice the difference in appearance between the nearby images to the skull in weight space shown in the top-left and top-right image strips of the attached graphic. The difference in semantics is stark. The insight that representation could be better opens up a lot of new paths and opportunities for investigation. It raises new urgency to understand the representation underlying foundation models and LLMs while exposing all kinds of novel avenues for potentially improving them, from making learning processes more open-ended to manipulating architectures and algorithms. Don't mistake this paper as providing comfort for AI pessimists. By exposing a novel set of stark and explicit differences between conventional learning and something different, it can act as an accelerator of progress as opposed to a tool of pessimism. At the least, the discussion it provokes should be quite illuminating.

$kenneth0stanley's tweet photo. Could a major opportunity to improve representation in deep learning be hiding in plain sight? Check out our new position paper: Questioning Representational Optimism in Deep Learning: The Fractured Entangled Representation Hypothesis. The idea stems from a little-known observation about networks trained to output a single image: when they are discovered through an unconventional open-ended search process, their representations are incredibly elegant and exhibit astonishing modular decomposition. In contrast, when SGD (successfully) learns to output the same image its underlying representation is fractured, entangled - an absolute mess! This stark difference in the underlying representation of the same "good" output behavior carries deep lessons for deep learning. It shows you cannot judge a book by its cover - an LLM with all the right responses could similarly be a mess under the hood. But also, surprisingly, it shows us that it doesn't have to be this way! Without the unique examples in this paper that were discovered through open-ended search, we might assume neural representation has to be a mess. These results show that is clearly untrue. We can now imagine something better because we can actually see it is possible. We give several reasons why this matters: generalization, creativity, and learning are all potentially impacted. The paper shows examples to back up these concerns, but in brief, there is a key insight: Representation is not only important for what you're able to do now, but for where you can go from there. The ability to imagine something new (and where your next step in weight space can bring you) depends entirely upon how you represent the world. Generalization, creativity, and learning itself depend upon this critical relationship. Notice the difference in appearance between the nearby images to the skull in weight space shown in the top-left and top-right image strips of the attached graphic. The difference in semantics is stark. The insight that representation could be better opens up a lot of new paths and opportunities for investigation. It raises new urgency to understand the representation underlying foundation models and LLMs while exposing all kinds of novel avenues for potentially improving them, from making learning processes more open-ended to manipulating architectures and algorithms. Don't mistake this paper as providing comfort for AI pessimists. By exposing a novel set of stark and explicit differences between conventional learning and something different, it can act as an accelerator of progress as opposed to a tool of pessimism. At the least, the discussion it provokes should be quite illuminating.$

983

157

891

164K

polceanum retweeted

hardmaru

@hardmaru

about 1 year ago

New Paper: Continuous Thought Machines 🧠 Neurons in brains use timing and synchronization in the way that they compute, but this is largely ignored in modern neural nets. We believe neural timing is key for the flexibility and adaptability of biological intelligence. We propose a new neural architecture, “Continuous Thought Machines” (CTMs), which is built from the ground up to use neural dynamics as a core representation for intelligence. By using neural dynamics as a first-class representational citizen, CTMs naturally perform adaptive computation. Many emergent, interesting behaviors arise as a result: CTMs solve mazes by observing a raw maze image and producing step-by-step instructions directly from its neural dynamics. When tasked with image recognition, the CTM naturally takes multiple steps to examine different parts of the image before making its decision. This step-by-step approach not only makes its behavior more interpretable but also improves accuracy: the longer it “thinks,” the more accurate its answers become. We also found that this allows the CTM to decide to spend less time thinking on simpler images, thus saving energy. When identifying a gorilla, for example, the CTM’s attention moves from eyes to nose to mouth in a pattern remarkably similar to human visual attention. I think this work underscores an important, yet often lost, synergy between neuroscience and AI. While modern AI is ostensibly brain-inspired, the two fields often operate in surprising isolation. By starting with such inspiration and iteratively following the emergent, interesting behaviors, we developed a model with unexpected capabilities, such as its surprisingly strong calibration in classification tasks, a feature that was not explicitly designed for. When we initially asked, “why do this research?”, we hoped the journey of the CTM would provide compelling answers. By embracing light biological inspiration and pursuing the novel behaviors observed, we have arrived at a model with emergent capabilities that exceeded our initial designs. We are committed to continuing this exploration, borrowing further concepts to discover what new and exciting behaviors will emerge, pushing the boundaries of what AI can achieve.

550

257K

polceanum retweeted

Dimitris Papailiopoulos

@DimitrisPapail

about 1 year ago

Kinda cute that you can reduce KV cache by replacing it with a universal, transferable dictionary + old school sig. proc reconstruction algorithm. We tested on non-reasoning models and was sota, but methinks it'll work even better on reasoning ones. The ICML random coins landed favorably on this one, so you'll get to chat with @jon_ghoh about it.

DimitrisPapail's tweet photo. Kinda cute that you can reduce KV cache by replacing it with a universal, transferable dictionary + old school sig. proc reconstruction algorithm.

We tested on non-reasoning models and was sota, but methinks it'll work even better on reasoning ones.

The ICML random coins landed favorably on this one, so you'll get to chat with @jon_ghoh about it.

549

448

69K

polceanum retweeted

ARC Prize

@arcprize

about 1 year ago

Today we are announcing ARC-AGI-2, an unsaturated frontier AGI benchmark that challenges AI reasoning systems (same relative ease for humans). Grand Prize: 85%, ~$0.42/task efficiency Current Performance: * Base LLMs: 0% * Reasoning Systems: <4%

arcprize's tweet photo. Today we are announcing ARC-AGI-2, an unsaturated frontier AGI benchmark that challenges AI reasoning systems (same relative ease for humans).

Grand Prize: 85%, ~$0.42/task efficiency

Current Performance:
* Base LLMs: 0%
* Reasoning Systems: <4% https://t.co/k9o7t031fN

324

629

462K

polceanum retweeted

Sebastian Risi

@risi1979

about 1 year ago

Excited to share our latest work: “Bio-Inspired Plastic Neural Networks for Zero-Shot Out-of-Distribution Generalization in Complex Animal-Inspired Robots” 🪲🦎 We show that Hebbian learning outperforms LSTM-based adaptation for real-world transfer. It even works without domain randomization! It can handle: ✅ Uneven terrain ✅ Morphological damage ✅ Sim-to-real gaps

225

109

14K

polceanum retweeted

hardmaru

@hardmaru

over 1 year ago

This was a fun experiment we ran while developing The AI Scientist-v2. With the permission of ICLR, we submitted an AI-generated paper to an ICLR workshop that passed the peer-review process. We documented the entire process and our learnings in a blog: https://t.co/z6u1D9YEo2 As AI researchers, we also wrote our own (human) reviews documenting our own assessment and critiques of the AI-generated papers, and conducted code reviews on the computational experiments conceived by The AI Scientist-v2, which you might find interesting! The AI-generated papers and our analysis of them are also published on our GitHub: https://t.co/k2CcnfGUy1 As we embrace artificial novelty search and open-ended discovery with AI, I believe computational creativity can enable frontier LLMs to produce even more novel and imaginative ideas (and if these ideas are related to AI / ML, can be tested with actual computational experiments conceived by AI). Perhaps one day, AI systems can produce groundbreaking scientific discoveries (or maybe, an accepted NeurIPS or ICLR paper 😛)

396

162

90K

Mihai Polceanu

@polceanum

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users