Principal RS at IBM Research AI. Speech, Formal/Natural Language Processing. Currently LLM post-training, structured SDG/RL. Opinions my own and non stationary
LxMLS 2026 is looking for monitors. As you know we favour ex-alumni so I don't need to explain to you how awesome this is. One week to apply!
https://t.co/xmIopDUVtf
Appreciate the engagement with our work and the discussion it's sparked! There are plenty of open problems to study around Abstract-CoT, and there are many areas for further analysis. We've written up a blog post that discusses some of our thoughts -- we plan to incorporate this (in some form) in a revised version of the paper as a Broader Impacts statement:
https://t.co/o2MKzAFgU8
🧵Building AI apps rarely means using just one model.
Granite 4.1 brings language, vision, speech, and guardrails together—so you can build real workflows, not just demos.
What devs should know.👇
"Thinking Without Words: Efficient Latent Reasoning with Abstract Chain-of-Thought"
Do reasoning models really need to think in words?
This paper replaces long verbal CoT with a short learned sequence of abstract tokens that acts like a latent scratchpad.
Warmed up from verbal CoT, then distilled and improved with RL.
Resulting up to 11.6x fewer reasoning tokens while staying competitive with standard CoT.
What if your language model could reason efficiently in an entirely new language?
We introduce Abstract Chain-of-Thought, a new mechanism which allows language models to reason through a short sequence of reserved "abstract" tokens through reinforcement learning. It is as performant as verbalized CoT at a fraction of the cost, achieving major gains in inference-time efficiency.
I'll be at ICLR 2026 this week in Rio 🇧🇷, presenting two papers! If you're interested in self-improving agents, long-horizon RL and open-ended discovery, test-time training, latent reasoning, or just want to chat, please reach out!
📝 Learned Meta-Tokens for Language Modeling -- 4/24 @ 3:15 PM, Pavillion 3, Poster #616
📝 Learning Efficient Latent Reasoning with Abstract Chain-of-Thought -- 4/27 @ 3 PM, LIT Workshop, 101 A
Interested in merging LoRAs/models but feel like it's a black box? ⬛️
In our latest work, we finally define "Mergeability" and quantify it to understand the mechanics behind the magic.
The spoiler? It’s not random. It all comes down to the base model's prior knowledge. 🧠👇
When I created Claude Code as a side project back in September 2024, I had no idea it would grow to be what it is today. It is humbling to see how Claude Code has become a core dev tool for so many engineers, how enthusiastic the community is, and how people are using it for all sorts of things from coding, to devops, to research, to non-technical use cases. This technology is alien and magical, and it makes it so much easier for people to build and create. Increasingly, code is no longer the bottleneck.
A year ago, Claude struggled to generate bash commands without escaping issues. It worked for seconds or minutes at a time. We saw early signs that it may become broadly useful for coding one day.
Fast forward to today. In the last thirty days, I landed 259 PRs -- 497 commits, 40k lines added, 38k lines removed. Every single line was written by Claude Code + Opus 4.5. Claude consistently runs for minutes, hours, and days at a time (using Stop hooks). Software engineering is changing, and we are entering a new period in coding history. And we're still just getting started..
There is significant discussion in the academic literature about RL making models better at pass@1 and *worse* at pass@N (or related claims).
We run a lot of RL runs at Cursor and don't see this issue systematically. Not doubting it occurs, but something else might be going on.
I'll be at NeurIPS from Dec 2nd (afternoon) to Dec 7th! I'm interested in continual self-improvement and adaptation, new reasoning methods, personalization, and calibration. I'd love to grab coffee and meet new people!
Come by our poster for STaPLe on Wednesday at 4:30 pm, #1812 in Exhibit Hall C, D, E! Our work bridges self-improvement, constitutional alignment, and self-correction.
Consider, if you will, this peculiar Silicon Valley confession: “I worked for 36 hours with no sleep. Although I was dead, I also felt energized. I even fell asleep a few times while driving home in my Cybertruck, but fsd came in clutch. Happy thanksgiving.” People read this and say, “Wow, so crazy.” No. This is not crazy. This is ideology speaking in the first person.
Look at what he is really proud of. Not what he built, not the result. He is proud of his own exhaustion. He whips himself and calls it self-actualization. Before, the saint starved in the cave for God. Now, the engineer fasts from sleep for the billionaire.
And the best part, the truly obscene part, is the drive home. Here the machine steps in as what Lacan would call the big Other, the big adult in the room that keeps going while you collapse. You can be unconscious, half-dead, and still you are “productive,” because the car does the driving, the system does the watching. You close your eyes, the algorithm stays awake.
But notice the ideological twist: rather than confronting the absurdity of a society where one works until unconsciousness, the narrative is inverted. The same system that squeezes him until he is sleeping at the wheel also appears as his savior. This is pure capitalism. First it injures you, then it sells you the bandage, and you say thank you. You wreck yourself for one of the billionaire’s machines, then another one of his machines rescues your body on the drive home.
Which brings us to the “Happy Thanksgiving.” It is the final twist of the knife. Gratitude here is not for rest, or sanity, or enough sleep to drive safely. Gratitude is for the privilege of being exhausted in the right office, in the right hoodie, for the right man. You give thanks to the very structure that wears you down. It is like saying grace over your own burnout.
Thus the man who naps on the freeway is not a deviation. He is the ideal subject of our time. Half-alive, overworked to the point of being a public hazard, and then thanking the machine that keeps this madness just barely on the road. The system grinds him down, risks his life and the lives of everyone around him, and his reaction is not “this cannot go on,” but “I am so grateful.” In this one man in a Cybertruck you get the whole picture at once: exploitation, technology and holiday cheer condensed into a single, obedient “thank you.”
2016: lol python? That's not programming. Real programmers use C/C++
2026: lol AI coding agents? That's not programming. Real programmers use AI tab completion
> Be AI PhD student
> Submit paper to conference
> LLM slop reviews
> Rejected
> Concurrent paper with same method accepted
> Resubmit to next conference
> Reviewer points to concurrent paper which was accepted by last conference
> Lack of novelty
> Rejected