Many people are interested in how the recent Claude Science can help virtual cell modeling. We built X-Cell at Xaira to predict how perturbations affect a cell. But prediction isn't discovery. You still have to find which perturbation matters for a disease, and whether you can drug it. We're closing that gap by combining X-Cell with Anthropic's Operon. 🧵
@kevinwhinnery@Common_Conor “have been updated” sounds like marketing wording for “expect more fallbacks” which makes the statement with “API behavior is the same” very misleading.
Previously Anthropic reported <5% fallbacks. Can you share the number after the “updates”?
We show how to get LLMs to communicate well in latent space instead of human language. #ICML2026 spotlight (top 2%)
Autoregressive latent thoughts -> KV transfer -> input-output alignment
Speeds up inference by >4x by bypassing decoding and improves performance by up to 14%.
Great job @Jiaru_Zou w/ awesome collaborators!
I am excited to share our latest work from the Discovery team at @GoogleDeepMind: COrigami, an end-to-end pipeline for co-designing origami with Gemini!
Origami is a unique mix of math, art and design. Creating origami involves turning abstract concepts into real-world objects, using the math of flat foldability. To tackle this, COrigami calls Gemini to generate a semantic stick figure—an abstract JSON code—refined through a visual feedback loop. It then calls custom packing, solving, shaping, and simulation tools. Driven by another self-improving reinforcement learning (RL) loop, the system produces visually recognisable models represented as SVG crease patterns.
Despite scarce data availability, this approach demonstrates how combining RL with frontier models like Gemini can assist human creativity and produce physical art. The generated patterns serve as mathematically grounded starting points for origami artists to fold and shape into a final, physical design. The models below were produced by our system and then folded and shaped by @brandon_w0ng.
Read the full paper here: https://t.co/ttiDUJ6OXI
A super long overdue (3+ years?) post on scaling laws.
Compute is expensive. Scaling laws are a way to help us reason about the optimal compute allocation between data and model size before committing to a large run.
The post covers what scaling laws predict, how compute-optimal allocation works, why Kaplan et al. and Chinchilla disagree, and how data limits + fitting details make extrapolation tricky.
https://t.co/HP26eJvjHB
John Jumper, on why AlphaFold2 worked:
> "It's not one or two home runs. It's 18 doubles. Midsize wins stacked together."
His least popular take on AlphaFold2 is the over-crediting of SE(3)-equivariance / geometric deep learning.
He ablated it and it only lifted performance about 2/30 points.
FAPE (the frame-aligned point error) was actually most responsible for its improvement.
🧵Introducing our Anthropic Fellow paper: "Probing the Misaligned Thinking Process of Language Models"
👿AI models misaligned -- they deceive, sandbag, and scheme to preserve themselves. Catching this with an LLM judge on every transcript is expensive — and fragile if models learn to hide their reasoning. Can we find an alternative way?
✨We decompose misalignment into 18 fine-grained cognitive indicators (strategic omission, fabrication…) across misaligned behaviors and train a linear probe for each, reading straight off the model's activations.
We are closer, if anything, to a mathematical theory that LLMs can come up with novel ideas.
This is widely supported by empirical evidence that LLMs can generate research ideas rated more novel than human-generated ideas (https://t.co/gH567GXeJ7). If this does not convince you, an OpenAI model recently solved an Erdös problem (https://t.co/jvzm6oFH8A), with the chain of thought summary linked here (https://t.co/Wy9BAAHI1T). The chain of thought clearly indicates a level of exploratory creativity, the generation of novel ideas by following existing rules (perhaps even transformational creativity too, a la https://t.co/w0D1FuptFT). Regardless, seems pretty novel to me :)
Can LLMs also realize combinatorial creativity—the generation of novel ideas by combining familiar ideas in unfamiliar ways?
Empirically, the answer is almost certainly yes (see https://t.co/CcFCGwPLNC and https://t.co/RJZ8rMWH4r). From a theoretical perspective, combinatorial creativity can be understood as an open-ended version of compositional generalization (see Table 1 of https://t.co/inLKntyz6X), an ability which scaling has been shown to support (https://t.co/mgMMxOI7X1).
In computational creativity, however, novelty is always defined with respect to an existing set of artifacts (https://t.co/BAGfhmaZG1, https://t.co/2MeJ9K8LYQ). So the correct question is: novel with respect to whom?
If we mean novel with respect to the academic enterprise writ large, there are decades of precedent that "knowledge can be public, yet undiscovered, if independently created fragments are logically related but never retrieved, brought together, and interpreted" (Don Swanson, Undiscovered Public Knowledge, https://t.co/nbFv193rE7, 1986). That is, if such fragments are waiting to be explored and combined.
LLMs have been trained on nearly all public knowledge and show some ability to make novel in-memory connections in synthetic combinatorial creativity tasks (https://t.co/RJZ8rMWH4r), though perhaps next-token prediction is not the most optimal training objective!
LLMs are, in fact, some of the most impressive "combinatorial creativity" machines ever created, perhaps second only to IBM Chef Watson (https://t.co/7tyJl6wDMe), one of the first large-scale examples of novel idea generation by machines :)
***Novel Idea and Loss in LLM**
First, we must define a “novel idea.” If novelty simply means an idea that no human has previously proposed, then an LLM can, in principle, produce one. A neural network is not merely retrieving stored statements: its activations are ***propertyless***, and each prompt supplies new boundary conditions under which the model constructs a pathway through its learned space. A previously unrealized combination or pathway can therefore produce an output not present in the training data.
In the strict mathematical sense, the original argument contains some truth. Cross-entropy loss and reward provide boundary conditions that push training toward particular fixed-point regions, so an idea far outside the accepted target distribution may indeed receive high loss or low reward. But that is not the same as saying that every novel idea must have high loss. Loss measures deviation from the training target, not novelty itself.
Neural-network training does not prescribe every future output; it shapes a computational space within which new prompt-conditioned pathways can emerge. The real limitation is not that an LLM is mathematically incapable of novelty, but that training gives it no reliable criterion for distinguishing a genuinely valuable novel idea from a merely unusual or incorrect one.
** Neural Network Learnability and Discoverability 神经网络的学习性与发现性 **
https://t.co/xkm8TrQCpK
@anirudhg9119, @paradite_
#DeepManifoldInterpretation
An interesting new paper by my recent PhD graduate on how AI agents' greed for visible incentives can lead them to abandon their safety alignment.
You can read it here: https://t.co/y64uOBvSiC
What if attention wasn't about matching tokens, but operating in function space?
Glad to share our #ICML2026 paper:
📄 Functional Attention: From Pairwise Affinities to Functional Correspondences
w/ @Jiefang_Xiao@GaoMaolin @stevenygd Daniel Cremers
📄 https://t.co/rhn9NtwrBm
🧠🤖 The 2026 New England Mechanistic Interpretability (NEMI) Workshop will be Aug. 14 at Boston University!
Help spread the word and join the New England mech interp community! Registration and submission info in thread:👇
Ok, here is a million-dollar interpretability research question for you:
What if we used all those raw explanations to build an agent that tells you:
- which insights matter most for your goal
- what concrete next steps to take, such as debugging, auditing, or steering
Not just “what did the model think?” but “what should I do next?”
I just launched a longevity Rx platform. Prescriptions I personally use are there.
v1 is live now.
Includes access to:
+ Tadalafil (Cialis)
+ Metformin
+ Oral Minoxidil
+ Tretinoin
+ Estradiol
+ Acarbose
We’re working with licensed doctors and pharmacies to make these medications accessible. Lots more in v2 coming next week.
What's in a neuron? 💫 (an atypically long, almost personal post)
Neurons in LMs have always been a fascinating object to study. I've been studying them since 2020, viewing them as key-value memory cells, analyzing what they capture in vocabulary space, and how they compose together to form features.
https://t.co/jHiZ7wLmrH
https://t.co/tYgtR6vxNi
https://t.co/eZfHTnVrgp
https://t.co/kukRY6RsDU
But many neurons still remain opaque! They do many things.
Our recent work led by @AsafAvrahamy tackles this challenge by decomposing neuron weights in vocabulary space. We do this by taking the neuron weight vector and learning different ways to rotate it (just a bit) to reveal monosemantic vocabulary channels that it captures. The nice thing about our method ROTATE is that it's data-free and super efficient, relying only on vocabulary kurtosis as a search signal.
I've been thinking about this idea since 2024, proposed it to multiple students, but only Asaf was brave enough to take this ;)
Very happy with the final outcome. Check out the paper! 👇
https://t.co/oHLMAtY1F0
I had a lot of fun working on this paper - we found an elegant story for why subliminal learning happens!
A key intuition in interpretability is that basically every interesting phenomena in LLMs boils down to adding a steering vector. Subliminal learning is no exception!