Do single-cell foundation models obey scaling laws?
A somewhat thought-provoking new Nature Methods study by the Crawford lab suggests that, for current single-cell foundation models, the answer may be “not really.” Across a broad range of architectures and downstream tasks, increasing pretraining data from hundreds of thousands to tens of millions of cells yielded surprisingly limited gains, with performance often saturating much earlier than expected.
This is interesting and provides exactly the kind of rigorous benchmarking our field needs. As Felix Fischer and I commented in the accompanying Research Briefing, such studies help move the discussion beyond model size and computational budgets toward actual scientific utility.
At the same time, I am not convinced the key conclusion is that scaling does not work in biology. Rather, it may be that current objectives are not extracting enough information from additional data.
Interestingly, in our recent scConcept work, we observe a markedly different scaling behavior, with continued gains as training data grows toward hundreds of millions of cells. The key difference may be the training objective itself: instead of reconstruction-based masked modeling, scConcept uses a contrastive objective that directly optimizes biologically meaningful cell representations.
https://t.co/CR8DSUmHi3
This raises an interesting question for the field: Have we reached the limits of data scaling, or only the limits of current objectives?
-> My guess is that the next generation of biological foundation models will depend less on simply collecting more cells and more on finding the right representation learning principles for biology.
Nature Methods paper:
https://t.co/ArtUo1fcPE
Research Briefing:
https://t.co/5er4vGJiAJ
#SingleCell #FoundationModels #AIforBiology
Fetal Bovine Serum is one of those products people try not to think about. It rests in the methods sections of Nature publications and in the hefty Quartzy bill associated with running a lab.
It has been the bedrock of our field since 1958, when Theodore Puck formalized it as the essential, undefined part of a universal media.
Behold one of the mightiest tools in mathematics: the camel principle.
I am dead serious. Deep down, this tiny rule is the cog in many methods. Ones that you use every day.
Here is what it is, how it works, and why it is essential:
Plant pathogens use secreted effectors to trick plant cells into providing sugary treats.
Learn more in a new #SciencePerspective: https://t.co/6hDUbXDmaX
I wrote a 4000-words long article about all the math you need to know for machine learning.
Trust me, you want to bookmark this: https://t.co/sV52SBB16J
Who knew that a little worm would be so important for scientific progress?
The roundworm C. elegans can be connected with many Nobel Prizes. In 1986, medicine laureate Robert Horvitz used the roundworm to identify two of the genes needed for programmed cell death to occur.
The 2024 medicine laureates Victor Ambros and Gary Ruvkun also used C. elegans to discover microRNA, a new class of tiny RNA molecules that play a crucial role in gene regulation. Their groundbreaking discovery revealed a completely new principle of gene regulation. This turned out to be essential for multicellular organisms, including humans.
Read more about their discovery: https://t.co/VWuZab6ptc
Another great essay on "Theory, models and biology" which is an even-handed treatise on why theory & biology don't easily mix. Shou, 2015.
https://t.co/kjEHgcoF2q
Late-night gene assembly. For tedious or high stakes reactions, I wait until nightfall when my brain works best. Time to blast some Sleep Token and crank these out.
✂️🧬🦠
Camillo Golgi received the 1906 medicine prize for discovering nerve cells could be stained with silver nitrate. This enabled Golgi and other researchers to make detailed studies of our nervous system, such as the ones pictured.
Learn more: https://t.co/eXwwoypgow
Marie Skłodowska Curie defended her doctoral thesis on radioactive substances at Université de la Sorbonne in Paris on 25 June 1903 and became the first woman in France to receive a doctoral degree.