William Chow

5 months ago

Proud to be part of the amazing int'l team that worked on this EDEN - a 28B parameter biological foundation model trained on 9.7T nucleotides equitably sourced from nature. Eden's tasks includes generating gene-editing enzymes, AMPs + synth microbiome https://t.co/4SugNlnpx9

654

5 months ago

Many people contributed to this massive preprint, and I don't have enough characters in a tweet to give a 📣 to all. But 👏 to @Basecamp_Res and the biodiversity partners, collabs at @nvidia, @MSFTResearch , @delafuentelab, @ferruz_noelia, @KevinKaichuang, @BanfieldJill. ➕👇

wchowb's tweet photo. Many people contributed to this massive preprint, and I don't have enough characters in a tweet to give a 📣 to all. But 👏 to @Basecamp_Res and the biodiversity partners, collabs at @nvidia, @MSFTResearch , @delafuentelab, @ferruz_noelia, @KevinKaichuang, @BanfieldJill. ➕👇 https://t.co/Gb2KhgiueP

5 months ago

654

Earth BioGenome Project 🌍

5 months ago

👀Visual Abstract here 👇

547

Who to follow

@EBPgenome

Sequencing life for the future of life 🧬 🌲Keep up with all EBP updates 👉 https://t.co/3DtyICIi5W Several EBP-affiliates are on BlueSky 🌍🦋

Ivan Sovic

@IvanSovic

Genomics enthusiast with substantial experience in seq alignment, de novo assembly and consensus algorithms, developing professional production-class software.

wchowb retweeted

5 months ago

In an incredible collaboration with @Basecamp_Res and @nvidia, today we announce EDEN: an evolution-scale DNA foundation model built on a simple idea—if we dramatically expand the biology we learn from, AI can stop overfitting to a handful of model organisms and start learning principles that truly generalize. Our lab at @Penn helped validate EDEN, which is trained to learn the underlying grammar of biology from vast parts of life we’ve barely sampled. EDEN’s largest model (28B params) was trained on 9.7T nucleotide tokens from BaseData, a dataset enriched for environmental and host-associated metagenomes, phage, and mobile genetic elements — including 10B+ novel genes from 1M+ newly discovered species. One example: using EDEN, we designed new antibiotic peptides with 97% experimental success (32/33 active) — including potent candidates against critical-priority pathogens. More broadly, I’m excited by what this represents: scaling biodiversity changes what models can learn, and it moves us closer to foundation models that don’t just read biology — they help us design within it. Huge thanks to this incredible team: @GeraldeneM49733, Gavin Ayres, @carlagrec, Keith Kam, Gus Minto-Cowcher, John St John, Tanggis Bohnuud, @bakalar_h, @wchowb, Robert Pecoraro, @mdt_torres, Aaron Kollasch, @leungmarcus, @qsirelkhatim, @francescofarin, Connor McGinnis, Srijani Sridhar, Daniel Anderson, @FrancescoOTERI3, Ali Takhabakshi, Jeremie Dona, @TylerShimko, Cedric Stenbeeke, Alexandros Papadopoulos, Malcolm Krolick @JohnsHopkins, @fspoen @OPIGlets @UniofOxford, Purba Gupta, Sandeep Kumar, Anne Bara, Jared Wilbur, @ferruz_noelia @CRGenomica, Timur Rvachov, Fangping Wang, @Hanqun_CAO, Hyun-Su Lee, Japan Mehta, Raphael Chaleil, Valerio Pereno, @potti_siddharth @Stanford, Chris Emerson, Roy Tal Dew, @KevinKaichuang @MSFTResearch @MSRNE, @exnx, @TadimetiNeha, @BanfieldJill @UCBerkeley, Alicia Frame @Azure, @bolton_emma_, @druau, Rory Kelleher, @anthonycosta, @kpowgerade, @glen_gowers, Oliver Vince, Jonathan Finn, & Philipp Lorenz! Link to paper: https://t.co/ZVkZDPcktS

delafuentelab's tweet photo. In an incredible collaboration with @Basecamp_Res and @nvidia, today we announce EDEN: an evolution-scale DNA foundation model built on a simple idea—if we dramatically expand the biology we learn from, AI can stop overfitting to a handful of model organisms and start learning principles that truly generalize.

Our lab at @Penn helped validate EDEN, which is trained to learn the underlying grammar of biology from vast parts of life we’ve barely sampled.

EDEN’s largest model (28B params) was trained on 9.7T nucleotide tokens from BaseData, a dataset enriched for environmental and host-associated metagenomes, phage, and mobile genetic elements — including 10B+ novel genes from 1M+ newly discovered species.

One example: using EDEN, we designed new antibiotic peptides with 97% experimental success (32/33 active) — including potent candidates against critical-priority pathogens.

More broadly, I’m excited by what this represents: scaling biodiversity changes what models can learn, and it moves us closer to foundation models that don’t just read biology — they help us design within it.

Huge thanks to this incredible team: @GeraldeneM49733, Gavin Ayres, @carlagrec, Keith Kam, Gus Minto-Cowcher, John St John, Tanggis Bohnuud, @bakalar_h, @wchowb, Robert Pecoraro, @mdt_torres, Aaron Kollasch, @leungmarcus, @qsirelkhatim, @francescofarin, Connor McGinnis, Srijani Sridhar, Daniel Anderson, @FrancescoOTERI3, Ali Takhabakshi, Jeremie Dona, @TylerShimko, Cedric Stenbeeke, Alexandros Papadopoulos, Malcolm Krolick @JohnsHopkins, @fspoen @OPIGlets @UniofOxford, Purba Gupta, Sandeep Kumar, Anne Bara, Jared Wilbur, @ferruz_noelia @CRGenomica, Timur Rvachov, Fangping Wang, @Hanqun_CAO, Hyun-Su Lee, Japan Mehta, Raphael Chaleil, Valerio Pereno, @potti_siddharth @Stanford, Chris Emerson, Roy Tal Dew, @KevinKaichuang @MSFTResearch @MSRNE, @exnx, @TadimetiNeha, @BanfieldJill @UCBerkeley, Alicia Frame @Azure, @bolton_emma_, @druau, Rory Kelleher, @anthonycosta, @kpowgerade, @glen_gowers, Oliver Vince, Jonathan Finn, & Philipp Lorenz!

Link to paper: https://t.co/ZVkZDPcktS

402

242

27K

wchowb retweeted

Kevin K. Yang 楊凱筌 @KevinKaichuang

5 months ago

EDEN: a family of genomic language models trained on up to 9.7 trillion nucleotides from @Basecamp_Res 's BaseData can design large serine recombinases, bridge recombinases, and antimicrobial peptides. https://t.co/vIm0THyc5T

KevinKaichuang's tweet photo. EDEN: a family of genomic language models trained on up to 9.7 trillion nucleotides from @Basecamp_Res 's BaseData can design large serine recombinases, bridge recombinases, and antimicrobial peptides.

https://t.co/vIm0THyc5T https://t.co/DntChVdUiq

155

wchowb retweeted

9 months ago

🧬 The more biology AI understands, the faster we unlock breakthroughs. As AI leaders gather in the UK, @nvidia is putting a spotlight on @Basecamp_Res as one of the UK’s top ‘AI Makers’—alongside @wayve_ai , @synthesiaIO & @IsomorphicLabs With 10B+ sequences in the world’s largest and fastest growing biological database, we're training foundation models to power the next wave of programmable medicines. 👉 Read NVIDIA’s 'AI Makers' blog: https://t.co/UBqqXppEmZ

264

wchowb retweeted

10 months ago

How do you collect data at a planetary scale to understand biological superintelligence—and turn that flood of info into a foundation for AI? We’ve been working on it at Basecamp Research—and now we’re teaming up with Microsoft & NVIDIA AI to share what we’ve learned at #SXSW2026. Our proposed session, “Decoding Nature: How AI is Learning to Program Biology,” explores how foundation models are learning to read & write DNA—unlocking breakthroughs in health, sustainability & beyond. If that sounds worth a spotlight at SXSW, we’d love your vote. 🧬 Head to SXSW PanelPicker: https://t.co/vFkpFpBSSJ 🧠 Search “Decoding Nature” ❤️ Hit the heart next to the session Every vote helps shape not just the festival—but the future of AI x biology. #PanelPicker #Health #AI #SyntheticBiology

297

wchowb retweeted

11 months ago

Can AI decode the Tree of Life? 🌍🧬 It’s a question @Basecamp_Res is obsessed with—fueling our mission to sample the planet and use the data we collect to train next-gen foundation models. Huge thanks to @Microsoft, @nvidia, and @agency_noir for capturing our story in a new documentary 🎥 Watch the full film here: https://t.co/ObfLCH9ybY

285

11 months ago

The preprint: https://t.co/3SoYDzorNm

11 months ago

If you are at #ISMBECCB2025🧬 Check out @carlagrec talk at the Trends in AI: A UK Perspective session on Monday Jul 21 She will be representing the Genomics and @Basecamp_Res team talking about how we built and expanded genomic/protein space by sampling untapped biomes on 🌎

113

11 months ago

Check out this trailer and the full video in the link!

NVIDIA AI

@NVIDIAAI

11 months ago

🍃 🧬 Basecamp Research is unlocking nature’s blueprint to accelerate breakthroughs in drug discovery, R&D, and beyond. With #AI accelerated by Microsoft Azure and NVIDIA, they’re creating one of the world’s largest biological databases—fueling scientific innovation across industries. 🎥Watch the series to see how #AI is solving global challenges: https://t.co/OS7vggrJcO

160

30K

117

wchowb retweeted

True Ventures

@trueventures

12 months ago

.@Basecamp_Res just unveiled BaseData: the world’s most diverse biological dataset, built from over 1 million novel species and 9.8 billion protein sequences. It’s a major leap forward for AI-powered biology—fueling breakthroughs in therapeutics, sustainability, and beyond. More in @GENbio from @xiaofei_lin: https://t.co/s4sWhgw8QT

trueventures's tweet photo. .@Basecamp_Res just unveiled BaseData: the world’s most diverse biological dataset, built from over 1 million novel species and 9.8 billion protein sequences. It’s a major leap forward for AI-powered biology—fueling breakthroughs in therapeutics, sustainability, and beyond.

More in @GENbio from @xiaofei_lin: https://t.co/s4sWhgw8QT

975

wchowb retweeted

12 months ago

AI sees differently. But when most of the data looks the same, there's not much for it to learn. That’s the case in biology—where a staggering 68% of all sequence data in the SRA comes from just five species. 🧬 (1/N)

925

wchowb retweeted

12 months ago

🧬 News alert: We’re bringing BaseData out of stealth — the world’s largest and fastest growing biodiscovery dataset, built in collaboration with scientists across 26 countries. 🔍 BaseData adds 9.8 billion newly discovered protein sequences to the known tree of life — expanding it by over 10x beyond what public datasets currently offer. 🌍 Why does this matter? Because this biological richness will help us break through the data wall holding back Bio AI performance. We’re training biological foundation models on BaseData in partnership with @NVIDIAAI This is a huge step forward in understanding biology and unlocking next-gen biotech. 📰 Huge thanks to @AndrewE_Dunn and @endpts for covering the story. Read the article: https://t.co/HMaCWwYlwp And our preprint: https://t.co/tiy27Meszm #AIForBiology #GTCParis #DrugDiscovery

Basecamp_Res's tweet photo. 🧬 News alert: We’re bringing BaseData out of stealth — the world’s largest and fastest growing biodiscovery dataset, built in collaboration with scientists across 26 countries.

🔍 BaseData adds 9.8 billion newly discovered protein sequences to the known tree of life — expanding it by over 10x beyond what public datasets currently offer.

🌍 Why does this matter? Because this biological richness will help us break through the data wall holding back Bio AI performance. We’re training biological foundation models on BaseData in partnership with @NVIDIAAI

This is a huge step forward in understanding biology and unlocking next-gen biotech.

📰 Huge thanks to @AndrewE_Dunn and @endpts for covering the story. Read the article: https://t.co/HMaCWwYlwp
And our preprint: https://t.co/tiy27Meszm

#AIForBiology #GTCParis #DrugDiscovery

12K

about 1 year ago

@Ski308 @5uperHalfs Berlin was definitely the gold standard, no queues anywhere, even at the post erdinger non alcoholic beer tents 😆. Run was smooth, the starting organisation of waves was smooth, the expo was smooth, bag check smooth. Even had warm drinks during the run b/c it was a cold day.

about 1 year ago

@5uperHalfs Reading online, Lisbon has been chaotic pre,during,post run last 2yrs. Shame the other cities have been amazing! Perhaps a review of Lisbon organisation needs to be considered, and should have learned from past mistakes. Read that London marathon events is consultants for 2025.

151

wchowb retweeted

FT Weekend @ftweekend

about 1 year ago

The race to discover life-saving drugs by sequencing new genomes https://t.co/LjxZ5DYSZg

852

wchowb retweeted

about 1 year ago

The @FTWeekend dives into @Basecamp_Res mission to sequence nature’s DNA in Earth’s wildest places — all to train next-gen AI foundation models. Grateful to @hannahkuchler for joining an expedition & telling the story. 🧬🌍 Read with a gift link (no paywall): https://t.co/s2206funbL

Basecamp_Res's tweet photo. The @FTWeekend dives into @Basecamp_Res mission to sequence nature’s DNA in Earth’s wildest places — all to train next-gen AI foundation models.

Grateful to @hannahkuchler for joining an expedition & telling the story.

🧬🌍

Read with a gift link (no paywall): https://t.co/s2206funbL

893

about 1 year ago

@GNRailUK Hi I'm stuck at Welwyn North and the train driver said that there is a fire and to look for alternative transportation so will take a taxi, which I believe will be compensated as usual.

269

wchowb retweeted