Introducing EpiBench, an agentic benchmark for practical epigenomics analysis.
106 evaluations span CUT&Tag/CUT&RUN, ATAC-seq, ChIP-seq, and DNA methylation workflows. The best agent–harness pair passes 45.0% of evaluations.
Evaluations reflect the assay outputs scientists use in practice. A task may depend on alignment files, peak calls, methylation tables, QC metrics, sample metadata, genomic annotations, or downstream summaries. Solving them requires a mix of coding, data analysis, and scientific judgment.
Ground truth is hard to define even for short-horizon scientific tasks. Alternative task interpretations can produce multiple plausible answers. Candidate tasks are hardened through manual quality control. We remove prompts that over-specify the method, answers that can be solved with general literature knowledge, and ground truths that fail to reproduce under peer reproduction.
Short-horizon tasks are the current frontier for scientific agents in epigenomics. Before models can own deeper biological reasoning, they need to become reliable at local assay-specific decisions.
Excited to share more of our technical notes in the spirit of open-research. Blog 2: How Muon Lost its Geometry
As Muon spread from speedruns to LLM-scale, it lost the µP scaling rule that keeps the best LR as width changes. The bug ships in PyTorch and Optax 😱. (1/n)
We’re proud to share that @TamarindBio has been selected to build, host, and operate the inference infrastructure layer for TuneLab2.0, the next evolution of the platform. @EliLillyandCo TuneLab is a first-of-its-kind, collaborative AI/ML drug discovery platform, bringing models trained on over $1B worth of Lilly proprietary data to the biotech ecosystem.
Tamarind will power TuneLab’s scalable drug discovery workflows and model inference.
Interesting work! Im curious where else you think these models will be halted in bio? It seems like there are lots of areas that LLMs cant venture in to yet, I wonder if more tools will be the way we start to teach these models about more and more areas of biology (beyond just accessing data)
@michellearning@barisakis Cool, how is best to sign up? Are you running customer experiments already? We have lots of CRO planned experiments I’d love to move over !
I have never been this bullish on biotech.
- peptide adoption going vertical :what started in grey markets is now heading toward federal policy in under 18 months
- the first human trial designed to reverse cellular aging is running right now. Results expected this year.
- AI is designing drugs that are outperforming anything a human research team has produced
- RAS just got drugged after 40 years of scientists calling it undruggable
by the time these reach the consumer at $100 a pop, the window to move early will already be closed.
that's exactly why Superhuman Fund II is backing this wave now.