Ed Harris @edharris - Twitter Profile

Introducing EpiBench, an agentic benchmark for practical epigenomics analysis. 106 evaluations span CUT&Tag/CUT&RUN, ATAC-seq, ChIP-seq, and DNA methylation workflows. The best agent–harness pair passes 45.0% of evaluations. Evaluations reflect the assay outputs scientists use in practice. A task may depend on alignment files, peak calls, methylation tables, QC metrics, sample metadata, genomic annotations, or downstream summaries. Solving them requires a mix of coding, data analysis, and scientific judgment. Ground truth is hard to define even for short-horizon scientific tasks. Alternative task interpretations can produce multiple plausible answers. Candidate tasks are hardened through manual quality control. We remove prompts that over-specify the method, answers that can be solved with general literature knowledge, and ground truths that fail to reproduce under peer reproduction. Short-horizon tasks are the current frontier for scientific agents in epigenomics. Before models can own deeper biological reasoning, they need to become reliable at local assay-specific decisions.

kenbwork's tweet photo. Introducing EpiBench, an agentic benchmark for practical epigenomics analysis.

106 evaluations span CUT&Tag/CUT&RUN, ATAC-seq, ChIP-seq, and DNA methylation workflows. The best agent–harness pair passes 45.0% of evaluations.

Evaluations reflect the assay outputs scientists use in practice. A task may depend on alignment files, peak calls, methylation tables, QC metrics, sample metadata, genomic annotations, or downstream summaries. Solving them requires a mix of coding, data analysis, and scientific judgment.

Ground truth is hard to define even for short-horizon scientific tasks. Alternative task interpretations can produce multiple plausible answers. Candidate tasks are hardened through manual quality control. We remove prompts that over-specify the method, answers that can be solved with general literature knowledge, and ground truths that fail to reproduce under peer reproduction.

Short-horizon tasks are the current frontier for scientific agents in epigenomics. Before models can own deeper biological reasoning, they need to become reliable at local assay-specific decisions.

4

106

22

74

8K

Ed Harris

@edharris

11 days ago

@s00k1m Love it !

0

1

0

457

edharris retweeted

Soo Kim @s00k1m

11 days ago

don’t type pymol commands urself, now there’s co-pymol available for claude code and cursor ⬇️

20

292

59

240

17K

edharris retweeted

Arda Göreci

@ArdaGoreci

11 days ago

Excited to share more of our technical notes in the spirit of open-research. Blog 2: How Muon Lost its Geometry As Muon spread from speedruns to LLM-scale, it lost the µP scaling rule that keeps the best LR as width changes. The bug ships in PyTorch and Optax 😱. (1/n)

ArdaGoreci's tweet photo. Excited to share more of our technical notes in the spirit of open-research. Blog 2: How Muon Lost its Geometry

As Muon spread from speedruns to LLM-scale, it lost the µP scaling rule that keeps the best LR as width changes. The bug ships in PyTorch and Optax 😱. (1/n) https://t.co/PPtRgi3cF2

8

143

20

121

10K

edharris retweeted

Deniz Kavi

@kavi_deniz

12 days ago

We’re proud to share that @TamarindBio has been selected to build, host, and operate the inference infrastructure layer for TuneLab2.0, the next evolution of the platform. @EliLillyandCo TuneLab is a first-of-its-kind, collaborative AI/ML drug discovery platform, bringing models trained on over $1B worth of Lilly proprietary data to the biotech ecosystem. Tamarind will power TuneLab’s scalable drug discovery workflows and model inference.

16

262

25

114

42K

Ed Harris

@edharris

12 days ago

@ArdaGoreci @owl_posting sadly not

1

3

0

99

Ed Harris

@edharris

12 days ago

@jontycorrin 💔

0

1

0

64

Ed Harris

@edharris

12 days ago

suuuuuurely you dont need to be this cautious.

0

3

0

76

Ed Harris

@edharris

13 days ago

Interesting work! Im curious where else you think these models will be halted in bio? It seems like there are lots of areas that LLMs cant venture in to yet, I wonder if more tools will be the way we start to teach these models about more and more areas of biology (beyond just accessing data)

0

75

Ed Harris

@edharris

13 days ago

@matt_is_nice Yeah, I think domain specific tooling in bio is going to become much more important.

0

1

0

117

Ed Harris

@edharris

14 days ago

@michellearning @barisakis Cool, how is best to sign up? Are you running customer experiments already? We have lots of CRO planned experiments I’d love to move over !

1

0

216

Ed Harris

@edharris

14 days ago

@michellearning @barisakis I see! When are the next tours? I’d love to see it in person !!

1

0

254

Ed Harris

@edharris

14 days ago

@michellearning @barisakis Haha. This was a genuine question, not a snarky remark. Is this lab number 2 you’re opening?

1

0

274

edharris retweeted

Mgoes (bio/acc 🤖💉)

@m_goes_distance

15 days ago

I have never been this bullish on biotech. - peptide adoption going vertical :what started in grey markets is now heading toward federal policy in under 18 months - the first human trial designed to reverse cellular aging is running right now. Results expected this year. - AI is designing drugs that are outperforming anything a human research team has produced - RAS just got drugged after 40 years of scientists calling it undruggable by the time these reach the consumer at $100 a pop, the window to move early will already be closed. that's exactly why Superhuman Fund II is backing this wave now.

15

342

32

121

20K