Dhruv Gautam @dhrvji - Twitter Profile

Talks at the intersection of systems engineering and computational biology 0:20 Why study systems x biology in "age of agents" 5:50 Forch: Building a utilitarian cloud container orchestrator (Max Smolin, LatchBio) 41:25 cyto: Ultra high-throughput processing of 10x Flex single-cell sequencing (Noam Teyssier, Arc Institute) 1:04:30 SLAF: A single-cell omics storage format for the virtual cell era (Pavan Ramkumar, SLAF Project) 1:33:30 Lessons in Perturbation Modeling: STATE, STACK, and Beyond (Dhruv Gautam, Arc Institute + UC Berkeley) 2:03:15 Leveraging Serverless Distributed Computing to Scale Computational Biology (Ben Shababo, Modal) Topics span container orchestration, single-cell infra, perturbation modeling for biology at scale.

2

52

7

44

4K

dhrvji retweeted

Larry Dial

@classiclarryd

4 months ago

New NanoGPT Speedrun WR at 92.1 (-0.3s) from @dhrvji , by moving the bigram hash from CPU to GPU. As shown here, recently added architectures are a great place to look for engineering improvements. https://t.co/aygHwUTsfI

1

63

10

15

5K

Dhruv Gautam @dhrvji

5 months ago

fun to see all the new work on “teacher forced” self distillation on large LLMs in the context of STACKs prescient distillation scheme 😁

Dhruv Gautam @dhrvji

5 months ago

LLMs needed post-training to become useful to end users, leading to the advent of prompt engineering. We're excited to announce STACK, a SOTA cell foundation model that leverages self-distillation based post-training to enable prompt engineering for cells.

dhrvji's tweet photo. LLMs needed post-training to become useful to end users, leading to the advent of prompt engineering.

We're excited to announce STACK, a SOTA cell foundation model that leverages self-distillation based post-training to enable prompt engineering for cells. https://t.co/yhKu2i4Y6I

2

14

1

4

2K

0

9

1

623

Dhruv Gautam @dhrvji

5 months ago

@finbarrtimbers i see computer use as pretty necessary for scientific discovery, interfacing with random software without apis, good plotting capabilities, and lots of superhuman imaging analysis put together with all the code gen abilities can definitely be v impactful

0

2

0

41

dhrvji retweeted

Arc Institute

@arcinstitute

5 months ago

Arc bioinformatics scientists @noamteyssier and @a_dobin have just released cyto, an ultra-high throughput processor specifically optimized for @10xGenomics Flex single-cell data. We are excited to make this resource open source: https://t.co/z5sxK6owjd

arcinstitute's tweet photo. Arc bioinformatics scientists @noamteyssier and @a_dobin have just released cyto, an ultra-high throughput processor specifically optimized for @10xGenomics Flex single-cell data.

We are excited to make this resource open source: https://t.co/z5sxK6owjd https://t.co/F0zhOsTKzC

6

218

39

112

87K

Dhruv Gautam @dhrvji

5 months ago

has anyone had any success in getting claude code/codex to setup a chain of SLURM dependency jobs; seems to really struggle with reasoning abt dependence even in plan mode

0

2

0

1

160

dhrvji retweeted

Arc Institute

@arcinstitute

5 months ago

Predicting cell state in previously unseen conditions such as disease or in response to a drug has typically required retraining for each new biological context. Today, Arc is releasing Stack, a foundation model that learns to simulate cell state under novel conditions directly at inference time, no fine-tuning required.

arcinstitute's tweet photo. Predicting cell state in previously unseen conditions such as disease or in response to a drug has typically required retraining for each new biological context. Today, Arc is releasing Stack, a foundation model that learns to simulate cell state under novel conditions directly at inference time, no fine-tuning required.

36

961

206

594

403K

Dhruv Gautam @dhrvji

5 months ago

Read more here: https://t.co/3lhBqTvtdS

0

2

0

180

Dhruv Gautam @dhrvji

5 months ago

LLMs needed post-training to become useful to end users, leading to the advent of prompt engineering. We're excited to announce STACK, a SOTA cell foundation model that leverages self-distillation based post-training to enable prompt engineering for cells.

2

14

1

4

2K

Dhruv Gautam @dhrvji

6 months ago

@jiaxinwen22 @SonglinYang4 their discussion on evals and preventing optimization for making memorization easier is great; theres still so much unresolved on understanding in weights learning/icl dynamics in pretraining

0

1

0

654

Dhruv Gautam @dhrvji

6 months ago

@kenbwork i agree, i meant rather that when these models start outperforming humans bc of RL, we can start analyzing their CoTs to find new strategies (our current ones are probably not optimal). this is more likely to work with datasets where the data isn’t heavily human annotated/biased

0

33

Dhruv Gautam @dhrvji

6 months ago

@kenbwork though i could imagine that RL on these sorts of tasks (if sufficiently diverse enough), and inspecting the CoTs will actually give us "new" tools for bioinformatics analysis in the next year

1

0

49

Dhruv Gautam @dhrvji

6 months ago

@kenbwork yeah i'd imagine tool design for bio will stay for now. agents working on software engineering often can iterate and test things in one off scripts; enabling agents to manipulate biological data into various forms that give denoised & dif learning signals is not the simplest RL

1

0

48

Dhruv Gautam @dhrvji

6 months ago

@kenbwork Do you imagine this result to be null once models start training on these sorts of tasks? kind of like how early into swe bench (su24) the best agents had these really complex workflows and now longc / basic harnesses (https://t.co/Ybx6dnK2gY, https://t.co/uJnOsJ35fp) are ~sota

1

0

75

Dhruv Gautam @dhrvji

6 months ago

@4ndyXu gpt4b is a good example of 1. working with just midtraining, with a domain specific plm you can distill protein sequences and align spaces with a pretrained LLM I imagine that the midtraining recipe will be very difficult to get right (ie annotating the sequences properly)

0

1

0

136

Dhruv Gautam @dhrvji

7 months ago

@DimitrisPapail def could fit in a rank 1 lora 😅

0

1

0

2K

dhrvji retweeted

darya @daryakaviani

7 months ago

Here's how LLM providers (& anyone) should be doing age verification in 2025: Keep the ID private; prove "≥18" with ZK proofs. Our new paper with @srinathtv "🌟Vega: Low‑Latency Zero-Knowledge Proofs over Existing Credentials" makes this practical today. https://t.co/o5suZaW3pj

6

111

15

36

13K

dhrvji retweeted

Teknium 🪽

@Teknium

7 months ago

When y'all make rl envs and get a sota model for a single task can you please distill a few thousand samples from it and share that dataset 🙏

6

105

8

18

13K

Dhruv Gautam

@dhrvji

Last Seen Users on Sotwe

Trends for you

Most Popular Users