@simocristea@jmschreiber91@BoWang87@GorinGennady If we put as much commitment, creativity and human hours into the biological model of RNA+ATAC state + spatial cell-cell relationships - as OpenAI put in GPT-4+RLHF - the results could be similarly good.
Things that are not required to be a scientist:
-Having a PhD
-Publishing peer-reviewed papers
-Getting paid to do research
Things that do make you a scientist:
-Using the scientific method to study stuff
That's it, that's the whole list.
ze frank, first and best vlogger, had a concept called "brain crack"
brain crack is that amazing idea you've been holding onto that you'll get to Later so you can Do It Right
it addicts you bc the more you build it up the less likely you'll ever do it
poast your brain crack!
For investors not wanting to wait for 100 clinical readouts from "techbio" companies, what are leading indicators about whether ML is making a dent in the biotech space?
This was a question asked during the 'ROI for AI in Healthcare' panel at the TD Cowen conference last week.
I answered it briefly then, but here are a few more thoughts about one place to look.
There could be significant creative destruction in the services strata of biotech, specifically in biologics (first).
Biotech is built on a foundational layer of services companies (CROs, CDMOs, etc). Some are large and general, others are small and specialized. This space has produced some big outcomes.
Money flows from biotechs/pharma companies to these services companies via small upfront cash fees, technical/commercial milestones, and royalties on sales.
This is also a place where new, ML-enabled companies are appearing left and right. Why?
Well, a lot of these companies spun out of academia or large ML research groups at big tech companies. Almost invariably, these groups raised because the founders contributed to key, open-source releases that gained traction across the ecosystem.
These v.1 tools were helpful and generally most successful in the biologics sphere - as with designing or optimizing antibodies, for example.
Quickly, and at an uneven pace, these v.1 models are turning into v.2 and v.3 models, often becoming closed-source in the process. At the same time, many ML-enabled companies in this group are building ancillary models and/or infrastructure.
When you tape together multiple, single-point software solutions into a workflow, you have something that resembles a digitally-enabled service.
Biotech and pharma pay way more for services than they do software. It requires internal talent/bandwidth at biotech to use software effectively, which is a distraction from what biotechs are usually good at. Services are titratable, variable costs and they allow all parties to focus on where they have a unique advantage.
The difference from the old vs. the new guard of biotech service providers (CROs) is that the new ones may have a wildly different cost structure. ML is a force multiplier. You need far, far fewer people to do the same amount of work.
Sure, the models aren't quite there yet or stitched together correctly quite perfectly. But in 12 months? 24 months? Who knows.
With little to no overhead or fixed CapEx outlay, how will the economics change? Will ML-enabled CROs be able to deliver a similar product (e.g., affinity maturation of an antibody) at 10x the speed? Will they be able to hit new targets/problems in a 0-1 sense?
If so, will they keep this margin and become vastly more profitable? How will traditional CROs react?
Sure, these ML-companies will need to acquire data. They'll maybe have a tiny lab in an active loop or they'll partner with a CRO (ironic) to get it. Some may have grand ambitions and others won't.
Traditional CROs also aren't standing still. If data is the rare, valuable thing (and I believe it is), then maybe we'll see these companies transform wholesale and jettison their physical infrastructure over time to resemble more like the ML-enabled players trying to usurp them.
Business transformation like this isn't easy, though. The innovator's dilemma is real with companies that have inertia.
I don't think anyone is exactly sure how this might all play out, but we certainly have opinions and I'm sure you might also. All I do know for sure is that the services industry in the life sciences will look very different in five years, up and down the stack from pre-clinical discovery to clinical development.
We're launching ACHIRA, a newco DIMENSION cofounded alongside @jchodera & @tkaraletsos. We're building foundation simulation models for the smallest, atomic-level resolution units of the universe. Assembling here: https://t.co/UiDTLR2ip8
NEW: Achira, a startup combining AI- and physics-based methods for drug discovery, launched Friday with a $33 million seed round
I talked with co-founders @jchodera, @Tkaraletsos, and @zavaindar on their venture: https://t.co/hDjaNKzeql
Announcing Pillar IV, our new $175M fund.
Thank you to our founders, investors, and community.
Science built the world we know. Investing in science is how we will build the world we want.
Breakthrough to impact has never been faster. Pillar IV is here to accelerate it.
Hey everyone! While I had a great on-site interview, I did not get the job of being @dwarkesh_sp ’s COO (He, very fairly, wanted someone further along in their career for this position).
One thing I noticed during this process was the importance of having intensity, and how few people choose to be intense. Much to consider here.
Anyways, I have a lot of inspiration now.
Time to go build an empire.
It took us about 9 months of exploration to build agents that can do superhuman scientific literature summary and Q&A. @m_skarlinski wrote up what failed and what was essential in an engineering blog post:
https://t.co/MLdThGHmW5
It took us about 9 months of exploration to build agents that can do superhuman scientific literature summary and Q&A. @m_skarlinski wrote up what failed and what was essential in an engineering blog post:
https://t.co/MLdThGHmW5
@itsclivetime Certainly you could think about "speaking textures", or "speaking molecules", or etc. What I've seen though is that the word "language" is misleading people to think LLMs are restrained to text applications.
@karpathy on the other hand maybe everything that you can express autoregressively is a language
and everything can be stretched out into a stream of tokens, so everything is language!
It's a bit sad and confusing that LLMs ("Large Language Models") have little to do with language; It's just historical. They are highly general purpose technology for statistical modeling of token streams. A better name would be Autoregressive Transformers or something.
They don't care if the tokens happen to represent little text chunks. It could just as well be little image patches, audio chunks, action choices, molecules, or whatever. If you can reduce your problem to that of modeling token streams (for any arbitrary vocabulary of some set of discrete tokens), you can "throw an LLM at it".
Actually, as the LLM stack becomes more and more mature, we may see a convergence of a large number of problems into this modeling paradigm. That is, the problem is fixed at that of "next token prediction" with an LLM, it's just the usage/meaning of the tokens that changes per domain.
If that is the case, it's also possible that deep learning frameworks (e.g. PyTorch and friends) are way too general for what most problems want to look like over time. What's up with thousands of ops and layers that you can reconfigure arbitrarily if 80% of problems just want to use an LLM?
I don't think this is true but I think it's half true.
6/ Nischal Jain - @nischal is building open source autonomous web agents!
My mind was blown when he just typed “subscribe to inverted passion blog” and saw agent Google the blog and fill the form and hit subscribe.
He is open sourcing what these guys have closed source:
https://t.co/MEDgYNiQYw
@praveen_chavali@observerforever 3/ Aaditya Salgarkar - @salgarkarap is a PhD in string theory (yes, really) who is interested in grokking vison transformers.
Specifically, he is exploring how to model 3D data (like in videos) for denoising (in a geology related application).