Sam Cox

over 1 year ago

Today, we're announcing BixBench, built in collaboration with @SciMac - a benchmark for AI agents tackling real bioinformatics tasks. We've created 53 scenarios with 296 questions that test how AI handles computational biology challenges. BixBench includes evaluation metrics and an open-source environment for LLMs to execute these tasks. #AIinScience

SamCox822 retweeted

Chemistry x AI (she/her) 🇱🇰 AI4Science @FutureHouseSF

about 1 year ago

Today, FutureHouse is launching the FutureHouse Platform, bringing the first-ever superintelligent scientific AI agents to scientists everywhere via a web interface and API. The Platform is launching with four agents, each with their own specialization:

SamCox822 retweeted

Sam Rodriques

@SGRodriques

about 1 year ago

Today, we are launching the first publicly available AI Scientist, via the FutureHouse Platform. Our AI Scientist agents can perform a wide variety of scientific tasks better than humans. By chaining them together, we've already started to discover new biology really fast. With the platform, we are bringing these capabilities to the wider community. Watch our long-form video, in the comments below, to learn more about how the platform works and how you can use it to make new discoveries, and go to our website or see the comments below to access the platform. We are releasing three superhuman AI Scientist agents today, each with their own specialization: A general-purpose agent (Crow); An agent to automate literature reviews (Falcon); and An agent to answer the question “Has anyone done X before” (Owl). We are also releasing an experimental agent, Phoenix, that has access to a wide variety of tools for planning experiments in chemistry. More on that below. The three literature search agents (Crow, Falcon, and Owl) have benchmarked superhuman performance. They also have access to a large corpus of full scientific texts, which means that you can ask them more detailed questions about experimental protocols and study limitations that general-purpose web search agents, which usually only have access to abstracts, might miss. Our agents also use a variety of factors to distinguish source quality, so that they don’t end up relying on low-quality papers or pop-science sources. Finally, and critically, we have an API, which is intended to allow researchers to integrate our agents into their workflows. Phoenix is an experimental project we put together recently just to demonstrate what can happen if you give the agents access to lots of scientific tools. It is not better than humans at planning experiments yet, and it makes a lot more mistakes than Crow, Falcon, or Owl. We want to see all the ways you can break it! The agents we are releasing today cannot yet do all (or even most!) aspects of scientific research autonomously. However, as we show in the video, you can already use them to generate and evaluate new hypotheses and plan new experiments way faster than before. Internally, we also have dedicated agents for data analysis, hypothesis generation, protein engineering, and more, and we plan to launch these on the platform in the coming months as well. Within a year or two, it is easy to imagine that the vast majority of desk work that scientists do today will be accelerated with the help of AI agents like the ones we are releasing today. The platform is currently free-to-use. Over time, depending on how people use it, we may implement pricing plans. If you want higher rate limits, especially for research projects, get in touch. @m_skarlinski, @andrewwhite01, @_tnadolski, Remo Storni, @semajazarb, @ludomitch, @MichaelaThinks, as well as @jasonjoyride and his team for making such fantastic videos of us!

142

683

724K

Who to follow

Geemi Wellawatte

@GWellawatte

Automating science. Cofounder @EdisonSci. Cofounder @FutureHouseSF. Prof of chem eng @UofR (on sabbatical).

Yu Xie

@YuuuXie

@MSFTResearch AI for Science. PhD @Materials_Intel @Harvard. BS @PKU1898.

SamCox822 retweeted

about 1 year ago

We are launching our FutureHouse Platform today! Our platform gives researchers public access to FutureHouse AI Scientist Agents for the first time. Check it out, at our website, and https://t.co/BlavgMdyrc.

254

127

17K

SamCox822 retweeted

about 1 year ago

We are launching a closed beta today for our data analysis agent. We are looking for extremely talented bioinformaticians and computational biologists to help us test it. Sign up here: https://t.co/DAuOOJYHGk

101

12K

SamCox822 retweeted

Sam Rodriques

@SGRodriques

over 1 year ago

The next frontier for AI Agents in Science will be data analysis. Today, we're releasing BixBench, the most sophisticated benchmark yet for data analysis in biology. Agents that can do these tasks will be powerful tools for discovery. So far, they're not even close.

SGRodriques's tweet photo. The next frontier for AI Agents in Science will be data analysis. Today, we're releasing BixBench, the most sophisticated benchmark yet for data analysis in biology. Agents that can do these tasks will be powerful tools for discovery. So far, they're not even close. https://t.co/bO8rKwd909

269

161

24K

SamCox822 retweeted

over 1 year ago

Molecular dynamics requires a lot of expert knowledge to set-up and analyze simulations. We set out to automate it with LLM agents: MDCrow!

andrewwhite01's tweet photo. Molecular dynamics requires a lot of expert knowledge to set-up and analyze simulations. We set out to automate it with LLM agents: MDCrow! https://t.co/zLczu15mO9

573

388

43K

SamCox822 retweeted

Marta Skreta @martoskreto

over 1 year ago

Finishing 2024 with one more research result! We’ve trained small language agents to do hard sci tasks: engineering proteins, manipulating DNA, and working with sci literature in a new library called Aviary. We beat humans and frontier LLMs on these tasks!

788

115

513

86K

Sam Cox @SamCox822

over 1 year ago

@andrewwhite01 we really need less data

100

SamCox822 retweeted

almost 2 years ago

(🧵1/3) presenting 2 posters at #ICML2024 @AI_for_Science in‼️5 mins‼️on (1) improving protein function representations (2) sample-efficient molecule generation more info in comments below -- come by and say hi! 🤗

martoskreto's tweet photo. (🧵1/3) presenting 2 posters at #ICML2024 @AI_for_Science in‼️5 mins‼️on

(1) improving protein function representations
(2) sample-efficient molecule generation

more info in comments below -- come by and say hi! 🤗 https://t.co/ZeiucWttvd

SamCox822 retweeted

Kevin M Jablonka @kmjablonka

almost 2 years ago

If you want to extract data using LLMs, you should check out our review 📝 arXiv https://t.co/y0gwlf6SPg 💻 hands-on online book https://t.co/V5148DFlOI

12K

SamCox822 retweeted

almost 2 years ago

This is probably one of the best opportunities for a junior bio researcher right now who wants to learn how the wet lab and AI will interact in the future of AI-powered science. If that sounds like you, get in touch!

SamCox822 retweeted

almost 2 years ago

LAB-Bench, our new benchmark for language models and agents on scientific research tasks in biology, is available today on HF here: https://t.co/7fFO49gnZc

SamCox822 retweeted

almost 2 years ago

We released today ~2,500 hard bio evals that test lab protocol design, scientific literature RAG, sequence design, figure understanding, table understanding, and more. We benchmarked against >10 independent PhD-level experts. Frontier LLMs do not exceed humans...yet 1/2

andrewwhite01's tweet photo. We released today ~2,500 hard bio evals that test lab protocol design, scientific literature RAG, sequence design, figure understanding, table understanding, and more. We benchmarked against >10 independent PhD-level experts. Frontier LLMs do not exceed humans...yet 1/2 https://t.co/x8inTjkFYN

168

14K

SamCox822 retweeted

about 2 years ago

ChemCrow was one of the first serious demonstrations of using AI to automate science. There will be many more to come. Major congratulations to the team: @SamCox822 @CarloBalda97 @OSchilter @CarloBalda97 @andrewwhite01 @pschwllr

SamCox822 retweeted