We built a semantic search engine for millions of galaxy images by having LLMs write the captions.
These images are completely unlabeled, but our method enables astronomers to search for rare phenomena via text. Try our app! 🔭👇
What will science look like in the age of agentic AI?
Come build the answer: 30 of us, 4 days in Berkeley, prototyping open-source tools for AI-assisted research.
July 28–31 → https://t.co/dRbEyGC0y1
📣 Announcing Terminal-Bench Science: benchmarking AI agents on real scientific workflows – now open for task contributions👇
https://t.co/MSPMwnbhVt
@AnthropicAI, @OpenAI, and @GoogleDeepMind use Terminal-Bench to evaluate AI on coding tasks. We're now extending it to scientific workflows.
1/6🧵
With agents, it's much easier to make science reproducible and expandable.
At Lightcone Research (a new initiative!), we're working on an open spec to support AI-driven research: not papers, not code, it's a well thought-out YAML to keep your agent in check. Come contribute!
Today, we’re launching Lightcone Research.
AI is changing what’s tractable in scientific research, putting more ambitious questions within reach.
We build open-source infra that turn this expanded reach into results other scientists can reproduce, inspect, and build on.
I'm honoured to be part of the inaugural ChatGPT Futures, Class of 2026, alongside an impressive cohort!
Supported by amazing collaborators and organizations, we made 100M+ galaxy images searchable by text. I’m excited to keep exploring how AI can help with scientific discovery.
Introducing the ChatGPT Futures Class of 2026—26 honorees from the first graduating class to have had ChatGPT throughout all four years of university, who used AI to:
- Map 1.5M previously unknown objects in space
- Detect disaster survivors through walls and debris
- Make 100M+ galaxy images searchable
- Preserve endangered languages
- Build infrastructure to reroute 5M+ pounds of unsold inventory from landfills
Everyone working on verifiable scientific code should read this blog by @kdqg1 on how he got Claude to build a JAX-based cosmological solver in a few days.
Give it success criteria (in this case, matching existing code to 0.1%) and iterate until success!
https://t.co/8g02v7iuPx
@cgeorgiaw Such a great point, it's quite hard to think of many problems in astrophysics (not AI+astrophysics) that are hill climbing problems. Maybe improving the speed of simulations is one example.
@DimitrisPapail We worked towards this in ReplicationBench (https://t.co/WqoDt64Rza) which was adapted to the Harbor format. But the LLM-generated tasks still had to be reviewed by experts (in this case astrophysicists).
https://t.co/NFSJwnVmfM
New blog post: Running experiments with Claude Code overnight
An account of letting Claude Code run experiments while I sleep, getting suspiciously good results, and then finding the subtle bug it missed. General musings as I test out this new paradigm!
https://t.co/SEIMP2X3Zj