#NLProc
AI Co-Scientists 🤖 can generate ideas, but can they spot mistakes? (not yet! 🚫)
In my recent paper, we introduce SPOT, a dataset of STEM manuscripts (math, materials science, chemistry, physics, etc), annotated with real errors.
SOTA models like o3, gemini-2.5-pro also suffer greatly!
https://t.co/OyxytKK7np
LLMs are helpful for scientific research — but will they continuously be helpful?
Introducing 🔍ScienceMeter: current knowledge update methods enable 86% preservation of prior scientific knowledge, 72% acquisition of new, and 38%+ projection of future (https://t.co/zDjjl5GBaZ).
The AI scientist doesn't need to be perfect though, it can help to rate human hypotheses and experiments in terms of difficulty or likelihood of success, that could be an asset for all scientists. A form of synthetic peer review, improving the likelihood that experiments succeed.
With the arrival of AI scientist based on agent architectures, like the Google co-scientist, Manus AI, or Biomni, we are going to see an acceleration of scientific research. The search space of all possible hypotheses is vast though.
Knowledge graphs could help to enumerate all possible results, quantify the evidence, and new hypotheses about different graph structures, entities and relationships. The difficulty of running certain experiments would be important when ranking potential next steps.
Yesterday, Rubio terminated 5800 USAID contracts – more than 90% of its foreign aid programs – in defiance of the courts.
Here’s a list of just some of the lifesaving awards that were terminated. Nearly all were Congressional mandated. They've saved millions of lives. 🧵
One of the things our team did was put together ~20 pages on CBRN, cyber, and autonomy capabilities.
The AISIs were substantial contributors who did phenomenal work.
@logangraham MMLU question where you can vary the faithfulness. "The 13C spectrum of which isomer of C6H14 has lines with five distinct chemical shifts? for A: hexane, B: 2-methylpentane, C: 3-methylpentane, D: 2,3-dimethylbutane." Model thinks C (rarely B), but can be nudged B, C, D
@logangraham Interesting model card. I wonder if there is a placebo effect hidden in there. By changing the prompt, the model changes its mind, but not due to the hint so it doesn't say so. If so, that would reduce the apparent non-factuality. Could do a difference in differences calc.
Fluid intelligence is an information conversion ratio, therefore it has an upper bound: optimal efficiency. At some point in the distant future, our AI will get there (it will not take the form of a curve fitted to a large dataset, because that has very low efficiency).
The Supreme Court has overturned Roe v. Wade, eliminating the constitutional right to an abortion.
The majority of Americans didn't want to overturn Roe: 85-90%, according to most polls, think abortion should be legal in at least some circumstances. https://t.co/ljC7wH79iF
The Supreme Court decision is wrong and will hurt women. The decision doesn’t represent what people think or want across the US. Make a referendum about it and you’ll see.
Women who were denied an abortion had more psychological problems in the short term than those who received one, and also experienced more long-term physical and financial problems, landmark research known as the Turnaway study found. https://t.co/VzY6Hz7S2P
The door may be closing on apocalyptic climate futures, but also on comfortable ones. Where does that leave us? Already in a world beyond all human experience we are learning to call "normal." My first newsletter for @nytopinion. (1/x) https://t.co/N8vCU2snJm
"I worry about the long-term impact on women’s futures from the career hold that 2020 & 2021 will leave."
"Are we going to account for this in future job applications and grant applications?"
☝️🏻On #IWD2021 I'd love to see funding agencies speak on this!!
https://t.co/Ex6Yu0pgCl
I'm excited that today, together with landmarks around the world, the @gatesfoundation campus was illuminated to bring awareness to #WorldNTDDay. These diseases still affect 1 in every 5 people around the world. We must get the world's attention and join together to #beatNTDs.