Datamap @datamapio - Twitter Profile

Pinned Tweet

Datamap @datamapio

over 2 years ago

#v1 #vélopolitain #paris #vélib

0

6

0

869

datamapio retweeted

Alex Prompter

@alex_prompter

6 months ago

This paper from Harvard and MIT quietly answers the most important AI question nobody benchmarks properly: Can LLMs actually discover science, or are they just good at talking about it? The paper is called “Evaluating Large Language Models in Scientific Discovery”, and instead of asking models trivia questions, it tests something much harder: Can models form hypotheses, design experiments, interpret results, and update beliefs like real scientists? Here’s what the authors did differently 👇 • They evaluate LLMs across the full discovery loop hypothesis → experiment → observation → revision • Tasks span biology, chemistry, and physics, not toy puzzles • Models must work with incomplete data, noisy results, and false leads • Success is measured by scientific progress, not fluency or confidence What they found is sobering. LLMs are decent at suggesting hypotheses, but brittle at everything that follows. ✓ They overfit to surface patterns ✓ They struggle to abandon bad hypotheses even when evidence contradicts them ✓ They confuse correlation for causation ✓ They hallucinate explanations when experiments fail ✓ They optimize for plausibility, not truth Most striking result: `High benchmark scores do not correlate with scientific discovery ability.` Some top models that dominate standard reasoning tests completely fail when forced to run iterative experiments and update theories. Why this matters: Real science is not one-shot reasoning. It’s feedback, failure, revision, and restraint. LLMs today: • Talk like scientists • Write like scientists • But don’t think like scientists yet The paper’s core takeaway: Scientific intelligence is not language intelligence. It requires memory, hypothesis tracking, causal reasoning, and the ability to say “I was wrong.” Until models can reliably do that, claims about “AI scientists” are mostly premature. This paper doesn’t hype AI. It defines the gap we still need to close. And that’s exactly why it’s important.

alex_prompter's tweet photo. This paper from Harvard and MIT quietly answers the most important AI question nobody benchmarks properly:

Can LLMs actually discover science, or are they just good at talking about it?

The paper is called “Evaluating Large Language Models in Scientific Discovery”, and instead of asking models trivia questions, it tests something much harder:

Can models form hypotheses, design experiments, interpret results, and update beliefs like real scientists?

Here’s what the authors did differently 👇

• They evaluate LLMs across the full discovery loop hypothesis → experiment → observation → revision
• Tasks span biology, chemistry, and physics, not toy puzzles
• Models must work with incomplete data, noisy results, and false leads
• Success is measured by scientific progress, not fluency or confidence

What they found is sobering.

LLMs are decent at suggesting hypotheses, but brittle at everything that follows.

✓ They overfit to surface patterns
✓ They struggle to abandon bad hypotheses even when evidence contradicts them
✓ They confuse correlation for causation
✓ They hallucinate explanations when experiments fail
✓ They optimize for plausibility, not truth

Most striking result:

`High benchmark scores do not correlate with scientific discovery ability.`

Some top models that dominate standard reasoning tests completely fail when forced to run iterative experiments and update theories.

Why this matters:

Real science is not one-shot reasoning.

It’s feedback, failure, revision, and restraint.

LLMs today:

• Talk like scientists
• Write like scientists
• But don’t think like scientists yet

The paper’s core takeaway:

Scientific intelligence is not language intelligence.

It requires memory, hypothesis tracking, causal reasoning, and the ability to say “I was wrong.”

Until models can reliably do that, claims about “AI scientists” are mostly premature.

This paper doesn’t hype AI. It defines the gap we still need to close.

And that’s exactly why it’s important.

378

8K

2K

6K

1M

datamapio retweeted

François Chollet

@fchollet

7 months ago

You rarely solve hard problems in a flash of insight. It's more typically a slow, careful process of exploring a branching tree of possibilities. You must pause, backtrack, and weigh every alternative. You can't fully do this in your head, because your working memory is too limited. Writing is the external medium that affords the time and precision necessary. Serious thinking must be done in writing. And that's why you can't outsource your writing, because then you're outsourcing your thinking.

92

3K

312

1K

127K

datamapio retweeted

Rohan Paul

@rohanpaul_ai

8 months ago

🧵5/n. 🧪 Critical thinking effects When answers arrive instantly, people practice evaluation and reasoning less, and the paper reports measurable declines in critical‑thinking scores among heavy users explained by offloading behavior. The punchline is not anti‑tool, it is that over‑delegation breeds standardized critical thinking, where everyone leans on the same shortcuts.

rohanpaul_ai's tweet photo. 🧵5/n. 🧪 Critical thinking effects

When answers arrive instantly, people practice evaluation and reasoning less, and the paper reports measurable declines in critical‑thinking scores among heavy users explained by offloading behavior.

The punchline is not anti‑tool, it is that over‑delegation breeds standardized critical thinking, where everyone leans on the same shortcuts.

1

18

3

5

883

Who to follow

VCS Zürich

@VcsZurich

Am 14 Juni in der Stadt Zürich: NEIN zum Parkplatz-Diktat der SVP! ➔ https://t.co/ZEdwTeZ2G4 Für eine Mobilität mit Zukunft.

Velo Mänsche Züri

@VeloMenschen

Wir sind #ZüriUfDeFelge, #CarFreeZüri & #RideForYourRights, mit 27x #Velodemo für #PopupVelowege seit Okt. 2020. 🐘: https://t.co/E7DhGVMXh1

foraus

@foraus

Schweizer Think Tank für Aussenpolitik I Forum de politique étrangère I Forum di politica estera | Swiss Forum on Foreign Policy.

datamapio retweeted

Ben Norton

@BenjaminNorton

8 months ago

"As a result of [China's] massive supply, the cost of generating electricity from solar has now fallen to a global average of around $0.04 per kilowatt hour—making it the cheapest energy source in history". https://t.co/dazMdg5ndJ Meanwhile, Western officials complain about China's so-called "overcapacity", which is precisely what is making a transition away from fossil fuels possible for the world. As this physicist writes: "one thing is clear: while China is making political decisions based on scientific evidence and while it is flooding the market with cheap solar energy, the Western world is sinking in a quagmire of self-righteous debate consisting of right-wing lies and left-wing virtue signaling. We need to get serious about how China is offering a way to combat climate change".

80

3K

855

347

119K

datamapio retweeted

François Chollet

@fchollet

9 months ago

I like the analogy of the "bicycle for the mind", because riding a bike requires effort from you, and the bike multiplies the effect of that effort. I don't think the end goal of technology should be to let you sit around and twiddle your thumbs.

64

1K

168

238

91K

datamapio retweeted

François Chollet

@fchollet

9 months ago

Software engineers shouldn't fear being replaced by AI. They should fear being asked to maintain the sprawling mess of AI-generated legacy code their employer's systems will soon run on. Because that one will actually happen.

342

8K

943

745

332K

Datamap @datamapio

9 months ago

@drfeifei Sounds a lot like this: https://t.co/m5qfaTADRr

0

13

datamapio retweeted

Prof. Eliot Jacobson

@EliotJacobson

11 months ago

Will this be the last Keeling curve upate from NOAA for CO2 at Mauna Loa? June 2025: 429.61 ppm This may be a tragic moment:

119

3K

812

316

204K

datamapio retweeted

Gabriel Zucman

@gabriel_zucman

11 months ago

An international call for action just got louder: Today, 7 Nobel Laureates have issued a powerful call for a minimum tax on the ultra-wealthy in Le Monde Here’s a quick breakdown of the debate—and where things stand globally https://t.co/kJKbMhPj5R 🧵

18

508

211

82

74K

datamapio retweeted

ChrisO_wiki

@ChrisO_wiki

about 1 year ago

1/ This graph from @JonBruner tells an important story: America's current dominance in science only began after the mid-1930s, when persecuted scientists began fleeing universities in Germany and then elsewhere in occupied Europe.

ChrisO_wiki's tweet photo. 1/ This graph from @JonBruner tells an important story: America's current dominance in science only began after the mid-1930s, when persecuted scientists began fleeing universities in Germany and then elsewhere in occupied Europe. https://t.co/EzqXTzAfkr

105

6K

2K

1K

343K

datamapio retweeted

John B. Holbein

@JohnHolbein1

about 1 year ago

You've heard of the studies where they give the same dataset/research question to a bunch of researchers and they tend to get different answers, right? Why is that? This new working paper shows that it has a lot to do with data cleaning. This is consistent with Gelman's "garden of forking paths" analogy. Small researcher coding decisions greatly influence results, often without being explicitly acknowledged.

JohnHolbein1's tweet photo. You've heard of the studies where they give the same dataset/research question to a bunch of researchers and they tend to get different answers, right?

Why is that?

This new working paper shows that it has a lot to do with data cleaning.

This is consistent with Gelman's "garden of forking paths" analogy. Small researcher coding decisions greatly influence results, often without being explicitly acknowledged.

24

2K

350

1K

211K

datamapio retweeted

Albert Pinto @70sBachchan

about 1 year ago

Mexico's president Claudia Sheinbaum is an energy systems expert. She is positioning Mexico to lead in the global green economy —from EVs & batteries to Renewables,Critical minerals,HVAC manufacturing. Her Plan Mexico is at a critical juncture. Our report: https://t.co/YFPeRpGN3Z

70sBachchan's tweet photo. Mexico's president Claudia Sheinbaum is an energy systems expert. She is positioning Mexico to lead in the global green economy —from EVs & batteries to Renewables,Critical minerals,HVAC manufacturing. Her Plan Mexico is at a critical juncture. Our report:
https://t.co/YFPeRpGN3Z https://t.co/E99wP8YnyN

9

352

121

163

41K

datamapio retweeted

John B. Holbein

@JohnHolbein1

about 1 year ago

Overall, many employers in their sample have a distinct Democratic tilt. Look at how few sectors are dominated by Republicans! {This could have something to do with what orgs are in their database, but their sample is quite large!}

JohnHolbein1's tweet photo. Overall, many employers in their sample have a distinct Democratic tilt. Look at how few sectors are dominated by Republicans!

{This could have something to do with what orgs are in their database, but their sample is quite large!} https://t.co/H0KfB6o5g2

11

592

42

104

60K

datamapio retweeted

Robert Sterling

@RobertMSterling

about 1 year ago

Now, looping back around, how does a dive bar tie into this? When our team visited the mill, we would stay in Memphis, about an hour away. At the end of the day, we would swing by the closest bar—Bar Dog—for a nightcap. And the bar was usually full of people speaking German.

RobertMSterling's tweet photo. Now, looping back around, how does a dive bar tie into this?

When our team visited the mill, we would stay in Memphis, about an hour away. At the end of the day, we would swing by the closest bar—Bar Dog—for a nightcap.

And the bar was usually full of people speaking German. https://t.co/IPqK5sR9Xu

11

1K

63

34

69K

datamapio retweeted

Kyla Scanlon

@kylascan

about 1 year ago

21

915

93

265

100K

datamapio retweeted

Virat Chauhan @viratzzs

about 1 year ago

amazing graph on open source access to research and its byproducts by @percyliang

0

17

6

9

6K

Datamap @datamapio

about 1 year ago

@JeffWeniger @Noahpinion I lived in France, Switzerland and the US. I had the highest salary in the US, but it still felt way less than in the other two countries.

0

22

Datamap @datamapio

about 1 year ago

@JeffWeniger @Noahpinion On average. Look at the median: https://t.co/YAsfhWD5EP

0

15

datamapio retweeted

John Burn-Murdoch

@jburnmurdoch

about 1 year ago

NEW 🧵: Is human intelligence starting to decline? Recent results from major international tests show that the average person’s capacity to process information, use reasoning and solve novel problems has been falling since around the mid 2010s. What should we make of this?

jburnmurdoch's tweet photo. NEW 🧵: Is human intelligence starting to decline?

Recent results from major international tests show that the average person’s capacity to process information, use reasoning and solve novel problems has been falling since around the mid 2010s.

What should we make of this? https://t.co/WyZoNEcTd3

2K

17K

5K

8K

4M

Datamap @datamapio

about 1 year ago

datamapio's tweet photo. https://t.co/zSTd0zMWTl

0

53

Datamap

@datamapio

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users