Andy Xor @andy_xor - Twitter Profile

Pinned Tweet

over 5 years ago

Actual AI manifesto: "Actual" means "not fake". We believe 'deep learning' is a dead-end and actual AI will come from understanding animal cognition (including human neuroscience). One has to respect the constraints of biological learning & memory in order to imitate them.

27

49

4

6

0

andy_xor retweeted

Rohan Paul

@rohanpaul_ai

about 14 hours ago

Beautiful paper from Google DeepMind. Explains the pathways from AGI to ASI, and why that jump could happen through several routes. The authors frame the AGI-to-ASI transition around 4 technical pathways: - continued scaling of compute, model size, data, and test-time inference; - algorithmic paradigm shifts beyond today’s transformer-based foundation-model stack; - recursive self-improvement, where AI accelerates AI R&D and improves future systems; and - multi-agent collective intelligence, where large populations of specialized agents coordinate into a superhuman group agent. Scaling may work for a while, but it could hit limits in data, compute, energy, or weaker returns from making systems larger. Recursive improvement is the most uncertain path, because AI could speed up AI research, but that loop may also slow if hard research problems need real-world testing, scarce hardware, or new ideas. Multi-agent collectives may be the most underappreciated path, because a society of competent digital workers could outperform a brilliant individual model through specialization, speed, and coordination. The big point is that ASI may not arrive as 1 sudden event, but as a chain of faster changes as AI helps create better AI and stronger scientific tools. ---- Link – arxiv. org/abs/2606.12683 Title: "From AGI to ASI"

rohanpaul_ai's tweet photo. Beautiful paper from Google DeepMind.

Explains the pathways from AGI to ASI, and why that jump could happen through several routes.

The authors frame the AGI-to-ASI transition around 4 technical pathways:

- continued scaling of compute, model size, data, and test-time inference;

- algorithmic paradigm shifts beyond today’s transformer-based foundation-model stack;

- recursive self-improvement, where AI accelerates AI R&D and improves future systems; and

- multi-agent collective intelligence, where large populations of specialized agents coordinate into a superhuman group agent.

Scaling may work for a while, but it could hit limits in data, compute, energy, or weaker returns from making systems larger.

Recursive improvement is the most uncertain path, because AI could speed up AI research, but that loop may also slow if hard research problems need real-world testing, scarce hardware, or new ideas.

Multi-agent collectives may be the most underappreciated path, because a society of competent digital workers could outperform a brilliant individual model through specialization, speed, and coordination.

The big point is that ASI may not arrive as 1 sudden event, but as a chain of faster changes as AI helps create better AI and stronger scientific tools.

----

Link – arxiv. org/abs/2606.12683

Title: "From AGI to ASI"

17

362

79

279

17K

andy_xor retweeted

Jeff Dean

@JeffDean

about 19 hours ago

Quite interesting thread on capabilities of real biological neurons (spoiler: they're way more capable than classical artificial neurons in a perceptron) . Nice work @IdoAizenbud and collaborators!

25

574

69

354

76K

andy_xor retweeted

Daily Wire

@realDailyWire

4 days ago

You mean after an African migrant tried to behead him.

26

10K

519

65

236K

Who to follow

Chuck Pittman

@ChuckPittman8

Air Force Veteran, 42 years of service. Question everyone and everything! The honest ones won't mind. If they mind, question them more.

RW

@randydu55280024

Conservative and definitely not afraid

Bobby Rodriguez

@bobby1884

School Of Hard knocks 🇺🇸🇵🇷🇳🇮 Lord Jesus Christ, Son of God, have mercy on me, a sinner 🙏🏽✝️

andy_xor retweeted

signüll

@signulll

4 days ago

this is absolutely incredible.

102

2K

99

299

222K

andy_xor retweeted

Science Simplified

@Scivf4

9 days ago

This is what the AI brain looks like.

290

11K

2K

4K

1M

andy_xor retweeted

Rohan Paul

@rohanpaul_ai

8 days ago

Nemotron 3 Ultra vs GPT-5.5 on atomic[.]chat, a desktop app that runs LLMs locally. Nemotron 3 Ultra gave almost similar result on a test to build HTML5 canvas with real physics, while being 10X cheaper. - Nemotron 3 Ultra: 11.3k tokens, $0.051 - GPT 5.5: 11.0k tokens, $0.57 Nemotron 3 Ultra has 550 bn total parameters (55 bn active per token), because it is a Mixture-of-Experts model.

10

139

14

64

30K

andy_xor retweeted

OpenAI

@OpenAI

9 days ago

We’ve been researching new ways for ChatGPT memory to carry context across conversations and keep it useful over time. Today, that work is rolling out as a more capable memory system in ChatGPT. https://t.co/0MyFKCe2Mu

743

10K

1K

3K

3M

andy_xor retweeted

Greg Brockman

@gdb

9 days ago

much better ChatGPT memory:

127

1K

44

71

133K

andy_xor retweeted

Flo Crivello

@Altimor

9 days ago

Pulled the trigger today and switched 100% of Lindy traffic to DeepSeek v4, churning from Anthropic models. Saves us millions of $ and we're actually seeing an *increase* in performance on many core use cases. Transformative for the business.

169

3K

162

1K

928K

andy_xor retweeted

Chubby♨️

@kimmonismus

9 days ago

Holy moly, Anthropic is getting very serious about recursive self-improvement! One word: acceleration. Insane blog article. Tl;dr: •We are close to an AI capable of fully autonomously designing and building its own successor •They stress this isn’t here yet and isn’t inevitable, but could arrive sooner than most institutions are ready for •Anthropic engineers now ship on average 8x as much code per quarter as they did in 2021–2025 •Task length AI can reliably complete is doubling roughly every 4 months (up from every 7 months) •Opus 3 (Mar 2024) handled ~4-minute tasks; Sonnet 3.7 (a year later) ~90-minute tasks; Opus 4.6 (a year after that) 12-hour tasks •SWE-bench went from low single digits to saturated in two years; CORE-bench (research reproduction) went ~20% to saturated in 15 months •METR found Claude Mythos Preview could work “at least” 16 hours, at the top of what they can currently measure •As of May 2026, Claude authored 80%+ of code merged into Anthropic’s codebase (low single digits before Claude Code launched in Feb 2025) •A March 2026 poll of 130 research staff: median respondent estimated ~4x output with Mythos Preview •One April 2026 example: Claude shipped 800+ fixes cutting a class of API errors 1,000x, work an engineer estimated would have taken a human four years •Claude-written code quality: worse than human in late 2025, roughly at parity now, expected to be strictly better within the year •On the hardest open-ended tasks, Claude’s success rate hit 76% in May 2026, up 50 points in six months •Code-speedup test: Opus 4 averaged ~3x speedup (May 2025), Mythos Preview ~52x (April 2026); a skilled human needs 4–8 hours to hit 4x •In an AI-safety research project, Claude agents recovered 97% of a performance gap (vs ~23% for two human researchers in a week), over 800 compute-hours and ~$18K •On picking the better “next step” in research sessions, the best model beat the human choice 51% (Nov 2025, Opus 4.5) rising to 64% (April 2026, Mythos Preview) •Human comparative advantage, for now: research taste and judgment, i.e. choosing which problems matter and when an approach is a dead end Three possible futures •The trend stalls (S-curve), but today’s capabilities still diffuse widely; they consider this least likely •Compounding efficiency gains, with humans still setting direction; 100-person firms doing the work of 10,000+; they think this is the likely path •Full recursive self-improvement, where AI builds its successors and pace is set by compute; the alignment outcome here is what they’re least certain about

kimmonismus's tweet photo. Holy moly, Anthropic is getting very serious about recursive self-improvement!

One word: acceleration.

Insane blog article.

Tl;dr:

•We are close to an AI capable of fully autonomously designing and building its own successor

•They stress this isn’t here yet and isn’t inevitable, but could arrive sooner than most institutions are ready for

•Anthropic engineers now ship on average 8x as much code per quarter as they did in 2021–2025

•Task length AI can reliably complete is doubling roughly every 4 months (up from every 7 months)

•Opus 3 (Mar 2024) handled ~4-minute tasks; Sonnet 3.7 (a year later) ~90-minute tasks; Opus 4.6 (a year after that) 12-hour tasks

•SWE-bench went from low single digits to saturated in two years; CORE-bench (research reproduction) went ~20% to saturated in 15 months

•METR found Claude Mythos Preview could work “at least” 16 hours, at the top of what they can currently measure

•As of May 2026, Claude authored 80%+ of code merged into Anthropic’s codebase (low single digits before Claude Code launched in Feb 2025)

•A March 2026 poll of 130 research staff: median respondent estimated ~4x output with Mythos Preview

•One April 2026 example: Claude shipped 800+ fixes cutting a class of API errors 1,000x, work an engineer estimated would have taken a human four years

•Claude-written code quality: worse than human in late 2025, roughly at parity now, expected to be strictly better within the year

•On the hardest open-ended tasks, Claude’s success rate hit 76% in May 2026, up 50 points in six months

•Code-speedup test: Opus 4 averaged ~3x speedup (May 2025), Mythos Preview ~52x (April 2026); a skilled human needs 4–8 hours to hit 4x

•In an AI-safety research project, Claude agents recovered 97% of a performance gap (vs ~23% for two human researchers in a week), over 800 compute-hours and ~$18K

•On picking the better “next step” in research sessions, the best model beat the human choice 51% (Nov 2025, Opus 4.5) rising to 64% (April 2026, Mythos Preview)

•Human comparative advantage, for now: research taste and judgment, i.e. choosing which problems matter and when an approach is a dead end

Three possible futures

•The trend stalls (S-curve), but today’s capabilities still diffuse widely; they consider this least likely

•Compounding efficiency gains, with humans still setting direction; 100-person firms doing the work of 10,000+; they think this is the likely path

•Full recursive self-improvement, where AI builds its successors and pace is set by compute; the alignment outcome here is what they’re least certain about

97

2K

162

572

258K

andy_xor retweeted

NVIDIA

@nvidia

9 days ago

Introducing NVIDIA Nemotron 3 Ultra. A frontier smart open model built for long-running agents that need to plan, reason, use tools and keep working across complex coding, research and enterprise workflows. Up to 5x faster inference and up to 30% lower cost for agentic tasks. Learn more: https://t.co/h9XLqqYPFf

120

2K

268

449

224K

andy_xor retweeted

Sundar Pichai

@sundarpichai

10 days ago

Our new Gemma 4 12B model hits a sweet spot between size + performance: it can run locally on a laptop, while enabling powerful multi-step reasoning and agentic workflows. Can’t wait to see what the community does with this one!

239

5K

368

775

427K

andy_xor retweeted

Elon Musk

@elonmusk

9 days ago

Hadamard thought in image space

2K

19K

4K

5K

7M

Andy Xor @andy_xor

9 days ago

@ProtectOurCare Stop transing kids

0

12

andy_xor retweeted

hardmaru

@hardmaru

17 days ago

For over a decade, we’ve accepted that end-to-end backprop is the only way to train deep networks. But holding the entire network in memory all at once is why AI training is hitting a resource wall. We found a new way to break the network into blocks and train them independently. The trick? Treating the network’s forward pass like a diffusion model denoising a signal. This reinterpretation slashes the memory needed to train deep models. In our #ICLR2026 paper (https://t.co/PK5h0mqQSo), we matched end-to-end performance across ViTs, DiTs, and LLMs. We did this while training just one isolated block at a time.

154

6K

637

4K

741K

andy_xor retweeted

Luke Martin

@VentureCoinist

17 days ago

the new @Starlink videos are crazy because it makes you realize there's a decent chance we become aliens to someplace else before they ever visit us here imagine showing this to someone from 100 years ago on earth, they would think we've already been invaded

81

2K

79

200

442K

andy_xor retweeted

Daniel Friedman

@DanFriedman81

23 days ago

After seeing the footage of the October 7 atrocities and watching American leftists celebrate with paraglider memes, I can’t overstate how important it is to me for every single Hamas paraglider guy to be killed. I want the activists to know that Israel killed every single one of them.

396

9K

1K

369

270K

andy_xor retweeted