Prudent AI @prudent_AI - Twitter Profile

Pinned Tweet

Prudent AI @prudent_AI

27 days ago

Is this for real 😂 @radbackwards @BerntBornich

1

3

0

60

prudent_AI retweeted

Chubby♨️

@kimmonismus

about 9 hours ago

Holy moly, Anthropic is getting very serious about recursive self-improvement! One word: acceleration. Insane blog article. Tl;dr: •We are close to an AI capable of fully autonomously designing and building its own successor •They stress this isn’t here yet and isn’t inevitable, but could arrive sooner than most institutions are ready for •Anthropic engineers now ship on average 8x as much code per quarter as they did in 2021–2025 •Task length AI can reliably complete is doubling roughly every 4 months (up from every 7 months) •Opus 3 (Mar 2024) handled ~4-minute tasks; Sonnet 3.7 (a year later) ~90-minute tasks; Opus 4.6 (a year after that) 12-hour tasks •SWE-bench went from low single digits to saturated in two years; CORE-bench (research reproduction) went ~20% to saturated in 15 months •METR found Claude Mythos Preview could work “at least” 16 hours, at the top of what they can currently measure •As of May 2026, Claude authored 80%+ of code merged into Anthropic’s codebase (low single digits before Claude Code launched in Feb 2025) •A March 2026 poll of 130 research staff: median respondent estimated ~4x output with Mythos Preview •One April 2026 example: Claude shipped 800+ fixes cutting a class of API errors 1,000x, work an engineer estimated would have taken a human four years •Claude-written code quality: worse than human in late 2025, roughly at parity now, expected to be strictly better within the year •On the hardest open-ended tasks, Claude’s success rate hit 76% in May 2026, up 50 points in six months •Code-speedup test: Opus 4 averaged ~3x speedup (May 2025), Mythos Preview ~52x (April 2026); a skilled human needs 4–8 hours to hit 4x •In an AI-safety research project, Claude agents recovered 97% of a performance gap (vs ~23% for two human researchers in a week), over 800 compute-hours and ~$18K •On picking the better “next step” in research sessions, the best model beat the human choice 51% (Nov 2025, Opus 4.5) rising to 64% (April 2026, Mythos Preview) •Human comparative advantage, for now: research taste and judgment, i.e. choosing which problems matter and when an approach is a dead end Three possible futures •The trend stalls (S-curve), but today’s capabilities still diffuse widely; they consider this least likely •Compounding efficiency gains, with humans still setting direction; 100-person firms doing the work of 10,000+; they think this is the likely path •Full recursive self-improvement, where AI builds its successors and pace is set by compute; the alignment outcome here is what they’re least certain about

kimmonismus's tweet photo. Holy moly, Anthropic is getting very serious about recursive self-improvement!

One word: acceleration.

Insane blog article.

Tl;dr:

•We are close to an AI capable of fully autonomously designing and building its own successor

•They stress this isn’t here yet and isn’t inevitable, but could arrive sooner than most institutions are ready for

•Anthropic engineers now ship on average 8x as much code per quarter as they did in 2021–2025

•Task length AI can reliably complete is doubling roughly every 4 months (up from every 7 months)

•Opus 3 (Mar 2024) handled ~4-minute tasks; Sonnet 3.7 (a year later) ~90-minute tasks; Opus 4.6 (a year after that) 12-hour tasks

•SWE-bench went from low single digits to saturated in two years; CORE-bench (research reproduction) went ~20% to saturated in 15 months

•METR found Claude Mythos Preview could work “at least” 16 hours, at the top of what they can currently measure

•As of May 2026, Claude authored 80%+ of code merged into Anthropic’s codebase (low single digits before Claude Code launched in Feb 2025)

•A March 2026 poll of 130 research staff: median respondent estimated ~4x output with Mythos Preview

•One April 2026 example: Claude shipped 800+ fixes cutting a class of API errors 1,000x, work an engineer estimated would have taken a human four years

•Claude-written code quality: worse than human in late 2025, roughly at parity now, expected to be strictly better within the year

•On the hardest open-ended tasks, Claude’s success rate hit 76% in May 2026, up 50 points in six months

•Code-speedup test: Opus 4 averaged ~3x speedup (May 2025), Mythos Preview ~52x (April 2026); a skilled human needs 4–8 hours to hit 4x

•In an AI-safety research project, Claude agents recovered 97% of a performance gap (vs ~23% for two human researchers in a week), over 800 compute-hours and ~$18K

•On picking the better “next step” in research sessions, the best model beat the human choice 51% (Nov 2025, Opus 4.5) rising to 64% (April 2026, Mythos Preview)

•Human comparative advantage, for now: research taste and judgment, i.e. choosing which problems matter and when an approach is a dead end

Three possible futures

•The trend stalls (S-curve), but today’s capabilities still diffuse widely; they consider this least likely

•Compounding efficiency gains, with humans still setting direction; 100-person firms doing the work of 10,000+; they think this is the likely path

•Full recursive self-improvement, where AI builds its successors and pace is set by compute; the alignment outcome here is what they’re least certain about

65

1K

114

386

137K

prudent_AI retweeted

Bernt Bornich

@BerntBornich

about 10 hours ago

We’re going all in on World Models. Today we’re launching the 1X World Model Lab. The bet is simple: You can’t fine-tune your way to AGI. And you definitely can’t fine-tune your way to robots that can operate in the physical world. General-purpose humanoids need models that understand space, motion, objects, causality, affordances, physics, and action before they ever see a specific task. The frontier is not better VLA wrappers. The frontier is embodied world models. The 1X World Model Lab will focus on large-scale embodied world model pretraining: building the most generalizable foundation model for humanoid robots from the ground up. The next frontier in AI requires scaling: web-scale media + egocentric human videos + sim + dexterous remote operated robot data + on-policy NEO data → real-world deployment for robot data collection and RL → abundance of data → physical AI The robot collects data. The model gets better. The robot gets better. Repeat. To lead this, we brought in one of the best for the mission: @_sam_sinha_ , as Head of World Models. Sam was a founding research scientist at Luma AI and has been at the frontier of scaling multimodal generative video models his whole career. If you’re the best in the world at large-scale pretraining, video models, robotics, RL, infra, or data — and you want your models to move atoms, not just pixels — join us. Send background + evidence of exceptional ability to: [email protected] We’re building the model that makes autonomous labor real.

BerntBornich's tweet photo. We’re going all in on World Models.

Today we’re launching the 1X World Model Lab.

The bet is simple:

You can’t fine-tune your way to AGI.

And you definitely can’t fine-tune your way to robots that can operate in the physical world.

General-purpose humanoids need models that understand space, motion, objects, causality, affordances, physics, and action before they ever see a specific task.

The frontier is not better VLA wrappers.

The frontier is embodied world models.

The 1X World Model Lab will focus on large-scale embodied world model pretraining: building the most generalizable foundation model for humanoid robots from the ground up.

The next frontier in AI requires scaling:

web-scale media + egocentric human videos + sim + dexterous remote operated robot data + on-policy NEO data → real-world deployment for robot data collection and RL → abundance of data → physical AI

The robot collects data.
The model gets better.
The robot gets better.
Repeat.

To lead this, we brought in one of the best for the mission: @_sam_sinha_ , as Head of World Models.

Sam was a founding research scientist at Luma AI and has been at the frontier of scaling multimodal generative video models his whole career.

If you’re the best in the world at large-scale pretraining, video models, robotics, RL, infra, or data — and you want your models to move atoms, not just pixels — join us.

Send background + evidence of exceptional ability to:

wmlab@1x.tech

We’re building the model that makes autonomous labor real.

87

2K

126

638

131K

prudent_AI retweeted

Boston Dynamics @BostonDynamics

2 days ago

If you've ever worked on your feet all day, you know how important a good pair of shoes is. Mechanical design engineer Chastity Kelly shares how we built Atlas functional feet. Learn more about Atlas' design, Chastity's path to robotics, and starting on the right foot: https://t.co/7rdiNnMIYc

16

392

60

69

37K

Who to follow

HYDERABAD Weatherman

@HYDWeatherMan

Weather Enthusiastic, Love to Tweet Anything about Rain😍 Follow : @Hyderabadrains For More Updates

Ami Shah

@ami_b_shah

A journo turned communications professional! India's top women in Finance 2023, AJF Fellow. RTs aren't endorsements, views are personal!

IONAGE

@ionageindia

Unified charging platform to seamlessly access India's EV charging infrastructure Download IONAGE App here - https://t.co/TyFpIi8A7L

prudent_AI retweeted

Humanoids daily

@humanoidsdaily

4 days ago

🚨Breaking: NVIDIA and Sharpa have just unveiled the NVIDIA Isaac GR00T Reference Humanoid Robot at GTC Taipei—a standardized, open hardware and software reference design for physical AI research. The platform unifies hardware, sensing, and compute into a single out-of-the-box development stack to eliminate fragmented workflows and speed up sim-to-real deployment. The hardware specs: • Body: Unitree H2 Plus full-sized chassis (31 DOF) • Hands: Dual Sharpa Wave dexterous hands (22 DOF per hand) featuring a visuo-tactile array with >1,000 pixels per fingertip and 0.02N sensitivity for complex, contact-rich manipulation. • Brain: NVIDIA Jetson AGX Thor T5000 onboard compute delivering 2,070 FP4 TFLOPS of Blackwell-architecture AI performance. The reference platform comes preloaded with the NVIDIA GR00T 1.7 humanoid model and works natively with Isaac Lab simulation and teleoperation libraries. Timing is everything: This massive partnership drops on the exact day of Unitree’s STAR Market IPO listing hearing in Shanghai, and amidst a proposed U.S. federal procurement ban on Chinese-made unmanned ground vehicles. Shipping is slated to begin in October 2026 to research labs worldwide.

5

111

25

32

57K

prudent_AI retweeted

Nikita Bier

@nikitabier

3 days ago

Commentary is one of the most important pillars of X. And sometimes the best way to share your thoughts is with video. Today we're launching a whole new way to make them: React with Video Tap the repost button and start recording with green screen, split screen, or picture-in-picture. Now available on iOS

4K

14K

2K

7M

Prudent AI @prudent_AI

2 days ago

@nikitabier This is amazing Finally a good feature from X team

0

1

21

prudent_AI retweeted

Lukas Ziegler

@lukas_m_ziegler

4 days ago

🚨 BREAKING: NVIDIA just announced the Isaac GR00T Reference Humanoid Robot. The first fully open humanoid robot reference design built on Jetson Thor, and it's going straight to the world's top research institutions. This is Jensen Huang's bet on open physical AI infrastructure. The hardware stack is serious: → Unitree H2 Plus chassis, 6 feet tall, 150 pounds, 31 degrees of freedom → Sharpa Wave tactile five-finger hands, 22 degrees of freedom, bringing total to 75 across the full body → NVIDIA Jetson AGX Thor onboard compute, 2,070 FP4 teraflops of AI performance, 128GB unified memory → Multi-view sensing, stereo head camera, wrist cameras, IMU Alongside this announcement, @UnitreeRobotics also introduced the H2 Plus as a standalone product, a frontier humanoid combining Unitree's own body, Sharpa's five-finger hands and NVIDIA Robotics Jetson Thor compute into one fully integrated research platform. The full Isaac GR00T software stack ships with it, teleoperation for data capture, open foundation models, Isaac Sim for training, Isaac Lab for evaluation, and accelerated ROS middleware for deployment. The complete loop from data to real-world robot in one unified platform. ETH Zürich, Stanford Robotics Center, UC San Diego and Ai2 are already on board as launch research partners. @NVIDIARobotics did to AI what it's now doing to robotics, build the platform, open the ecosystem, let the world build on top of it. Whoever owns the infrastructure layer wins. NVIDIA knows this better than anyone. 👀 Read more here: https://t.co/rQWJTquS0X ~~ ♻️ Join the weekly robotics newsletter, and never miss any news → https://t.co/GoA3ZuwoPB

11

229

55

72

15K

Prudent AI @prudent_AI

5 days ago

Mistral enters the robotics race 🔥

Olivier Duchenne

@inventorOli

7 days ago

Robostral can now follow natural language instructions. It responds to voice commands and pointing. It is also getting better at fine-grained manipulation where precise force control matters. It generalizes to new objects and tasks not present in the training data.

25

450

63

153

41K

0

17

Prudent AI @prudent_AI

5 days ago

@kimmonismus I am actually surprised to see Google held back VEO 4 release from I / O event. It would have been the highlight of the day undoubtedly.

0

71

prudent_AI retweeted

Space and Technology

@spaceandtech_

7 days ago

RAI Institute’s Ultra mobility vehicle is showing off advanced new stunts with smooth 360-degree spins, kip jumps, flips, and bunny hops. The AI-powered machine performs sharp turns, mid-air rotations, and clean landings with impressive balance and precision. And this is only the beginning of what the vehicle can achieve.

23

591

156

129

56K

Prudent AI @prudent_AI

8 days ago

@ai_for_success Perplexity, OpenAI, Google all announced paid subscriptions and then offered these plans free for a period. Curious what Meta is going to do.

0

37

Prudent AI @prudent_AI

9 days ago

IT industry that is built on outsourcing, but not on building products. Sorry to say we are doomed with this leadership in the entire industry. And Nandan Nilekani is clearly not the face India should look at, if we're serious about survival.

0

1

0

31

Prudent AI @prudent_AI

9 days ago

I am actually surprised to see Indian IT majors are not at all aggressive about investing in R&D or investing in promising AI startups. HCL invested in $ 150 million in SARVAM AI recently. Apart from this I see absolutely no big initiative so far.

1

0

61

Prudent AI @prudent_AI

9 days ago

Now don't tell me about ZOHO investing in ZIA, or Mahindra investing in Small language models currently and having plans to build LLMs later. If this is the aggression India is prepared to show in the face of a Dinosaur that is coming to swallow the entire IT industry.

1

0

26

prudent_AI retweeted

NIK

@ns123abc

11 days ago

🚨 Google DeepMind CEO Sir Demis Hassabis: “Today’s systems, are nowhere near [AGI]. Doesn’t matter how many Erdős problems you solve… I think it’s far, far from what a true invention or someone like a Ramanujan would have been able to do” it’s over for the Erdős hype

172

5K

455

1K

729K

Prudent AI @prudent_AI

11 days ago

@kimmonismus 😂

0

58

prudent_AI retweeted

Rohan Paul

@rohanpaul_ai

12 days ago

🇨🇳 China's Hangzhou Airport is now using its first track-guided bird-dispersion robot. Has directional sound devices, insect-killing lamps & cameras. Gives runways 24/7 protection with smart patrols, HD cameras, and a greener way to keep birds away.

10

120

23

27

13K

Prudent AI @prudent_AI

12 days ago

@kimmonismus With hype and fear enterprises around the world will be ready to pay a premium price for the model. That's the strategy.

0

1

0

130

prudent_AI retweeted

Dwarkesh Patel

@dwarkesh_sp

13 days ago

New blackboard lecture w @reinerpope How do chips actually work – starting with basic logic gates, and working up to why GPUs, TPUs, FPGAs, and the human brain each look the way they do. 0:00:00 – Building a multiply-accumulate from logic gates 0:16:20 – Muxes and the cost of data movement 0:25:59 – How systolic arrays work 0:39:00 – Clock cycles and pipeline registers 0:51:40 – FPGAs vs ASICs 1:03:14 – Cache vs scratchpad 1:07:16 – Why CPU cores are much bigger than GPU cores 1:11:49 – Brains vs chips 1:15:22 – A GPU is just a bunch of tiny TPUs Look up Dwarkesh Podcast on YouTube/Spotify/etc to watch. Enjoy!

93

6K

721

7K

919K

prudent_AI retweeted

shirish

@shiri_shh

12 days ago

chinese startup built an AI collar that translates barks and meows into full sentences. 95% accuracy. cost $118. 10k people have already pre-ordered it. It uses mics, motion sensors, and AI to read body language and vocalizations.

406

5K

335

3K

2M

Prudent AI

@prudent_AI

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users