Diego @diegocaples - Twitter Profile

diegocaples retweeted

Markov Robotics

@markovrobotics

11 days ago

0-shot pick and place on unseen objects with less than an hour of teleoperation data

0

8

3

1

493

diegocaples retweeted

Greg Tarr

@Greg_Tarr

8 months ago

we @agi_inc just achieved 76.3% on the OSWorld benchmark taking the #1 spot from ByteDance (53.1%)

2

11

1

0

793

diegocaples retweeted

AGI, Inc. @agi_inc

8 months ago

AGI surpasses human-level performance at computer use. We’re excited to announce that AGI, Inc. is now the global leader on OSWorld-Verified, the industry benchmark for AI computer-control. agi-0 is the first agent to reach a superhuman score on OSWorld, with a score of 76.2%.🔥 Learn more about in it our company blog post from @_gundawar: 👇 https://t.co/2V36lNC3M1

agi_inc's tweet photo. AGI surpasses human-level performance at computer use.

We’re excited to announce that AGI, Inc. is now the global leader on OSWorld-Verified, the industry benchmark for AI computer-control.

agi-0 is the first agent to reach a superhuman score on OSWorld, with a score of 76.2%.🔥

Learn more about in it our company blog post from @_gundawar:
👇

https://t.co/2V36lNC3M1

14

177

9

20

48K

diegocaples retweeted

Weights & Biases

@wandb

8 months ago

🏆 Grand Prize Winners: Daydreamer @diegocaples @_gundawar They're tackling the "GPT Moment for Robotics." Their agent uses a video diffusion model to imagine a successful outcome, executes it in the real world, and then uses VLM feedback to self-improve, training only on its successes.

1

10

1

8

1K

diegocaples retweeted

Stephen James

@stepjamUK

8 months ago

𝗗𝗟𝗥 𝗿𝗲𝘀𝗲𝗮𝗿𝗰𝗵𝗲𝗿𝘀 𝗴𝗮𝘃𝗲 𝗮 𝗿𝗼𝗯𝗼𝘁𝗶𝗰 𝗮𝗿𝗺 𝗳𝘂𝗹𝗹-𝗯𝗼𝗱𝘆 𝘁𝗼𝘂𝗰𝗵 𝘀𝗲𝗻𝘀𝗶𝘁𝗶𝘃𝗶𝘁𝘆 𝘄𝗶𝘁𝗵 𝗻𝗼 𝗮𝗿𝘁𝗶𝗳𝗶𝗰𝗶𝗮𝗹 𝘀𝗸𝗶𝗻 𝗻𝗲𝗲𝗱𝗲𝗱. They used internal force-torque sensors at 8 kHz + deep learning. The robot can feel where you touch it, recognize letters drawn on its surface, and respond to virtual buttons placed anywhere on its body. What's interesting is the infrastructure behind it. To train these models, you need high-frequency sensor streams, manifold learning to unfold trajectories, and the ability to iterate fast. They collected 2,300 samples from 20 people and hit 95.5% accuracy on digit recognition. This is what's possible when you have the right data infrastructure. 📄 https://t.co/yadvb1iKnW Video credit: @DLR_en

56

2K

340

983

174K

diegocaples retweeted

AGI, Inc. @agi_inc

8 months ago

AGI, Inc. is now the global leader on the AndroidWorld benchmark, with state-of-the-art verified performance of 97.4% This is a huge milestone for Android use, and just a sneak preview of what's coming - bringing trustworthy, reliable agents to every screen 🚀

agi_inc's tweet photo. AGI, Inc. is now the global leader on the AndroidWorld benchmark, with state-of-the-art verified performance of 97.4%

This is a huge milestone for Android use, and just a sneak preview of what's coming - bringing trustworthy, reliable agents to every screen 🚀 https://t.co/pg0J6bZRp5

31

301

87

21

51K

diegocaples retweeted

Lucas Beyer (bl16)

@giffmana

9 months ago

Did you know that when they say stuff like "The A18 uses TSMC's 3nm process" or "announced the 2nm node" The 3nm, 2nm actually doesn't mean anything?! It's just like a version number. They make it up. Literally nothing measures 2nm or 3nm. I certainly didn't know.

giffmana's tweet photo. Did you know that when they say stuff like "The A18 uses TSMC's 3nm process" or "announced the 2nm node"

The 3nm, 2nm actually doesn't mean anything?! It's just like a version number. They make it up. Literally nothing measures 2nm or 3nm.

I certainly didn't know. https://t.co/oPvPsqsLs6

333

9K

521

2K

773K

diegocaples retweeted

Crémieux

@cremieuxrecueil

9 months ago

Waymo is so safe that if every car was driven like a Waymo, about 9% of America's life expectancy gap would disappear. 9 percent Americans die in car accidents *that often*.

cremieuxrecueil's tweet photo. Waymo is so safe that if every car was driven like a Waymo, about 9% of America's life expectancy gap would disappear.

9 percent

Americans die in car accidents *that often*. https://t.co/GaxmnBuyQZ

228

7K

644

908

1M

Diego

@diegocaples

12 months ago

@tiff_soerianto Thanks!

0

11

Diego

@diegocaples

12 months ago

@jxmnop Because the models produced by this method are very different than the models learned by gradient descent. While this does give us a “ground truth” to benchmark interp methods on, the results don’t generalize to actual learned models.

0

7

0

320

diegocaples retweeted

PicoCreator - AI builder @ ✈️🌉

@picocreator

about 1 year ago

SOTA AI agent that reliably works... where Claude, Gemini, and o3 fail... to do the boring chores in life... @FeatherlessAI is making this possible, as part of our work into AI reliability Surpassing existing frontier models & agents by 50%+

picocreator's tweet photo. SOTA AI agent that reliably works...
where Claude, Gemini, and o3 fail...
to do the boring chores in life...

@FeatherlessAI is making this possible, as part of our work into AI reliability

Surpassing existing frontier models & agents by 50%+

4

118

21

74

27K

diegocaples retweeted

AGI, Inc. @agi_inc

about 1 year ago

🚀 INTRODUCING REAL Bench: Our New Standard for Web AI Agent Evaluation We're thrilled to announce the release of REAL Bench - our groundbreaking benchmark to transform how web AI agents are evaluated! Why we created REAL Bench: ✅ We built functional replicas of popular websites to test what agents can REALLY do ✅ We wanted to measure ACTUAL performance, not academic abstractions ✅ We compared leading frameworks including BrowserUse (31%) and StageHand (19%) What web tasks would YOU like to see AI agents tackle? Join our community to be part of the agentic revolution reshaping AI! ⚡ 👉 Explore REAL Bench → [https://t.co/wdDqtPhk2a] 🛠️ Try REAL Bench and get your REAL score today → [https://t.co/baXfnhs2pC]

18

160

16

29

120K

diegocaples retweeted

Div Garg

@divgarg

about 1 year ago

Learn to build AGI agents you actually want to work with 🔥 Sign up and follow 👉: https://t.co/xzjlVPtBFl In collaboration with @AndrewYNg and @DeepLearningAI!

6

70

8

16

15K

diegocaples retweeted

DeepLearning.AI

@DeepLearningAI

about 1 year ago

AI agents that can browse the web, fill out forms, and even place online orders are no longer just research demos—they’re being built today. But real-world websites are complex. Layouts change. Popups appear. And one wrong click can cascade into booking the wrong flight or buying the wrong product. In our new course, Building AI Browser Agents, made in collaboration with @agi_inc, you’ll learn how to build web agents and how to make them more reliable using AgentQ, a framework that helps agents self-correct. Guided by instructors @divgarg and @namangarg0, you’ll build agents step-by-step: from scraping and summarizing, to signing up for newsletters, to navigating the open web and choosing optimal actions. 👉 Learn for free: https://t.co/Poa7kJ4WM7

11

420

73

267

38K

diegocaples retweeted

AGI, Inc. @agi_inc

about 1 year ago

Good AGI agents complete tasks. Great ones check their own work. Discover how to build them in our new course with @DeepLearningAI Enroll Now! https://t.co/P1R495nEXQ

7

64

31

15

14K

diegocaples retweeted

Meghna Natraj @NatrajMeghna

about 1 year ago

We won 1st Place! 🏆 Our hackathon project 'AutoRL: Reinforcement Learning is all you Need' trains open-source LLMs via RL to master tools (MCPs) rivaling closed-source models. Proud of the team: @diegocaples, @thomastjoshi, @xdotli! Thank you @JvNixon! #RL #LLM #ML #AI #AGIHouse

3

11

2

6

2K

diegocaples retweeted

AGI House SF

@AGIHouseSF

about 1 year ago

Anthropic brought Model Context Protocol to life. We gathered 200+ elite hackers for 12 hours to build the open source future of AI agent connections. Here's what we saw at the Finally Connected MCP Hackathon, where LLMs met the real world, with @AnthropicAI, @SmitheryDotAI, @kodjima33, @ExaAILabs, by @JvNixon:

5

95

21

56

21K

diegocaples retweeted

AGI House SF

@AGIHouseSF

about 1 year ago

1/ AutoMCP 🥇 1st Place ToolMaster RL - Training open-source LLMs to excel with MCPs through reinforcement learning. This project creates an environment where models learn tool usage through trial and error rather than prompt engineering. "Reinforcement Learning is All You Need" for transforming mediocre open-source models into tool-using experts that rival closed-source alternatives. Diego Caples, @diegocaples Thomas Joshi, @thomastjoshi Meghna Natraj, @NatrajMeghna Xiangyi Li, @xdotli

6

31

6

29

6K

Diego

@diegocaples

over 1 year ago

@rg7777777777 @realDonaldTrump

0

2

0

120

Diego

@diegocaples

Last Seen Users on Sotwe

Trends for you

Most Popular Users