LDJ @ldjconfirmed - Twitter Profile

Pinned Tweet

LDJ

@ldjconfirmed

about 2 years ago

Moores law created AI to save itself.

5

67

7

13

11K

LDJ

@ldjconfirmed

about 15 hours ago

@kelstar_ @NousResearch @OpenRouter Not sure why that would impact the rankings in this way. But further update now; Hermes Agent is now #1 on the monthly global charts too, and is so far ahead that it has more usage than OpenClaw, Kilo Code and Claude Code combined.

ldjconfirmed's tweet photo. @kelstar_ @NousResearch @OpenRouter Not sure why that would impact the rankings in this way. But further update now; Hermes Agent is now #1 on the monthly global charts too, and is so far ahead that it has more usage than OpenClaw, Kilo Code and Claude Code combined. https://t.co/yjtYPS1r74

1

0

28

LDJ

@ldjconfirmed

11 days ago

@willdepue https://t.co/KiveCK9mmm

LDJ

@ldjconfirmed

over 1 year ago

GODMAAX is the new FAANG G = Google(Deepmind) O = OpenAI D = Deepseek M = Meta A = Anthropic A = Alibaba(Qwen) X = XAI

7

83

3

19

5K

0

65

LDJ

@ldjconfirmed

24 days ago

@NousResearch @kelstar_ @OpenRouter Already now #1 on the weekly chart too😎

1

0

37

Who to follow

Redmond Hosting

@RedmondAI

https://t.co/dfIHZ8g6ND: Affordable, top-tier cloud hosting for AI. Empowering innovators with 80GB A100 GPUs. Rent by hour or long-term. We're here for you.

Ieatcrayons

@Hatkid641

Red dead, undying nightmares, Goated I like cats

about 1 month ago

@anko_979 @BlackHC It’s a common misconception that it was pulled, it wasn’t, it’s placement in the code of conduct was simply changed from the preface of the code of conduct to the ending section of the code of conduct where it still exists today.

1

0

44

LDJ

@ldjconfirmed

about 1 month ago

Roughly related but I don't think they intend to use any GDPval score as evidence for AGI being achieved. They say in the paper themselves that it's largely work that is only a time-horizon of a few hours and the context is much more assisted than a real job. GDPVal is also far easier and more saturated than something like RemoteLaborIndex which comprises of real Upwork tasks (but still not typical employment positions) Current GDPVal SOTA is over 80% Current RemoteLaborIndex SOTA is less than 5%

0

1

0

27

LDJ

@ldjconfirmed

about 1 month ago

@daniel_mac8 @deredleritt3r In Microsofts October 2025 blog post about their latest partnership terms with OpenAI: "Once AGI is declared by OpenAI, that declaration will now be verified by an independent expert panel."

0

2

0

21

LDJ

@ldjconfirmed

about 1 month ago

@deredleritt3r @daniel_mac8 "Economically valuable work" is further defined by people internally at OpenAI as the jobs tracked by the US bureau of labor statistics. So I suppose it's a majority of those jobs that they mean.

1

2

0

53

LDJ

@ldjconfirmed

about 1 month ago

@deredleritt3r In Feb 2026, Sam Altman said at a Stanford hackathon: “If you are a sophomore now, you will graduate into a world with AGI in it" Sophomores in Feb 2026 are set to graduate around mid-2028. I believe this is the first and only time he's stated such a near-term AGI prediction.

1

5

0

1

510

LDJ

@ldjconfirmed

about 2 months ago

@ChaosEmergent @haider1 GPT-4.5 started training ~may 2024, almost exactly 2 years ago now. (Based on official OpenAI statements that mentioned starting training on their new next generation model at the time, along with corroboration from WallStreetJournal and others)

0

1

0

155

LDJ

@ldjconfirmed

3 months ago

@zephyr_z9 That quote is not true to what he said. His statement was directly opposite of the what you created within your quotations. Here is his actual quote about that topic: "Even by 2028, I don’t expect that we’ll get systems as smart as people in all ways"

0

4

0

183

LDJ

@ldjconfirmed

3 months ago

@juristr L9: You have the AI itself write the optimal coordination layer on the fly for spawning, routing and managing agents programmatically, in the way that works best for a given project and your preferences.

1

0

106

LDJ

@ldjconfirmed

3 months ago

@otium33 @BasedBiohacker @bryan_johnson He has already publicly talked about results of his personal peptide experimentation prior to doing his shroom experiments.

0

72

LDJ

@ldjconfirmed

3 months ago

@haider1 It's been confirmed that some devs outside of OpenAI had early access to GPT-5.4 for atleast "a few weeks" prior to public release. Exhibit A:

Pietro Schirano

@skirano

3 months ago

This model is absolutely insane. I’ve been using it for a few weeks, and it’s the first model that made the impossible feel possible for me. Particularly the pro version , it’s capable of solving even the hardest problems.

34

470

25

67

56K

1

13

0

1

6K

LDJ

@ldjconfirmed

3 months ago

@thebasedcapital I think it did well. It lasted nearly 3 full years.

0

5

0

1K

LDJ

@ldjconfirmed

3 months ago

In November 2023, Yann LeCun, Thomas Wolf and others from Meta and Huggingface created a benchmark called GAIA, which described itself as: "A benchmark for General AI Assistants that, if solved, would represent a milestone in AI research." Most of the problem solutions were kept private, not released online. It proposed 466 "real-world questions that require a set of fundamental abilities such as reasoning, multi-modality handling, web browsing, and generally tool-use proficiency." On the hardest level, the average human score was 87%, while the leading systems scored less than 3%. 10 months later OpenAI released O1-preview, reaching ~30% on that level. Now in 2026 the human baseline for the hardest level has officially been surpassed, the best agent systems are now scoring 88.9% on GAIAs hardest level (level 3).

ldjconfirmed's tweet photo. In November 2023, Yann LeCun, Thomas Wolf and others from Meta and Huggingface created a benchmark called GAIA, which described itself as: "A benchmark for General AI Assistants that, if solved, would represent a milestone in AI research." Most of the problem solutions were kept private, not released online.

It proposed 466 "real-world questions that require a set of fundamental abilities such as reasoning, multi-modality handling, web browsing, and generally tool-use proficiency."

On the hardest level, the average human score was 87%, while the leading systems scored less than 3%. 10 months later OpenAI released O1-preview, reaching ~30% on that level.

Now in 2026 the human baseline for the hardest level has officially been surpassed, the best agent systems are now scoring 88.9% on GAIAs hardest level (level 3).

25

789

57

240

79K

LDJ

@ldjconfirmed

3 months ago

@ThomasScialom Unfortunately I can’t find any human baselines for GAIA 2.

1

0

1K

LDJ

@ldjconfirmed

3 months ago

@xundecidability @WaveTheoryAI The difference here is that GAIA is real world questions involving highly specific information that exists amongst human civilization across a diverse set of modalities, not an abstract puzzle.

1

0

49

LDJ

@ldjconfirmed

3 months ago

@nithin_k_anil The human baseline score was also matched/surpassed by GPT-5 and Gemini-3-Pro working together without any specialized orchestrator in the loop, and only scored ~2% below the top score by Nvidia. I imagine Opus 4.6, GPT-5.4 and Gemini-3.1 together would get an even better score.

0

1

0

194

LDJ

@ldjconfirmed

3 months ago

The current highest level 3 score was achieved by Nvidia, leveraging a multi-agent system that includes Nvidias own tool orchestrator model. It scores 89.8% on Lvl 3 (even higher than the 88.9% typo I wrote above) The public leaderboard can be seen here: https://t.co/nv4KWfiFYa