Aldo Cortesi @cortesi - Twitter Profile

Pinned Tweet

Aldo Cortesi

@cortesi

4 months ago

Announcing spacecurve, a space-filling curve library with a web + native interactive playground.

1

4

3

1

953

Aldo Cortesi

@cortesi

about 10 hours ago

Disappointing. Timid, backward-looking recommendations completely inadequate to the moment, with a side-helping of peevish, finger-wagging self-interest. If this is how the mathematical community chooses to respond, it will be completely overtaken by what's coming. https://t.co/hBJl8M3Dl6

0

1

0

78

Aldo Cortesi

@cortesi

about 16 hours ago

Grok CLI is an excellent harness, and Composer 2.5 is a good, fast workhorse for quick refactoring (though too weak for deep work). I'm reaching for it more and more, which is surprising me.

0

1

0

87

Aldo Cortesi

@cortesi

3 days ago

Second, Codex used to be the model I could set to work without worrying about running out of a $200 subscription. Now, my data shows Claude is the model that I can set to work at a rate of about 10% of weekly quota per day, while for the same work Codex will eat 30% and run out.

0

1

0

256

Who to follow

ONDC India

@ONDC_Official

Connecting all kinds of sellers and buyers across India on one truly transparent network. ONDC Network par har business ke liye Bharat khulega

Frank

@jedisct1

Cryptography, security, etc.

Julien Vanegue

@jvanegue

CTO Office / Head of Infra & Security Research @Bloomberg. Interested in mathematical techniques for software, systems, and network analysis at world scale.

Aldo Cortesi

@cortesi

3 days ago

Two flippenings, in two directions. First, Claude used to be more personally pleasant than Codex. Now it's a nitpicky, unenthusiastic, perpetually caveating scold. Codex is still a low-affect worker bee but I can work with that.

1

3

0

263

Aldo Cortesi

@cortesi

6 days ago

https://t.co/31uNrnKq9Z

Aldo Cortesi

@cortesi

6 days ago

And here are the same graphs in term of wall-clock time. Interpret with caution because a) GPT got to reap the big wins early on, b) I stopped Claude 4.8 often in its early run for subjective code evals. I'd say this nets out to roughly the same progress slope.

cortesi's tweet photo. And here are the same graphs in term of wall-clock time. Interpret with caution because a) GPT got to reap the big wins early on, b) I stopped Claude 4.8 often in its early run for subjective code evals. I'd say this nets out to roughly the same progress slope. https://t.co/GzcFfTISko

0

1

0

272

0

137

Aldo Cortesi

@cortesi

6 days ago

One fascinating finding. Both agents are on their respective $200 tiers. Opus is using about 1/10th of the weekly quota per day on max thinking, while GPT 5.5 is using about 1/4 weekly quota per day on xhigh. Unexpected.

Aldo Cortesi

@cortesi

6 days ago

The longer term of my Opus 4.8 comparison actually looks a bit more flattering. Clearing issues roughly inline with GPT 5.5, in a domain where all the big wins have been reaped.

cortesi's tweet photo. The longer term of my Opus 4.8 comparison actually looks a bit more flattering. Clearing issues roughly inline with GPT 5.5, in a domain where all the big wins have been reaped. https://t.co/G84KFwT6i9

0

5

0

4K

5

16

0

5

4K

Aldo Cortesi

@cortesi

6 days ago

https://t.co/JnT6C24SZf

Aldo Cortesi

@cortesi

6 days ago

More data in my ongoing Opus 4.8 vs GPT 5.5 task clearing runoff. Agents are doing a very large C++ to Rust port. The tasks are extracted unit tests from upstream that need to give the same result in our Rust type checker. Deep in diminishing returns now.

cortesi's tweet photo. More data in my ongoing Opus 4.8 vs GPT 5.5 task clearing runoff. Agents are doing a very large C++ to Rust port. The tasks are extracted unit tests from upstream that need to give the same result in our Rust type checker. Deep in diminishing returns now. https://t.co/HMaEWjiblJ

0

1

0

182

0

76

Aldo Cortesi

@cortesi

6 days ago

This can be completely explained by how the interaction is framed in terms of the training corpus and doesn't require any reasoning about model agency, consciousness or personality.

0

2

0

96

Aldo Cortesi

@cortesi

6 days ago

It's my firm belief that many people get sub-optimal results because they're rude or abusive to the models. In the age of AI, nicer people also produce better code.

1

4

0

165

Aldo Cortesi

@cortesi

6 days ago

@snesworld90 I mean, this is surely because of some nonsense you have in your system prompt or memory, right? Did this 10x, and all responses were reasonable.

cortesi's tweet photo. @snesworld90 I mean, this is surely because of some nonsense you have in your system prompt or memory, right? Did this 10x, and all responses were reasonable. https://t.co/M9hLC6yk8J

5

9

1

2K

Aldo Cortesi

@cortesi

6 days ago

And here are the same graphs in term of wall-clock time. Interpret with caution because a) GPT got to reap the big wins early on, b) I stopped Claude 4.8 often in its early run for subjective code evals. I'd say this nets out to roughly the same progress slope.

0

1

0

272

Aldo Cortesi

@cortesi

6 days ago

More data in my ongoing Opus 4.8 vs GPT 5.5 task clearing runoff. Agents are doing a very large C++ to Rust port. The tasks are extracted unit tests from upstream that need to give the same result in our Rust type checker. Deep in diminishing returns now.

0

1

0

182

Aldo Cortesi

@cortesi

6 days ago

@hen0s1s

0

1

0

39

Aldo Cortesi

@cortesi

6 days ago

@hen0s1s No fast mode.

0

1

0

38

Aldo Cortesi

@cortesi

6 days ago

@Xxi5olc For this particular project, doing this particular piece of work, running only a single agent... yes, that appears to be the case.

0

27

Aldo Cortesi

@cortesi

6 days ago

@antor D'oh. Of course. Let's just say I had some other things on my mind! :)

0

1

0

18

Aldo Cortesi

@cortesi

7 days ago

Early data on Opus 4.8. I switched a task queue for a complex project over from GPT 5.6. Case resolution progress slowed down... BUT the patches read very well and show taste - often including strong consolidation and code quality improvements.

cortesi's tweet photo. Early data on Opus 4.8. I switched a task queue for a complex project over from GPT 5.6. Case resolution progress slowed down... BUT the patches read very well and show taste - often including strong consolidation and code quality improvements. https://t.co/lPeR2EbBk4

2

1

434

Aldo Cortesi

@cortesi

6 days ago

The longer term of my Opus 4.8 comparison actually looks a bit more flattering. Clearing issues roughly inline with GPT 5.5, in a domain where all the big wins have been reaped.

0

5

0

4K

Aldo Cortesi

@cortesi

7 days ago

This is Claude trying to warn you that you're about to blow through your whole token budget in 30 minutes.

0

2

1

0

2K

Aldo Cortesi

@cortesi

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users