Centipede5

Verified account

@Centipede5dev

Rutgers CS - web game dev of 5M+ Players - Currently Agent Puppeteer

New Jersey

Joined April 2018

174 Following

334 Followers

130 Posts

Pinned Tweet

2 months ago

I took @karpathy’s autoresearch loop and applied it to game development, here's what it built last night: Agents read player data → plan improvements → spin up new git branches in a game evolution tree → ship playable HTML5 variants → repeat forever. The system optimizes toward the game variant most likely to be chosen by players. Live leaderboard + playable games at https://t.co/S1qoFgEAVJ MIT open source: https://t.co/agVgN9zNgy I feel like we're just beginning to unlock the potential of this beyond ML, soon we'll have self-improvement loops in every product. Very excited to see what comes next.

1

2

1

0

661

about 13 hours ago

@systematicls Wait until you hear about EV depending on the utility function

0

0

0

0

636

24 days ago

@raw_works How much of the improvement for the explicit problems actually comes from code execution as a reasoning substrate vs the recursive subagents doing an Atom-of-Thought-like decomposition? How many of the traces actually contain code doing heavy lifting beyond just calling agents?

0

1

0

0

213

27 days ago

@simpsoka Constant cybersecurity warnings suddenly for frontend work / data science.

Centipede5dev's tweet photo. @simpsoka Constant cybersecurity warnings suddenly for frontend work / data science. https://t.co/l2GDYLrRKf

0

1

0

0

23

Who to follow

#crypto#forex#backtocrypto

27 days ago

Also somehow it completely stopped late at night when usage probably drops...

0

1

0

0

57

27 days ago

The whole "flagged for possible cybersecurity risk" is bullshit right? They just don't have enough compute to serve 5.5 for the exploding number of codex users. I've been getting this every few hours for the most benign SWE work ever.

Centipede5dev's tweet photo. The whole "flagged for possible cybersecurity risk" is bullshit right? They just don't have enough compute to serve 5.5 for the exploding number of codex users. I've been getting this every few hours for the most benign SWE work ever. https://t.co/pKeNVqgyyB

1

2

0

0

102

about 1 month ago

3d spatial reasoning is probably the weakest technical link. Ive tried all frontier models but they almost always struggle to correctly scale/ rotate assets in 3d. Not too big of a deal for me to do manually but annoying that it breaks the loop for a non-taste reason. I have a few scripts that work ~80% of the time but definitely not solved

5

13

0

1

948

about 1 month ago

This is kind of a meaningless metric if you think about it. Addition of a 1 digit number takes a person maybe a second, a 4 digit number 4 seconds etc. You could make the exact same graph with the task of addition and show how there was an "intelligence explosion" in the 1940s. If you use AI regularly you know that long context tasks are not really the bottleneck anymore outside of maybe frontier math. Jagged intelligence

0

2

0

0

135

about 1 month ago

Autogamestudio has been absolutely insane after gpt-image-2 dropped, custom sprite animations are finally feasible

2 months ago

I took @karpathy’s autoresearch loop and applied it to game development, here's what it built last night: Agents read player data → plan improvements → spin up new git branches in a game evolution tree → ship playable HTML5 variants → repeat forever. The system optimizes toward the game variant most likely to be chosen by players. Live leaderboard + playable games at https://t.co/S1qoFgEAVJ MIT open source: https://t.co/agVgN9zNgy I feel like we're just beginning to unlock the potential of this beyond ML, soon we'll have self-improvement loops in every product. Very excited to see what comes next.

1

2

1

0

661

0

3

1

0

160

about 1 month ago

@Alibaba_Qwen @arena Embarrassing

0

0

0

0

106

about 2 months ago

@mdancho84 Why does this shit keep getting posted every single time a new model comes out???

0

34

0

0

592

about 2 months ago

Surrogate models are going to be the next big thing in LLM harness development

0

0

0

0

101

about 2 months ago

@thomasfbloom Human mathematicians won't scale with Moore's law though

0

1

0

0

236

about 2 months ago

Interesting read. I imagine true in-context learning will appear when the memory systems themselves are more integrated into training beyond just learning tool calls, maybe some kind of recurrent attention model. Bitter lesson will eventually come for all of the engineering hacks currently deployed.

0

0

0

0

589

about 2 months ago

@alexandr_wang Wow, new SOTA in bioweapons refusal! Never doubt meta superintelligence

0

1

0

0

3K

about 2 months ago

@joelniklaus Very interesting read! Have you experimented at all with prompt optimization techniques such as GEPA?

1

3

0

0

53

about 2 months ago

@cloneofsimo Thank you for this, the original 2d "donut" chart is extremely misleading

0

3

0

0

1K

2 months ago

Huge productivity hack for vibe-building: Have your agent build a simple streamlit / chartjs admin dashboard for whatever you’re working on. Visual debugging >> logs + manual testing. ~60% of the time it exposes a broken or weird architectural choice from a 5 second scan

1

0

0

0

250

3 months ago

Real ones will know we're on the good AI timeline

Centipede5dev's tweet photo. Real ones will know we're on the good AI timeline https://t.co/OPFlri5JQ7

0

0

0

0

162

3 months ago

@mirofish_ai is a stupid project created by people who have already forgotten the bitter lesson of ML. Zero chance it goes anywhere, just a really shitty untrained predictive model. Nonetheless there will be people hyping it for a while because it sounds like sci-fi...

0

0

0

0

32

Last Seen Users on Sotwe

Trends for you

Most Popular Users