enty @chronurgist - Twitter Profile

authoring a benchmark and this is quite hard. Because I can’t really predict in advance what task it’ll fail at. For example, the main task of adding a giant feature was completely oneshotted pretty much in under 50k tokens (not counting sub agents). But the smaller one took roughly 200k more before the agent gave up on it entirely. The fix was quite simple too and I think it missed it because it didn’t read the files itself.

0

3

0

159

0

1

0

28

enty

@chronurgist

5 days ago

@chaotictransfem Huuuge congrats!!

1

2

0

25

enty

@chronurgist

6 days ago

authoring a benchmark and this is quite hard. Because I can’t really predict in advance what task it’ll fail at. For example, the main task of adding a giant feature was completely oneshotted pretty much in under 50k tokens (not counting sub agents). But the smaller one took roughly 200k more before the agent gave up on it entirely. The fix was quite simple too and I think it missed it because it didn’t read the files itself.

0

3

0

159

chronurgist retweeted

tender

@tenderizzation

6 days ago

claude code (scroll), codex (scroll) git push (scroll), git pull (scroll) compiling (scroll), testing (scroll) failed (scroll), failed (scroll)

tenderizzation's tweet photo. claude code (scroll), codex (scroll)
git push (scroll), git pull (scroll)
compiling (scroll), testing (scroll)
failed (scroll), failed (scroll) https://t.co/1RIXQbA7ze

2

26

3

1

2K

enty

@chronurgist

7 days ago

sucks they dont support sm75 so I gotta rent a small card but hey its really fun still

0

1

0

20

enty

@chronurgist

7 days ago

cutedsl is so fun to write wtf I dont have to tile shit shit anymore and the indexing just magically falls out of shapes and strides it makes me wanna actually write the kernel

1

0

1

43

enty

@chronurgist

8 days ago

learning QR decomposition for the GPU MODE competition but i notice it’s so much more difficult to learn in a TUI to just opening up an chat with an LLM and just seeing the latex and everything else directly inline, GUIs > TUIs.

0

6

0

2

797

enty

@chronurgist

8 days ago

@oleksoleksoleks where bro???

0

433

enty

@chronurgist

Last Seen Users on Sotwe

Trends for you

Most Popular Users