@thsottiaux Any plans for better handling of background tasks? Having codex launch and manage experiments is great but they get buried in walls of text, I can't attach etc
@petergostev Ok, that's an interesting data point for current models. But still, I worry that a human given the same task might disappoint because they simply stop earlier than you wanted them to. Shouldn't we control for budget in some way (could be time, tokens, or $)
There's a lot of excitement about AI for scientific discovery. But can AI really be creative enough to do this autonomously? I put Demis Hassabis's definition to the test
Had a lot of fun with this side project to test whether AI could invent new board games as good as Go
@thsottiaux When codex runs a long running terminal job there's no way to attach to it, there's no way to see all terminals, and no way to pop them out
@trajektoriePL While this description definitely has appeal, it seems contrary to the view of universal explainers by @DavidDeutschOxf. Is there really just a bunch of different stuff or is there bright line universality?
Ran out of @codex credits for the week. Opened @antigravity to continue. 3.1 Pro credits gone after 20 minutes. Switch to Opus 4.6 and get "ran into an error" message every other message and then credits gone. Switch to 3 Flash and ... "ran into an error" ๐ฎ๐๏ธ @OfficialLoganK
@romainhuet great work on codex. It won back my subscription.
Feature Request: continue local sessions from my phone. I much prefer local dev, but now I have a constant feeling of time wasting when out and about and can't tell an agent to keep working.
Lots of work went into this one co-led with John M. O'Toole. Thanks to our co-authors Geraldine Boylan, Sean Griffin, Aurel Luca, Sean Mathieson, and Soraia Ventura.
Blog: https://t.co/52kOXfhIjJ
Paper: https://t.co/0PJkGPitKq
Supported by Enterprise Ireland DTIF grant
Our seizure detection paper is finally out! First of many for @CergenX with our friends in INFANT.
Published in npj Digital Medicine.
This represents a significant milestone for the field with the first ever demonstration of human expert level performance on held out datasets.