Sean 🔨 @darcys22 - Twitter Profile

1 day ago

Training job: "CUDA out of memory" Me: what did my code allocate?? My code: nothing. literally nothing. it crashed on line one. Turns out the GPU had 40GB held by ghost processes from jobs that finished hours ago and never let go.

0

1

0

11

Sean 🔨 @darcys22

10 days ago

So playing with Hermes Agent and im having a bunch of trouble asking it to setup cron jobs. Like itll create the script in the wrong directory and the cron wont point at it. Whats the default directory we should be putting cron scripts into?

0

1

0

24

Sean 🔨 @darcys22

11 days ago

@SolSt1ne Rule 2: slow down and pause is great for clients. But this 23 minute video could have been 10 if he didnt talk so slow

1

11

0

3K

Sean 🔨 @darcys22

17 days ago

America hits token limits on all their agents, meanwhile Australia has a tiny population getting the off-peak GPU buffet. We are no longer "remote". We are computationally advantaged nomads

0

21

Who to follow

just a beat down old mechanic

Juan.eth

@juaneth77

Privacy Lawyer | Polygot | NFT collector | Building with @visitsugartown

Sean 🔨 @darcys22

17 days ago

@shannholmberg How are you getting the agents to communicate to each other?

0

301

Sean 🔨 @darcys22

about 1 month ago

@_avichawla The smaller fine-tuned model isnt able to have the same understanding as the stronger teacher. But it can with the weaker teacher

0

1

0

210

Sean 🔨 @darcys22

about 1 month ago

RAG systems fell away because agents were able to navigate a bunch of files and figure out the important information themselves. This is similar to how Karpathy was talking about self driving cars. Moving from C++ where the rules were enforced, to letting the AI make decisions

0

14

Sean 🔨 @darcys22

about 2 months ago

Models trained in clean environments learn brittle strategies, over rely on structure and don't develop robustness. So they reward architectures that are fragile in the real world

0

1

0

16

Sean 🔨 @darcys22

about 2 months ago

@SMB_Attorney Put their new document into chatGPT. Ask to review and find any issues. Rinse and repeat

0

32

Sean 🔨 @darcys22

about 2 months ago

Reviewing an AI paper and its like We compare against: 1) Weak baselines 2) Small synthetic models, and 3) Synthetic tasks. We added some inductive bias and you can see our HUGE GAINS They are just patching weaknesses in underpowered setups instead of improving strong models

0

13

Sean 🔨 @darcys22

2 months ago

@yoheinakajima are you gunna pay the tax on the billion dollars in revenue in each company?

0

69

darcys22 retweeted

i14.ai

@i14labs

2 months ago

i14 Journal Club: Foundation Models Where Math Meets Cognitive Science i14 is starting a weekly online discussion group for AI researchers and engineers exploring the intersection of generative AI, mathematics, and cognitive science. We analyze how architectural design impacts learning, memory, and reasoning in foundation models. Join us to dissect training dynamics and explore how cognitive principles can inform the next generation of architectures, with our first session hosted via Google Meet on Monday, March 30 · 12:00 PM AEDT (Melbourne time), which is Sunday, March 29 · 6:00 PM PDT (San Francisco time) Apply to join HERE: https://t.co/fOU6DQWqKY

0

3

2

0

164

Sean 🔨 @darcys22

3 months ago

Maybe I’m reading too many posts on reddit. But DLSS 5 sounds awesome. The game engine can do the “rough” sketch of what should be on the screen quickly, then let AI polish that into a super realistic frame. Its like the perfect pipeline for parallel processing

0

48

Sean 🔨 @darcys22

6 months ago

@levelsio Is this just because the scans are actually too low resolution to be useful for that task? Like the false positive rate is so high it’s only useful if you know something is already wrong. Would this be the same issue with a better scanner?

1

4

0

1

347

Sean 🔨 @darcys22

8 months ago

@redtachyon @tenobrus Similar story here lol

0

1

0

61

Sean 🔨 @darcys22

10 months ago

@dejavucoder Also apparently you can add "includeCoAuthoredBy": false, into .claude/settings.json

1

0

30

Sean 🔨 @darcys22

10 months ago

@dejavucoder Learn to accept your job isnt to write the code anymore. There is no shame in keeping your coauthor there.

1

2

0

138

Sean 🔨 @darcys22

10 months ago

@GrantSlatton Finance people also get annoyed at excel when you have to rounddown() everything because 0 != 0

0

9

0

1K

Sean 🔨 @darcys22

11 months ago

@mitchellh Yeah nice!

0

282

Sean 🔨 @darcys22

11 months ago

@nielsandriesse https://t.co/G9NLcfRwmH Posted in Feb Appears they need to learn how to ship

Sam Altman

@sama

over 1 year ago

OPENAI ROADMAP UPDATE FOR GPT-4.5 and GPT-5: We want to do a better job of sharing our intended roadmap, and a much better job simplifying our product offerings. We want AI to “just work” for you; we realize how complicated our model and product offerings have gotten. We hate the model picker as much as you do and want to return to magic unified intelligence. We will next ship GPT-4.5, the model we called Orion internally, as our last non-chain-of-thought model. After that, a top goal for us is to unify o-series models and GPT-series models by creating systems that can use all our tools, know when to think for a long time or not, and generally be useful for a very wide range of tasks. In both ChatGPT and our API, we will release GPT-5 as a system that integrates a lot of our technology, including o3. We will no longer ship o3 as a standalone model. The free tier of ChatGPT will get unlimited chat access to GPT-5 at the standard intelligence setting (!!), subject to abuse thresholds. Plus subscribers will be able to run GPT-5 at a higher level of intelligence, and Pro subscribers will be able to run GPT-5 at an even higher level of intelligence. These models will incorporate voice, canvas, search, deep research, and more.

4K

37K

4K

6K

7M

0

1

0

22

Sean 🔨

@darcys22

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users