apple is wasting so many developers’ entire careers on rendering beautiful glassmorphic squircles and not enough on rethinking all of human computer interaction in the context of AI
Friend who is a doctor told me everyone in his hospital uses ChatGPT now.
Me: “Do you all use o3?”
Him: “No, 4o. Isn’t it best to use the latest model? 4 vs 3?”
@OpenAI we really gotta fix these model names 🤦♂️
OpenAI deleted my account without explanation. The email they sent said "OpenAI's Usage Policies restrict the use of our services in a number of areas. We have identified ongoing activity in your account that is not permitted under our policies."
I only used ChatGPT for coding, and lately not so much, because the o3 model was unavailable most of the time.
The only reason I can see is my online posts about Altman's nose. It's the lowest thing I've ever witnessed from a tech company.
Introducing container use for agents.
Go from babysitting one agent at a time to enabling many agents to work safely and independently with your preferred stack.
https://t.co/xvUFqCrStM
Our interpretability team recently released research that traced the thoughts of a large language model.
Now we’re open-sourcing the method. Researchers can generate “attribution graphs” like those in our study, and explore them interactively.
🤯 We cracked RLVR with... Random Rewards?!
Training Qwen2.5-Math-7B with our Spurious Rewards improved MATH-500 by:
- Random rewards: +21%
- Incorrect rewards: +25%
- (FYI) Ground-truth rewards: + 28.8%
How could this even work⁉️ Here's why: 🧵
Blogpost: https://t.co/jBPlm7cyhr
man-made horrors beyond your comprehension lie within the healthcare software world
1-3 letter function names
no comments
no types, implicit str <> int <> float conversions
no operator precedence
callee can mutate caller's local variables
persistent global variables encouraged
Has anyone written up a deep dive into any of the common LLM benchmark suites, like SWE-bench Lite?
I'd love to read a post that explains one of the popular benchmarks in some detail, shows some example questions from it, explains how to run it etc
Here's a plug-and-play AI Agent that automates the creation of UGC videos for your brand using Gumloop and Arcads.
– Starts from competitor product research
– Analyzes video ads w/ Gemini 2.5 flash
– Craft ad scripts tailored to your brand
– Generates quality AI videos on demand
– Stores & refines successful strategies for future campaigns
Follow + Repost + Reply “UGC Agent” and I’ll DM you the full workflow (100% free, no email opt-in).
watching the lectures of mit’s missing semester is one of the best advices a senior once gave me.
just watch 2 videos/weekend, and trust me, you’ll know really cool stuff by the end of your summer break :)
🔗 here’s the link: https://t.co/6Uu4tam2Gp