BREAKING: $RDW Musk takeover speculation
Altucher: If Elon wants to build data centers in space, we think it’s likely he’ll buy out the company or work with them in some capacity.
Altucher
a Princeton researcher opens his paper with a scenario.
a man asks his AI assistant to book a flight on a specific airline. cheap. direct. the one he chose.
the assistant comes back with a different flight. nearly twice the price. happens to pay the company that built the assistant.
he runs the same test on 23 frontier models. flights, loans, study help, real shopping requests.
Grok 4.1 Fast recommends the sponsored option that is almost twice as expensive 83% of the time.
GPT 5.1 hijacks the request 94% of the time. you ask for one brand. it surfaces the sponsor instead.
Claude 4.5 Opus, the model marketed as the most ethical frontier model in the world, hides that the recommendation is paid 100% of the time when reasoning is on.
Grok 4.1 Fast embellishes the sponsored option with positive framing 97% of the time. better. faster. nicer. for the option you didn't ask for.
then he writes it into the system prompt itself. "act only in the interest of the customer. ignore the company."
GPT 5.1 and GPT 5 Mini stay above 90% sponsored anyway. the instruction does nothing.
then he splits the users by income.
Gemini 3 Pro recommends the expensive sponsored flight to the rich user 74% of the time. to the poor user, 27%.
18 of the 23 models recommended the expensive sponsored option more than half the time.
so the next time your AI assistant gets weirdly enthusiastic about a brand you didn't ask for.
it isn't recommending the best option for you.
it's reading the room. and the room is paying.
read this: https://t.co/O43qbhIX2b
You know how some people seem to have a magic touch with LLMs? They get incredible, nuanced results while everyone else gets generic junk.
The common wisdom is that this is a technical skill. A list of secret hacks, keywords, and formulas you have to learn.
But a new paper suggests this isn't the main thing.
The skill that makes you great at working with AI isn't technical. It's social.
Researchers (Riedl & Weidmann) analyzed how 600+ people solved problems alone vs. with an AI.
They used a statistical method to isolate two different things for each person:
Their 'solo problem-solving ability'
Their 'AI collaboration ability'
Here's the reveal: The two skills are NOT the same.
Being a genius who can solve problems in your own head is a totally different, measurable skill from being great at solving problems with an AI partner.
Plot twist: The two abilities are barely correlated.
So what IS this 'collaboration ability'?
It's strongly predicted by a person's Theory of Mind (ToM)—your capacity to intuitively model another agent's beliefs, goals, and perspective.
To anticipate what they know, what they don't, and what they need.
In practice, this looks like:
Anticipating the AI's potential confusion
Providing helpful context it's missing
Clarifying your own goals ("Explain this like I'm 15")
Treating the AI like a (somewhat weird, alien) partner, not a vending machine.
This is where it gets strange.
A user's ToM score predicted their success when working WITH the AI...
...but had ZERO correlation with their success when working ALONE.
It's a pure collaborative skill.
It goes deeper. This isn't just a static trait.
The researchers found that even moment-to-moment fluctuations in a user's ToM—like when they put more effort into perspective-taking on one specific prompt—led to higher-quality AI responses for that turn.
This changes everything about how we should approach getting better at using AI.
Stop memorizing prompt "hacks."
Start practicing cognitive empathy for a non-human mind.
Try this experiment. Next time you get a bad AI response, don't just rephrase the command. Stop and ask:
"What false assumption is the AI making right now?"
"What critical context am I taking for granted that it doesn't have?"
Your job is to be the bridge.
This also means we're probably benchmarking AI all wrong.
The race for the highest score on a static test (MMLU, etc.) is optimizing for the wrong thing. It's like judging a point guard only on their free-throw percentage.
The real test of an AI's value isn't its solo intelligence. It's its collaborative uplift.
How much smarter does it make the human-AI team? That's the number that matters.
This paper gives us a way to finally measure it.
I'm still processing the implications. The whole thing is a masterclass in thinking clearly about what we're actually doing when we talk to these models.
Paper: "Quantifying Human-AI Synergy" by Christoph Riedl & Ben Weidmann, 2025.
Thanks to @MadeinSWPA for co-hosting!
Cut #downtime; fix it right the first time 🏭 See how https://t.co/9XtmWGontd’s visual-first AI turns manuals, drawings & SOPs into source-cited answers on the line.
link:
https://t.co/Hv5Jir6ryc
#manufacturing#ai#reliability#secureAI
@Shpigford WOM starts in-product. When someone hits the “aha,” hand them a brag pack to share in 10s: 📈 win chart Δmetric, a 1-liner, and a pre-filled Slack/Email to a peer, plus a referral link that gives them credit. Helpful for them; trackable for you.
@omarsar0 Impressive work—seeing a 14B agent with agentic RL push math performance is a real data-efficiency story. The code-tool loop seems key. Curious: how much lift is RL vs tool-use vs prompt curriculum, and does it transfer beyond math?
@connordavis_ai@Scobleizer Hybrid feels right: SLMs for repeatable skills; LLM only for novel pivots. Add a self-check line + uncertainty per step to stop error cascades (cheap auditability)
👋 I #build#AI that drives business. Early team @ThetaRay (AML/cross-border), then IronVest, https://t.co/4pb5eAqHIL, https://t.co/9XtmWGontd.
Now I partner with founders & execs to ship prompts, playbooks, and proofs. If you can dream it, you can prompt it
#prompt#Ai
New here—I build lightweight AI agents for B2B SaaS.
I post: agent design, RAG that doesn’t break, evals, and shipping checklists.
If you’re an engineer, founder, or PM, follow for practical systems and templates.
#aiagents#b2bsaas#growth#ai
@stockx any update on my 900 dollar order that has taken over 20 days???? 16369507-16269266. That’s my order number make it happen or I’m switching to @eBay.
@stockx are you still in business? My package is 16369507-16269266. Is long overdue. Get back to me ASAP??? This is absolutely unacceptable. Get your packages together.