Jordan

@snibbmaster

The champagne of friends.

Joined May 2017

7.1K Following

582 Followers

7.1K Posts

snibbmaster retweeted

Plastic Soldier

@PlastiqSoldier

about 23 hours ago

@ben_golub No, that doesn't work, because there would still be alien AIs.

snibbmaster retweeted

Mark Kretschmann

@mark_k

about 16 hours ago

OpenAI is reportedly considering drastic token price cuts to pull customers away from Anthropic, per WSJ. This follows rising complaints from enterprise customers about AI costs, while Anthropic has been gaining traction with Claude Code. The message is clear: @OpenAI does not want Anthropic to own the developer market. A real AI price war may be starting.

661

55K

snibbmaster retweeted

H. B.

@theaichbee

2 days ago

@Miles_Brundage I used Fable all day today. The hype takes were right. It's great. An improvement. But normal, incremental improvement, not silicon god improvement. Still had to fix lots of its mistakes.

snibbmaster retweeted

Conor

@jconorgrogan

1 day ago

TLDR: 1. Declare AI too dangerous for ordinary competition so you propose a regulatory regime where only the largest incumbents can survive 2. Warn about labor displacement while selling the product to executives as a labor-displacement tool 3. Warn about state overreach while asking the state to license and gatekeep frontier models 4. Warn about corporate power while sketching a corporate-state cartel over compute, release, security, export controls, and deployment

125

190

76K

Who to follow

Muntala Piniyini

@muntala_piniyin

Founder, Writer, Builder - Tokenization, Stablecoins, RWAs, ICMs, Cross-border payments, Digital Assets, Neofi.

Ko₿ra

@kobratrading

Occasional Shitposter https://t.co/AXBwv2y5Uy

Alex-PromoTeam | RWA | Web3 | BD |

@Alex_PromoTeam

PromoTeam DAO | Polkadot | RWA tokenization | $WUD Universe

snibbmaster retweeted

spor

@sporadica

2 days ago

okay, after a few hours of playing around with Fable i've determined it to be borderline unusable. wtf @AnthropicAI every other request is getting rerouted to Opus. requests with NOTHING AT ALL to do with bio or cybersec are STILL getting flagged and rerouted. like 50-60%. who the hell let this go live. this is atrocious. i understand the safeguards but you basically didn't even release a model, you released a braindead, nerfed model that can't even handle the simplest request without being told to nuke itself and use a different model. what are we even doing? just don't even release it next time smh

925

99K

snibbmaster retweeted

Andy Masley

@AndyMasley

2 days ago

Fable is AGI

107

207K

snibbmaster retweeted

zerohedge

@zerohedge

3 days ago

One week later: token prices are tumbling after industrywide blockback against surging token costs.

575

101

285K

snibbmaster retweeted

Arpit Gupta

@arpitrage

3 days ago

Things look good for AI as a “normal technology” so far. LLMs have rapidly impacted coding, but few other professions; and even in coding it’s hard to see value generation. Bottlenecks, regulation, other frictions keeping humans complementary and relevant

149

38K

snibbmaster retweeted

Gary Marcus

@GaryMarcus

3 days ago

I have said exactly this before and still believe it to be true

515

127

40K

snibbmaster retweeted

swyx

@swyx

3 days ago

It's finally out!!! @METR_Evals found that more than half of SWEBench results is unmergeable slop. FrontierCode represents over 1000+ hours of maintainer validated software engineering work most frontier models cannot yet solve, much less solve with high quality. Cog had IOI Gold medalists and top code maintainers Look At The Data — FrontierCode includes 3000+ rubrics covering code quality and anticheat reward hacking plaguing other benchmarks. FC Diamond is so hard that Opus 4.8 scores 13.8%. Three eras of AI coding : Three eras of benchmarks 2021 • Autocomplete : HumanEval 2023 • Passing Tests: SWEBench, TerminalBench 2026 • Maintainable Code: FrontierCode to me the most beautiful chart when I requested a special historical run into all extant old models, the data was finding that the easiest third of FC tasks (in FC Extended) were rapidlly and suddenly solved over late 2025 - Opus almost doubled from a 41% pass rate to 74% in 4 months. This describes the "WTF happened in Dec 2025" vibe shift that a lot of folks from @dhh to @karpathy have called out: it is the difference between getting 95% success in 2 rerolls vs 6, making it finally feasible to go up the next layer of abstraction in agentic coding, eg @GeoffreyHuntley's ralph loops or @bcherny's /goals or @steipete's "loops that prompt your agents" without fearing too much that things go off the rails. My guess: as AI accelerates from here, each FrontierCode tier will saturate in sequence, hopefully ~annually. I've already asked the team to prepare FrontierCode 2027.... The old mountains will be destroyed. Their rubble becomes regolith. And from that regolith, the next model forest grows. Circle of life.

swyx's tweet photo. It's finally out!!! @METR_Evals found that more than half of SWEBench results is unmergeable slop. FrontierCode represents over 1000+ hours of maintainer validated software engineering work most frontier models cannot yet solve, much less solve with high quality.

Cog had IOI Gold medalists and top code maintainers Look At The Data — FrontierCode includes 3000+ rubrics covering code quality and anticheat reward hacking plaguing other benchmarks.

FC Diamond is so hard that Opus 4.8 scores 13.8%.

Three eras of AI coding : Three eras of benchmarks

2021 • Autocomplete : HumanEval
2023 • Passing Tests: SWEBench, TerminalBench
2026 • Maintainable Code: FrontierCode

to me the most beautiful chart when I requested a special historical run into all extant old models, the data was finding that the easiest third of FC tasks (in FC Extended) were rapidlly and suddenly solved over late 2025 - Opus almost doubled from a 41% pass rate to 74% in 4 months.

This describes the "WTF happened in Dec 2025" vibe shift that a lot of folks from @dhh to @karpathy have called out: it is the difference between getting 95% success in 2 rerolls vs 6, making it finally feasible to go up the next layer of abstraction in agentic coding, eg @GeoffreyHuntley's ralph loops or @bcherny's /goals or @steipete's "loops that prompt your agents" without fearing too much that things go off the rails.

My guess: as AI accelerates from here, each FrontierCode tier will saturate in sequence, hopefully ~annually. I've already asked the team to prepare FrontierCode 2027....

The old mountains will be destroyed. Their rubble becomes regolith. And from that regolith, the next model forest grows. Circle of life.

773

384

182K

snibbmaster retweeted

Dominick Romano

@dromanocpm

3 days ago

More people need to listen to what Gary is saying here. Many, MANY of these talking heads who suck the oxygen out of the room with just their ego, say things that are so wildly wrong, that its misguiding an entire generation, and its all for ego and greed. Literally, not a clue what they are talking about... Yet the next idiot calls them an AI Expert 🤦‍♂️

snibbmaster retweeted

Neeraja Deshpande

@neerajadeshp

3 days ago

@TheStalwart I agree that phones are socially concerning—I just took a 2 hour walk without one and felt great!—but “it’s the phones” is intellectually lazy, and I don’t buy this study at all, even as I agree that getting off our phones is important. https://t.co/A30967zkNz

snibbmaster retweeted

Alexander Panetta

@Alex_Panetta

4 days ago

Striking paper from Wharton. The big conclusion: AI must increase productivity 2.7x -- and quickly -- or tech companies risk bankruptcy with all that entails for the economy. For context: this is how a quickie 2.7x productivity boom would compare to historical precedent. Paper linked in my daily AI digest. Useful context for OpenAI reportedly talking to the US government about a bailout (ahem, I mean ownership stake).

Alex_Panetta's tweet photo. Striking paper from Wharton. The big conclusion: AI must increase productivity 2.7x -- and quickly -- or tech companies risk bankruptcy with all that entails for the economy. For context: this is how a quickie 2.7x productivity boom would compare to historical precedent. Paper linked in my daily AI digest. Useful context for OpenAI reportedly talking to the US government about a bailout (ahem, I mean ownership stake).

213

718

182K

snibbmaster retweeted

Shaiel Ben-Ephraim

@academic_la

4 days ago

I sacrificed friends, family, community and quite possibly the ability to return to my home in order to defend Palestinians against Israel. Spent thousands of dollars of my money and raised much more for Gaza. And I have found myself so viciously attacked by the people who are supposed to be pro-Palestine. You are definitely setting an example that no one should ever cross over or they will pay an awful price and be completely isolated. I do not regret the choice I made. It was the right one. But it has been costly.

846

183

195

256K

snibbmaster retweeted

Karri Saarinen

@karrisaarinen

4 days ago

The thing about people talking about building with AI is that they always talk about how they’re building, what tools they use, and how much they use them. Much less is said about what they actually built, or what impact it had. The tool becomes the job. And the purpose.

140

158

89K

snibbmaster retweeted

Noah Smith 🐇🇺🇸🇺🇦🇹🇼

@Noahpinion

4 days ago

Between the "machines of loving grace" scenario and the "humans become horses" scenario there's a scenario where everyone uses a lot of AI, nothing changes, and the world continues to limp along and argue about stupid TikTok videos til the population collapses

363

46K

snibbmaster retweeted

Aaron Levie

@levie

6 days ago

Coding is basically the pinnacle of what you could reasonably automate with AI, and yet we still need human engineers to oversee agents for them to be effective. The AI models are trained on an incredible amount of sophisticated code. The users are highly technical and can use the latest tools quickly. The work is “verifiable” because you can test an app. The outcomes are often removed from the quality of the code (you can have sloppy code but the app can still work). And the context for the agent is often already digitized and sitting in the codebase. That’s an incredible amount of benefits that AI coding agents get to work with. Some of those apply to knowledge work, but most don’t in areas where the work needs to be fully reviewed to be useful, or where data isn’t as abundantly digitized. This makes the job for agents in knowledge work more complicated. So if with all of that, engineers still remain in very high demand, the risks are going to be less than what’s perceived for other areas of knowledge work. Agents will let people do far more than they did before, but the people don’t go away.

568

251

144K

snibbmaster retweeted

Cat McGee

@catmcgee

6 days ago

I feel like Anthropic is on the verge of losing a lot of trust. Too much marketing trying to disguise as AI safety

468

30K

snibbmaster retweeted

Hunter Biden

@HunterBiden

6 days ago

@Samwise_Ganji Fiat is a sham, the banking class is corrupt, decentralized digital currency and the blockchain are the inevitable future, and the incumbents will fight it to the death.

291

277

255

699K

snibbmaster retweeted

Joe Weisenthal

@TheStalwart

6 days ago

I recently tried to use Codex to go through a modesty sized (online) database of photos (from our recent live show) and identify ones worth posting. Not only did it require a lot of hand holding, what it was able to do required still lots of trial and error.

128

36K

Jordan

@snibbmaster

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users