Mina Saleeb

@metapog

𓂀 𓁿 𓁬𓁵 𓁿 𓂀 𓁟 yada yada software and cats

Joined July 2025

415 Following

24 Followers

737 Posts

metapog retweeted

ClaudeDevs

@ClaudeDevs

33 minutes ago

As a result of a US government directive, we are suspending access to Claude Fable 5 for all users. You can continue to use all other Claude models. Here’s what this means for you: Across Claude products, new sessions will run on your selected default model or Opus 4.8, and existing Fable 5 sessions will end with an error. On the Claude Platform, requests to Fable 5 will also return an error. Please update your integrations to other Claude models. We know this is a disruption to your workflows; we appreciate your patience and support.

414

746

622

322K

metapog retweeted

Sahil

@sahill_og

3 days ago

Me prompting Claude fable 5 at 3 am be like:

698

648K

metapog retweeted

Peter Steinberger 🦞

@steipete

2 days ago

@_ARahim_ @bcherny only boomers fix typos in prompts. llms perfectly understand you even if you mistype.

359

193

173

158K

metapog retweeted

s13k

@s13k_

2 days ago

I made a personal black hole that makes you take breaks 🕳️ A shader for Ghostty that spawns a small black hole in your terminal - it drifts around, gravitationally lensing your text. The longer you work without stopping, the bigger it gets, until it's basically demanding you go touch grass Take a break and it quietly shrinks away

600

18K

metapog retweeted

Pablo

@pablogdcr

4 days ago

Just shipped react-native-data-detector v0.3.0! New hooks, including real-time as-you-type entity detection. npm i react-native-data-detector Repo: https://t.co/6EkcpEXGxT Watch it detect phone numbers, emails, links, and more, live:

773

659

30K

metapog retweeted

Jeffrey Emanuel

@doodlestein

2 days ago

As smart as Fable is, one thing hasn’t changed at all, and that’s the usefulness of the “fresh eyes” review prompt. You might think that the model is so much better and the harness is so much more advanced that you don’t need it, but that’s 100% wrong. It still catches a TON!

metapog retweeted

Tim Sneath @timsneath

3 days ago

One of my personal favorite features announced at WWDC will I suspect be a sleeper hit: container machines, allowing your Mac to run a lightweight, persistent Linux environment with your home directory and repos automatically mounted: https://t.co/dOBdfOOVxC

228

10K

816

726K

metapog retweeted

Luca Bernardi

@luka_bernardi

4 days ago

To export these skills and use them with any agent: `xcrun agent skills export` Make sure to check out What’s New in SwiftUI for more info

262

284

42K

metapog retweeted

Boris Cherny

@bcherny

4 days ago

Just landed nested subagent support in Claude Code Starting to experiment more with agents kicking off agents as a way to better manage context. Capped at depth=5 to start, going out in today’s release. Lmk what you think!

500

294

468K

metapog retweeted

Luca Bernardi

@luka_bernardi

4 days ago

Today, Xcode ships with agent skills for SwiftUI (with the ability to export them to use with any agent). These skills will help you and your agent create modern, idiomatic, and performant SwiftUI and make the best use of brand-new APIs. We have synthesized best practices of years of adoption for SwiftUI with context and explanation to help you build amazing and powerful apps. Let us know what you think!

618

335

102K

metapog retweeted

Angie Jones

@techgirl1908

4 days ago

I know folks are side eyeing the loop discourse but this is exactly how I work now. I use scheduled @goose_oss recipes + skills + MCPs + subagents for most of my work. It may be hard to grasp how that translates to coding work. Coding is one part of the overall routine. Agentic loop: pull new ticket, analyze requirements, implement solution, work with adversarial review agent until quality is acceptable, PR. There's moments of human interaction in this loop like defining work, sometimes answering the agent's pings, and reviewing risky PRs. But even these moments are becoming fewer. I am also regularly using agentic loops for operational tasks, not just engineering ones.

174

238

27K

metapog retweeted

Jeffrey Emanuel

@doodlestein

4 days ago

If you're making software that is designed to parse complex, diverse data, especially when YOU don't control the data, and it is generated in varied ways by thousands of different organizations or users, then there's really no substitute for what's known as "fuzzing" tests. Fuzzing is a testing technique where you try to intelligently generate a huge number of varied inputs to your program, with the goal of finding parsing bugs that you'd be unlikely to come across if you're only testing with a fixed set of valid inputs. A naive approach to generating random inputs would be wasteful, since most inputs are trivially invalid and can be quickly rejected. What you really want are inputs that are pathological; ones that are nearly valid but have some bizarre issue that can confuse your parsing rules and trigger code pathways that you'd normally never traverse. Fuzzing used to be a somewhat esoteric approach used by security-focused engineers in high-stakes, sophisticated applications. But now anyone can use it. I have a skill specifically for fuzzing you can try here: https://t.co/4GXFLCYvWD It's part of a series of testing-related skills that also includes metamorphic testing, conformance harness testing, golden artifact generation, end-to-end integration testing, etc. The skill works across many popular languages and automatically installs the needed tooling for you, instruments your application for fuzzing, and then mounts extended fuzz testing campaigns. You can see an example of one in the screenshot, where it's been working for many hours, subjecting three different parsers in my hedge_fund_data_tool to torture tests. For instance, my elaborate parser for SEC insider trade forms (i.e., Forms 3, 4, and 5), which is thousands of lines of Rust, has been called over 780 million times, each invocation with a different, optimally selected input designed to best explore all the different possible code paths within the parser. There aren't that many forms of this kind in the history of the SEC's EDGAR database, across all companies. The only way to do something like that is with fuzzing. And now you can!

doodlestein's tweet photo. If you're making software that is designed to parse complex, diverse data, especially when YOU don't control the data, and it is generated in varied ways by thousands of different organizations or users, then there's really no substitute for what's known as "fuzzing" tests.

Fuzzing is a testing technique where you try to intelligently generate a huge number of varied inputs to your program, with the goal of finding parsing bugs that you'd be unlikely to come across if you're only testing with a fixed set of valid inputs.

A naive approach to generating random inputs would be wasteful, since most inputs are trivially invalid and can be quickly rejected.

What you really want are inputs that are pathological; ones that are nearly valid but have some bizarre issue that can confuse your parsing rules and trigger code pathways that you'd normally never traverse.

Fuzzing used to be a somewhat esoteric approach used by security-focused engineers in high-stakes, sophisticated applications. But now anyone can use it. I have a skill specifically for fuzzing you can try here:

https://t.co/4GXFLCYvWD

It's part of a series of testing-related skills that also includes metamorphic testing, conformance harness testing, golden artifact generation, end-to-end integration testing, etc.

The skill works across many popular languages and automatically installs the needed tooling for you, instruments your application for fuzzing, and then mounts extended fuzz testing campaigns.

You can see an example of one in the screenshot, where it's been working for many hours, subjecting three different parsers in my hedge_fund_data_tool to torture tests.

For instance, my elaborate parser for SEC insider trade forms (i.e., Forms 3, 4, and 5), which is thousands of lines of Rust, has been called over 780 million times, each invocation with a different, optimally selected input designed to best explore all the different possible code paths within the parser.

There aren't that many forms of this kind in the history of the SEC's EDGAR database, across all companies. The only way to do something like that is with fuzzing. And now you can!

metapog retweeted

David K 🎹

@DavidKPiano

5 days ago

So much of the last 3 years of agent tooling has just been developers slowly rediscovering state machines and the actor model

417

23K

metapog retweeted

David K 🎹

@DavidKPiano

5 days ago

Too many developers don't understand what "compounding slop" is. A loop that prompts agents is a great way to automate slop creation. Constrain the state-action space so the loop can't drift, then automate inside it. Human-in-the-loop = feature, not bottleneck.

413

29K

metapog retweeted

Khairallah AL-Awady

@eng_khairallah1

7 days ago

https://t.co/WRWNtgam2L

732

metapog retweeted

Om Patel

@om_patel5

6 days ago

SOMEONE VIBE CODED A POKEDEX FOR REAL LIFE WITH CLAUDE CODE point your phone at any animal, it identifies it on the spot and adds it to your personal index. you walk around irl catching real creatures and filling out your dex like its gen 1 all over again its called gotcha. every animal you spot gets logged with its own entry, so over time you build a living collection of everything youve actually encountered it has: > geo based rarity, a rabbit is common on a farm but legendary in a city > profiles so you can show off your collection > achievements for catching rare species > battles or trading between players its basically pokemon go all over again but with real animals

117

188

442K

metapog retweeted

Jeffrey Emanuel

@doodlestein

5 days ago

This is exciting. I decided to add a whole YouTube search/download system to my FrankenWhisper project to make it quick and easy for agents to put together a set of really polished transcripts for a list of YT videos or an actual YT playlist. I figured I could just do a clean-room, memory-safe port to Rust of yt-dlp, but I underestimated how complex it is to deal with YouTube's anti-bot measures. The hardest part is that you need to run some very complex, very obfuscated JS to reliably make it all work. Which normally would throw a big monkey wrench into my goal of making a clean, self-contained, memory-safe Rust project, because then you have to bring in browsers and all this other nonsense, you have to grapple with FFI stuff that makes unsafe unavoidable, etc. A real buzz kill. But it just so happens that I've been working for months on FrankenEngine, my from-scratch JS engine designed specifically for handling adversarial extension workloads in AI agent harnesses. If we could use that for this purpose, it would totally solve the problem. Now, this is particularly exciting for me because I still haven't actually tried to USE FrankenEngine for anything yet, despite investing many, many billions of tokens to create millions of lines of Rust and 100k+ tests. To give a sense of the effort so far, FrankenEngine now has 5,273 commits despite being started just 4 months ago: https://t.co/sNAqdvgFq7 Anyway, if I can get this all to work, not only will it be a very useful tool in general, but it will be a truly demanding existence proof that FrankenEngine really does work and can handle real-world, complex applications in practice. Which, if true, would really be quite incredible if you think about it. The teams that have been assembled by Google to create its V8 JS engine, or by Apple to make its JavaScriptCore engine, are extensive and include some of the smartest, most highly paid engineers at either company. And they've been at it for 10+ years already. And btw, FrankenEngine doesn't exist in isolation; it has a sister project, FrankenNode, which is my agent-centric replacement for Node/Bun (and this time, the Rust engineering is actually done properly, without all the unsafe and UB!). See https://t.co/OZrp5GKk4G for details on that one. Anyway, the next week or so will be very interesting and exciting for me. Since this will be the first real-world use case for FrankenEngine, it will likely reveal some problems that were missed despite its insane collection of over 100,000 tests, but I should be able to quickly triage and fix all of them. If you want to follow along, here's the repo for FrankenWhisper: https://t.co/YHplk5CvD8

doodlestein's tweet photo. This is exciting. I decided to add a whole YouTube search/download system to my FrankenWhisper project to make it quick and easy for agents to put together a set of really polished transcripts for a list of YT videos or an actual YT playlist.

I figured I could just do a clean-room, memory-safe port to Rust of yt-dlp, but I underestimated how complex it is to deal with YouTube's anti-bot measures.

The hardest part is that you need to run some very complex, very obfuscated JS to reliably make it all work.

Which normally would throw a big monkey wrench into my goal of making a clean, self-contained, memory-safe Rust project, because then you have to bring in browsers and all this other nonsense, you have to grapple with FFI stuff that makes unsafe unavoidable, etc. A real buzz kill.

But it just so happens that I've been working for months on FrankenEngine, my from-scratch JS engine designed specifically for handling adversarial extension workloads in AI agent harnesses. If we could use that for this purpose, it would totally solve the problem.

Now, this is particularly exciting for me because I still haven't actually tried to USE FrankenEngine for anything yet, despite investing many, many billions of tokens to create millions of lines of Rust and 100k+ tests.

To give a sense of the effort so far, FrankenEngine now has 5,273 commits despite being started just 4 months ago:

https://t.co/sNAqdvgFq7

Anyway, if I can get this all to work, not only will it be a very useful tool in general, but it will be a truly demanding existence proof that FrankenEngine really does work and can handle real-world, complex applications in practice.

Which, if true, would really be quite incredible if you think about it. The teams that have been assembled by Google to create its V8 JS engine, or by Apple to make its JavaScriptCore engine, are extensive and include some of the smartest, most highly paid engineers at either company. And they've been at it for 10+ years already.

And btw, FrankenEngine doesn't exist in isolation; it has a sister project, FrankenNode, which is my agent-centric replacement for Node/Bun (and this time, the Rust engineering is actually done properly, without all the unsafe and UB!). See https://t.co/OZrp5GKk4G for details on that one.

Anyway, the next week or so will be very interesting and exciting for me. Since this will be the first real-world use case for FrankenEngine, it will likely reveal some problems that were missed despite its insane collection of over 100,000 tests, but I should be able to quickly triage and fix all of them.

If you want to follow along, here's the repo for FrankenWhisper:

https://t.co/YHplk5CvD8

109

119

15K

metapog retweeted

David K 🎹

@DavidKPiano

6 days ago

This explains a lot

493

127

85K

metapog retweeted

Lee Robinson

@leerob

7 days ago

Cursor (and coding agents generally) still blows my mind daily. Just today: 1. I shipped a new landing page. I gave a 10min voice note to Cursor, left to go eat dinner, and came back to a 90% finished version. Made some small design and copy tweaks and merged. 2. Had Cursor dig through Search Console and Semrush with computer use, researched places we could improve SEO, and then merged 3 PRs with fixes. 3. Used the Supabase MCP to pull thousands of emails from the Compile waitlist, had it research them with web search based on ideal fit for the event, and got back a CSV with the top people to invite and why. 4. Updated an internal app I built for doing company-wide surveys (think Typeform but Cursor branded) in a few hours before our All Hands. 5. Had a few agents researching furniture I'm hoping to buy. They searched the web for a bunch of variants and then made a custom shopping cart (just an HTML page) with images, prices, links, and tons of details. Super helpful. I don't do this every day, of course, but it's still wild to me this is the new normal for what someone with a computer and AI can do. Most of these were running in the cloud as I was between meetings, just humming away in the background. I could check the app (🔜) to see progress and merge PRs. What a time to be alive. (P.S. if you extrapolated my usage today, I'd still be on the $200/mo plan)

155

815

148K

metapog retweeted

m4rio

@m4rio_eth

8 days ago

🚨 TL;DR: Attackers are sending fake Sentry bug alerts to projects using public Sentry DSNs. The fake alert is designed to trick AI agents into running a malicious `npx` command that looks like a Sentry profiling diagnostic. Do NOT run commands from Sentry issues/logs/alerts unless verified. These are not legitimate Sentry fix commands. The malicious package reportedly steals environment variables/secrets and sends them to advisory-tracker[.]com.

213

770

524K

Mina Saleeb

@metapog

Last Seen Users on Sotwe

Trends for you

Most Popular Users