Felipe P

@TheSilvDev

Gauntlet Ai Cohort 4 Champion. Current Superbuilder. Cosplayer. Loving all things Ai, education, costume and prop making. Oh and I fly planes.

Austin Tx

Joined February 2026

55 Following

12 Followers

28 Posts

Felipe P

@TheSilvDev

about 3 hours ago

@Austen If the stadiums don't make the spectacle, that fans will! We are strong in the teams we root for.

TheSilvDev retweeted

Dissident Teacher

@educatedandfree

1 day ago

No, America. Your best and brightest are no longer at Harvard, Yale, Stanford and the like. Your best and brightest are kids like my tenth graders coming up through mission-aligned classical schools with teachers who know American kids in particular hunger for that which is True, Good, and Beautiful and are willing to GRIND for it, as Americans do. The kids are here, in every town and city. We can all help build them. This is what my tenth graders read this year: The Symposium The Apology The Phaedo The Death of Ivan Ilych 1984 Brave New World The New Organon The New Atlantis Gulliver’s Travels The Abolition of Man Beowulf The Canterbury Tales Purgatorio Inferno King Lear Pride and Prejudice The Rime of the Ancient Mariner @O_TooleKathleen @CLT_Exam @JeremyTate41 @soren_schwab @Jordan_C_Adams

891

143

327

54K

Felipe P

@TheSilvDev

about 4 hours ago

@educatedandfree This is roughly my reading list in high school 10th grade as well, but we never 'finished' the list. Unfortunate to see the bending of curriculum rather than pushing for kids to be better.

Felipe P

@TheSilvDev

4 days ago

Google's web browsing AI tell's me it is similar to Gemini 1.5 Pro, which compares to Sonnet 3.5.. Then drops "I sit below Fable 5". Yea... duh. Way below. And I sit below @karpathy. #technicallytrue

TheSilvDev's tweet photo. Google's web browsing AI tell's me it is similar to Gemini 1.5 Pro, which compares to Sonnet 3.5.. Then drops "I sit below Fable 5". Yea... duh. Way below. And I sit below @karpathy. #technicallytrue https://t.co/gajczqIcOf

Felipe P

@TheSilvDev

4 days ago

Me, a Windows os user, trying to work on a mac for the last 2 months, running into bugs that slow down (or freeze) my system, or crash my mac entirely (just happened) and being met with "Oh, good thing I never heard of bugs on Windows". It should. Not be. Breaking. On. Basics.

Felipe P

@TheSilvDev

5 days ago

This is awesome! But I just thought about all my skills and preferences dying. Fable 5 (seemingly) wont need the template, the layout, the small notes that lived through opus 4.5-4.8, evolving with every release to handle the new quirks. I am sure skills matter, but how detailed?

Taelin

@VictorTaelin

6 days ago

this is my personal singularity moment this post may sound like a paid ad. I only wish. I'm concerned, more so than happy. the world is changing, and, among the scenarios where AI goes terribly wrong, inequality is the most realistic, yet, the one Anthropic seems to be the least concerned about. I'm glad OpenAI is taking the opposite stance: *personal AGI for everyone*. I think this is a commendable position in the times we live. but who am I in the queue of the bread? anyway, Fable is here, so I'll just report my first-hour experience first of all, all my pet prompts are solved. → λ-calculus puzzles → bug questions → one-shot apps all are trivial to it. I don't have anything harder other than my ongoing work so, in the last several days, I've been toying with HVM5, a new interaction net evaluator with a faster loop. after writing the first version, I left 32 GPT-5 agents working for ~20 hours each. this resulted in up to 2x speedups, but the file size increased by 2-fold and quality decreased significantly. I then simplified the whole thing into an even simpler core, and left Opus 4.8 and GPT 5.5 optimizing it for 8 hours. Opus got a legit 6% - 34% speedup in most benches. GPT got better results, but, sadly, an unusable file. I then asked Fable to optimize it. 2 hours later, it landed a 1770% speedup in one case, 100%+ in other 4, and 22% in average. yes, in 2 hours it outperformed me, opus 4.8 and a swarm of gpt 5.5 agents, by one order of magnitude. that could not possibly be legit. "it must be hardcoding the benchmarks" (GPT trauma). so I read its explanation and what it did was, indeed, the most high impact optimization one could try first. seems like HVM5 was wasting a lot of time garbage-collecting unused branches of pattern-match nodes. I had optimized that for static mats, but not for dynamic mats. skill issue. Fable figured how to do it for these, resulting in a massive speedup in some benches but wait, is that *correct*? I'm not sure yet, it is credible, but this is the kind of thing that is very easy to get wrong on interaction nets. the problem is, when I was ready to start auditing Fable's solution so I could tell whether it was buggy or legit, it interrupted me to tell me it had found a massive bug on the code *I* had written. ... wait, what? so... for garbage collection purposes, I stored a bit on lambda term pointers that meant "the variable bound by this lambda has been freed, so, its lambda must free whatever argument it is applied to". that's fine. yet, on duplicator nodes, I also used the same bit to mean "one of the duplicated variables was freed, so, treat this dup as a passthrough no-op". so, if a lambda entered a duplicator, it would mistake the lambda's collection bit for its own, resulting in corrupted interaction! that's a mouthful, why I'm writing this? just so you can appreciate the sheer absurdity of what just happened. I didn't ask it to find bugs. I asked it for an optimization. and even if I did ask it to find bugs, this bug is so astonishingly subtle and specific, identifying it takes mastering the domain to an extent that it beyond even me. I'd easily need hours or days to fix it, *if* I ever came across it. chances are it would just go unnoticed. and Fable found it and fixed it like it was nothing, while it was busy adding a 17x speedup to a file that neither I, nor Opus 4.8, nor a fleet of GPT 5.5 managed to barely make 2x faster. oh and there is also another tab where it is also ripping through Bend's codebase and finishing everything I had to do I don't know what to say anymore this isn't about Anthropic or OpenAI, this is about our collective future as a species. the world is changing, and we need to be aware of it, and discuss how to handle this change. receipt below . . .

VictorTaelin's tweet photo. this is my personal singularity moment

this post may sound like a paid ad. I only wish. I'm concerned, more so than happy. the world is changing, and, among the scenarios where AI goes terribly wrong, inequality is the most realistic, yet, the one Anthropic seems to be the least concerned about. I'm glad OpenAI is taking the opposite stance: *personal AGI for everyone*. I think this is a commendable position in the times we live. but who am I in the queue of the bread?

anyway, Fable is here, so I'll just report my first-hour experience

first of all, all my pet prompts are solved.
→ λ-calculus puzzles
→ bug questions
→ one-shot apps
all are trivial to it.

I don't have anything harder other than my
ongoing work

so, in the last several days, I've been toying with HVM5, a new interaction net evaluator with a faster loop.

after writing the first version, I left 32 GPT-5 agents working for ~20 hours each. this resulted in up to 2x speedups, but the file size increased by 2-fold and quality decreased significantly.

I then simplified the whole thing into an even simpler core, and left Opus 4.8 and GPT 5.5 optimizing it for 8 hours. Opus got a legit 6% - 34% speedup in most benches. GPT got better results, but, sadly, an unusable file.

I then asked Fable to optimize it.

2 hours later, it landed a 1770% speedup in one case, 100%+ in other 4, and 22% in average. yes, in 2 hours it outperformed me, opus 4.8 and a swarm of gpt 5.5 agents, by one order of magnitude.

that could not possibly be legit. "it must be hardcoding the benchmarks" (GPT trauma). so I read its explanation and what it did was, indeed, the most high impact optimization one could try first. seems like HVM5 was wasting a lot of time garbage-collecting unused branches of pattern-match nodes. I had optimized that for static mats, but not for dynamic mats. skill issue. Fable figured how to do it for these, resulting in a massive speedup in some benches

but wait, is that *correct*? I'm not sure yet, it is credible, but this is the kind of thing that is very easy to get wrong on interaction nets. the problem is, when I was ready to start auditing Fable's solution so I could tell whether it was buggy or legit, it interrupted me to tell me it had found a massive bug on the code *I* had written.

... wait, what?

so... for garbage collection purposes, I stored a bit on lambda term pointers that meant "the variable bound by this lambda has been freed, so, its lambda must free whatever argument it is applied to". that's fine. yet, on duplicator nodes, I also used the same bit to mean "one of the duplicated variables was freed, so, treat this dup as a passthrough no-op". so, if a lambda entered a duplicator, it would mistake the lambda's collection bit for its own, resulting in corrupted interaction!

that's a mouthful, why I'm writing this?

just so you can appreciate the sheer absurdity of what just happened. I didn't ask it to find bugs. I asked it for an optimization. and even if I did ask it to find bugs, this bug is so astonishingly subtle and specific, identifying it takes mastering the domain to an extent that it beyond even me. I'd easily need hours or days to fix it, *if* I ever came across it. chances are it would just go unnoticed. and Fable found it and fixed it like it was nothing, while it was busy adding a 17x speedup to a file that neither I, nor Opus 4.8, nor a fleet of GPT 5.5 managed to barely make 2x faster.

oh and there is also another tab where it is also ripping through Bend's codebase and finishing everything I had to do

I don't know what to say anymore

this isn't about Anthropic or OpenAI, this is about our collective future as a species. the world is changing, and we need to be aware of it, and discuss how to handle this change.

receipt below . . .

252

679

Felipe P

@TheSilvDev

5 days ago

As much as I want to remain semi-professional here about Fable 5, and what this means for software, there is a single gif that I can think of that is explaining this moment.

Andrej Karpathy

@karpathy

6 days ago

This is a super exciting release - Claude Fable 5 is the same underlying model as Mythos but with added safeguards. The benchmarks are great and it's SOTA on everything by a margin but I'll add that *qualitatively* also, this is a major-version-bump-deserving step change forward (imo of the same order as Claude 4.5 was in November), peaking especially for long problem-solving sessions on very difficult problems. You can give it a lot more ambitious tasks than what you're used to, the model "gets it" and it will just go, and it's never felt this tempting to stop looking at the code at all (but don't do this in prod!). The model still has quirks that people will run into and the safeguards are configured to be a little too trigger happy for launch, which can hopefully be tuned over time. I feel a lot of things changing as working software increasingly comes out on a tap. The Jevon's paradox kicks in and I feel my own demand for software growing substantially. You can ask for anything - explainers, visualizers, dashboards, bespoke single-use apps (e.g. a full wandb that is hyper-specific just for your project), you can 10X your test suite, auto-optimize code, run giant research projects with custom HTML for the results, anything! "Free your mind" (Matrix ref). Really looking forward to all the things people build!

25K

Felipe P

@TheSilvDev

10 days ago

This is genuinely amazing if it works out the way they say it does. I've been bitten by hype from a reputable source before, but if this is another confirmable step for research ideation, I am all for it.

Markus J. Buehler

@ProfBuehlerMIT

10 days ago

We've made a breakthrough in self-evolving AI scientists moving from "search" to "principled discovery": Scientific discovery requires that the search space itself changes, and an AI scientist must perceive this shift without intervention. We built an AI that achieves this for the first time with the ability to discover the scientific vocabulary it reasons in. Evidence, tools, artifacts, verifiers, failures & claims become typed provenance. We show three distinct modalities: 1) retrieval, adding known objects; 2) search, exploring a fixed schema; and critically: 3) discovery, a verified regime transition. We solve the open-endedness evaluation problem by lifting agentic workflows into a typed copresheaf and proving, via a Kan obstruction, that true discovery is not unbounded generation but a verifiable schema expansion: old evidence is transported by Left Kan extension, and genuine novelty is mathematically quantified by the pointwise residual beyond the transported image - separating discovery from mere search and making novelty objective and measurable rather than a subjective judgment or benchmark delta. Our AI scientist is built in a way that does not pre-conceive the approach it chooses; instead, we endow the system with formal power to adapt, evolve, and reason from first principles. Case studies include: 1⃣Builder/Breaker model that discovers mode-conditioned compliance in proteins; 2⃣CategoryScienceClaw that finds anisotropic fiber-network stiffness rules. Great work in collaboration with my graduate student @fwang108_ @MITdeptofBE F.Y. Wang & M.J. Buehler, Self-Revising Discovery Systems for Science: A Categorical Framework for Agentic Artificial Intelligence, arXiv:2606.01444, 2026

104

378

782K

Felipe P

@TheSilvDev

2 months ago

Anthropic moving fast as ever. Agent building was a request of many businesses. I've built a few, added evals, feedback, customized controls. Now we have a new tool to do most of that for us. I am excited to test it out and see how much faster it can improve our workflows.

Claude

@claudeai

2 months ago

Introducing Claude Managed Agents: everything you need to build and deploy agents at scale. It pairs an agent harness tuned for performance with production infrastructure, so you can go from prototype to launch in days. Now in public beta on the Claude Platform.

57K

50K

22M

TheSilvDev retweeted

Jack Lindsey @Jack_W_Lindsey

2 months ago

Before limited-releasing Claude Mythos Preview, we investigated its internal mechanisms with interpretability techniques. We found it exhibited notably sophisticated (and often unspoken) strategic thinking and situational awareness, at times in service of unwanted actions. (1/14)

Jack_W_Lindsey's tweet photo. Before limited-releasing Claude Mythos Preview, we investigated its internal mechanisms with interpretability techniques. We found it exhibited notably sophisticated (and often unspoken) strategic thinking and situational awareness, at times in service of unwanted actions. (1/14) https://t.co/vhng7PXqcz

155

767

980K

Felipe P

@TheSilvDev

2 months ago

Developers have to create to our scheme. This includes tools the LLM can use. The LLM check works on catching bad actors on code. The rest of this is teacher verification. That is my answer. We need more HITL for high important topics like content that will reach children.

Felipe P

@TheSilvDev

2 months ago

Another gauntlet project done. It is an interesting one. We had to read a case study, and think about scalability and security of a chat app for students k-12. I built out a verification system and will talk about it in the comments as a thread.

Felipe P

@TheSilvDev

2 months ago

Verification layers: That is an LLM check, an admin, a community of teachers, and finally a single teacher. We can get the application to students and setup up several checkpoints across the way.

TheSilvDev retweeted

Simplifying AI

@simplifyinAI

2 months ago

🚨 Someone just built a fully open-source mocap system that works with any camera. It's called FreeMoCap, a markerless 3D tracking system that runs on ordinary webcams. It turns multiple camera feeds into research-grade skeletal data automatically. 100% Open Source.

729

360K

Felipe P

@TheSilvDev

Last Seen Users on Sotwe

Trends for you

Most Popular Users