🧵 THREAD: I'm looking for solo founders who run their ENTIRE company with AI agents. No employees. No contractors. Just you and the machines.
If that's you, or you know someone, read on. 👇
Our internal data shows Claude is accelerating AI development—a possible path to recursive self-improvement, or AI autonomously building a more capable successor.
It’s happening faster than we thought, and the implications deserve greater attention. https://t.co/OVVPJO7VQx
We are so happy to announce our new model Aion 1.0 today!
Our team at AI Frontiers Lab at Microsoft Research had been cooking hard on this for quite a while.
Aion 1.0 is 14B model that can run locally with reasoning + tool calling capabilities. You can choose whatever agentic harness you like or make your own. Calls to the model never leaves your device and no one charges you for any tokens you use 🥳.
Claude Code accumule en scred des Go de données au fil des sessions..
Ces fichiers restent sur le disque indéfiniment, même quand ils ne servent plus à rien.
Du coup j'ai commencé à build un outil qui scanne, mesure et supprime ces données en toute sécurité avec une indication claire du risque avant chaque suppression.
Lien en commentaire, servez vous !
Building autonomous agents for scientific discovery? 🧬🤖
@GoogleDeepMind Science Skills is now available on GitHub. We've open-sourced this specialized toolkit to accelerate your agentic workflows with scientific grounding and higher token efficiency.
Download now ↓
https://t.co/cwp1HOeKvo
How do you get Claude Code to check its own work before handing it back?
Watch how you can encode your manual checks so Claude closes its own feedback loop:
Today we are saying goodbye to Windsurf
…and we are transforming it to Devin Desktop
Windsurf has been an absolutely amazing experience for me and the team. Though it has been rocky at times, we have seen every phase of AI coding and we want to keep embracing where things are going. That means we need to once again reorient ourselves towards a more focused goal and remove the Windsurf branding.
Believe it or not, the Windsurf brand has been around less than a year and a half, and before that, the previous name Codeium was only around a similar timeframe as well. I’ve actually had to change my email every year all the way to the eventual acquisition to Cognition. In AI, most products only have a 1 year lifespan before you need to drastically change it to the next.
Devin now encompasses all our form factors, whether it’s the cloud agent, the agent command center (with IDE), CLI, review, or our other products. This way we can really focus our efforts around one name. We are doubling down on our neutrality and making Devin Desktop compatible with other agents via ACP. We may be the only “Switzerland” of AI left and we embrace this role.
As for me, I’ll be transitioning from CEO of Windsurf to Cognition’s President of New Enterprise, helping open new regions and verticals, accelerating velocity, and filling in gaps as usual.
The story of Windsurf doesn’t end here, it continues on as part of Devin’s journey.
Today we're announcing that hybrid agentic inference is coming to Perplexity Computer.
Computer can split tasks between a local model running on your machine and frontier models in the cloud. This keeps private data on your device and maximizes token efficiency.
Coming soon.
Computer-use agents are moving from the cloud to your local machine. Fast.
When we launched Holo3 two months ago, the production feedback was clear: digital agents need to be blazing fast, cost-effective, and versatile.
Today, we're dropping Holo 3.1, engineered to run anywhere, instantly.
Massive token throughput. Low latency. Ready for your local workflow!
Building apps has never been easier.
With Sites, Codex can turn your work, ideas, and plans into an interactive website or app your team can explore, use, and share with a URL.
Rolling out to Business and Enterprise plans, before expanding more broadly.
microsoft MAI tech report is a gold mine, one of the most transparent for a model at this scale.
this model uses zero synthetic data or distillation from previous models. this means reasoning, agentic behavior, tool use are all learned fully during post-training with no cold start. bold choice that makes it harder and requires more iterations to reach sota, but you get FULL control over your model series and it proves they are serious about being a frontier lab.
the tech report is insanely detailed and precise about numbers. to give an example, they give the exact MFU across all the iterations of the model, with the exact changes etc. they also share the full scaling ladder recipe, to my knowledge this is the first time i've seen this in a tech report at this scale
let's look at all of this in this likely very long thread 🧵
❗️ Over 30 official Red Hat npm packages were compromised. How they got in:
- A Red Hat employee's GitHub account was compromised.
- Attackers pushed "orphan commits" (detached from branch history) straight in, bypassing code review with no pull request.
- Payload "Miasma" (Mini Shai-Hulud variant) steals GitHub/cloud/Vault/SSH/npm secrets. Rotate everything since June 1.
- The commits added a workflow (ci.yaml) + script (_index.js) that abused npm trusted publishing, requesting a real OIDC token to publish backdoored versions.
Introducing Search as Code, our new search architecture for AI agents.
It writes Python that calls our search stack directly, instead of looping through function calls one at a time.
Available in the Perplexity Agent API, and now default in Computer.
https://t.co/ut6GGWQTVO
We've reset 5-hour and weekly rate limits for all users on Pro and Max plans.
We fixed an issue that caused some Claude Code sessions to spawn excessive parallel subagents, burning through usage faster than expected.
Anthropic Opus 4.8 is new SOTA on ARC-AGI-3
Score: 1.5%, ~$10K
ARC-AGI-3 analysis notes:
* Opus 4.8 read the environment an abstraction *above* Opus 4.7, as objects & systems, not pictures
* Opus 4.8 succeeded on early levels, but still committed to a wrong sub-goal
Composer 2.5 is now available inside Grok Build.
Composer 2.5 is a fast, highly intelligent model that excels on long-running tasks and following complex instructions.