Big fan of tech & startups. Doing AI things with @FulbrightPrgrm. Director at @StartupGrindNYC. Art & Tech chair at @NatnlArtsClub. Skipper at @Manhattan_YC.
@ValeRicciulli I think it's because the CAPTCHA isn't loading properly. Happened to me yesterday, but it initialized after getting "registration failed" 3-4 times.
Did you know there were recently 4 AI supply-chain security incidents in 50 days? They hit OpenAI, Anthropic, and Meta-adjacent infrastructure.... things like elease pipelines, CI runners, and dependency hooks.
It got almost zero coverage, which is unfortunate.
The attack surface for AI systems is turning out to be the software factory around the model... not necessarily the model, itself.
Red teams test for things like jailbreaks and system card violations, but they don't typically test the publish gates, package managers, and CI pipelines that ship the models into production.
While I wish this were a coincidence, four incidents in 50 days across three major labs is systematic. When something breaks publicly (and it will soon) this will suddenly get all the coverage it deserves.
Check out the @VentureBeat article for a bigger breakdown.
https://t.co/SHCUEcwkxe
@TheEntAIShow@sema4@ramvzz I think the real constraint for AI agents moving forward will be procurement. Institutions are demanding audit trails and explainability in RFPs, which means vendors without these built into core architecture will be eliminated before regs like the EU AI Act even enforce it.
@TheEntAIShow@sema4 Great episode with @ramvzz on this. It's notable that the market's already moving faster than regulation. Accuracy โ Accountability in regulated work. Finance and law teams need provenance: full lineage of how decisions were made, not just what the decision was.
Notion just opened its workspace to outside agents. Anthropic just locked Claude's agents inside your own infrastructure.
Same week, opposite architectures. One of them is wrong about where enterprise trust lives.
@NotionHQ's bet: the platform is the trusted intermediary. Any agent, anywhere, can plug in.
@AnthropicAI's bet: enterprise buyers need control at the perimeter. Your data never leaves your environment.
Whichever architecture wins will eventually set the default for how AI agents integrate with corporate workflows through 2028. It's governance philosophy masquerading as a product decision.
Anthropic's @StainlessAPI acquisition is actually underrated. It's being covered as a minor infrastructure deal, but it's actually one of the more aggressive competitive moves in recent memory.
Anthropic now controls tooling that OpenAI and Google depended on to ship developer products... and they're shutting down the hosted version.
The industry covered @karpathy, but the Stainless deal deserves equal attention.
This week, Google I/O announced Gemini as the agentic runtime embedded in Search, Workspace, Chrome, and hardware.
On the same day, Anthropic grabbed @StainlessAPI (SDK layer) and announced @karpathy will lead AI-powered training automation.
Neither company is competing purely on model benchmarks anymore. Both are racing to own the full stack: training pipeline, developer tooling, product integration, and hardware (Nvidia Vera arriving at both companies simultaneously).
Frontier model quality is becoming commoditized. The companies that win this phase will be the ones who have the most locked-in infrastructure.
Two different AI papers this week made the same point from opposite directions.
@thinkymachines says the real frontier is real-time interactivity over static output quality. Josh and Palmer at @AICollectiveCo say it's recoverability, no matter if failures are visible and bounded.
Both are right. Neither shows up on a benchmark. The field has been measuring the wrong things. Deployment is forcing the issue.
The practical implication: If you're evaluating AI tools using vendor benchmarks or public leaderboards, you're selecting for performance in conditions that don't exist in production.
The model you pick because it aced MMLU might be the one that wipes a folder without telling you, or breaks when your users don't communicate in tidy, complete prompts.
For anyone making AI purchasing or deployment decisions, the honest criteria aren't "what's the benchmark score?" They're: "What does this model do when it's wrong? Does it tell me? Can I recover from it?" Almost no vendor answers those questions directly, because almost no benchmark has required them to.
@sciencevs@MerylEHorn@Penn The problem is that offices and classrooms aren't teaching workers and students to use AI correctly. Prompts that are structured and intentional will create specific and non-generic outputs that are way more helpful than inputting, "tell me how to have a healthy lifestyle."
I just listened to the @sciencevs episode on #AI making us stupid w/ @MerylEHorn & Dr. Shiri Melumad at @Penn. The panic is real, but misplaced. AI doesn't fry your brain. Bad prompting fries your ROI. If you treat a reasoning engine like a vending machine, you get mediocrity.
I'm sitting in a startup pitch session, wishing more of these companies with good ideas made an effort to differentiate between standalone and presentation decks. So much text! What are your go-to slide deck resources for effective presentations?
@CNN why is @thelauracoates not in a better time slot? She's great on script and even better off. Way more interesting and fun to watch than AP... maybe even AC. No anchor is above the law.