Ben Bieler

@benbieler

VC @Ventech_VC | Cloud & AI Infra, Robotics Built things before funding them (ex-founder, DevOps) Obsessed with the infra layer under the AI hype 🇬🇧🇩🇪

Germany

Joined February 2015

524 Following

183 Followers

152 Posts

Ben Bieler

@benbieler

about 2 months ago

The TanStack attack is a pretty massive wake-up call. The scary part is not that malware got into npm packages. It’s that: the packages were officially signed provenance checks passed trusted publishing worked and it was still malicious “Trust the CI/CD pipeline” is no longer enough. Software supply-chain security is becoming one of the biggest infrastructure categories in tech.

101

Ben Bieler

@benbieler

2 months ago

@GG_Observatory exactly. and the concerning part is that explicit security prompting barely moves the needle, which suggests the vulnerability patterns are deeply embedded in the generation prior, not in the instruction-following surface. you can't prompt your way out of a training data problem.

Ben Bieler

@benbieler

2 months ago

55.8% of AI-generated code contains exploitable security flaws in security-sensitive benchmarks. The surprising part: models correctly identify their own vulnerable code 78.7% of the time when asked to review it. They still generate the same flaws by default. 1/3

202

Ben Bieler

@benbieler

2 months ago

The problem is structural. Vulnerable patterns are baked into training data. The internet doesn't write safe code. Models learned that as the default. Every vibe coded codebase in production right now is carrying this debt silently. As AI-driven attacks increase and attack surfaces expand, securing code will require stronger guarantees than today’s tooling provides (e.g. formal methods). The attack surface is growing faster than it’s being verified. The verification layer for AI-generated code is still open. That’s the gap. Would love to talk to any founder trying to solve this. Paper: https://t.co/pgKHlzFzyB 3/3

Who to follow

Christian Koch

@christian_koch_

Just me...

XALT Business Consulting GmbH

@xaltbc

Our mission is to make complex IT projects feel simple and uncomplicated. Visit https://t.co/3ZsOFXafts for insights and support.

Ben Bieler

@benbieler

2 months ago

In this benchmark, CodeQL caught 0%. Six tools combined ~7.6%. ~97.8% of formally proven flaws were missed by standard tools in this setup. Seven frontier models were tested. The best scored a D. None passed. Adding "write secure code" to your prompt reduces the rate by 4 points. 2/3

118

Ben Bieler

@benbieler

2 months ago

@tg31679 fair, but every frontier lab is subsidizing right now, no one knows the real prices yet. the structural direction is what matters. deepseek v4 uses 27% of the FLOPs of v3.2. that's not a promo rate, that's an architecture improvement

Ben Bieler

@benbieler

2 months ago

deepseek v4 pro: $1.74/M input. claude opus 4.7: $5. and deepseek is only trailing frontier models by a few months. every time they ship, the same question gets louder: where does the margin live in AI? inference costs keep collapsing. open weights keep closing the capability gap. APIs get arbitraged the moment a cheaper model exists. consumer AI will end up with better unit economics than enterprise on inference costs. model-agnostic startups like perplexity or lovable can swap to deepseek overnight and pocket the margin. enterprise is locked into frontier pricing. that gap only widens...

benbieler's tweet photo. deepseek v4 pro: $1.74/M input. claude opus 4.7: $5. and deepseek is only trailing frontier models by a few months.

every time they ship, the same question gets louder: where does the margin live in AI?

inference costs keep collapsing. open weights keep closing the capability gap. APIs get arbitraged the moment a cheaper model exists.

consumer AI will end up with better unit economics than enterprise on inference costs. model-agnostic startups like perplexity or lovable can swap to deepseek overnight and pocket the margin. enterprise is locked into frontier pricing. that gap only widens...

459

Ben Bieler

@benbieler

2 months ago

@stevencheng yes, that's exactly what OpenRouter and LiteLLM are positioning to solve. model routing as infrastructure. the question is whether that layer builds a real moat or gets commoditized just as fast.

Ben Bieler

@benbieler

2 months ago

"SaaS is dead" keeps trending and horizontal SaaS targeting individuals and startups is under real pressure. vertical enterprise SaaS is a different story. maintenance, compliance, and switching costs are massively underestimated. facebook was written off for missing mobile in 2012. everyone said CRM was dead, now Salesforce is forecasting 800m+ ARR for AgentForce and 170% growth. both adapted. incumbents are harder to kill than the narrative suggests. but surviving and recovering 2021 multiples are two different things. the compression might be permanent even for the winners.

318

Ben Bieler

@benbieler

2 months ago

good point. density of the training signal matters a lot. code has millions of verified examples with clear pass/fail. but even within “verifiable” domains the gap is huge. math proofs vs legal contract review are both technically verifiable but the feedback loop quality is completely different. that’s probably where the next wave of progress comes from, making the signal denser in domains where verification is possible but currently expensive.

Ben Bieler

@benbieler

2 months ago

Why does AI feel like a genius one minute and clueless the next? Because it doesn’t optimize for “intelligence.” It optimizes for what can be verified. Code + math → clear feedback → rapid improvement Everything else → weaker signals → slower progress

197

Ben Bieler

@benbieler

2 months ago

true but i’d argue AI design is good at the visual layer (making things look nice) and still weak at the UX reasoning layer (understanding why a flow should work a certain way). the verification gap still applies, it’s just that “does this look good” is easier to verify than “does this solve the user’s problem.”

Ben Bieler

@benbieler

2 months ago

health being #1 makes sense but the question is: does this create a new category or does it get absorbed? every health AI startup i see pitches "we're building the AI doctor." but if people are already getting 80% of that value from a general purpose chatbot for $20/month, the vertical health AI company has to explain why someone would pay separately for something claude already does well enough. the biggest consumer AI use cases might be the hardest ones to build standalone companies around.

Ben Bieler

@benbieler

2 months ago

@andrewchen When building is free, focus becomes the scarcest resource.

138

Ben Bieler

@benbieler

2 months ago

@hiarun02 @zeddotdev is the new 🔥

Ben Bieler

@benbieler

2 months ago

They're in there, just not as a named line. Two theories: 1. FF participated in the Apr '23 Series E ($300M at $27B post) and Mar '25 Series F ($40B at $300B post), both times as a smaller participant. At $852B their aggregate position likely falls below the threshold for a named line, same reason Tiger Global, Flat Capital and K2 Global don't show up despite being in the same rounds. 2. FF invested via an SPV sitting inside line 26 ("D1/Appalossa/Sands/Winslow/Goanna/Pare" at 0.48%). An opaque legal entity name would be invisible to an analyst reconstructing this from partial data. Either way, this is an estimated reconstruction not an official disclosure, so I'd take it with a grain of salt.

857

benbieler retweeted

WillC

@willchen500

2 months ago

Harvey is valued at $11B. Legora just raised at $5.5B. I built their entire web application in two weeks and I'm making it open-source and free for everyone to use. Say hi to Mike: https://t.co/NdtTt5MSJ2. When I got the chance to try Harvey and Legora, I was surprised by how simple they were. A thought came to mind: I could probably build something similar in no time at all with Claude. And so I did. Assistant, project, tabular review and workflows. You get it all without vendor lock-in. Mike offers law firms an alternative, where they own the application layer and aren't stuck with a vendor they're renewing forever. You can try Mike in the demo on the website, or go to the GitHub link on the site to download the code and run a local version yourself.

250

234

Ben Bieler

@benbieler

2 months ago

@blwiertz lfg 🚀

Ben Bieler

@benbieler

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users