We've been doing a lot of scanning of open source repositories in the past month or so since @OpenAI opened up it's Trusted Access for Cyber program.
We’ve had impressive results with GPT 5.5 without guardrails as part of that program. This led us to test out the effectiveness of other, more widely available models.
The question was, could good (or bad) actors without access to a formal program like those from OpenAI or Anthropic use other models to find similar vulnerabilities in source code as well?
The short answer is yes. And also that different models may find different types of vulnerabilities so you might want to have a multi-model approach to your AI code scanning efforts.
And for us the surprising star of the show? @cursor_ai 's widely available Composer 2.5 model. Best price / performance by a significant margin.
So if you're not in a geography from which you can get access to either Anthropic or OpenAI's security programs, you do have options (and of course, so do the bad guys, so let's get going!).
I tested the code scanning capabilities of 10 different, widely available, large language models (without cyber guardrails) using the same real-world codebase and asked: how good are the LLMs at finding security vulnerabilities?
I used the same methodology that I’ve been using to find 100s of vulnerabilities in open source software and libraries over the past month.
After model self-review, deduplication of the same findings across models and an independent model assessment, I found 350 distinct vulnerabilities.
The headline: on a single run, no model found more than ~35% of them, and false-positive rates ranged from ~2% to ~30%.
These were our results: 🧵
LiveSponsors swapped a brittle MySQL + MongoDB stack for SurrealDB Cloud and dropped query times from 20 seconds to 7 milliseconds, while halving infra complexity. 🚀
Read how they did it: https://t.co/Oj13fUkZyZ
If you're looking for a framework to understand and communicate the components of an agentic system I've found this talk by @EnoReyes as a particularly helpful. Well worth a watch.
https://t.co/WNiaK0P5aw
Today we’re joined by Kamyar (@Azizzadenesheli), a staff researcher at @nvidia. We talk about the rise of AI agents powered by LLMs, and how RL is being used today to solve real-world problems.
🎧/📷: https://t.co/GJ9PJ7rICC
Here are some key takeaways; (1/6)
@FlyAirNZ@RCarlyleAuthor You mention 'green hydrogen' - it might be "green" to make it (e.g. via hydro or geo-thermal) but if you then burn it you get nitrogen oxides which have a climate change effect. Or did you mean you will be using it in fuel cells?
Pumped to be partnering with @0xMagmar, @BPIV400 and the @SkipProtocol team building critical infrastructure to democratize and enhance value to all ecosystem participants across the Cosmos! ⏭ 🚀
Shoutout to @Georgian_io, whose team were users before they were investors.
Thanks to existing investors @KickstartFund and @ForwardVc for being so supportive, and doubling down in a meaningful way.
Thanks to new investors at @founderscoop and @OsageVP for your conviction.
3 years ago today, we started Trinsic to make decentralized identity real.
Today we announce our $8.5M seed round. 🎉
What happened in between? And what problem does decentralized identity solve anyway?
Here’s a recap of our journey so far: https://t.co/cPw6iIYYXI
@ceramicnetwork is one of the most interesting decentralized data projects out there right now and @3boxlabs is one of the core teams behind it. We recently spoke to @web3lauren about the project and it's a great overview, well worth a listen...
https://t.co/RWhUKmY8mE
Also check out @Georgian_io most recent episode on the Impact Podcast with 3Box Labs’ @web3lauren on how decentralized data networks are the foundation of building powerful applications in #web3
As an early-stage founder, you know how important it is to get your GTM strategy right.
Head of Marketing and Growth, @benrwilde, will be leading a GTM Strategy workshop on May 25th @ 3pm-4pm EDT.
Register now 👇
@trusona_inc Introduces Authentication Cloud, Delivering Passwordless Sign-Ins Without an App to Improve Business Growth and Profitability https://t.co/tHTaPdMlV7 #martech#marketing#Technology#Trusona
🚨 ANNOUNCEMENT:
Trusona appointed to @FIDOAlliance Board of Directors with Kevin Goldman, Chief Experience Officer, as Primary Board Delegate — providing human-centered design influence to the global #passwordless standard.
🔗 Read the release:
https://t.co/pdgvAwOSdf
@PoolChess I had a ZX-81 and then a Spectrum 48K here in New Zealand. Thanks for your work on the lecture notes for the Plutus Pioneer Program. Great stuff Chris.