Dan Robinson @danlovesproofs - Twitter Profile

Pinned Tweet

6 months ago

We built a bug finder. We're finding serious, "let's fix that right now" issues in every codebase we run it on. Introducing Detail!

danlovesproofs's tweet photo. We built a bug finder. We're finding serious, "let's fix that right now" issues in every codebase we run it on.

Introducing Detail! https://t.co/BGgVFvRlBH

28

358

24

293

113K

Dan Robinson

@danlovesproofs

1 day ago

twelve alien spacecraft arrive around the world; the military assembles a team led by an expert linguist to attempt to make contact.

danlovesproofs's tweet photo. twelve alien spacecraft arrive around the world; the military assembles a team led by an expert linguist to attempt to make contact. https://t.co/Y515QyicW6

0

1

0

138

danlovesproofs retweeted

˗ˏˋ Jesse Hanley ˎˊ˗

@jessethanley

7 days ago

@BentoumiTech @danlovesproofs Detail is the best poorly marketed product in the AI tool space.

4

8

1

808

Dan Robinson

@danlovesproofs

8 days ago

If you want to run this on your own codebase, check out Detail. The biggest difference vs Ramp's scan pipeline is that Detail has an additional feedback loop: we track which vulns get fixed, and we use that data to prioritize better going forward. So when you scan your code with Detail, we're making use of fix / no-fix decisions from across our customer base. In fact, we don't even look for vulns in particular – just high-value bugs. But if you find many bugs and optimize for which ones get fixed, you'll organically find a lot of vulns, because when a vuln comes in that is bad (and real) engs jump all over it. You can think of this as the bitter lesson coming for security scanners. Don't write a scanner that looks for SQL injections, write a scanner that looks for bugs and do a good job picking the bugs that engs are going to fix, and you'll end up finding the SQL injections.

Ramp Labs

@RampLabs

8 days ago

https://t.co/YHN5Hy4Ddf

15

206

24

281

303K

1

7

1

7

1K

Who to follow

Exa

@ExaAILabs

We're an AI research lab building a search engine for the future

Grumpy Cat

@cyrus_msk

Data driven risk manager/ Data scientist in finance

Rohan Arora

@rohanarora_

creating https://t.co/5p2qx3zSUN / https://t.co/ywHO1fcqhu ML @JHUAPL BioE @Cal_Engineer. YCW23 https://t.co/CKGrLxi0l5

Dan Robinson

@danlovesproofs

8 days ago

@BenjaminHouy @detaildotdev 🙏🙏🙏

0

1

0

282

Dan Robinson

@danlovesproofs

9 days ago

@0x15f @detaildotdev unbothered. moisturized. happy. in his lane. focused. flourishing. 258 bugs lighter.

0

3

0

134

Dan Robinson

@danlovesproofs

29 days ago

@sachiniyergreen @tankots get this man on a billboard, he has become wispr's strongest soldier.

0

1

0

128

Dan Robinson

@danlovesproofs

29 days ago

you may not like it but this is what peak performance looks like

1

7

1

0

260

Dan Robinson

@danlovesproofs

about 1 month ago

@BenjaminHouy @detaildotdev Send feature requests anytime!

1

0

62

danlovesproofs retweeted

Benjamin Houy

@BenjaminHouy

about 1 month ago

Really impressed with @detaildotdev. Connected it to GitHub, received an email with a list of bugs a few hours later. All real bugs. One critical. Will make my overwhelmed indie hacker life much easier.

1

3

1

0

321

Dan Robinson

@danlovesproofs

about 1 month ago

The hard part about building a bug finder is that every codebase has thousands of bugs, and the vast majority do not matter. Human attention is the bottlenecking resource in building software right now, so the right goal is: how do we pick out the 1% of bugs that are important, and put those ones in front of engineers? Fun post here about how we used a chess tournament to determine that the bugs we're finding are several sigmas more important than the typical CR bot comment.

Detail

@detaildotdev

about 1 month ago

We ran a chess tournament for bugs. The question we wanted to answer: are bugs from Detail "important"? How do they compare to what code review bots catch? One of the most important ways we benchmark ourselves is that we want the bugs we generate to be significantly more important than the typical comment from a code review bot. We took a week of findings from CR bots running on OpenClaw and vLLM, plus findings from Detail on the same week of changes. We put them through an LLM-as-judge tournament. We fed the head-to-head results into a Bradley-Terry model to compute ELO ratings for bugs. Out comes a global ranking from most to least important. Awesome exploration from @sachiniyergreen below, with methodology, charts, and a PostHog secret exfiltration vuln that four code review bots missed.

detaildotdev's tweet photo. We ran a chess tournament for bugs.

The question we wanted to answer: are bugs from Detail "important"? How do they compare to what code review bots catch?

One of the most important ways we benchmark ourselves is that we want the bugs we generate to be significantly more important than the typical comment from a code review bot.

We took a week of findings from CR bots running on OpenClaw and vLLM, plus findings from Detail on the same week of changes. We put them through an LLM-as-judge tournament.

We fed the head-to-head results into a Bradley-Terry model to compute ELO ratings for bugs. Out comes a global ranking from most to least important.

Awesome exploration from @sachiniyergreen below, with methodology, charts, and a PostHog secret exfiltration vuln that four code review bots missed.

4

10

3

0

1K

0

5

0

2

612

Dan Robinson

@danlovesproofs

about 1 month ago

@KeiraArts @jessethanley @MostlyTechPod @IanLandsman @aarondfrancis Amazing! Send feedback & feature requests anytime, DM or myfirstname at product domain.

0

165

Dan Robinson

@danlovesproofs

about 1 month ago

I'll update the site! The backstory, if you were curious: - We built a bug finder and optimized around finding bugs that were confidently worth an engineer's attention. - One of the main signals we use for whether a bug is worth an engineer's attention is: if we put a bug in front of an engineer, will they fix it? (We track this and optimize against it.) - Security vulns perform very well here. So, organically, Detail started finding a lot of vulns. We aren't security researchers by background. We aren't building SAST or managing your supply chain. We're trying to find bugs you'll care about, and low-hanging vulns are a category of bugs that you'll probably care a lot about. We stumbled into this use case, and we're figuring out how to serve it better. In some sense, this is bitter lesson in action: don't build a security scanner, instead build a codebase defect scanner and optimize for what engineers judge to be worth fixing, and in practice you'll find a lot of the vulns that matter as part of doing that. But we find other high value issues too: data loss bugs, billing mistakes, etc.

0

16

danlovesproofs retweeted

Darian Moody

@djm_

about 2 months ago

We have spent over £30k on annual pen tests over the last 3yrs. They found things, but nothing major. In the last months, https://t.co/5vkny9csbH has uncovered nearly 10 complex high-sev auth/idor vulns on our endpoints. Less than $500 bucks.

1

6

1

570

Dan Robinson

@danlovesproofs

about 2 months ago

@chadwhitacre @_m27e It's satisfying to watch, which matches the feeling we want our product to give you: the "ahhhh that's better" that people get from powerwash / weedwacking / deepclean videos. Except for your codebase.

0

23