Over a decade of security research and engineering channeled into securing emerging threats
γ €
CTO @audit_wizard π§ββοΈπͺπͺ
γ €γ €γ €γ €γ €γ €γ €
Co-Founder @hackstackapp
Auditing is finding bugs others missed.
Evolution of solving smart contract security (aka how to find bugs in 2026):
β Static analysis is blind to logic
β Fuzzing may find logic bugs by accident, but is blind to integrations
β Invariant fuzzing covers logic + state, but blind to spec gaps and unknown invariants
β Formal verification proves what you specify, but blind to what you didn't specify
β Spec-to-code compliance catches spec gaps, but blind to implicit assumptions never written
β Human adversarial reasoning covers the rest - but doesn't scale
β AI pattern reasoning scales human thinking based on past patterns, but not novel ones ("This looks like a reentrancy. I've seen reentrancy before. check for reentrancy.")
πΎπΎπΎ Then there was first-principles reasoning
"This contract holds ETH. ETH can move. Who controls when it moves? What happens if it moves at an
unexpected time? What state is inconsistent if that happens?"
With Claude Code + Skills in 2026 we can remove the blind spot, but it requires:
π§ deep domain expertise
π₯· creativity
π° funds
AI skills are the solution for first-principle reasoning. however, auditors ignore the 3 requirements.
They do so by:
βπ§ Skipping the domain expertise, and asking AI to generate the checklists
βπ₯· Skipping creativity by copying each others fully AI-generated logic
βπ° Careless on token spend
The alpha is that to find bugs in 2026 you need to work in researching the deep extra specific domain expertise, add your creativity and personal takes to it, and optimize it like every word matters (which it does). This doesn't scale right away but it does over time.
Exactly how it's done today:
π 3 terminal tabs, each running tmux with 2-4 claude code panes
π claude sonnet 4.6
πͺ¨ caveman skill for token optimization
π§ obsidian vault for memory and organization
π secret ingredient: research what you audit
π§ͺ occasional experiments with guest skills from our community's finest
If you are doing this too, interested to learn and research together and want to hunt bugs - we should be friends
Comment below, and I'll DM π€
Hey Protocol Labs founders and builders π
We're hosting a hands-on OpSec workshop exclusively for @ProtocolLabs portfolio founders and developers in New York City.
ποΈ Tuesday, June 9 Β· 10:00 β 11:00 AM
πOASIS by Workville, NYC
We'll be covering practical security practices for teams building in Web3:
β Key management & storage
β Device security
β Wallet & multisig handling
β Incident response fundamentals
β Auth & security policies
Led by our CEO, @joe_vanloon
Want a head start or can't make it in person? Check out the training materials here:
https://t.co/9Xh3vOi9yJ
Grab your spot below π
https://t.co/1isu8Nv99j
See you there!
I'm in love with this saas deployment GDPR-vibe-bypass architecture
Bring Your Own Cloud
Taken from ClickHouse public docs - perfect demonstration of how to deploy software to customer cloud infrastructure, while keeping them still regulated on their own terms (they are in-charge of the infrastructure and data retainment, you still manage updates and fixes)
agent-onboarding skill is another easy trick to enforce agents to "register" to a workforce
βΌοΈ not only for devs, but useful for audits too
I know there are many built-in team features in various solutions,
tbh there's also just git lol
but if you just want to randomly run 5 different agents on the same codebase with minimal collisions and 0 setup you just invoke this at the stat of each context
the agent will:
- give itself a name based on contextual purpose
- read current project TODO.md (or create one)
- will constantly look out for join work
- update the tasks on the TODO when done
quite cool for coverage tracking too π
https://t.co/2moZ8svoIr
tiny-auditor is a skill of 48 lines that was hand-crafted based on 10 years of experience in writing
not trying to show off, but this is how we need skills to be made:
βοΈ Author
> author's linkedin history matches what the skill does
> it takes a few hours to write down without AI, but it took a decade of hard work an real expertise to learn enough to even know what are the sweet spots
π€ No AI
> knowledge is not enough, if you dilute it with AI writing the skill for you based on your instructions (increases token cost)
π¨βπ Write while doing the WORK
> it needs to be polished WHILE working with it, for example if you are writing a report manully you are more focused on the edge cases of the skill user because you are one, without active work on the side it won't be as good
In 10 years I must've issued thousand of reports, reviewed, presented to fortune 500 customers and board members, got feedback, got burned, reiterated, written knowledgebases - you get it.
I put all that experience and all my focus to write how a PERFECT security report should look like
Please share your thoughts! these are the skills we need, not mega-AI spam from all directions
https://t.co/H6RGIAKXai
Hey Protocol Labs founders and builders π
We're hosting a hands-on OpSec workshop exclusively for @ProtocolLabs portfolio founders and developers in New York City.
ποΈ Tuesday, June 9 Β· 10:00 β 11:00 AM
πOASIS by Workville, NYC
We'll be covering practical security practices for teams building in Web3:
β Key management & storage
β Device security
β Wallet & multisig handling
β Incident response fundamentals
β Auth & security policies
Led by our CEO, @joe_vanloon
Want a head start or can't make it in person? Check out the training materials here:
https://t.co/9Xh3vOi9yJ
Grab your spot below π
https://t.co/1isu8Nv99j
See you there!
@mikotabrbrbr public portfolio with report links: https://t.co/KXJ1aA181d
personal linkedin (can show you I've been doing pentests since 2017 - web2 clients usually like their reports NDA'd https://t.co/srlR1rFgBJ)
my AI assisted audits strategy (may 2026): https://t.co/Dzwoio8DC0
Auditing is finding bugs others missed.
Evolution of solving smart contract security (aka how to find bugs in 2026):
β Static analysis is blind to logic
β Fuzzing may find logic bugs by accident, but is blind to integrations
β Invariant fuzzing covers logic + state, but blind to spec gaps and unknown invariants
β Formal verification proves what you specify, but blind to what you didn't specify
β Spec-to-code compliance catches spec gaps, but blind to implicit assumptions never written
β Human adversarial reasoning covers the rest - but doesn't scale
β AI pattern reasoning scales human thinking based on past patterns, but not novel ones ("This looks like a reentrancy. I've seen reentrancy before. check for reentrancy.")
πΎπΎπΎ Then there was first-principles reasoning
"This contract holds ETH. ETH can move. Who controls when it moves? What happens if it moves at an
unexpected time? What state is inconsistent if that happens?"
With Claude Code + Skills in 2026 we can remove the blind spot, but it requires:
π§ deep domain expertise
π₯· creativity
π° funds
AI skills are the solution for first-principle reasoning. however, auditors ignore the 3 requirements.
They do so by:
βπ§ Skipping the domain expertise, and asking AI to generate the checklists
βπ₯· Skipping creativity by copying each others fully AI-generated logic
βπ° Careless on token spend
The alpha is that to find bugs in 2026 you need to work in researching the deep extra specific domain expertise, add your creativity and personal takes to it, and optimize it like every word matters (which it does). This doesn't scale right away but it does over time.
Exactly how it's done today:
π 3 terminal tabs, each running tmux with 2-4 claude code panes
π claude sonnet 4.6
πͺ¨ caveman skill for token optimization
π§ obsidian vault for memory and organization
π secret ingredient: research what you audit
π§ͺ occasional experiments with guest skills from our community's finest
If you are doing this too, interested to learn and research together and want to hunt bugs - we should be friends
Comment below, and I'll DM π€
tiny-auditor is a skill of 48 lines that was hand-crafted based on 10 years of experience in writing
not trying to show off, but this is how we need skills to be made:
βοΈ Author
> author's linkedin history matches what the skill does
> it takes a few hours to write down without AI, but it took a decade of hard work an real expertise to learn enough to even know what are the sweet spots
π€ No AI
> knowledge is not enough, if you dilute it with AI writing the skill for you based on your instructions (increases token cost)
π¨βπ Write while doing the WORK
> it needs to be polished WHILE working with it, for example if you are writing a report manully you are more focused on the edge cases of the skill user because you are one, without active work on the side it won't be as good
In 10 years I must've issued thousand of reports, reviewed, presented to fortune 500 customers and board members, got feedback, got burned, reiterated, written knowledgebases - you get it.
I put all that experience and all my focus to write how a PERFECT security report should look like
Please share your thoughts! these are the skills we need, not mega-AI spam from all directions
https://t.co/H6RGIAKXai
npm user?
β‘οΈ One small change to stay safe, FREE
Add these aliases
β‘οΈ pkg installs forbid using known malware
I run this:
- locally, to stay safe
- in my CI to detect compromised transitive dependencies early for my lib consumers
π supply-chainability
term to describe how much your devs are prone to get rekt by a supply chain attack
how much would you say is an average supplychainability % in a small dev team nowadays?
Claude Code steering behaviour that causes your agent to forget stuff:
(for devs and auditors alike)
> implement X
>> Thinking...
> oh also implement Y
the latter creates an interrupt signal that will cause X to finish half-way, creating hidden slop
FIX:
> implement X
>> Thinking...
> side job: implement Y
depending on the task it will auto trigger a background task or just be clearly instructed that this is not a steer but an added request
Also token-efficient!
Agent Harness Engineering Pattern #8
Steering πͺ (and I bet you didn't know you cared)
Ever thought how it affects your chat when you're interrupting claude code mid-turn?
Partial turn still in transcript
>> Hi
>> [interrupted]
>> I mean bye
All remains in context
Good because:
- Model can push back - "was 90% done, should I finish first?"
- Can reference the cut itself e.g. "when I stopped you.."
Bad because:
- Many interruptions is filling the context window with junk
- Late interrupt mean all prior tool calls still in window, still cost tokensβΌοΈ π±
Literally github gets infected from a malicious vscode extension
AI is lowering all the security standards we worked on raising the last decade and last 2 months are clear evidence of that
We are investigating unauthorized access to GitHubβs internal repositories. While we currently have no evidence of impact to customer information stored outside of GitHubβs internal repositories (such as our customersβ enterprises, organizations, and repositories), we are closely monitoring our infrastructure for follow-on activity.
I gave @VitalikButerin "A shallow dive into formal verification" article to the caveman skill haha
You gotta love gen z knowledge consumption
ππͺ¨
πͺ¨What it is
Formal verification = math proofs machine can check. Lean language. AI
writes proofs now. Paradigm shift happening in Ethereum + broader computing.
πͺ¨Why care (this one is my favorite ππ)
Bugs scary. Bugs in smart contracts β North Korea drain funds. Bugs in ZK
proofs β steal silently, no trace. Powerful AI models automate bug
discovery. Need stronger guarantees.
πͺ¨Core idea
Program is math object. Prove it behaves correctly = math theorem.
Example:
- Signal's X3DH key exchange proven as hard as DDH assumption
- AES impl proven correct
- Together β Signal encryption secure vs passive attackers
End-to-end FV means: not just "protocol is secure in theory" - specific user code proven secure in practice. User checks statement claimed, not
entire codebase.
πͺ¨Key insight
Safe programming = express intent multiple ways, verify all consistent.
FV extends this infinitely: optimized impl + readable impl β verify
match. 10 friends each write property list β check all pass. AI does all
of it fast.
πͺ¨ How to use today
Don't write proofs by hand - too hard. Instead:
- Ask AI write program in Lean (or assembly)
- AI proves desired properties along way
- Task self-verifying β let AI run hours unsupervised
- You only check: final theorem statement matches what you wanted
Best models for Lean proofs: Claude, Deepseek 4 Pro, Leanstral (119B, 6B
active, runs locally ~15 tok/sec).
πͺ¨ Limits
- Easy forget to prove what actually matters
- Easy sneak false assumptions into proofs
- Unverified code parts still bite you
- Even Lean itself can have bugs
- Only value = clarity of theorem statement you check at end
πͺ¨ Bottom line
FV = "final form of software development" (Yoichi Hirai). AI makes it viable now.
Write spec, AI proves code matches spec. Check statement, trust code.
I gave @VitalikButerin "A shallow dive into formal verification" article to the caveman skill haha
You gotta love gen z knowledge consumption
ππͺ¨
πͺ¨What it is
Formal verification = math proofs machine can check. Lean language. AI
writes proofs now. Paradigm shift happening in Ethereum + broader computing.
πͺ¨Why care (this one is my favorite ππ)
Bugs scary. Bugs in smart contracts β North Korea drain funds. Bugs in ZK
proofs β steal silently, no trace. Powerful AI models automate bug
discovery. Need stronger guarantees.
πͺ¨Core idea
Program is math object. Prove it behaves correctly = math theorem.
Example:
- Signal's X3DH key exchange proven as hard as DDH assumption
- AES impl proven correct
- Together β Signal encryption secure vs passive attackers
End-to-end FV means: not just "protocol is secure in theory" - specific user code proven secure in practice. User checks statement claimed, not
entire codebase.
πͺ¨Key insight
Safe programming = express intent multiple ways, verify all consistent.
FV extends this infinitely: optimized impl + readable impl β verify
match. 10 friends each write property list β check all pass. AI does all
of it fast.
πͺ¨ How to use today
Don't write proofs by hand - too hard. Instead:
- Ask AI write program in Lean (or assembly)
- AI proves desired properties along way
- Task self-verifying β let AI run hours unsupervised
- You only check: final theorem statement matches what you wanted
Best models for Lean proofs: Claude, Deepseek 4 Pro, Leanstral (119B, 6B
active, runs locally ~15 tok/sec).
πͺ¨ Limits
- Easy forget to prove what actually matters
- Easy sneak false assumptions into proofs
- Unverified code parts still bite you
- Even Lean itself can have bugs
- Only value = clarity of theorem statement you check at end
πͺ¨ Bottom line
FV = "final form of software development" (Yoichi Hirai). AI makes it viable now.
Write spec, AI proves code matches spec. Check statement, trust code.
Many people have claimed that with AI-assisted bug finding, secure code (and hence trustless anything) will be impossible.
I have a much more optimistic take, and AI-assisted formal verification is a major part of the reason why:
https://t.co/0ceMBZ6uqj
The very core principal is that you start hunting on codebase that went through audits, contests, thousand of eyes, even onchain attackers failed to find an exploit
This means that you have to dive deeper than anyone else either on a niche knowledge expertise or on a peripheral (e.g. condition the protocol does that is not obvious and affects state in a very unexpected way)
Also read this: https://t.co/6CDtlCqr42
For all the auditors getting scared by this contests market shift - let me walk you through bugonomics history ππͺ¨β¬
1β£9β£9β£5β£ Netscape (old browser) paid researchers for bugs which was radical at the time
2β£0β£1β£2β£ @Hacker0x01 and @Bugcrowd dominated the bounty space and no notion of contests
they had private invite-only events which is close, but a contest model didn't fit large web2 companies e.g. Uber Airbnb etc - don't want 500 hackers hammering their servers at a single week
2β£0β£2β£1β£ @code4rena realized that contests are of different nature:
- Smart contracts store loads of money directly, and get hacked like crazy
- Smart contracts are "immutable" - once deployed must find bugs before launch
- Open source means auditor can fully understand logic, not just probe blindly
- More auditor attention, better results
For protocols - contests costs more than bounty
Let's think like a protocol for a second π€
contest = coverage, more eyes, pre-launch safety net
- Pay $200k pool upfront
- Runs 1-4 weeks
- Payout regardless of findings quality (money still gone)
bounty = sparse coverage, reactive not proactive
- Pay $0 until valid bug reported
- Only pay on confirmed severity
- Treasury preserved until hit
in bull markets - protocols don't want to get hacked, they spend what they can (contests + bounty after)
in bear markets - same, but now protocols have no funds - bounty is cheaper
2β£0β£2β£5β£ bear market gets worse, AI spamming submissions left and right making triaging costs increase exponentially
2β£0β£2β£6β£ even worse - still bear market, MORE (way more) AI and there are less new protocols on top of it all
That's why today we are back to web2-style bounties. The protocols that make real money, real impact.
In 2015 people made a living of web2 bounties, this ain't different
@immunefi@HackenProof@xyz_remedy all are live and kicking, and there's money on the table for you to take, harder than before, true - but since when hard stopped us?
π¨π¨ DAILYWARDEN IS DOWN
π¨π¨ DAILYWARDEN IS DOWN
π¨π¨ DAILYWARDEN IS DOWN!!!
I guess we are all officially transitioning back to bounties now ππ
https://t.co/Zi1nv2As8i
Here's where I'd go to next π
https://t.co/qDyrtLy9oi
https://t.co/162cYocydu
@hugoderby230593@Hacker0x01@Bugcrowd Bugs harder to find, because they're very rare and audited already, and you only sometimes get paid
Program wait for your bugs for a long time rather than 2 weeks
Auditing is finding bugs others missed.
Evolution of solving smart contract security (aka how to find bugs in 2026):
β Static analysis is blind to logic
β Fuzzing may find logic bugs by accident, but is blind to integrations
β Invariant fuzzing covers logic + state, but blind to spec gaps and unknown invariants
β Formal verification proves what you specify, but blind to what you didn't specify
β Spec-to-code compliance catches spec gaps, but blind to implicit assumptions never written
β Human adversarial reasoning covers the rest - but doesn't scale
β AI pattern reasoning scales human thinking based on past patterns, but not novel ones ("This looks like a reentrancy. I've seen reentrancy before. check for reentrancy.")
πΎπΎπΎ Then there was first-principles reasoning
"This contract holds ETH. ETH can move. Who controls when it moves? What happens if it moves at an
unexpected time? What state is inconsistent if that happens?"
With Claude Code + Skills in 2026 we can remove the blind spot, but it requires:
π§ deep domain expertise
π₯· creativity
π° funds
AI skills are the solution for first-principle reasoning. however, auditors ignore the 3 requirements.
They do so by:
βπ§ Skipping the domain expertise, and asking AI to generate the checklists
βπ₯· Skipping creativity by copying each others fully AI-generated logic
βπ° Careless on token spend
The alpha is that to find bugs in 2026 you need to work in researching the deep extra specific domain expertise, add your creativity and personal takes to it, and optimize it like every word matters (which it does). This doesn't scale right away but it does over time.
Exactly how it's done today:
π 3 terminal tabs, each running tmux with 2-4 claude code panes
π claude sonnet 4.6
πͺ¨ caveman skill for token optimization
π§ obsidian vault for memory and organization
π secret ingredient: research what you audit
π§ͺ occasional experiments with guest skills from our community's finest
If you are doing this too, interested to learn and research together and want to hunt bugs - we should be friends
Comment below, and I'll DM π€
For all the auditors getting scared by this contests market shift - let me walk you through bugonomics history ππͺ¨β¬
1β£9β£9β£5β£ Netscape (old browser) paid researchers for bugs which was radical at the time
2β£0β£1β£2β£ @Hacker0x01 and @Bugcrowd dominated the bounty space and no notion of contests
they had private invite-only events which is close, but a contest model didn't fit large web2 companies e.g. Uber Airbnb etc - don't want 500 hackers hammering their servers at a single week
2β£0β£2β£1β£ @code4rena realized that contests are of different nature:
- Smart contracts store loads of money directly, and get hacked like crazy
- Smart contracts are "immutable" - once deployed must find bugs before launch
- Open source means auditor can fully understand logic, not just probe blindly
- More auditor attention, better results
For protocols - contests costs more than bounty
Let's think like a protocol for a second π€
contest = coverage, more eyes, pre-launch safety net
- Pay $200k pool upfront
- Runs 1-4 weeks
- Payout regardless of findings quality (money still gone)
bounty = sparse coverage, reactive not proactive
- Pay $0 until valid bug reported
- Only pay on confirmed severity
- Treasury preserved until hit
in bull markets - protocols don't want to get hacked, they spend what they can (contests + bounty after)
in bear markets - same, but now protocols have no funds - bounty is cheaper
2β£0β£2β£5β£ bear market gets worse, AI spamming submissions left and right making triaging costs increase exponentially
2β£0β£2β£6β£ even worse - still bear market, MORE (way more) AI and there are less new protocols on top of it all
That's why today we are back to web2-style bounties. The protocols that make real money, real impact.
In 2015 people made a living of web2 bounties, this ain't different
@immunefi@HackenProof@xyz_remedy all are live and kicking, and there's money on the table for you to take, harder than before, true - but since when hard stopped us?
π¨π¨ DAILYWARDEN IS DOWN
π¨π¨ DAILYWARDEN IS DOWN
π¨π¨ DAILYWARDEN IS DOWN!!!
I guess we are all officially transitioning back to bounties now ππ
https://t.co/Zi1nv2As8i
Here's where I'd go to next π
https://t.co/qDyrtLy9oi
https://t.co/162cYocydu
π¨π¨ C4 SHUTS DOWN (and what does it mean)
> since June last year @zellic_io did not take any profit to themselves for keeping @code4rena alive despite platform obvious costs
> why? we can defer that Zellic's customers enjoyed the services there, and that its hell of a business lead-gen to be this middleman, even for free
> bear market + AI submission spam is a bad combo, but even worse that it continues overtime without breathing air
to many the OG stepping down might signal "contests are dead" (which was already the vibe with thedailywarden homepage) but to me it just says that it's a hard business running a contest platform nowadays
if you're an auditor, don't use it as an excuse to give up - but take the lesson here that "easy" wins are no longer valuable - contests that pay need real criticals, real impact, hard research, niche focus areas and strengths
its your time to shine βοΈ
thanks @code4rena for reimagining crowdsourced security