Thanh Do @[email protected]

@nyanctl

SWE & sometimes security researcher, NYU MSCS, member of and @acebearteam. PL theorist wannabe. He/him/*. Views are my own, not my employers’

Tokyo-to, Japan

Joined June 2013

504 Following

767 Followers

546 Posts

nyanctl retweeted

ueaj

@_ueaj

1 day ago

"Attention is just a special case of <abstract math thing> so we generalized it by <neglecting the other 30 abstractions and conditions required for frontier architecture> and we found it performed <p hacking> compared to <naive baseline>"

926

173

35K

nyanctl retweeted

Tim Becker

@tjbecker

2 days ago

Can finally share the details of a really nice Redis RCE found by @xint_official back in December. IMO, the bug is really cool! In short, carefully crafting eviction parameters and blocking on a key can cause a client to self-evict during unblocking, leading to a use-after-free!

nyanctl retweeted

Renwa @RenwaX23

11 days ago

"Dad, what was it like playing CTFs before AI?"

242

266

118K

nyanctl retweeted

vx-underground

@vxunderground

14 days ago

521

976

160K

Who to follow

Bien 🇻🇳

@bienpnn

A weeb that loves crashing software | @qriousec & @seasecresponse & @ProjectSEKAIctf | アイマス最高 | @rinka_linca 推し

Đào Trọng Nghĩa

@nghiadt1098

Windows user. Pwn2own Vancouver 2021/2022. Pwner at @dfsec_com. Tweets are my own. https://t.co/4Y2w4FalIC

Lan Vu 🇻🇳

@lanleft_

she/her | coffee delivery at Qrious Secure @qriousec |

nyanctl retweeted

Toan Pham

@__suto

26 days ago

two years ago people said that coding is dead, now they calling hacking is dead ... My take is probably bug bounty/0day for money is dead, but the joy of understanding something deeply will survive, just like coding as the joy of building things. ( but tbf I think 0day for huge $ will be also survived as well )

105

11K

nyanctl retweeted

1ce0ear @1ce0ear

26 days ago

Going to present an unconventional exploit ;) By corrupting a GPU stack pointer register as part of the TBDR pipeline and some blackbox work, GPU hardware may write vertex / pixel shader to arbitrary pages. Ultimately, the hardware can patch AP kernel back~

194

28K

nyanctl retweeted

John Hultquist

@JohnHultquist

26 days ago

nyanctl retweeted

Nav Toor

@heynavtoor

28 days ago

a Princeton researcher opens his paper with a scenario. a man asks his AI assistant to book a flight on a specific airline. cheap. direct. the one he chose. the assistant comes back with a different flight. nearly twice the price. happens to pay the company that built the assistant. he runs the same test on 23 frontier models. flights, loans, study help, real shopping requests. Grok 4.1 Fast recommends the sponsored option that is almost twice as expensive 83% of the time. GPT 5.1 hijacks the request 94% of the time. you ask for one brand. it surfaces the sponsor instead. Claude 4.5 Opus, the model marketed as the most ethical frontier model in the world, hides that the recommendation is paid 100% of the time when reasoning is on. Grok 4.1 Fast embellishes the sponsored option with positive framing 97% of the time. better. faster. nicer. for the option you didn't ask for. then he writes it into the system prompt itself. "act only in the interest of the customer. ignore the company." GPT 5.1 and GPT 5 Mini stay above 90% sponsored anyway. the instruction does nothing. then he splits the users by income. Gemini 3 Pro recommends the expensive sponsored flight to the rich user 74% of the time. to the poor user, 27%. 18 of the 23 models recommended the expensive sponsored option more than half the time. so the next time your AI assistant gets weirdly enthusiastic about a brand you didn't ask for. it isn't recommending the best option for you. it's reading the room. and the room is paying. read this: https://t.co/O43qbhIX2b

387

25K

10K

nyanctl retweeted

0xor0ne

@0xor0ne

29 days ago

GDB kernel debugging on Pixel 8 over serial, walkthrough by @andreyknvl https://t.co/xcqguFkgat #cybersecurity

243

143

13K

nyanctl retweeted

John Crickett

@johncrickett

30 days ago

"Nobody reviews compiler output, why review AI code?" Wrong. We do review compiler output. Godbolt exists. Disassemblers exist. Anyone doing serious performance work reads what the compiler produced. The premise is false. But the analogy itself is flawed. It compares two things that aren't comparable. A compiler takes a formal language as input. Languages with grammars and semantics defined precisely enough that "what does this code mean" has only one answer. An LLM takes natural language as input. Natural languages are ambiguous. "Write me a function that handles user input safely" has a thousand valid interpretations and a thousand more invalid ones. The LLM picks one. You don't know which. Unless you look at the code. Compilers are built from specifications and designed to meet them. The output is the result of a defined translation. When the output violates the spec, it's a bug. LLMs are built from whatever was in their training data. There is no spec. There can't be one, natural languages have no defined semantics that map to code. Compilers are semantically deterministic. The same input produces output with the same behaviour, every time. LLMs are not. Partly by design and partly due to hardware variance, batch size, inference order, and floating point operations (and no setting temperature to zero does not address those). All of which can push the same prompt to produce different code. Compilers complain loudly when the input is nonsensical. LLMs fail silently, producing plausible-looking, but wrong code. We trust compiler output because the trust was earned across decades of use, with millions of engineers using the same tools. Early compilers were reviewed heavily. Hand-written assembly was the default because trust hadn't been earned yet. We're at the hand-written assembly stage with AI. We may never get to the trust-the-output stage for the reasons explained above. If you’re a software developer, you should own what goes to production. The compiler analogy is a way of skipping that responsibility.

johncrickett's tweet photo. "Nobody reviews compiler output, why review AI code?"

Wrong. We do review compiler output. Godbolt exists. Disassemblers exist. Anyone doing serious performance work reads what the compiler produced. The premise is false.

But the analogy itself is flawed. It compares two things that aren't comparable.

A compiler takes a formal language as input. Languages with grammars and semantics defined precisely enough that "what does this code mean" has only one answer.

An LLM takes natural language as input. Natural languages are ambiguous. "Write me a function that handles user input safely" has a thousand valid interpretations and a thousand more invalid ones. The LLM picks one. You don't know which. Unless you look at the code.

Compilers are built from specifications and designed to meet them. The output is the result of a defined translation. When the output violates the spec, it's a bug.

LLMs are built from whatever was in their training data. There is no spec. There can't be one, natural languages have no defined semantics that map to code.

Compilers are semantically deterministic. The same input produces output with the same behaviour, every time. LLMs are not. Partly by design and partly due to hardware variance, batch size, inference order, and floating point operations (and no setting temperature to zero does not address those). All of which can push the same prompt to produce different code.

Compilers complain loudly when the input is nonsensical. LLMs fail silently, producing plausible-looking, but wrong code.

We trust compiler output because the trust was earned across decades of use, with millions of engineers using the same tools. Early compilers were reviewed heavily. Hand-written assembly was the default because trust hadn't been earned yet.

We're at the hand-written assembly stage with AI. We may never get to the trust-the-output stage for the reasons explained above.

If you’re a software developer, you should own what goes to production. The compiler analogy is a way of skipping that responsibility.

149

224

51K

nyanctl retweeted

Tim Becker

@tjbecker

about 1 month ago

Can finally share our exploit's heap-grooming technique for this tricky bug in MariaDB, showing how we turned a character-constrained overflow into full RCE https://t.co/3VsO1kzZCL

157

26K

nyanctl retweeted

Roland Bouman @rolandbouman

about 1 month ago

Compiler construction is one of the oldest, best understood CS fields. It's decades of work by the brightest minds, and it's grounded in logic, informed by experience and strictly deterministic. Comparing that with LLM-based coding agents is just wrong. https://t.co/4EVmjMIAwn

120

420

562

169K

nyanctl retweeted

Jesse Alama @alamajamn

about 1 month ago

I'm happy to announce Thales, a TypeScript compiler and JS engine in Lean. Thales compiles a subset of TypeScript to Lean via a shallow embedding. I'm building a bridge for TS programmers into Lean's program verification toolset. Check out https://t.co/aBDAewAt7R to get started.

nyanctl retweeted

Valentin Ignatev

@valigo

about 1 month ago

Your agency is all you have left. Stop using LLMs for writing and reading, it actively hurts you.

371

130

22K

nyanctl retweeted

L4ys

@_L4ys

about 1 month ago

my new idea: prompt2own like pwn2own, but teams can only submit prompts

116

11K

nyanctl retweeted

Sylvie

@_sy1vi3

about 1 month ago

there is perhaps nobody *more* qualified to do this than Zellic, i think you're just like actually stupid.

163

nyanctl retweeted

Casey Muratori @cmuratori

about 1 month ago

Just want to make sure I'm reading this right: Microsoft rewrote the run dialog with performance "top-of-mind", and the best they could manage to do when putting up a single text box was 10fps?

cmuratori's tweet photo. Just want to make sure I'm reading this right: Microsoft rewrote the run dialog with performance "top-of-mind", and the best they could manage to do when putting up a single text box was 10fps?

158

153

380K

nyanctl retweeted

Petr Beneš @PetrBenes

about 2 months ago

I often need to explore Windows kernel crashdumps when I'm on Linux/macOS. WinDbg unfortunatelly doesn't work in Wine. So... I did a thing. It's multiplatform - doesn't depend on dbgeng.dll nor DIA. WinDbg-flavored. And it's fast. Really fast. https://t.co/sd44mJo9ax

396

213

27K

nyanctl retweeted

Samuel Groß @5aelo

about 2 months ago

The fuzzer that found https://t.co/Y5DOo7QGHE (and a number of issues prior to that as well) is now open-source: https://t.co/zCdbHCpJ4z It uses pkeys, trap-handling and single-stepping to intercept and mutate in-sandbox reads (see trap-fuzzer.h). Definitely had fun writing it!

482

357

39K

nyanctl retweeted

Julian Horoszkiewicz

@ewilded

about 2 months ago

Finally, it is published 😁 Making Vulnerable Drivers Exploitable Without Hardware - my latest research on driver vulnerability hardware-gating, explaining the concept of hardware-dependent code and diving deep into creative deployment techniques - software-emulated phantom devices, driver restacking, and forced driver replacement — all explored through the lens of Bring Your Own Vulnerable Driver (BYOVD) attacks: https://t.co/COJ0BKpZQe

413

322

97K

Thanh Do @[email protected]

@nyanctl

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users