spinpx @spinpx - Twitter Profile

spinpx retweeted

7 months ago

Happy to share our work "Cottontail: LLM-Driven Concolic Execution for Structured Test Input Generation" will appear in S&P'26! Paper: https://t.co/vIxZD5BGE2 Code: https://t.co/NxuD4wwNF4 Special thanks to @nim_gnoes_eel, @JNUYUXIAN, @spinpx, @LingxiaoJiang, and @mboehme_ ♥️

tuhaoxin's tweet photo. Happy to share our work "Cottontail: LLM-Driven Concolic Execution for Structured Test Input Generation" will appear in S&P'26!

Paper: https://t.co/vIxZD5BGE2

Code: https://t.co/NxuD4wwNF4

Special thanks to @nim_gnoes_eel, @JNUYUXIAN, @spinpx, @LingxiaoJiang, and @mboehme_ ♥️ https://t.co/2wqfYiFaNe

6

11

7

3

1K

spinpx @spinpx

over 1 year ago

(Prompt) Fuzzing is fundamentally a form of inference scaling law. https://t.co/ZEuMCjU8R8

0

154

spinpx retweeted

Dongdong She @DongdongShe

almost 2 years ago

What happens if you write buggy code and misconfigure the experimental setup when evaluating a fuzzer’s performance? Wrong and misleading conclusion! We found several fatal bugs and wrong experimental settings in MLFuzz (https://t.co/KoDGAUYJ95, a revisit work on NEUZZ published on a top tier software engineering conference ASE 2023, @AndreasZeller, @ASE_conf ). These following bugs lead to wrong and misleading conclusions in MLFuzz. • An initialization bug ⇒ Failure setup of persistent mode fuzzing. • A program crash ⇒ Unexpected early termination of NEUZZ. • An error in training dataset collection ⇒ A poorly-trained neural network model. • An error in result collection ⇒ Incomplete code coverage report We confirmed these bugs with the MLFuzz’s authors and write a rebuttal paper(https://t.co/Pyp3RalEqt) to explain the errors in MLFuzz and summarize the lessons on a fair and scientific fuzzing experiment/revisit. 1. Ensure the correctness of code implementation. Careful and rigorous debugging is needed. If you would like to patch a prior work, double-check your setting or patch is correct and seek help from original developer if needed. MLFuzz introduced 3 implementation bugs that led to wrong experimental results and conclusions. 2. Diverse benchmark selection. Try to evaluate your fuzzer on multiple benchmarks, like FuzzBench, Magma, UniFuzz. 3. Uniform code coverage metric. Covert different code coverage metrics like AFL XOR hash, LLVM coverage sanitizer (pruned), LLVM coverage sanitizer (no-pruned), AFL++ code coverage into a uniform one by replaying 4. Complete test case collection. Be sure to collect all the test cases generated by the fuzzer. 5. Uniform fuzzing mode. Ensure all fuzzer are running under same modes, either the default mode or the faster persistent mode. An apple-to-banana comparison like MLFuzz only leads to wrong conclusions. 6. Open-source your fuzzing corpus. Fuzzing is an optimization and different seed corpus (starting point) can lead to drastically variant results.

DongdongShe's tweet photo. What happens if you write buggy code and misconfigure the experimental setup when evaluating a fuzzer’s performance? Wrong and misleading conclusion!

We found several fatal bugs and wrong experimental settings in MLFuzz (https://t.co/KoDGAUYJ95, a revisit work on NEUZZ published on a top tier software engineering conference ASE 2023, @AndreasZeller, @ASE_conf ). These following bugs lead to wrong and misleading conclusions in MLFuzz.
• An initialization bug ⇒ Failure setup of persistent mode fuzzing.
• A program crash ⇒ Unexpected early termination of NEUZZ.
• An error in training dataset collection ⇒ A poorly-trained neural network model.
• An error in result collection ⇒ Incomplete code coverage report

We confirmed these bugs with the MLFuzz’s authors and write a rebuttal paper(https://t.co/Pyp3RalEqt) to explain the errors in MLFuzz and summarize the lessons on a fair and scientific fuzzing experiment/revisit.
1. Ensure the correctness of code implementation. Careful and rigorous debugging is needed. If you would like to patch a prior work, double-check your setting or patch is correct and seek help from original developer if needed. MLFuzz introduced 3 implementation bugs that led to wrong experimental results and conclusions.
2. Diverse benchmark selection. Try to evaluate your fuzzer on multiple benchmarks, like FuzzBench, Magma, UniFuzz.
3. Uniform code coverage metric. Covert different code coverage metrics like AFL XOR hash, LLVM coverage sanitizer (pruned), LLVM coverage sanitizer (no-pruned), AFL++ code coverage into a uniform one by replaying
4. Complete test case collection. Be sure to collect all the test cases generated by the fuzzer.
5. Uniform fuzzing mode. Ensure all fuzzer are running under same modes, either the default mode or the faster persistent mode. An apple-to-banana comparison like MLFuzz only leads to wrong conclusions.
6. Open-source your fuzzing corpus. Fuzzing is an optimization and different seed corpus (starting point) can lead to drastically variant results.

0

72

15

23

15K

spinpx @spinpx

over 2 years ago

https://t.co/X4u8bPKPYZ I and Wei Cao did most of this work and wrote the first draft while we were at Ant Group. However, they removed us from the author list. Sad story. This work is shepherd by Alex Liu. However, he is not in the list, too.

1

3

1

2K

Who to follow

Shuai Wang

@wangshuai901

Associate Professor in CSE at HKUST | Happy Hacking | Software and Systems Security | Reverse Engineering | AI (LLM) Security and Privacy

Heng Yin

@heng_yin

Professor at University of California, Riverside, CEO of Deepbits Technology, Inc. https://t.co/RIfCoDznsv

Chao Zhang

@chao_zhang_thu

Tenured Associate Professor at Tsinghua University, focusing on Software and System Security, the fusion of AI and security.

spinpx retweeted

Dongdong She @DongdongShe

over 2 years ago

@AndreasZeller @ririnicolae @MaxCamillo @FSEconf Andreas, you are a renowned researcher in the fuzzing community, and your fuzzing book is amazing. But this work draws a completely WRONG conclusion due to the careless comparison of file-retrieval fuzzer against in-memory fuzzer, where the fuzzing throughput gap is up to 10X

1

6

2

3K

spinpx @spinpx

over 2 years ago

Hopper supports LLVM instrumentation now. https://t.co/Oia198e5O2

0

1

0

326

spinpx @spinpx

over 2 years ago

Hopper is released at: https://t.co/JHKslYAq9L

0

3

1

373

spinpx @spinpx

almost 3 years ago

We presented HOPPER, which generates fuzzing test cases for libraries automatically via interpretative fuzzing. It transforms the problem of library fuzzing into the problem of interpreter fuzzing. The paper can be found at https://t.co/b0ao9C8pL6

0

7

2

1

613

spinpx @spinpx

about 7 years ago

Our recent work on fuzzing nested branches: https://t.co/52UJCbhuMo

0

12

5

0

spinpx @spinpx

over 7 years ago

binutils, tcpdump, mupdf, ffmpeg are the most popular programs in evaluation of fuzzing papers. 🙂🙂https://t.co/S7mTUJrom0

0

8

2

0

spinpx retweeted

spinpx @spinpx

over 8 years ago

@dgryski We do plan to release the software in the future. Whether Angora works with other language depends on taint analysis engine. We used DFSan in the paper, and Angora also supports libdft now.

4

11

2

0

spinpx @spinpx

over 8 years ago

@dgryski We do plan to release the software in the future. Whether Angora works with other language depends on taint analysis engine. We used DFSan in the paper, and Angora also supports libdft now.

4

11

2

0

spinpx retweeted

Richard Johnson

@richinseattle

over 8 years ago

Congrats to my @TalosSecurity colleague @emd3l and the other accomplished authors of papers accepted at IEEE S&P https://t.co/rRkEPOgGao

0

21

3

0

spinpx retweeted

Emanuele Cozzi @invano

over 8 years ago

Very excited to announce that my first paper “Understanding Linux Malware” was accepted @IEEESSP 2018! A study on more than 10k #Linux #malware documenting challenges and Linux-specific malicious techniques. With @emd3l @reyammer @balzarot https://t.co/uK56uu2BgI

5

137

67

4

0

spinpx retweeted

Software Engineering Papers @ComputerPapers

over 8 years ago

Angora: Efficient Fuzzing by Principled Search. https://t.co/9xGO4R8JOd

0

2

0

spinpx retweeted

✨ Lizard Queen | @pvineetha.bsky.social ✨ @pvineetha

over 8 years ago

“We figured out a way to trick your voice assistants to respond to our commands but since it might be too obvious to you if we do that, we embedded our commands in songs, and everytime your voice assistant hears our songs it executes our commands”. 🔥 This is fine 🔥 https://t.co/vP0vHbO9rC

8

202

150

0