Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Most code doesn't express subtle logic paths. If I test if a million inputs are correctly sorted, I've probably implemented the sorter correctly.

This just rings of famous last words to me. There are many errors that pass this test. Edge cases in arbitrary code is not easy.

Makes me wonder how fuzzers do it. Just random data? How guided is it?



Modern fuzzers try to modify the input so that code travels through as many different paths as possible.

One of the better known "new gen fuzzers" is AFL. Wikipedia has a high-level overview of its fuzzing algorithm https://en.wikipedia.org/wiki/American_Fuzzy_Lop_(software)#...

With AFL you can use a JPEG decoder and come up with a "valid" JPEG picture, i.e. one acceptable by the decoder: https://lcamtuf.blogspot.com/2014/11/pulling-jpegs-out-of-th...


Thanks. This is pretty damn cool, and sounds much more useful than random for real-world use cases.

Question: does this work for interpreted languages? Or is this an assembly only thing?


I suppose it does work for interpreted languages (you just need to define what is success and what is failure), but for AFL the evaluation might be too far away from the actual branching that it would be less effective, possibly critically so. Additionall in fuzzing the bottleneck is running the programs, millions of times (though maybe not billions). So the slower your function is, the slower the fuzzing will be.

I think though the same approach could be used with interpreters, and I expect it would be easier to do there. E.g. for Python there is this, but I haven't tried it: https://github.com/jwilk/python-afl




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: