Comments Page - Fuzzing the PHP Interpreter via Dataflow Fusion

« Back Fuzzing the PHP Interpreter via Dataflow Fusionarxiv.orgSubmitted by todsacerdoti 2 days ago

mmsc 2 hours ago
This is cool! If anybody is interested in looking at the bugs that were found, they can be found on Github: https://github.com/php/php-src/issues?q=author%3AYuanchengJi...
If you work with some interpreted language that is written in C or C++, it is actually quite easy to fuzz the interpreter using the scripting language natively. I outlined how to add fuzzing functions into the Pike scripting language to interact with the AFL++ fuzzer here: https://joshua.hu/aflplusplus-fuzzing-scripting-languages-na...
If you have a very large codebase in such a language, you can just replace whatever function introduces some data highest on the callstack with the (new, introduced) function that retrieves fuzzing data from AFL++, and effectively fuzz all of the internal functions that your codebase uses. PHP, Perl, Ruby, or some esoteric language, are all pretty good targets for this.
firer 2 days ago
Fuzzing data flow separately from control flow is an interesting idea
I can believe that it dramatically speeds up finding certain bugs, but I doubt that it can reach a large class of complex vulnerabilities, which in the case of high value targets is probably all that's left.
The PHP interpreter isn't much of an interesting target, since it (usually) doesn't accept user input, even if it does power a significant part of the web.
For that reason, it's much less researched and still has low complexity bugs.
More robust interpreters such as JavaScript's V8 will probably fare much better against data flow only fuzzing. Bugs in V8 tend to combine both data flow and control flow[1].
[1] https://googleprojectzero.blogspot.com/2021/01/in-wild-serie...