Building Efficient Fuzzers using Automata Learning

Suman, Jana (PI)

Columbia University

Projet

Description

Evolutionary mutational fuzzers like American Fuzzy Lop (AFL)/libFuzzer have been quite successful at finding bugs in large, complex software. The code-coverage-driven evolutionary approach of these fuzzers begins with a set of seed inputs, applies mutations repeatedly to generate a new generation of inputs, and decides which inputs are promising and should be mutated further based on whether they increased code coverage or not. Despite their popularity, the evolutionary fuzzers are not very efficient at finding deep bugs as the random and unsystematic nature of the evolutionary process fails to increase code coverage beyond a certain threshold. This problem is especially noticeable for event-driven code that process highly structured inputs, i.e., GUI-driven applications, parsers for different file formats, etc. In this proposal, we plan to investigate and design a novel type of mutational fuzzers that will learn automata models (e.g., Deterministic Finite Automata (DFA)) of the test programs by executing the test program with different input and use the learned models to achieve higher code coverage and find deeper bugs than evolutionary fuzzers. Automata models are specially suitable for fuzzers for two main reasons: they can represent a wide variety of programs and they can be learned efficiently through blackbox queries. Unlike evolutionary fuzzers that determine the future test inputs solely based on the current generation of inputs, an automata-learning-based fuzzer will learn an automata model of the test program based on the observed program behaviors on all prior inputs. Such fuzzers will leverage the inferred model to drive the input generation process towards unexplored parts of the program and use the program behaviors for the new inputs to further refine the learned model. Therefore, automata learning can achieve high code coverage in an efficient manner. Besides code coverage, the automata-learning-based fuzzer will also have several other major advantages over evolutionary fuzzers: (i) the automata models, once learned for a test program, can be reused for testing newer versions of the same application or other applications with similar input format and functionality. (ii) Unlike traditional supervised machine learning algorithms, the automata models can be easily interpreted by security analysts and thus the models will significantly help the analysts in understanding and reverse engineering the logic of the test program.

Statut	Terminé
Date de début/de fin réelle	7/17/18 → 7/16/21

Financement

U.S. Army: 360 000,00 $ US

Keywords

Inteligencia artificial
Ciencias sociales (todo)

Building Efficient Fuzzers using Automata Learning

Détails sur le projet

Description

Financement

Keywords

Empreinte numérique