Genetic Algorithms for Software Test Data Generation
Administered by: National Science Foundation
In software testing, one is often interested in judging how well a series of test inputs tests a piece of code—the goal being to uncover as many faults as possible with a potent set of tests. Thus, a test series that has the potential to uncover many faults is better than one that can only uncover a few.
Unfortunately, it is almost impossible to say quantitatively how many faults are potentially uncovered by a given test set. This is not only because of the diversity of the faults themselves, but because the very concept of a "fault" is only vaguely defined. This has lead to the development of test adequacy criteria—criteria that are believed to distinguish good test sets from bad ones.
When a test adequacy criterion has been selected, the question that arises next is how one should go about creating a test set that is "good" with respect to that criterion. Since this can be difficult to do by hand, there is an obvious need for automatic test data generation. We propose to research a specific approach to automatic test case generation for incorporation in RST's coverage test tools, Deepcover for C/C++ and Deepcover for Java.
Test data generation is important for several reasons. First, empirical results seem to indicate that tests that are selected on the basis of test adequacy criteria are, in fact, good at uncovering faults. Secondly, test adequacy criteria are objective criteria by which the quality of software tests can be judged. Neither of these benefits can be realized unless adequate test data (i.e., test data that satisfy the adequacy criteria) can be found.
We propose to enhance and commercialize a test data generation originally paradigm proposed by [Korel 90]. This paradigm treats parts of the program as functions that can be evaluated by executing the program, and whose value is minimal for those inputs that satisfy the adequacy criterion. Therefore, the problem of generating test data is reduced to the well-understood problem of function minimization. Additionally, it is assumed that the structure of the program is visible to the tester, and this enables the use of heuristics during function minimization. Some such heuristics are described in [Korel 90] and [Korel 96].
The problem with this test case generation method (and all others) is that test data generation leads to an undecidable problem: one cannot take a test adequacy criterion and determine whether an input exists that satisfies it. The reason is that the problem of finding test-adequate is a variant of the halting problem — a problem not soluble by any algorithm.
To solve this dilemma, practical test data generation algorithms are usually given resource limitations (e.g., limitations on the amount of computation time they may expend). If no solution is found before the resources are exhausted then the algorithm is considered to have failed. In short, test generation algorithms do not always succeed in finding an adequate test input.
Clearly, it is desirable to have test data generation algorithms that are more powerful in the sense of being more capable of finding adequate tests. The research proposed here specifically addresses that need.
We propose to explore two extensions of Korel's paradigm. First, we will use more powerful function minimization methods in place of Korel's gradient-descent algorithms. Although many sophisticated algorithms exist for this purpose, such as simulated annealing, tabu search, and genetic algorithms, none of these methods have been applied to automatic test data generation. In this effort, we will research the use of genetic algorithms.
References
[Korel 90] B. Korel. Automated Software Test Data Generation IEEE Transactions on Software Engineering 16(8):870-879, August, 1990.
[Korel 96] B. Korel. Automated Test Data Generation for Programs with Procedures Proceedings of the 1996 International Symposium on Software Testing and Analysis pages 209-215, ACM Press, 1996.
