An empirical study of the robustness of Windows NT applications using random testing 论文

2000引用 266
Software Testing and Debugging TechniquesSoftware Reliability and Analysis ResearchAdvanced Malware Detection Techniques

摘要

We report on the third in a series of studies on the reliability of application programs in the face of random input. In 1990 and 1995, we studied the reliability of UNIX application programs, both command line and X-Window based (GUI). In this study, we apply our testing techniques to applications running on the Windows NT operating system. Our testing is simple black-box random input testing; by any measure, it is a crude technique, but it seems to be effective at locating bugs in real programs. We tested over 30 GUI-based applications by subjecting them to two kinds of random input: (1) streams of valid keyboard and mouse events and (2) streams of random Win32 messages. We have built a tool that helps automate the testing of Windows NT applications. With a few simple parameters, any application can be tested. Using our random testing techniques, our previous UNIXbased studies showed that we could crash a wide variety of command-line and X-window based applications on several UNIX platforms. The test results are similar for NT-based applications. When subjected to random valid input that could be produced by using the mouse and keyboard, we crashed 21% of applications that we tested and hung an additional 24 % of applications. When subjected to raw random Win32 messages, we crashed or hung all the applications that we tested. We report which applications failed under which tests, and provide some analysis of the failures. 1INTRODUCTION We report on the third in a series of studies on the reliability of application programs in the face of random input. In 1990 and 1995, we studied the reliability of UNIX command line and X-Window based (GUI) application programs[8,9]. In this study, we apply our techniques to applications running on the Windows NT operating system. Our testing, called fuzz testing, uses simple black-box random input; no knowledge of the application is used in generating the random input. Our 1990 study evaluated the reliability of standard UNIX command line utilities. It showed that 25-33 % of such applications crashed or hung when reading random input. The 1995 study evaluated a larger collection of