TestCom-Fates 2007

Software Fault Diagnosis

Peter Zoeteweij, Delft University of Technology, The Netherlands

Peter Zoeteweij is a researcher in the computer science department at Delft University of Technology, from which he also received an MSc in 1996. From 1996 to 2001, he worked for Logica (now LogicaCMG) as a software engineer, mainly on software for the oil industry. From 2001 to 2005 he was a researcher at CWI, the Netherlands Institute for Mathematics and Computer Science. He received a PhD from the University of Amsterdam in 2005. Peter has a background in parallel computing and constraint programming, and currently works in the context of the TRADER project, hosted by the Embedded Systems Institute in Eindhoven. The goal of this project is to develop techniques for improving the user-perceived reliability of embedded systems in the area of consumer electronics.

Background

Automated diagnosis techniques help to localize faults that are the root causes of discrepancies between expected and observed behavior of systems. As such, these techniques are a natural companion to testing efforts, which aim at exposing such discrepancies. In software development, automated diagnosis can reduce the effort spent on manual debugging, which shortens the test-diagnose-repair cycle, and can hence be expected to lead to more reliable systems, and a shorter time-to-market. Outside the software development cycle, diagnosis results can also be used for maintenance, and as the basis for (automated) recovery strategies.

Description

This 3-hour tutorial aims to give an overview of automated diagnosis applied to software faults. The emphasis is on a particular technique called spectrum-based fault localization, which is well-suited for diagnosing software systems, and which can easily be integrated with existing testing schemes. The tutorial consists of two parts: in the first part we introduce spectrum-based fault localization and related concepts and techniques, and in the second part we survey research and applications in this area.

Part I: Concepts

We will start by introducing a high-level view on the diagnosis problem. Central to this view is the notion of a model of a system, which serves to define its intended behavior, and may contain additional information about its composition and operation. In the context of this high-level view, we will describe and relate two specific approaches to automated diagnosis: model-based diagnosis and spectrum-based fault localization.

Model-based diagnosis (MBD) entails reasoning about possible explanations for observed (faulty) behavior using an operational model of the system. If a suitable formalism is chosen, this reasoning can well be automated.

Spectrum-based fault localization (SFL) compares the activity of parts of a system with the pattern in which failures occur across a test suite, or across several usage scenarios. Based on this comparison, likely candidates for causing these failures can be identified.

While MBD has been applied successfully for diagnosing complex mechanical systems, finding a formalism that facilitates its application to software has proven to be difficult. In contrast, SFL applies naturally to software. Comparing MBD and SFL is useful for understanding the possibilities and limitations of both methods.

Part II: Research and Applications

The survey of research and applications related to SFL consists of the following topics:

An overview of existing systems and projects where SFL is applied.
An overview of recent SFL-related research, both in our own group and by others. Here we concentrate on the effect that several parameters of the approach have on the quality of the diagnosis. Experimental results performed on a bench-mark set of software faults known as the Siemens set, containing over 100 software faults in seven small C programs, will be presented.
A case study of applying SFL in collaboration with NXP, a semiconductor company developing embedded systems for consumer electronics devices. The case study involves two faults in the control software of a product line of television sets. We discuss the implementation of SFL for this particular application domain, which is characterized by a high level of concurrency, and scarce CPU and memory resources.
A discussion of further possibilities for utilizing SFL outside the software development cycle, including (automated) recovery.
A survey of other approaches to automated diagnosis of software systems, in particular attempts at applying MBD to software.

It is not necessary to bring a laptop, but participants with laptops and gcc and gcov installed (standard in most Linux / GNU distributions) will be able to get some hands-on experience with spectrum-based fault localization. The necessary files will be available on a USB drive, but can also be obtained from http://www.st.ewi.tudelft.nl/~peterz/getting_started.tar.gz.