alattin

Alattin: Mining Alternative Patterns for Defect Detection

NOTE: This paper is an extended version of our ASE 2009 paper titled: "Alattin: Mining Alternative Patterns for Detecting Neglected Conditions". The project website along with the results of our ASE 2009 paper are moved here.

PROJECT SUMMARY

To improve software quality, static or dynamic defect-detection tools accept programming rules as input and detect their violations in software as defects. As these programming rules are often not well documented in practice, previous work developed various approaches that mine programming rules as frequent patterns from program source code. Then these approaches use static or dynamic defect-detection techniques to detect pattern violations in source code under analysis. However, these existing approaches often produce many false positives due to various factors. To reduce false positives produced by these mining approaches, we develop a novel approach, called Alattin, that includes new mining algorithms and a technique for detecting neglected conditions based on our mining algorithm. Our new mining algorithms mine patterns in four pattern formats: conjunctive, disjunctive, exclusive-disjunctive, and combinations of these patterns. We show the benefits and limitations of these four pattern formats with respect to false positives and false negatives among detected violations by applying those patterns to the problem of detecting neglected conditions.

The paper submitted to ASE Journal can be found as an attachment below.

PEOPLE

Faculty

Tao Xie (Principal Investigator)

Graduate Students

Suresh Thummalapenta (PhD Student)

EMPIRICAL RESULTS

Mined Patterns Sheet Format: Describes results of Section 5.3 for all four pattern formats

The results are available for three subject libraries: Java Util, Java Sql, Java JTA in all four pattern formats. The format of each excel sheet is described below:

    • ID : pattern ID.
    • API Name: Shows the name of the API method for which the rule is associated with.
    • Pattern: Gives an alternative of the pattern. Here only IDs of the pattern are available. More details about the pattern are available in the text file <SubjectName>_AllPatterns.txt
    • Support: Shows the support value assigned by our mining algorithms.
    • Category: Manually classified category through documentation or source code. Possible values are Rule, Partial Rule, and False Positive.
    • Comments: Additional comments regarding the pattern.

Detected Violations Sheet Format: Describes results of Section 5.4 for all four pattern formats

The results are available for four subject applications: Columba, Hibernate, HsqlDB, BCEL

    • Filename: Name of the source file.
    • Method: Name of the violated method in the preceding source file.
    • Violated API: The API whose properties are violated.
    • Violated Pattern: Actual pattern used to detect the violation.
    • Support: Support for the associated pattern. The value ranges between 0 and 1.
    • Category: Manually identified category of the violation. Possible values are Defect and False Positive.
    • Comments: Any comments that describe the reasons for selecting the described violation category.

SPONSORS

Army Research Office Award Program (09/08/2008-08/30/2011)

National Science Foundation Award CNS-0725190, Computer Systems Research (CSR) Program (08/01/2007-07/31/2008)

For any details, please contact