Your page title

Automated Testing of API Mapping Relations

PROJECT SUMMARY

Software companies or open source organizations often release their applications in different languages to address business requirements such as platform independence. To produce the same applications in different languages, often existing applications already in one language such as Java are translated to applications in a different language such as C#. To translate applications from one language (L1) to another language (L2), programmers often use automatic translation tools. These translation tools use Application Programming Interface (API) mapping relations from L1 to L2 as a basis for translation. During translation, it is essential that API elements (i.e., classes, methods, and fields of API libraries) of L1 and their mapped API elements of L2 (as described by API mapping relations) exhibit the same behaviors, since any inconsistencies among these mapping relations could result in behavioral differences and defects in translated applications. Therefore, to detect behavioral differences between mapped API elements described in mapping relations, and thereby to effectively translate applications, we propose the first novel approach, called TeMAPI (Testing Mapping relations of APIs). In particular, given a translation tool and a test-generation tool, TeMAPI automatically uses the test-generation tool to generate test cases that expose behavioral differences between mapped API elements from mapping relations described in the translation tool. To show the effectiveness of our approach, we applied our approach on five popular translation tools. The results show that TeMAPI effectively detects various behavioral differences between mapped API elements. Some behavioral differences can indicate defects in translation tools, and two such new defects were confirmed after we reported them.We summarize detected differences as eight findings and their implications. Our approach enables us to produce these findings that can improve effectiveness of translation tools, and also assist programmers in understanding the differences between mapped API elements of different languages. To show the significance of our findings, we conduct an additional evaluation of translating real projects. The results show that existing translation tools can translate most API elements in real projects, and thus confirm the significance of our findings, since our findings are most valuable to those translatable API elements.

The paper is submitted under review.

PEOPLE

Hao Zhong (iTechs, ISCAS), Suresh Thummalapenta (NCSU), Tao Xie (NCSU)

SUBJECT TOOLS

WRAPPERS

Since we are the first ones to test behavioral differences of API methods between two different languages, there exist no previous benchmark. Therefore, we release all the wrappers used in our approach to serve as a benchmark for the future work.

Java to CSharp benchmarks

CSharp to Java benchmarks

RESULT 1: API TRANSLATION CAPABILITY.

This table lists translatable API methods and fields for each translation tool. Please click the corresponding worksheet for the list of each tool:

Name format of API method and fields

API method name = classname + method name + (parameter type) e.g., java.lang.Byte.equals(java.lang.Object)

API constructor name = classname + .<init> + method name + (parameter type) e.g., java.lang.Byte.<init>(java.lang.String)

API field name = classname + .<get> + field name e.g., java.lang.Long.<get>MIN_VALUE

Please note that the list for each tool is shorter than the list for translatable wrappers. Please consider the 6 wrappers as follows.

pulbic w1(){ public w2(){ public w3(){ public w4(){ public w5(){ public w6(){

c obj = c1(...); c obj = c1(...); c obj = c1(...); c obj = c2(...); c obj = c2(...); c obj = c1(...);

obj.m1(...); obj.m2(...) ; obj.m3(...) ; obj.m1(...) ; obj.m2(...) ; obj.m3(...) ;

} } } } } }

If all the 6 wrappers are translatable, totally 5 API methods/fields (i.e., c1, c2, m1, m2, m3) are translatable.

RESULT 2: BEHAVIORAL DIFFERENCES OF API MAPPING RELATIONS.

1. Generating C# Test Cases for Java Code

Extended too1 l: Pex.

Figure 1 Distribution of found unique behavioral differences by Pex

(1) 36.8% test cases show the behavioral differences caused by null inputs.

(2) 22.3% test cases show the behavioral differences caused by stored string values.

(3) 11.5% test cases show the behavioral differences caused by illegal inputs or inputs out of ranges.

(4) 10.7% test cases show the behavioral differences caused by different understandings.

(5) 7.9% test cases show the behavioral differences caused by exception handling.

(6) 2.9% test cases show the behavioral differences caused by static values.

(7) 7.9% failing test cases are related to the API methods that can return random values or values that depend on time.

2. Generating Java Test Cases for C# Code

Extended tool 2: Randoop

Figure 2 Distribution of found behavioral differences by Randoop

(1) 45.0% test cases show the behavioral differences caused by illegal inputs or inputs out of ranges.

(2) 34.0% test cases show the behavioral differences caused by stored string values.

(3) 5.3% test cases show the behavioral differences caused by different understandings.

(4) 4.0% test cases show the behavioral differences caused by exception handling.

(5) 3.0% test cases show the behavioral differences caused by null inputs.

(6) 2.0% test cases show the behavioral differences caused by static values.

(7) 0.3% failing test cases are related to the API methods that can return random values or values that depend on time.

(8) 3.4% test cases fail because of invocation sequences.

(9) 3.0% test cases get failed since translation tools such as Java2CSharp translate API elements in Java to C# API elements that are not implemented yet.

When generating test cases, Pex explores feasible paths, whereas Randoop uses a feed-back random strategy. As a result, the distribution of Figure 1 is different from the distribution of Figure 2. The distribution of Figure 1 is more reasonable since each feasible path reflects a unique behavior, and each failing test case reflects a unique behavioral difference, whereas the

distribution of Figure 2 is affected by redundancies in test cases generated by Randoop. Still, Randoop's random strategy helps find another behavioral difference as described in the submitted draft.