Discussion Notes SQ 2009 Week 9

Parseweb: a programmer assistant for reusing open source code on the web.

Thummalapenta, S. and Xie, T. 2007.

In Proceedings of the Twenty-Second IEEE/ACM international Conference on

Automated Software Engineering (Atlanta, Georgia, USA, November 05 - 09, 2007).

ASE '07. ACM, New York, NY, 204-213.

Basic Idea is transforming from "input type" to "final type"

which is one of common problems when developers know what they need and what they have.

Key Concept:

- Mining from example (Open Source files)

- Analyze code - using backward slicing

- Rank the result by frequency

- Recommend solutions to user

Advantage:

- Instead of mining API signatures like other related works, it does mining on example

This solves problem when the object is needed to be downed cast before call a method.

DAG: directed acyclic graph

- Acts as control flow graph, but ignore all the loops

- Each node is statement or type, each edge is type transformation.

Type Heuristic:

- Because an object is a subclass of many classes, it is difficult to know exact class that it really is.

- Use other method calls near by to guess the exact type of the object

Ranking:

- Rank by size and popularity

- Use first 200 results from Google Code Search for relevant result

Is Perl the best text processor?

- Syntax:

only write fewer lines, but not easy because it requires some symbol before every variable, and complex syntax.

- Automatically type conversion:

hard to keep track what type of each variable is.

- Regular Expression capability:

built-in RegEx function, but now other programming languages have RegEx functions

- Google uses Python

Using Google Code Search Engine for retrieve relevant data

provides Parseweb with fast, scalable, and fresh-ness of search result.