Chapter 11 Future Work
A Ph.D. Thesis by Andrew Le Gear
[Back to Home Page] [Previous Chapter] [Next Chapter]
“The future, according to some scientists, will be exactly like the past, only far more expensive.”
-John Sladek
The work of this thesis has prompted several interesting avenues of research that could form the basis for a substantial amount of future work. All the proposed hypotheses are directly related to Reconn-exion. The ideas presented are far from being fully developed, but should serve as a form of guidance for future researchers to expand beyond the work of Reconn-exion. Sections 11.1 to 11.4 explore potential future work that aims to better refine the Reconn-exion technique and copperfasten the results of the work begun by this thesis. This is followed in sections 11.5 to 11.12 by suggestions for work that will expand upon Reconn-exion.
Further studies are needed to help a catalogue of guidelines when using Reconn-exion. Topics where guidance is needed include:
•Choice of testcases.
•Choice of features.
•Appropriate use of tool support.
•How to create an initial high level model.
•How to create initial mappings.
Such research would create a “best practices” body of knowledge for Reconn-exion.
In the case study in section 9 the prospect of using databases accesses as part of the source model is explored, although not evaluated in detail. Future work will see the effects and usefulness of incorporating database accesses into the reflexion model. A further useful refinement on including database accesses would be to make the distinction between reads and writes on particular database accesses in the analysis.
The solution presented in this thesis stops at component encapsulation. Future work exists, in investigating the component wrapping stage that allows the extraction and reuse of that component. The topic was previously visited in section 3.5.5. In a publication arising from this thesis a prototype wrapping solution using the xADL 2.0 architectural description language was implemented with success (Le Gear et al., 2004). However,
this was a fairly trivial example. Much work exists in comparing wrapping solutions and to document best practices when wrapping.
The field of reengineering an maintenance now presents an array of useful reengineering algorithms and techniques that the software maintainer can avail of, many of which were reviewed in the literature review chapters of this thesis. They have become even more valuable as they have been integrated together into large tool sets such as Bauhaus (Koschke, 2005) and Dali (Kazman and Carrière, 1997). These tool sets will in turn become
even more useful as they become integrated into widely used development environments such as Eclipse (Eclipse IDE Homepage, 2005) or Visual Studio (Microsoft, 2006c). This thesis evaluates Reconn-exion in isolation. However, much work remains in investigating it’s use in conjunction with other techniques as part of an architectural recovery process (Christl et al., 2005).
The reuse perspective is the union of the SHARED sets calculated for each feature. The evidence gathered in chapter 8 suggested that these SHARED sets contain core architectural elements of a system. It may be possible to create an architectural view of a system based upon the relationship between these various SHARED sets. For
example, given three features and their corresponding SHARED sets:
• Feature1,SHARED(Feature1) = {a,b, c,d, e,f}.
• Feature2,SHARED(Feature2) = {b,d,e,g,h,i}.
• Feature3,SHARED(Feature3) = {a, c,j,k,l}.
Notice that these SHARED sets overlap in places (as in practice):
• SHARED(Feature1) n SHARED(Feature2) = {b,d,e}.
• SHARED(Feature1) n SHARED(Feature3) = {a,c}.
One could then create a model of the system with high-level elements corresponding to:
•Shared by feature one only.
•Shared by feature two only.
•Shared by feature three only.
•Shared by features one and two only.
•Shared by features one and three only.
Figure 11.1: Decomposing a system in terms of its SHARED sets.
Figure 11.2: Including the common software elements in the model of the system.
In this way a summary of the system, similar to a Reflexion Model, that is compared against the call graph of the system, could be produced. This is shown in figure 11.1. This process, unlike all other design recovery techniques could provide an automated route from profiled features to a recovered design.
The set of common software elements (CELEMS) is the intersection of all profiles retrieved from the system. This set represents utility and initialisation code. Including this set in the model would add further meaning to the model (figure 11.2). Notice the one way relationship that will often exist between CELEMS and the remainder of the system in figure 11.2, indicating that the source code in CELEMS will usually be executed before the remainder of the system.
Finally, the unique elements to a feature could also be included in the model, as with figure 11.3. What is important about these models of a system is that they can be generated from a behavioral specification of a system without consulting the source code in a process moving from system execution to a recovered architecture in a single, automatic step. Furthermore, the models recovered are not in terms of generic architectural concepts such as “data layer,” or “user interface,” rather a domain specific architectural recovery could potentially be achieved.
Figure 11.3: A feature based decomposition of a software systems that shows shared, unique and common software elements of a system.
Software product lines are the underlying, generic architectures (to that domain) of software products that allow for the shorter time-to-market and shorter and cheaper development cycles through reuse of the a set of domain assets across a range of applications (the product line) (Greenfield et al., 2004; Eisenbarth and Simon, 2001; Priéto-Diáz, 1991). The usefulness of the software product line has been effectively shown for new
product families. However, where existing legacy applications exist, migrating to a product line philosophy can be difficult (Simon and Eisenbarth, 2002) and the process of modernization (Seacord et al., 2003) unclear.
Investigating whether the SHARED and UELEMS (unique) sets can be used as a means of identifying potential software to reuse in a new product line architecture for an organisation would be useful. Of even more interest to an organisation would be the potential to identify product lines in their organisation that implicitly exist across their
products already. This could possibly be achieved through a combination of Software Reconnaissance and clone detection (section 3.3.1) in the following steps:
1. Software Reconnaissance is performed on a catalogue of products within an organisation. The SHARED and UELEMS sets identify the commonalities and variabilities for each individual product.
2. Clone detection detection is applied to the catalogue. This is different to the normal application of clone detection. Normally, one is trying to identify clones within an existing system. In this case, clones of source code across different systems are being searched for.
3. Correspondence between clones and SHARED sets are searched for across systems. If the same, cloned, SHARED set, or a portion of it, appears in more than one product then a portion of an existing, implicit product line within that organisation may have been identified.
The means of identifying an interface in Reconn-exion could be extended to identify the complicated set of join points for an aspect during aspect-oriented software development (given the appropriate source model).
To implement this, a far more detailed source model would be necessary that contains data flow information, allowing the model generator to know what data is entering and leaving the aspect. The technology to achieve this goal exists in other solutions. For example, the “extract method” facility that exists in Visual Studio .Net (Microsoft,
2006a), allows one to highlight consecutive lines of code and automatically create a new method from it. The data model in the IDE figures out data items entering and leaving that region of code. This type of model is necessary for an automated Aspect recovery technique since information regarding the data entering and leaving an aspect
is vital for aspect weaving.
Reflexion Modelling has traditionally been used as means of analysing existing software that is unfamiliar to the user. Interestingly, the reverse use of this structural summarization technique could be of use as a means of design control for a software architect in a development team. The process could proceed like this (Le Gear et al., 2006):
1. At the design phase of the software lifecycle, the software architect creates a high-level architectural model of the system.
2. During the development phase of the lifecycle, each new software element implemented is mapped to part of the high-level model.
3. Each time new mappings are made, a Reflexion Model is generated. Any divergences produced represent a violation of the prescribed architecture. Thus, architectural violations can be identified immediately when they occur. Once identified, the architect can choose to:
•Have that portion of the system altered to conform to the architecture.
•Update his architectural model to accommodate the unanticipated architectural need.
This approach presents a viable means for an architect to control the architecture of his developing system, or at the very least, documenting changes to his architecture. Gail Murphy suggested a similar idea in (Murphy et al., 2001), with the key difference being that her suggestion’s intended application was for existing systems and not to be incorporated from the beginning of the sotware development process.
During the evaluation it was noticed how the models developed by the participants were useful in stimulating a learning dialogue between the participants and the architects. Both sides found an opportunity to learn from these dialogues. This is particularly illustrated in section 8.2.
This suggests that useful research could be undertaken in investigating a collaborative version of Reconn-exion or Reflexion Modelling whereby a group of people can create high-level models, maps and interpret their Reflexion Models as a team. Such a process could be incorporated into the regular design meetings of a development team.
Implementing the collaborative approach would also be relatively inexpensive in effort, since all that would be necessary is each team member to be logged in over a remote desktop, using a single application. A study by the author is currently underway at to investigate this.
Reflexion modelling, as used in this thesis is a form of structural summarization. That is, the models created by the user are compared against a source model that describes the structure of the system, and more specifically the call relationships between procedures and data sources in the source code. However, different source models can also be produced. One that is potentially useful is a source model that represents the sequence or temporal relationship between procedures of a program. This can easily be derived from program traces (section 4.2.1). Figure 11.4 shows a simple source model using temporal relations.
Using a source model like this, the normal Reflexion Modelling process could be implemented, except this time a software engineer would be performing a temporal summarization of some business process, rather that a structural summarization of the architecture. That is, business process recovery could be undertaken as opposed to architectural recovery. Figure 11.5 shows an example of temporal summarization using the source model in figure 11.4.
Figure 11.4: A simple temporal source model.
Figure 11.5: An example temporal summarization.
Temporal summarisation in this way could allow a software engineer to reason over a large business process in a system.
As explained in the previous section, the source model used in the examples in this thesis are structural (the call graph). Already, in this chapter, two more types of source models are named as necessary to implement some of the suggested future work:
•A data-flow source model for aspect recovery.
•A temporal source model for temporal summarization.
Different types of source model enable analyses to be performed from different viewpoints. Describing a system from different viewpoints is standard practice during design (Kruchten, 1995) and should be no different during design recovery. The types of source model that would be interesting to investigate further would be:
• Temporal -for business process recovery
• State machines -automated recovery of activity diagrams or state charts.
• Events and Publish / subscribe -a message oriented model of a system, useful for design recovery on enterprise systems.
•Database access -Provides a data oriented view of a system.
•Feature mapping -A domain oriented recovery of a system.
•Data type usage -A domain concept breakdown of a system.
•Data flow -Another business process recovery of a system.
•Physical deployment -Allows the recovery of deployment diagrams.
•Concurrency -Useful for the design recovery of state charts and sequence diagrams. In terms of software comprehension or component recovery, richer source models can provide a software engineer with more information on dependencies, thus allowing for more informed decisions to be made.
Two interesting observations were made during evaluation that could be developed further as useful metrics and heuristics when analysing software:
1. A new reuse metric could be developed from the concept of the reuse perspective. One possible measure, for example, could capture how many features a software element is shared across. This could be in an indication of domain reuse.
2. For each of the studies of Reconn-exion, it took each of the participants about ten iterations to arrive at, what they felt was, a finished encapsulation of the component, in spite of differences in age, experience, system sizes and application domain. Perhaps this is a “magic number” that can allow us to estimate the number of iterations required to encapsulate a component using Reconn-exion in general. Of course, much work remains to develop these concepts into a hypothesis. Nonetheless, they appear to be interesting avenues for future work.
[Back to Home Page] [Previous Chapter] [Next Chapter]
Component Reconn-exion by Andrew Le Gear 2006