SOVA is supported by a three-step process.
First, the (textual) artifacts are parsed. The parsing step uses natural language processing (NLP) techniques, including semantic role labeling, temporal ordering, and pronouns replacements, as well as an ontological model.
Second, the similarity of the parsed behaviors is calculated. Perceiving behavior from an external point of view, the behavioral similarity is the weighted average of the semantic similarities of the initial states (pre-conditions), external events (triggers), and final states (post-conditions). The semantic similarity calculation can follow different knowledge-based or corpus-based measures.
Finally, a feature diagram capturing the variability of the software products is created. This is done utilizing a hierarchical agglomerative clustering algorithm. The created feature diagrams are the basis of the domain model.