Abstract
The Ontology Matching (OM) problem is an important barrier to break in order to, for example, use Semantic Web standards on the world wide web. Several kinds of OM techniques exist. Instance-based OM (IBOM) is a promising OM technique, which is gaining popularity amongst researchers. IBOM uses the extension of concepts to determine whether or not a pair of concepts is related. The extension of a concept is defined by the instances with which that concept is associated.
While IBOM has many strengths, a weakness is that in order to match two ontologies a suitable data-set is required, which generally implies instances that are associated with concepts of both ontologies, i.e. dually annotated in-
stances. In practice, instances are often associated with concepts of a single ontology, rendering IBOM rarely applicable. However, in this thesis, we suggest a method that enables IBOM using two disjunct data-sets. This is done by enriching every instance of each data-set with concept associations of the most similar instances from the other data-set, creating dually annotated instances. We call this technique instance-based ontology matching by instance enrichment (IBOMbIE). The IBOMbIE method has proved to be successful, rendering it promising for IBOM research.
We have applied the IBOMbIE algorithm to two real-life scenarios, where large data-sets are used to match the ontologies of European libraries. In both scenarios we have invaluable gold standards and dually annotated instances at our disposal, which we use to evaluate the resulting alignments. Using these evaluation techniques we test the impact and significance of several design
choices of the IBOMbIE algorithm, such as the instance similarity measure and the amount of instances that is used to enrich an instance. Finally we compare the IBOMbIE algorithm to other OM algorithms.