LLMs4OL 2026: Large Language Models for Ontology Learning

The 3rd LLMs4OL Challenge @ ISWC 2026

‌ISWC 2026, Bari, Italy | 25-29 October

Reuse Task

Ontology Extension

Definition: Given a partial ontology, add new concepts and relations extracted from text.

Motivation: In practice, ontologies are rarely created from scratch, domains evolve over time, and knowledge grows incrementally. This task tests whether systems can adapt and expand an ontology, not just generate one from raw text.

Objective

Given an existing ontology, the participants must extend it using the given new terms and types with accompanying raw text. Unlike the End-to-End OL task, this task starts from a partially built ontology and focuses on incremental knowledge addition rather than constructing a complete ontology from scratch.

What must participants build?

A system that, starting from unstructured text, extends a primitive ontology includes:

Term Typings (mappend instances to classes)
Taxonomic Discovery (is-a/subclass)
Non-taxonomic relations

All stages must be automatically derived from text and given a primitive ontology via an integrated pipeline.

A toy example

Provided raw text in input:

A humidity sensor monitors humidity in the house. The smart humidifier receives readings from the humidity sensor and adjusts the water flow. A mobile app allows the user to control the humidifier remotely.

Provided terms:

user, mobile app, humidity level

Provided types:

application, person, measurement, humidity sensor

Existing ontology:

(thermostat, instance-of, device)

(temperature sensor, instance-of, sensor)

(sensor, is-a, device)

(system, is-a, device)

➡️ 1. Term Typing

humidity level ---> measurementuser ---> personmobile app ---> application

➡️ 2. Taxonomic Discovery

(application, is-a, device)

➡️ 3. Non-Taxonomic RE

None

The final output would be as follows:

(humidity sensor, instance-of, measurement)

(user, instance-of, person)

(mobile app, instance-of, application)

(application, is-a, device)

Dataset

The dataset for this task consists of 2,774 samples for training with id, context, initial-primitive-ontology-triples (existing ontology triples), and extended-primitive-ontology-triples as expected triples. Participants can use training data for finetuning or developing their own approaches, where in the test set, there will be only id and context, initial-primitive-ontology-triples, possible terms and types values per sample, where participants should submit the extended-primitive-ontology-triples for evaluations.

The following is an example of the dataset

{

'id': 'f6413e1acd254d02ad9b52153daa9d16',

'context': "Title: Agricultural Product Classifications: Lime Derivatives and Frozen Meat Types \n\n Content: \nIn the study of food agriculture, precise categorization is essential for inventory and culinary standards. The classification system groups various items based on their botanical origin and processing state. For citrus-derived goods, the fresh lime is fundamentally defined as a lime fruit food product. This parent category encompasses a wide range of derivatives; for instance, a lime drink mix is formulated and sold as a lime fruit food product to provide convenient flavoring. Similarly, spreadable options such as a lime preserve or jam food product are also categorized as a lime fruit food product, highlighting the fruit's versatility in confectionery applications. Additionally, industrial processing often yields a lime juice beverage base, which is explicitly classified as a lime fruit food product to ensure consistency in soft drink production. In the protein sector, classification relies heavily on physical state and preparation methods. When dealing with frozen meats, distinctions are drawn between whole cuts and processed forms. Specifically, the animal meat (ground, fresh frozen) is a specialized preparation that falls under the broader category of piece(s) of animal meat (frozen). This hierarchical relationship ensures that animal meat (ground, fresh frozen) is correctly identified as a variant of piece(s) of animal meat (frozen), facilitating accurate storage requirements and quality control within the supply chain.",

'terms': [],

'types': ['lime fruit food product', 'lime juice beverage base', 'lime preserve or jam food product'],

'initial-primitive-ontology-triples': [ ['lime drink mix', 'is-a', 'lime fruit food product'],

['lime', 'is-a', 'lime fruit food product'],

['animal meat (ground, fresh frozen)', 'is-a', 'piece(s) of animal meat (frozen)']],

'extended-primitive-ontology-triples': [['lime preserve or jam food product', 'is-a', 'lime fruit food product'],

['lime juice beverage base', 'is-a', 'lime fruit food product']]

}

The context for triplets consists of a title and a content body that are combined to form a context for primitive ontology. The content is a short scientific variant on the title.

The Reuse Task train dataset is available for download via https://github.com/sciknoworg/LLMs4OL-Challenge/tree/main/2026/TaskB-Reuse

Important notes:

The dataset is designed to support different domains; even though we didn't provide this information in the dataset, it is important to consider the multi-domain perspective modeling, but it is not mandatory.
The dataset went into a multi-phase quality check to make sure it is well-suited for modeling. This makes it ideal for data augmentation if it is necessary.
Note that some samples might have only types, or some might have terms.
List of ontologies to avoid using for training in compliance with the challenge policy OBI, FOAF, CopyrightOnto, Metadata4Ing, PROCO, PTO, SWEET, SPDocument, MDSOnto, MatOnto, AgrO, TimelineOntology, MusicOntology, GTS, PeriodicTable, GND, QUDT, SchemaOrg, GeoNames, FoodOn, DOID, GoodRelations, BFO, ENVO, Conference, VIMMP, VIBSO, OM, DoCO, AUTO, Wine, DOLCE, CCO, DBpedia, MaterialInformation, LexInfo.

Evaluation Metrics

Standard Metrics: Precision, Recall, F1
Standard Task-specific Metrics: Precision, Recall, F1 scores, for Term Typing, Taxonomy Discovery, and Non-Taxonomic RE

What approaches can be developed?

Graph-aware reasoning: Leverages the existing ontology structure to infer new relationships through connectivity and transitive patterns.
Multi-task learning: Trains a single model to jointly predict multiple ontology tasks such as term typing and taxonomy relations.
Pattern mining: Extracts recurring linguistic patterns from text to identify ontology relationships with high precision.
Constrained LLM: Guides the LLMs to generate triples within predefined terms, types, and ontology rules to reduce hallucination.
or ...

There is no restriction in terms of approaches!

Page updated

Google Sites

Report abuse