This webpage offers two distinct datasets designed to support a range of research activities. The first dataset focuses on visual affordance understanding, providing valuable resources for studying how objects are perceived for action in various contexts. The second dataset addresses complex word identification, aiding in natural language processing tasks. Both datasets are curated to facilitate innovative research in robotics, AI, and NLP.
The physical and textural attributes of objects have been widely studied for recognition, detection and segmentation tasks in computer vision. A number of datasets, such as large scale ImageNet, have been proposed for feature learning using data hungry deep neural networks and for hand-crafted feature extraction. To intelligently interact with objects, robots and intelligent machines need the ability to infer beyond the traditional physical/textural attributes, and understand/learn visual cues, called visual affordances, for affordance recognition, detection and segmentation. To date there is no publicly available large dataset for visual affordance understanding and learning.
We introduce a large scale multi-view RGBD visual affordance learning dataset, a benchmark of 47210 RGBD images from 37 object categories, annotated with 15 visual affordance categories including, 'grasp', 'wrap grasp', 'containment', 'liquid-containment', 'openable', 'dry', 'tip-push', 'display', 'illumination', 'cut', 'pourable', 'rollable', 'absorb', 'grip' and 'stapling'. The dataset also contains 35 cluttered/complex scenes with different objects and multiple affordances.
Here are some example affordance annotations from our dataset.
-----------------------------------------------------------------------------------------------------------------------------------------------
Please cite the following papers if you use this dataset:
Hierarchical Transformer for Visual Affordance Understanding using a Large-scale Dataset - IROS2023
Syed Afaq Ali Shah and Zeyad Khalifa
Link: https://ieeexplore.ieee.org/document/10341976
A large scale multi-view RGBD visual affordance learning dataset - ICIP2023
Zeyad Khalifa and Syed Afaq Ali Shah.
-----------------------------------------------------------------------------------------------------------------------------------------------
Download Dataset
The Affordance Learning Dataset provided here is for non-commercial research/educational use only.
Download Links:
We have developed a CWI dataset in binary and probabilistic settings consisting of the same examples. However, the labels differ i.e., in the binary setting, a word is classified as either complex (1) or simple (0), while in the probabilistic setting, the label is a probability indicating the complexity of a word, calculated as the proportion of annotators who consider the word to be complex. The labels in the transformed dataset are as follows:
Binary setting. Original labels (0 - complex class, 1 - simple class) have been converted to ”yes” and ”no”.
Probabilistic setting. Original labels have been converted to text, i.e. for probability of 0.5 of a word being complex, the model has to output ”0.5”.
Additionally, our transformed dataset introduces two questions:
”question: complex”, where model has to predict binary label for the word or, depending on the setting, probability of a word being complex
”question: simple”, where model has to predict binary label for the word or, depend on the setting, probability of a word being simple.
The simple question is opposite to complex.
Download Dataset and Code
Please cite the following papers if you use this dataset:
Text-To-Text Generative Approach for Enhanced Complex Word Identification. Neurocomputing 2024.
Patrycja ´Sliwiaka and Syed Afaq Ali Shah.