Dataset

multi-view data

METABRIC-BRCA: gene expression profiles, CNV profiles, and clinical information

multi-modal: text and image [TIP2023_Graph Embedding Contrastive Multi-Modal Representation Learning for Clustering]

Corel Image Features: no label

CUHK Face Sketch FERET Database (CUFSF):  Face & Sketch

Advertisement Dataset 

CCV:  Columbia Consumer Video (CCV) Database, A Benchmark for Consumer Video Analysis

FCVID: Fudan-Columbia Video Dataset

MNIST: edge view and gray view

n-MNIST handwritten digit dataset

Prokaryotic phyla 

Other data

Corel10K and GHIM-10k 

Reuters、Cora、WebKB、Movies、Newsgroup 

SenITVehicle_2views_300samples_3clusters.mat

toyViews.RData  

airlines_raw.csv.bz2、askubuntu_processed.csv.bz2 

MSRC

WikipediaArticles.mat 

synthetic dataset for paper 'Auto-weighted Multi-view Constrained Spectral Clustering' 

scene.mat 

ImageNet-10 

MovieLens-1M

Jester jok

VidTIMIT Audio-Video Dataset 

SensIT Vehicle (acoustic, seismic) 

Wikipedia articles: Full - 2,866 multimedia documents (image + text) and features (matlab format)  

LINQS :CiteSeer for Document Classification , cora, Social Spammer ,Drug-Target Interaction ,Stance Classification ,CiteSeer for Entity Resolution ,ArXiv ,PubMed Diabetes ,WebKB ,Terrorists ,Terrorist Attacks 

https://github.com/thuiar/AWESOME-MSA/tree/ce3cc6f805f57a8e92c5a58d23bd73515426316b#related-datasets


Multi-label-Multi-view

COREL 5K、IAPR TC-12、ESP GAME、PASCAL VOC 2007、MIR FLICKR

MULAN package

The Extreme Classification Repository: Multi-label Datasets & Code

ESP GAME

Single-view

UL-FMTV : thermal infrared face dataset

Visual and Thermal face dataset

MPEG-7 CE Shape-1 Part B 

ALOI (1000 classes with small objects)



deformity detect 

PKU-Market-Phone、PKU-Market-PCB