Datasets

New datasets for domain adaptation

House-price

This dataset contains house visual images and structural information from four cities: Los Angeles, Washington, New York, and Boston in the United States from (2017-2020). We randomly and manually selected 1000 houses in each city. For the visual features, we have preserved the most representative pictures from four different perspectives: the front, bedroom, bathroom, and kitchen. For structural information, we collected the number of bedrooms, number of bathrooms, housing area, postcode, construction year, renovation year, house type, parking space, and sunshine number of each house. Therefore, we have ten properties for a house, including the actual sold price of the house.

Plant-Identification

PlantCLEF 2020 contains four domains (herbarium, herbarium_photo_associations, photo and test). The herbarium domain contains 320,750 images in 997 species, and the number of images in different species are unbalanced. This dataset consists of herbarium sheets whereas the test set will be composed of field pictures. The validation set consists of two domains herbarium_photo_associations and photos. Herbarium_photo_associations domain includes 1,816 images from 244 species. This domain contains both herbarium sheets and field pictures for a subset of species, which enables learning a mapping between the herbarium sheets domain and the field pictures domain. Another photo domain has 4,482 images from 375 species and images are from plant pictures in the field, which is similar to the test dataset. The test dataset contains 3,186 unlabeled images. [more details]