Grounded Image Editing Request (GIER)

A language-driven image editing dataset that supports local and global editing with unconstraint images and free-form language

Overview

Dataset Overview

An overview of the GIER dataset. Each data sample is a triplet of source image, target image and language request. We provide annotations of possible editing operations, operation type (global or local), and the masks corresponding to each local operation. We also provide more possible language requests collected from Photoshop experts and amateurs. The data triplet is crawled from zhopped and reddit, and the photoshop experts are hired from Upwork and amateurs from AMT.

Statistic

The following figures shows the distribution of different editing operations.




  • Unique image pairs: 6179

  • Average request per image: 4.83

  • Average number of operation per edit: 3.21

Download

GIER.json: contains all image urls, requests, operation annotations.
split.json: contains the train/val/test split.
images.zip: contains all the images.
masks.zip: contains all the masks.
features.zip: contains all the features.

The specific usage please refer to our repository

Paper

@inproceedings{shi2020benchmark, title={A Benchmark and Baseline for Language-Driven Image Editing}, author={Shi, Jing and Xu, Ning and Bui, Trung and Dernoncourt, Franck and Wen, Zheng and Xu, Chenliang}, booktitle={Proceedings of the Asian Conference on Computer Vision}, year={2020}}

Contact

For further question, please contact Jing Shi.