Leveraged wisely, new datasets can inspire new multimedia methods and algorithms, as well as challenge us to rethink how we evaluate their efficacy, efficiency, and generalizability. The availability of massive, open multimedia datasets like the Yahoo Flickr Creative Commons 100 Million (YFCC100M), which spans 99.2 million images and 0.8 million videos, offers unique opportunities for advancing the state of the art in multimedia processing, analysis, search, and visualization.
The Multimedia Commons initiative, a multi-institution collaboration, was launched last year to compute features, generate annotations, and develop analysis tools, principally focusing on the YFCC100M. These resources are being released into the public domain, hosted via Amazon’s Public Data Sets program. They have already been used for research in several multimedia subfields, including computer vision, image processing, and video content analysis. With additional annotation and curation, this data has the potential to enable major leaps forward in research.
The MMCommons’16 workshop will provide a forum for the community of current and potential users of the Multimedia Commons -- i.e., everybody! -- to share novel research using the YFCC100M dataset, emphasizing approaches that were not possible with smaller or more restricted multimedia collections; ask new questions about the scalability, generalizability, and reproducibility of algorithms and methods; re-examine how we use data challenges and benchmarking tasks to catalyze research advances; and discuss priorities, methods, and plans for continuously expanding annotation efforts.
In encouraging collaboration around an open, shared dataset, we hope to inspire participation from a diverse set of multimedia researchers working on a broad set of tasks. Major themes of the MMCommons'16 workshop will include deriving insights about the content and structure of multimedia from the MMC data -- and from comparisons with other datasets; addressing scale; predicting dataset bias and measuring reproducibility; re-examining evaluation paradigms; optimizing annotations; and laying the groundwork for broad applications.
Looking for information about the Multimedia Commons Initiative in general? Check it out here!