Learn2Augment contains three core components: a Selector that learns to choose good videos to augment, a Semantic Matching method that improves optimization, and a Video Compositing that composites video pairs for augmentation.
Learns to select good video pairs for augmentation
Use of semantic matching reduces the search space of possible pairs by 100x
While costly to train selector, we can train it once on a large dataset and then use it without the need of finetuning
Obtains SOTA on Few-shot learning as well as semi-supervised learning