Our work introduces three news way to perform data level dropout: Difficulty-Based Progressive Dropout (DBPD), Scalar-Based Random Dropout (SRD) and Schedule-Matched Random Dropout (SMRD).
We prescribe a hardware agnostic method to calculate efficiency of training a neural network, called Effective Epochs.
Our experiments are conducted on a wide variety of datasets and models, and we prove performing our technique(s) reduces the effective epochs to as little as 12.4% and also increases the accuracy consistantly across models.
We also show the superior performance of our technique with respect to effective epochs on pre-training of self-supervised models.