From Coarse To Fine: Efficient Training For Audio Spectrogram Transformers