Delving Deep Into the Generalization of Vision Transformers Under Distribution Shifts