Learning and understanding single image depth estimation in the wild