Synthetic video to language