Scene Understanding in Deformable Object Manipulation

via Taxonomy-Guided Vision-Language Models