Deformable Cluster Manipulation via Whole-Arm Policy Learning