David Blanco-Mulero, Julia Borras, Carme Torras
Accepted to IEEE/RSJ IROS 2025
Preprint | Code (Coming soon)
Robotic-assisted dressing has the potential to significantly aid both patients as well as healthcare personnel, reducing the workload and improving the efficiency in clinical settings. While substantial progress has been made in robotic dressing assistance, prior works typically assume that garments are already unfolded and ready for use. However, in medical applications gowns and aprons are often stored in a folded configuration, requiring an additional unfolding step.
In this paper, we introduce the pre-dressing step, the process of unfolding garments prior to assisted dressing. We leverage imitation learning for learning three manipulation primitives, including both high and low acceleration motions. In addition, we employ a visual classifier to categorise the garment state as closed, partly opened, and fully opened configuration. We conduct an exhaustive empirical evaluation of the learned manipulation primitives as well as their combinations. Our results show that highly dynamic motions are not enough for unfolding the garment, requiring further refinement via quasi-static motions to achieve at least a partly opened configuration.
Table of Contents
Fling
Shake
Twist
The demonstrations are recorded using an OptiTrack system, which outputs the Cartesian positions and quaternion orientations of both hands.
We use the left hand as a reference and assume the motion to be symmetric.
The raw hand trajectories include movements along non-essential axes. To address this, we pre-process the demonstrations to retain rotation only along the essential axes:
Fling. Primary rotation around the Y-axis.
Shake. Primary rotation around Y-axis.
Twist. Primary rotation around Z-axis.
The image below illustrates the main axes and the robot set-up.
Then we use the inverse kinematics of the Python robotics-toolbox to compute the joint trajectory for each motion, as shown below.
It can be observed that, in all motions, the left and right robots collide.
To prevent potential collisions and minimise excessive tension on the garment, as discussed in the paper, we apply a constraint after computing the DMP: the Y-axis position of both end-effectors is fixed throughout the trajectory.
Fling
Shake
Twist
The joint trajectory of each motion is used as a demonstration for separate DMPs.
Following Hannus, E. et al., 2024 we use the constrained DMP approach proposed by Sidiropoulos, A., Papageorgiou, D. and Doulgeri, Z., 2023. to constrain the joint position, velocity and acceleration. You can have a look at the constrained DMP code.
We use the same DMP and optimisation parameters as in Hannus, E. et al., 2024 (code available here).
For our experiments we utilise a UR5e. The limits used as constraints for the DMP are:
Joint position limits. Max = [2π, 2π, 2π, 2π, 2π, 2π]; Min = [-2π, -2π, -2π, -2π, -2π, -2π]
Joint velocity limits. According to the UR data the maximum velocity of each joint is 180deg/sec.
Joint acceleration limits. The advised maximum acceleration for each joint is 800deg/sec^2 for joint motions.
For safety reasons, both velocity and acceleration limits are restricted to 90% of their maximum values.
Constrained DMP Fling
Constrained DMP Shake
Constrained DMP Twist
To clearly demonstrate the need for the constrained DMP, the plots below illustrate the position, velocity, and acceleration of the six robot joints during the twist trajectory.
These profiles are shown for both the unconstrained DMP (UC-DMP) and the constrained DMP (Opt-DMP). The dashed lines indicate the joint position, velocity, and acceleration limits.
Joint 1
Joint 2
Joint 3
Joint 4
Joint 5
Joint 6
Our dataset consists of 5,081 images at a resolution of 640×480. The dataset includes:
Closed gowns: 2,473 images.
Partly opened gowns: 1,376 images.
Fully opened gowns: 1,233 images.
We used YOLO11n as classifier and the following parameters for training:
Epochs = 150
Batch size = 64
Optimizer = AdamW
Initial learning rate = 0.01
Final learning rate = 0.1
Weight decay = 0.0005
Early stopping = 10
We also apply data agumentation including the following types of augmentation and parameters:
Color augmentation (hue shift). HSV_h = 0.015,
Saturation augmentation. HSV_s=0.7
Brightness augmentation. HSV_v=0.4,
Rotation augmentation. Degrees=10.0,
Translation augmentation. Translate=0.1,
Scaling augmentation. Scale=0.5,
Shearing augmentation. Shear=2.0,
Perspective distortion. Perspective=0.0005,
Flip upside down. Flipud=0.5,
Flip left-right. Fliplr=0.5,
Fore more info about the data augmentation please check Ultralytics YOLO Data augmentation.
Below you can find the performance evaluation of the fine-tuned YOLO11n model.
The normalised confusion matrix shows the classification accuracy per class.
The F1-Confidence curve shows how the model’s F1 score varies with the confidence threshold, reflecting the balance between precision and recall across different classes.
Júlia Borràs
Institut de Robòtica i Informàtica Industrial, CSIC-UPC
Spain
To cite this work, please use the following BibTex entry:
@article{blancomulero2025predressing,
title={Evaluating the Pre-Dressing Step: Unfolding Medical Garments Via Imitation Learning},
author={David Blanco-Mulero and Júlia Borràs and Carme Torras},
year={2025},
eprint={2507.18436},
archivePrefix={arXiv},
primaryClass={cs.RO},
url={https://arxiv.org/abs/2507.18436},
journal={arXiv preprint arXiv:2507.18436},
}