SuPer: A Surgical Perception Framework

Traditional control and task automation have been successfully demonstrated in a variety of structured, controlled environments through the use of highly specialized modelled robotic systems in conjunction with multiple sensors. However, application of autonomy in endoscopic surgery is very challenging, particularly in soft tissue work, due to the lack of high quality images and the unpredictable, constantly deforming environment.

In this work, we propose a novel surgical perception framework, SuPer, for surgical robotic control. This framework continuously collects 3D geometric information that allows for mapping of a deformable surgical field while tracking rigid instruments within the field. To achieve this, a model-based tracker is employed to localize the surgical tool with a kinematic prior in conjunction with a model-free tracker to reconstruct the deformable environment and provide an estimated point cloud as a mapping of the environment.

The proposed framework was implemented on the da Vinci Surgical System in real-time with an end-effector controller where the target configurations are set and regulated through the framework. Our proposed framework successfully completed autonomous soft tissue manipulation tasks with high accuracy. The demonstration of this novel framework is promising for the future of surgical autonomy.

Carousel imageCarousel imageCarousel imageCarousel imageCarousel image

Framework and Setup


SuPer Deep

Robotic automation in surgery requires precise tracking of surgical tools and mapping of deformable tissue. Previous works on surgical perception frameworks require significant effort in developing features for surgical tool and tissue tracking.

In this work, we overcome the challenge by exploiting deep learning methods for surgical perception. We integrated deep neural networks, capable of efficient feature extraction, into the tissue reconstruction and instrument pose estimation processes. By leveraging transfer learning, the deep learning based approach requires minimal training data and reduced feature engineering efforts to fully perceive a surgical scene.

The framework was tested on publicly available datasets, which use the da Vinci Surgical System, for comprehensive analysis. Experimental results show that our framework achieves state-of-the-art tracking performance in a surgical environment by utilizing deep learning for feature extraction.



(click on the image to see more details and other experiments!)

SuPer Framework

Follow Trajectory

Without Framework

Grab Meat

Dataset / Links

The dataset is saved as a ROSBag format and is a recording of the repeated tissue manipulation experiment. The datastreams are the raw stereo-scopic left and right images, the encoder readings from the surgical robot, and a generated mask from our tool tracking. For details on the surgical robot, including the location of blue markers, refer to the LND.json file. Also, an initial handeye is also provided for purposes of surgical tool tracking. However, this is not necessary for testing deformable tissue tracking, as we already provide a mask. Note that the mask is generated in the left, rectified, undistorted camera frame and is not perfect so dilation is recommended.


  • SuPer: A Surgical Perception Framework for Endoscopic Tissue Manipulation with Surgical Robotics

Yang Li*, Florian Richter*, Jingpei Lu, Emily K. Funk, Ryan K. Orosco, Jianke Zhu, and Michael C. Yip (* Equal contributions)

IEEE Robotics and Automation Letters, vol. 5, no. 2, pp. 2294-2301, April 2020. [arXiv]

  • SuPer Deep: A Surgical Perception Framework for Robotic Tissue Manipulation using Deep Learning for Feature Extraction

Jingpei Lu*, Ambareesh Jayakumari*, Florian Richter, Yang Li, and Michael C. Yip (* Equal contributions)



Ambareesh S N Jayakumari

Emily K. Funk

Ryan K. Orosco

Jianke Zhu