The Confluence of Vision and Control

American Controls Conference 2020 Workshop

Held online on June 30, 2020

Workshop Overview

The use of visual sensors in feedback control has been an active topic of research for decades. As the cost of hardware lowers and computational capabilities increase, vision-based control is reaching new levels of capability and application. Recent innovations in computer vision can provide greater capabilities to control applications such as autonomous vehicles and robotics. At the same time, open problems in computer vision can be solved through control theory, such as nonlinear and adaptive control.

We presented eleven discussions on recent work in vision-based control, the application of control to computer vision, and topics in which vision and control are uniquely intertwined. We sought to highlight recent developments and open problems that exist at the intersection of vision and control and spur further research and development in the community.

Several speakers have agreed to post their presentations below. If you are interested in the outcomes or more information, feel free to reach out to the speakers or organizers below.

Agenda for the Workshop

8:30 AM – 8:45 AM Welcome & Introductions

8:45 AM – 9:15 AM Warren Dixon - Intermittent Image based feedback

Description: Image feedback can be used to estimate Euclidean distances between features in an image and/or relative motion between an image feature and the camera. Typical current methods assume that the image feature is continuously observed; yet, in most practical scenarios, the image feature can be occluded. Image occlusion/intermittency segregates the image dynamics into two subsystems: when the feature is visible, feedback is available and the estimator/observer is stabilizable, otherwise its unstable. This talk discusses the use of switched/hybrid systems methods including the development of sufficient dwell-time conditions to ensure the stability of such estimators/observers.

Biosketch: Prof. Warren Dixon received his Ph.D. in 2000 from the Department of Electrical and Computer Engineering from Clemson University. He worked as a research staff member and Eugene P. Wigner Fellow at Oak Ridge National Laboratory (ORNL) until 2004, when he joined the University of Florida in the Mechanical and Aerospace Engineering Department. His main research interest has been the development and application of Lyapunov-based control techniques for uncertain nonlinear systems. He is an ASME Fellow and IEEE Fellow, and formerly an IEEE Control Systems Society (CSS) Distinguished Lecturer.

Description: In this talk, new developments of concurrent learning based full and reduced observers for a perspective dynamical system (PDS) will be presented. The PDS is a widely used model for estimating the depth of the feature point from a sequence of camera images and camera motion information. Building on the current progress of concurrent learning (CL) for parameter estimation in adaptive control, a state observer is developed for a PDS model, where the inverse depth appears as a time-varying parameter in the dynamics. Using the data recorded over a sliding time window in the near past, information about the recent depth values is used in a CL term to design a reduced order state observer. Results of the real world depth estimation experiments will be shown to demonstrate the efficiency of the proposed observers using a 7-DOF manipulator with an eye-in-hand configuration. A result on extended object tracking of an object will be presented. The method to estimate shape (extent) of an object along with its kinematic state given the sparse point measurements with noise will be presented.

Biosketch: Dr. Ashwin Dani received M.S. and Ph.D. degrees from the University of Florida (UF), Gainesville. He is currently an Associate Professor in the Department of Electrical and Computer Engineering at the University of Connecticut, Storrs. He was a post-doctoral research associate in Aerospace Engineering department at the University of Illinois at Urbana-Champaign. His research interests are in the area of estimation, machine learning for control, human-robot collaboration, and vision-based control and autonomous navigation. He is a senior member of IEEE, and a member of the Conference Editorial Board for IEEE Control System Society.

Description: Data association is concerned with matching elements of two (or multiple) sets of data that are measured by sensors or known based on prior knowledge. As such, it includes a broad range of control and robotic applications, including estimating camera motion, loop-closure in SLAM, and map merging based on feature correspondences in images. The main challenge in data association is the existence of wrong matches, which occur due to noise, outliers, or similar-looking features, and if not corrected can drastically affect the results in these applications. In this talk, we review classical and recent techniques for robust data association, ranging from model-based methods between two sets (e.g, RANSAC) to model-free techniques based on the notion of cycle consistency across multiple sets of data. We show that these techniques can achieve considerable improvement in the accuracy of existing pipelines such as SLAM.

Biosketch: Kaveh Fathian is a Postdoctoral Associate at the Massachusetts Institute of Technology. He received the PhD degree in Electrical Engineering in 2018, the MS degree in mathematics in 2018, and the MS degree in Electrical Engineering in 2013 from the University of Texas at Dallas, USA. He is a member of the IEEE Control Systems Society and Robotics and Automation Society. His research interests include topics in linear and nonlinear control theory, distributed and multi-robot systems, object detection and motion estimation from images and point clouds.

10:15 AM – 10:45 AM Coffee Break

10:45 AM – 11:15 AM Nicholas Gans - Five-Point Algorithms for Estimating Pose and Velocity

Description: We present a solution to the problem of recovering the rotation and translation, and linear and angular velocity of a moving camera from captured images using the minimal number of feature points. Recovery of translation and rotation are commonly recovered using the epipolar constraint, where the rotation and translation changes are recovered from the essential matrix. This leads to problems in cases of pure rotation or planar scenes. Velocity is an overlooked problem, often estimated using pose estimates over short time frames. We will present a new formulation based on the quaternion representation of rotation to recover the rotation and translation from five matched feature points. The same methodology is then followed to estimate angular and linear velocity from five optical flow points. The estimates then are fused in an extended Kalman filter. Experimental results using public vision datasets will be presented.

Biosketch: Nicholas Gans earned his Ph.D. in Systems and Entrepreneurial Engineering from the University of Illinois Urbana-Champaign in 2005. He is currently Division Head of Autonomy and Intelligent Systems at the University of Texas at Arlington Research Institute. Prior to this position, we was a professor in the department of Electrical and Computer Engineering at The University of Texas at Dallas. His research interests are in the fields of robotics, nonlinear and adaptive control, machine vision, and autonomous vehicles. He is a senior member of IEEE, and an Associate Editor for the IEEE Transaction on Robotics.

Description: Man-made multi-robot systems have been advancing apace with the help of high-performance hardware and computational technologies. Despite the high-performance computing, communication, sensing, and power devices used in these systems, their effectiveness in uncertain environments appears to still fall behind the natural systems such as a swarm of ants, a flock of birds, or a team of wolves. One of the challenges in multi-robot coordination is the lack of effective distributed algorithms and designs that enable the robots to work cooperatively and safely in uncertain environments. This talk will present some recent research results on distributed algorithms and robust control methods for multi-robot coordination, and on the exploration of image information in the feedback loop.

Biosketch: Guoqiang Hu received his Ph.D. degree in Mechanical Engineering from the University of Florida in 2007. He is currently with the School of Electrical and Electronic Engineering at Nanyang Technological University, Singapore. His research focuses on analysis, control, design and optimization of distributed intelligent systems. More specifically, he works on distributed control, optimization and games, with applications to multi-robot systems and smart city systems. He serves as Associate Editor for IEEE Transactions on Automatic Control and IEEE Transactions on Control Systems Technology.

Description: This talk considers the problem of localizing a network of cameras using vision-based measurements. In the first part of the talk, we will examine distributed algorithms that can be employed to fuse local pairwise rotation and direction measurements into a globally consistent localization with local computations and discrete-time updates. In the second part of the talk, we will consider new theoretical results that allow us to provide a statistical confidence value to the results that can be obtained by localization algorithms that optimize robust costs based on the L1 norm.

Biosketch: Roberto Tron is an Assistant Professor in the Mechanical Engineering department at Boston University. He received his B.Sc. (2004) and M.Sc. (2007) degrees (highest honors) from the Politecnico di Torino, Italy. He received a Diplome d’Engenieur from the Eurecom Institute and a DEA degree from the Université de Nice Sophia-Antipolis in 2006. He received his Ph.D. in Electrical and Computer Engineering from The Johns Hopkins University in 2012, and has been a post-doctoral researcher with the GRASP Lab at the University of Pennsylvania until 2015.

His research spans automatic control, robotics and computer vision, with particularly interest in applications of Riemannian geometry and in distributed perception, control, and planning for teams of multiple agents. He was recognized at the IEEE Conference on Decision and Control with the “General Chair’s Interactive Presentation Recognition Award” (2009), the “Best Student Paper Runner-up” (2011), and the “Best Student Paper Award” (2012). His research interests include applications of Riemannian geometry and optimization to problems in computer vision, control of multi-agent systems and robotics.

12:15 PM – 1:30 PM Lunch Break

1:30 PM – 2:00 PM Romeil Sandhu - 2D/3D Interactive Feedback Control for Autonomous Systems

Description: While significant work in autonomy and robotics have been made, most recently with advances in machine learning, there still exists no universal framework capable of handling complex real-world scenarios due to “unknown unknown.” In particular, during black-swan events for which loss-of-life must be mitigated, operators often mistrust and abandoned their autonomous counterparts in favor of (computationally intractable) experience for which errors may still arise yet accountability is upheld. While such operators are experts in their domain, they can be considered non-experts given their obfuscation to the autonomous model construction, leading to mistrust, especially in situations of duress. To combat this, we introduce a geometric interactive control 2D3D feedback approach towards imaging systems in order to properly incorporate a 3D human operator based on single or multiple 2D image observations. Ultimately, given the ubiquitous definition of trust under ambiguous situations, we aim to provide the non-expert expert as needed intervention without abandoning their autonomous counterpart.

Biosketch: Romeil Sandhu is currently an Assistant Professor at Stony Brook University and is the recipient of the 2018 AFOSR YIP Award for work on interactive feedback control for autonomous systems and 2018 NSF CAREER Award for work on geometric optimization of time-varying networks. Romeil first received his B.S. and M.S. degrees from the Georgia Institute of Technology in Electrical Engineering in 2006 and 2009, respectively. Then, under the direction of Professor Allen Tannenbaum, he completed his Ph.D. in 2011. His current research interest lies on the intersection of control, geometry, and statistics applied to problems rooted in networks, imaging, and learning.

Description: Visual simultaneous localization and matching (SLAM) has long been considered an important component of autonomous robotics, especially when navigating unknown environments. Recently, several robust real-time methods have been introduced. These methods must make a trade-off between computational latency and pose estimation accuracy, with many opting to reduce latency over improved accuracy. Instead, we explore acceleration methods for existing high-accuracy approaches based on actively enhancing the conditioning of key SLAM optimization sub-problems through measurement sub-selection. In doing so, we identify a way to mitigate the trade-off and obtain low latency and high accuracy pose estimation in visual SLAM. These results hold for open-loop SLAM benchmarks. The question is then: How much does improving latency versus accuracy impact the closed loop? To answer this question, we incorporate several stereo visual-inertial SLAM algorithms and task them to perform closed-loop trajectory tracking. Exploring their relative positioning in the latency-drift parameter space suggests that accuracy is more important than latency once a certain latency threshold is met. As a consequence accelerating accurate but slow SLAM methods may provide the best solution to SLAM on modern mobile robots.

Biosketch: Patricio A. Vela is an associate professor in the School of Electrical and Computer Engineering at the Georgia Institute of Technology. Dr. Vela's research focuses on geometric perspectives to control theory and computer vision, particularly how concepts from control and dynamical systems theory can serve to improve computer vision algorithms used in the decision-loop. These efforts are part of a broad program to understand research challenges associated with autonomous robotic operation in uncertain environments. Dr. Vela received a B.S. (1998) and a Ph.D. (2003) from the California Institute of Technology. He was a post-doctoral researcher at the Georgia Institute of Technology from 2003 to 2005.

02:30 PM – 03:00 PM Coffee Break

Participants to Be Determined

Description: In this talk we will describe the development of a visual multiple target tracker for tracking multiple ground objects from fixed-wing and multi-rotor UAS. The algorithm is based on the recently introduced Recursive-RANSAC algorithm, in conjunction with a visual front end. We will also describe vision-based orbiting algorithms that guarantee target observability and asymptotic target following. We will describe practical issues related to implementation and describe our efforts to enable long-duration target following.

Biosketch: Prof. Randal W. Beard received his PhD in 1995 from the Department of Electrical, Computer, and Systems Engineering Rensselaer Polytechnic Institute. Since 1996, he has been with the Electrical and Computer Engineering Department at Brigham Young University, Provo, UT, where he is currently a professor. His primary research focus is autonomous control of small air vehicles and multivehicle coordination and control. He is a past associate editor for the IEEE Transactions on Automatic Control, IEEE Control Systems Magazine, and the Journal of Intelligent and Robotic Systems. He is a fellow of the IEEE, and an associate fellow of AIAA.

3:30 PM – 4:00 PM Eddie Tunstel - Leveraging the Confluence of Vision & Control Toward Intelligent Robotics

Description: This talk offers thoughts and considerations regarding vision and control in the context of robotics through discussion of applications from research projects and deployed systems associated with different domains such as planetary robotics, disaster response, and advanced manufacturing & service. Capabilities of interest for real-world applications that would leverage the confluence of vision and control for intelligent robotics are the driver. Motivating the discourse are considerations for next-level robotic intelligence, such as enhancing perception capabilities beyond the visual modality, moving beyond object recognition and grasping to knowledge and reasoning about object properties, enabling smart human-collaborative robots that are responsive to human activity observation or prediction, and advancing from robot learning for perception or control to autonomous/developmental learning and knowledge/skill transfer. The aim is to broaden the aperture of current research to the effect of boosting robotic intelligence to the next level enabling robots that are multi-functional in the real-world.

Biosketch: Eddie Tunstel earned the Ph.D. in electrical engineering from the University of New Mexico and mechanical engineering degrees from Howard University. He is an Associate Director and Group Leader for Robotics at United Technologies Research Center, East Hartford, CT. Previously he was a Sr. Roboticist and Space Robotics & Autonomous Control Lead at Johns Hopkins APL and worked for two decades at NASA JPL as a Sr. Robotics Engineer and Group Leader of its Advanced Robotic Controls Group. There he served as a Mars rover systems engineer for autonomous navigation and as rover engineering team lead for mobility and robotic arm operations on the surface of Mars. He maintains expertise in robotic autonomy & intelligent systems with authorship of over 160 publications. He is an IEEE Fellow and was 2018-2019 President of the IEEE SMC Society.

4:00 PM – 4:30 PM Takeshi Hatanaka - Control of PTZ/Drone Networks for Visual Monitoring

Description: Large-scale visual monitoring has become crucial in order to prevent crimes, to evaluate aging infrastructures, and to mitigate the damage of natural disasters. In the task, it is in general required to efficiently collect dense data in real time over environment, and a network of multiple mobile cameras can be a key technology to meet the requirement. In this talk, we start with briefly reviewing coverage control for mobile sensor networks and then investigate how this control technology is applied to networks of Pan-Tilt-Zoom (PTZ) cameras and drones with cameras. Specifically, we highlight how to manage overlaps of limited fields of view for a drone network in a distributed fashion, wherein we present a novel control barrier function approach, and how to put vision data in the loop taking a scenario of visual surveillance of human activities for a PTZ network.

Biosketch: Takeshi Hatanaka received the Ph.D. degree in applied mathematics and physics from Kyoto University in 2007. He then joined Tokyo Institute of Technology in 2007, where he held positions as an assistant and associate professor. Since April 2018, he is an associate professor at Osaka University. He is the co-author of Passivity-Based Control and Estimation in Networked Robotics (Springer, 2015). His research interests include cyber-physical & human systems, networked robotics, and energy management systems. He is an AE for IEEE TSCT and SICE JCMSI, and a member of the Conference Editorial Board of IEEE CSS.

4:30 PM - 5:30 PM Panel Discussion and Concluding Remarks

Prerequisites skills for participants:

Basic understanding of vision-based control/estimation, nonlinear and adaptive control is beneficial. For the registrants who do not have sufficient background in these topics, basic tutorial material will be provided prior to the workshop.

You can request access to the prerequisite material using the following link

Workshop Organizers

Nicholas Gans

Division Head of Automation and Intelligent Systems

University of Texas at Arlington Research Institute


Ashwin Dani

Associate Professor, Electrical and Computer Engineering

University of Connecticut