Introduction
At the beginning of the semester, we were inspired by the maneuverability and grace quadcopters were capable of. We wanted to attempt a project that took advantage of these characteristics.
What we came up with as a goal was a drone capable of keeping itself at a fixed position and orientation relative to one's face. This was a pretty ambitious end goal considering our inexperience and the short period of the project, but what better way to learn? We thought having a robot capable of movement in three dimensions to be pretty interesting given that it will allow for more freedom of movement. Of course our control system becomes more complicated in exchange. We tried to address this problem by abstracting out the details of flight control, allowing the firmware on board of our quadcopter to handle flight while we supplied thrust, pitch, and yaw commands. (ignoring roll for the time being)
In addition to control, we also had to worry about communicating with the quad. We needed the ability to process images and pass commands to the quad fast enough, else risk a crash landing or even worse a crash into a person. For this reason, we decided to do our image processing off the quadcopter's board, then relay commands to the drone. To ensure minimal lag in our communications, we planned to have video and control commands transmit via separate dedicated channels.
This technology would likely find a use in the film and media. Users will have an easier time taking pictures or videos of themselves, particularly in events involving a lot of movement. Its use can be further extended if general object tracking was implemented. Check out the YouTube video we attached of Nixie, a wearable camera that unfolds from your wrist into a flying photo and video catcher.
Design Requirements:
Stays at fixed position and orientation relative to subject's face.
Must stay at a safe distance from subject.
Can only move by yaw and pitch motion.
Must be able to stay at a fixed altitude.
Design Decisions:
Modular approach: have a controlled testbed to develop control system while attempting to interface with quad's hardware.
2 DoF Gimbal with a webcam and an Arduino controlling servos by subscribing to ROS topics published by a computer is used to develop yaw and pitch control.
"Ground Control" computer to handle image processing and controls.
Commands relayed to quadrotor via ArduPilot Mission Planner software with a python script from a computer.
Face tracking based on Haar Cascade from OpenCV
Controls Design:
A controller type must be found. Roughly, 99 percent of SISO (Single Input Single Output) controllers in the industry are PI (proportional with integral action). The reason to include integral action in the controllers is to avoid a steady state error in the step response, which in this case will be present as a slight offset from the center of the camera. The differential term in a PID controller is often avoided as it will introduce a risk of controlling the noise instead of the actual signal. For those reasons, the PI-controller seems as an immediate good choice but one may argue that it is not critical that there is a slight offset from the center of the camera. Furthermore, the PI controller is dependent on a fixed sampling time (see derivation above. We start from the Laplace domain function of a PI controller and transform it to a discrete time difference equation where it is seen that Ki is multiplied with Ts, i.e the sampling time). As the current setup merely runs the controller when a measurement is captured, the system actually introduce a variable sampling time. A good initial controller will therefore be a simple proportional controller which indeed obey with the system demands of centering the face to a reasonable level.
Tradeoff Made:
Separate vs integrated image processing unit - This decision was made for better image processing performance.
3DR Quad vs CrazyFlie - Better payload and battery life was prioritized over maneuverability and size.
Arduino + Laptop vs Beaglebone Black - Easier to set up and work with since we were provided with a VM. Came at the cost of a less portable system.
Real-world Impact:
Our design decisions were more in line with a proof-of-concept system rather than a marketable product.
This is because of the large number of independent systems that need to work at the same time for the drone to work. Setting up will be too long and difficult for the average consumer.
The use of ROS also impacts the performance of the system. It will not be able to complete processing as fast as possible.
However, the separate systems and use of ROS allows for rapid development of the system. This is ideal for prototyping since we only need to worry about the reliability of each independent node in the system.
Implementation
Hardware Used:
3DR ArduCopter Quad Kit (mostly unused)
Scavenged hobby servos and gimbal
Arduino Uno
Logitech Webcam (from lab)
Setting up the gimbal:
We built a light wooden frame to mount the gimbal on. The servos plug in to the arduino as in the picture below:
Block diagram and how the system works:
Facial recognition portion:
The sensor for our setup is a webcam. The face recognition code we are using subscribes to the camera topic.
It searches sections of the image for facial features using the Haar Cascade algorithm. In order to make searching faster, the feature search area is bounded between +/-25% of the last found face's area.
Once a match is found, the node publishes the coordinates of the center of the square and draws a green square around the detected face in the processed image window.
Controller portion:
Two parallel proportional controllers makes sure that the camera centers the face at all time. One controller for pitch and one for yaw. The proportional values are carefully chosen from experiments in the lab. The experiments revealed that controller values above a certain size would make the system unstable. So without any actually modeling performed, we found an appropriate controller value that ensured stability and a reasonable rate of change.
To accommodate the need for a variable reference, i.e. the face that is tracked, the new angle passed to a servo is equal to the old angle plus the angle adjustment needed to compensate for the change in the face location from the center of the screen and that is calculated by the P controller.
Additionally, the controller is implemented such that they have a variable value depending on the size of the error. The error is defined as the measurements subtracted from the reference (which is zero). This ought to improve the speed at which the servos adjust the camera to its center position when the face is far from being centered. When the face is almost centered, we have small error values and we use a smaller controller value.
Two important filters are introduced after the controllers. First a low pass filter that ensures that fast transients are not allowed. This filter is implemented as the test phase revealed certain unstable and jumping measurements. This instability resulted in a camera that sometimes randomly seek towards the extremity positions. The low pass filter proved valuable in this aspect. The last filter is a limiter which ensures that we do not send commands that makes the servos move outside their physical boundaries.
Finally, the control block publish the angles to servo1 and servo2 topics and thereby pass these values to the Arduino.
Further details can be found in the code
Arduino portion:
The Arduino subscribes to ROS topics "servo1" and "servo2" published by the controller. The arduino code then orders the servos to move to the corresponding angles using PWM signals.
Results
Our camera tracks a face as we specified in our goal. It is able to track a person while standing up, sitting down, moving left or right, moving towards or away for the camera, and even jumping.
Check out the video of the gimbal tracking with facial recognition on the right.
Conclusion
Although our finished solution met our design criteria, we found that the facial recognition software that we used and edited is not fast and consistent enough for a stable and smooth control system. We expect that the facial recognition was slow because we were running it on Virtual Machine, and not consistent because it only searches for faces aligned vertically.
Troubles Faced:
Interfacing with quad: We were unable to get mission planner to communicate correctly with ROS in addition to it not being compatible with Virtual Machine.
Getting the wireless camera to stream like a webcam. (We used a wired webcam in the end)
ROS on Beaglebone (uP): BeagleBone's Ubuntu OS was not able to download and install ROS packages as it would not connect to the internet.
Improvements:
Better and faster facial recognition algorithm → (Smooth out the movement).
Fixed sampling time → (required for advanced controllers).
Kalman filter → (estimate missing measurements).
Use an easier quad and microcontroller to interface with.
Team
Nicholas Gan
AA in Liberal Arts, Compter Science Emphasis
Computer, Electrical, and Automotive Engineer
Raghid Mardini
Mechanical Engineering major, focusing on design and controls.
Christian Koecks Lykkegaard
BSc in Electrical engineering (real time signal processing)
MSc student in Electrical engineering (automation and control)
Danish exchange student from Aalborg University
The Equipment
Videos
Code:
face_track.py:
#!/usr/bin/env python
import rospy
from rospy.numpy_msg import numpy_msg
from sensor_msgs.msg import Image
from std_msgs.msg import String
from ar_track_alvar.msg import FaceTF
import cv,cv2,time,sys
from cv_bridge import CvBridge, CvBridgeError
from numpy.linalg import *
import numpy as np
minFaceSz = 20
maxFaceSz = 640
def detect(img, cascade):
global minFaceSz
global maxFaceSz
rects = cascade.detectMultiScale(img, scaleFactor=1.8, minNeighbors=4, minSize=(minFaceSz, minFaceSz), maxSize=(maxFaceSz, maxFaceSz), flags = cv2.CASCADE_SCALE_IMAGE)
if len(rects) == 0:
return []
rects[:,2:] += rects[:,:2]
return rects
def draw_rects(img, rects, color):
for x1, y1, x2, y2 in rects:
cv2.rectangle(img, (x1, y1), (x2, y2), color, 2)
class FaceTracker:
def __init__(self):
self.cascade = cv2.CascadeClassifier("/home/ee125/opencv-3.0.0-beta/data/haarcascades/haarcascade_frontalface_alt.xml")
# ROS to CV image bridge
self.bridge = CvBridge()
#Publishes an image after image processing
self.pub = rospy.Publisher('processed_image', Image)
self.pub2 = rospy.Publisher("face_tf", FaceTF)
#Initialize the node
rospy.init_node('face_tracker')
#Subscribe to the image topic
rospy.Subscriber("/usb_cam/image_raw", Image, self.img_received, queue_size=1)
def run(self):
try:
rospy.spin()
except KeyboardInterrupt:
cv2.destroyAllWindows()
def img_received(self, message):
# Convert ROS message to Numpy image
self.np_image = np.array(self.bridge.imgmsg_to_cv(message,'bgr8'))
gray = cv2.cvtColor(self.np_image, cv2.COLOR_BGR2GRAY)
gray = cv2.equalizeHist(gray)
#t = clock()
self.rects = detect(gray, self.cascade)
self.vis = self.np_image.copy()
draw_rects(self.vis, self.rects, (0, 255, 0))
#cv2.imshow('facedetect', self.vis)
self.publish_output()
cv2.waitKey(1)
# Publish the output image stored in self.np_image
def publish_output(self):
global minFaceSz
global maxFaceSz
#cv_image = cv.fromarray(self.np_image)
cv_image = cv.fromarray(self.vis)
ros_msg = self.bridge.cv_to_imgmsg(cv_image, encoding="bgr8")
self.pub.publish(ros_msg)
if len(self.rects) != 0:
diffx = self.rects[0][0] + self.rects[0][2]
diffy = self.rects[0][1] + self.rects[0][3]
self.pub2.publish(1 * (diffx/2 - 320), 1 * (diffy/2 - 236))
minFaceSz = int(0.5 * diffx)
maxFaceSz = int(1.5 * diffx)
else:
minFaceSz = 30
maxFaceSz = 500
if __name__ == '__main__':
node = FaceTracker()
node.run()
tf_listener.py:
#!/usr/bin/env python
import roslib
import rospy
import tf
import math
from ar_track_alvar.msg import FaceTF
from std_msgs.msg import String
from std_msgs.msg import UInt16
from tf.transformations import euler_from_quaternion
import numpy as np
import time
u=0
ref=0
k = 10.0 / 640
ky = 10.0 /472
uy=0
refy=0
xcenter=62.5
ycenter=62.5
uold=50
uyold=62.5
xsensorold=0
ysensorold=0
def callback(msg):
global counter
global u
global ref
global xcenter
global ycenter
global uold
global uyold
global xsensorold
global ysensorold
pub.publish("*******************************")
pub.publish(time.strftime("%H:%M:%S +0000", time.gmtime()))
xsensor=msg.x
pub.publish('x:'+str(xsensor))
ysensor=msg.y
pub.publish('y:'+str(ysensor))
###Controller START
# Filter for outliers
if (abs(xsensor-xsensorold)>300):
xsensor=xsensorold
if (abs(ysensor-ysensorold)>300):
ysensor=ysensorold
# Two levels of proportional controller values
if (abs(xsensor)>180):
k = 15.0/640 # When far from center
else:
k = 10.0/640 # For fine adjustment when close to the center
if (abs(ysensor)>180):
ky = 12.0/472 # When far from center
else:
ky = 9.0/472 # For fine adjustment when close to the center
# Controller Part
err = ref - xsensor
u = -1*err*k + uold
erry = refy - ysensor
uy = -1*erry*ky + uyold
# Filter to limit angle change
if (u-uold>5):
u=uold+5
if (u-uold<-5):
u=uold-5
# Filter to limit total angle for joint limits
if (u > 120):
u = 120
elif (u < 20):
u = 20
if (uy > 90):
uy = 90
elif (uy < 30):
uy = 30
#publishing to terminal listener
pub.publish('u:'+str(u))
pub.publish('uy:'+str(uy))
#publishing to servos
limit=50
if (xsensor<-1*limit or xsensor>limit):
servo2.publish(u)
if (ysensor<-1*limit or ysensor>limit):
servo1.publish(uy)
#setting history values
uold=u
uyold=uy
xsensorold=xsensor
ysensorold=ysensor
###Controller END
if __name__ == '__main__':
rospy.init_node('tf_ar_tag')
pub = rospy.Publisher("tf_publisher", String)
trans=[]
angle=30
servo2=rospy.Publisher("servo2",UInt16)
servo1=rospy.Publisher("servo1",UInt16)
time.sleep(1)
pub.publish('initializing angle')
# Initializing servos to the center
servo2.publish(xcenter)
servo1.publish(ycenter)
rospy.Subscriber("face_tf", FaceTF, callback)
rospy.spin()
Arduino.ino
#if defined(ARDUINO) && ARDUINO >= 100
#include "Arduino.h"
#else
#include <WProgram.h>
#endif
#include <Servo.h>
#include <ros.h>
#include <std_msgs/UInt16.h>
ros::NodeHandle nh;
Servo servo1; #Yaw Servo
Servo servo2; #Pitch Servo
//Servo1 Callback Function
void servo1_cb( const std_msgs::UInt16& cmd_msg){
servo1.write(cmd_msg.data); //set servo angle, should be from 0-180
digitalWrite(13, HIGH-digitalRead(13)); //toggle led for debugging
}
//Servo2 Callback Function
void servo2_cb( const std_msgs::UInt16& cmd_msg){
servo2.write(cmd_msg.data); //set servo angle, should be from 0-180
digitalWrite(13, HIGH-digitalRead(13)); //toggle led for debugging
}
//Subscribing to servo1 topic
ros::Subscriber<std_msgs::UInt16> sub1("servo1", servo1_cb);
//Subscribing to servo2 topic
ros::Subscriber<std_msgs::UInt16> sub2("servo2", servo2_cb);
void setup(){
//led for debugging
pinMode(13, OUTPUT);
nh.initNode();
nh.subscribe(sub1);
nh.subscribe(sub2);
servo1.attach(8); //attach servo1 to pin 8
servo2.attach(9); //attach servo2 to pin 9
}
void loop(){
nh.spinOnce();
delay(1);
}