The aim of this project is to quantify transfer learning and cross-game compatibility in Atari. Inspired by the Request for Research 2.0 from OpenAI, we train 5 policies on 5 Atari games, and collecting 450,000 steps from the policy of each game. With these trajectories, a generative model (action-conditioned autoencoder) is trained on 2 of the games, and then fine-tuned and tested on each of the other 3. The goal of this process is to quantify the effectiveness of such a method, and the benefit (or lack thereof) of pre-training on the 2 other games. The effect of pre-training and available fine-tuning data will be explored to further quantify the feasibility and practicality of this approach.
The aim of this project is to create a distributed embedded vision system for real-time object detection, tracking, and anomaly action indication across multiple cameras. The system cameras are equipped with an embedded vision computing device, and collected as the edge nodes. Each embedded vision platform operates as a single IoT device with the capability of running real-time video analytics. Multiple embedded vision platforms will directly communicate with each other to realize distributed detection and tracking in a larger environment. A combination of deep learning-based models and classical computer vision algorithms will be developed and integrated to realize distributed visual tracking at the edge. With privacy as a priority, it is required that all video is processed at the edge, no personally identifiable methods are used (such as face detection), and only the minimal necessary encoded metadata is communicated across the network.
[Report] [Code] [Expo Video] [Short Demo]
In this work, we explore the use of a state-of-the-art convolutional neural network architecture for single shot detection (YOLOv3) and transfer its application to mid-wave infrared imagery. Over 80% accuracy is achieved on the Military Sensing Information Analysis Center Automatic Target Recognition dataset over all ranges (1000m to 5000m) with approximately 10FPS inference time on the Nvidia Xavier embedded GPU platform.