Aerial Image Understanding

Welcome! 


We are a team of PhD students and experienced researchers with an ambition to push the frontiers of AI and take important steps towards making the technologies of tomorrow possible. We aim to design and implement efficient image processing algorithms, that could be used onboard UAVs and UGVs. Our main objectives are:

1) To create fast methods for online and unsupervised learning in large spatiotemporal volumes of data, that have the capabilities of functioning in dynamic, real-world scenarios. We intend to use different kinds of imaging and 4D (3D + time) sensing capabilities, ranging from fixed sensors to cameras present on UAVs. Given the huge amounts of unlabeled data available and the costly manual annotation, unsupervised learning is crucial for the development of new AI technologies. Therefore, we will focus on efficient learning methods without human supervision.

2) To develop methods capable of complete scene understanding, from the level of objects and activities involving objects to translating the visual scene into natural language. 

3) To give drones the capacity to “see” and understand the world in which they fly.

Our research interests are divided in three main directions:


Scene understanding

We aim to fully understand a scene - this means consensus between as many representations as possible and model updates for new RGB data. While it is intuitive that multiple visual representations can help boosting the performance of an additional one, how can we achieve performance gains using no new labels is unclear. This is what we aim to achieve using consnsus from a hypergraph of representations. Ideally, we would like to have a model trained on a small number of labels and to gradually improve the scene understanding using only new RGB data.


Semantic segmentation

There are tasks where unsupervised learning doesn't excel yet. One of the most common among them is semantic segmentation - due to a wide variety of environments, it is difficult to start from pixels and end up in classes, without any prior knowledge. However, given a set of initial labels, we can deploy custom time propagation methods to improve segmentation performance. Our methods yield large improvements using only flow and geometric constrains - both of which can be obtained at no additional labelling cost.


Localization & obstacle avoidance

Accurate localization, both absolute and relative, can help higher level tasks, such as semantic segmentation and depth. The goal is to be able to iteratively improve the labels of static objects using successive scans of a target region, while avoiding obstacles that could damage the unmanned vehicle.  Our work addresses both the localization and avoidance problems in ways that need the least amount of labels - synthetic data and reconstructed views are the main source of training data. Furthermore, some of our models have been tested on embedded devices and yield reasonable performance even for real time, on-board usage.

Team Members

PhD Supervisors & Research Coordinators

Prof. Dr. Marius Leordeanu

Institute of Mathematics of the Romanian Academy & University Politehnica of Bucharest

Prof. Dr. Ing. Emil Slusanschi

University Politehnica of Bucharest

PhD Students & Researchers

Alina Marcu

PhD Student, Institute of Mathematics of the Romanian Academy

Dragos Costea

PhD Student, University Politehnica of Bucharest

Mihai Pirvu

PhD Student, University Politehnica of Bucharest

Vlad Licaret

AI Engineer, University Politehnica of Bucharest

Mihai Masala

PhD Student, University Politehnica of Bucharest

Collaborators

Prof. Dr. Traian Rebedea

University Politehnica of Bucharest

Iulia Paraicu

PhD Student, Institute of Mathematics of the Romanian Academy

Funding & Partnerships


EEA and Norway Grant 2019-2022: EEA-RO-2018-0496 (1.5 Million Euro) “Spacetime Vision – Towards Unsupervised Learning in the 4D World”

UEFISCDI Grant 2018-2020: PN-III-P1-1.2-PCCDI2017-0734 (1.7 Million Euro) „Robots and Society: Cognitive Systems for Personal Robots and Autonomous Vehicles” (Marius Leordeanu is the PI of the IMAR Partner).

UEFISCDI Grant 2018-2020: TE-2016-2182 (100K Euro)  « Vision in Words : Automatic Linguistic Description of Objects, People and their Interactions in Indoor Videos” 

UEFISCDI ERC-like Grant 2016-2018: ERC-2016-0007 (170K Euro) “The Classifier Graph: A Recursive Multiclass Network for Deep Category Recognition in Images and Video”.

UEFISCDI Grant 2016-2018: PED-2016-1842 (105K Euro) “Automatic linguistic descriptions of objects, people and their interactions in indoor videos”. 

UEFISCDI Grant 2012-2016: PCE-2012-4-0581 (300K Euro), “Automatic Video Understanding at Middle and Higher Levels of Interpretation”. 

European Funds Grant 2015-2019: POC-A1.2.1D-2015-P39-287 (1 Million Euro) –„Automatic interpretation of images and video sequences using natural language processing” (PI with Traian Rebedea)