CAMERA CALIBRATION THROUGH CAMERA PROJECTION LOSS

Camera calibration is a necessity in various tasks including 3D reconstruction, hand-eye coordination for a robotic interaction, autonomous driving, etc. In this work we propose a novel method to predict extrinsic (baseline, pitch, and translation), intrinsic (focal length and principal point offset) parameters using an image pair. Unlike existing methods, instead of designing an end-to-end solution, we proposed a new representation that incorporates camera model equations as a neural network in multi-task learning framework. We estimate the desired parameters via novel camera projection loss (CPL) that uses the camera model neural network to reconstruct the 3D points and uses the reconstruction loss to estimate the camera parameters. To the best of our knowledge, ours is the first method to jointly estimate both the intrinsic and extrinsic parameters via a multi-task learning methodology that combines analytical equations in learning framework for the estimation of camera parameters. We also proposed a novel dataset using CARLA Simulator [1]. Empirically, we demonstrate that our proposed approach achieves better performance with respect to both deep learning-based and traditional methods on 8 out of 10 parameters evaluated using both synthetic and real data.

Paper

Code

Acf based region proposal extraction for yolov3 network towards high performance cyclist detection

Implementation of the paper "ACF Based Region Proposal Extraction for YOLOv3 Network Towards High-Performance Cyclist Detection in High Resolution Images".

Demo

Article

Code

autonomous driving in car simulations

Built a model for steering angle prediction on euro truck simulator 2.

Demo

Alphago Zero on tic tac toe

Mastering Tic Tac Toe Using Self Play and Reinforcement Learning.

Article

Presentation

Code

Cross-View Image Retrieval - Ground to Aerial Image Retrieval Through Deep Learning

Cross-modal retrieval aims to measure the content similarity between different types of data. The idea has been previously applied to visual, text, and speech data. In this paper, we present a novel cross-modal retrieval method specifically for multi-view images, called Cross-view Image Retrieval CVIR. Our approach aims to find a feature space as well as an embedding space in which samples from street-view images are compared directly to satellite-view images (and vice-versa). For this comparison, a novel deep metric learning based solution “DeepCVIR” has been proposed. Previous cross-view image datasets are deficient in that they (1) lack class information; (2) were originally collected for cross-view image geolocalization task with coupled images; (3) do not include any images from off-street locations. To train, compare, and evaluate the performance of cross-view image retrieval, we present a new 6 class cross-view image dataset termed as CrossViewRet which comprises of images including freeway, mountain, palace, river, ship, and stadium with 700 high-resolution dual-view images for each class. Results show that the proposed DeepCVIR outperforms conventional matching approaches on CVIR task for the given dataset and would also serve as the baseline for future research.

Paper

Article

object detection and tagging

Worked on Object Detection and Tagging from videos starting off by compiling OpenCV, followed by Optimization using UMat objects, Deployment using Postgre SQL, Flask and RabbitMQ for handling multiple requests.

Article

nerve segmentation

The aim of this project was to segment a collection of nerves called the Brachial Plexus (BP) in ultrasound images of the neck by applying different machine learning techniques.

Report

Skin classification

Building a deep learning model that classifies skin images with samples of 8 common skin pathologies and carcinoma.

Code

box office prediction

Box Office Prediction on TMDB dataset.

Code

sentiment analysis

Visualization along with analysis using LSVC, Regression,ANN, LSTM on Movie Reviews utilizing NLTK, Keras, Sklearn, Gensim, WordCloud and Yellowbrick.

Code

jpeg compression

JPEG Compression using DCT (Discrete Cosine Transform) and DWT (Discrete Wavelet Transform) in Matlab.

Article

Code

pyspark examples

Some example codes in PySPARK.

Article

Code

Algorithm comparison using r

Comparison of Classification and Clustering Algorithms using different datasets.

Code

erdos-renyi and small world

Exploration on the working of Random Graphs through application on real world datasets.

Code

girvan-newman algorithm

Implementation of the Girvan-Newman Algorithm to identify communities.

Code

analysis of who trust whom online social network using page rank and other components

Exploring the structure of a directed social network, namely the Epinions Social Network. This is a who-trust-whom online social network of a general consumer review site Epinions.com. Members of the site can decide whether to ''trust'' each other. All the trust relationships interact and form the Web of Trust which is then combined with review ratings to determine which reviews are shown to the user.

Code

tree implementation in c++

This project consisted of first implementing a (general) tree in C++. Then, using the tree data structure to load a dataset of weekly sales by various departments in multiple stores owned by a single retail store chain and then using this tree to answer interesting queries on that data.

Code