Done with my MS-Thesis
A necessity for my thesis
Header files in C++
Camera Calibration through Camera Projection Loss
PythonRobotics: a Python code collection of robotics algorithms
Camera calibration is a necessity in various tasks including 3D reconstruction, hand-eye coordination for a robotic interaction, autonomous driving, etc. In this work we propose a novel method to predict extrinsic (baseline, pitch, and translation), intrinsic (focal length and principal point offset) parameters using an image pair. Unlike existing methods, instead of designing an end-to-end solution, we proposed a new representation that incorporates camera model equations as a neural network in multi-task learning framework. We estimate the desired parameters via novel camera projection loss (CPL) that uses the camera model neural network to reconstruct the 3D points and uses the reconstruction loss to estimate the camera parameters. To the best of our knowledge, ours is the first method to jointly estimate both the intrinsic and extrinsic parameters via a multi-task learning methodology that combines analytical equations in learning framework for the estimation of camera parameters. We also proposed a novel dataset using CARLA Simulator [1]. Empirically, we demonstrate that our proposed approach achieves better performance with respect to both deep learning-based and traditional methods on 8 out of 10 parameters evaluated using both synthetic and real data. Our code and generated dataset will be made publicly available to facilitate future research.
IBM Watson
Probability Primer
Last of the questions which I remember from my interview
Yet another interview question
Another question I got in an interview
Comparison of Grimson’s background subtraction method with Gaussian and simple background subtraction
A gentle revisit
jpaketest.c:1:10: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘.’ token
W: The repository ‘http://dist.carla.org/carla xenial Release’ does not have a Release file.
N: Data from such a repository can’t be authenticated and is therefore potentially dangerous to use.
Installing the NVIDIA display driver...
A system reboot is required to continue installation. Please reboot then run the installer again.
An attempt has been made to disable Nouveau.
If this message persists after reboot, please see the display driver log file at /var/log/nvidia-installer.log for more information.
It appears that an X server is running. Please exit X before installation. If you’re sure that X is not running, but are getting this error, please delete any X lock files in /tmp.
Literature Review
Gazebo Simulator of the Turtlebot
Create a ROS Workspace
Initial Setup
COVID-19 Global Forecasting
A paper implementation or may be just aggregation of some open source implementations
What we had, What we thought, What we did
The course that haunted me while I was studying it
OpenCV with dnn module
Creating a web application with Python, Flask, PostgreSQL and deploying it on Heroku
What I learned from a failed experiment
I recently worked on camera calibration for stereo vision and for that I studied a little bit about the topic. What follows is my understanding of it.
I have recently worked on JPEG Image Compression. Following is what I understood along the way. Following articles really helped me during the process.
How I implemented AlphaGo Zero for the Tic-Tac-Toe game
Cross-modal retrieval aims to measure the content similarity between different types of data. The idea has been previously applied to visual, text, and speech data. In this paper, we present a novel cross-modal retrieval method specifically for multi-view images, called Cross-view Image Retrieval CVIR. Our approach aims to find a feature space as well as an embedding space in which samples from street-view images are compared directly to satellite-view images (and vice-versa). For this comparison, a novel deep metric learning based solution “DeepCVIR” has been proposed. Previous cross-view image datasets are deficient in that they (1) lack class information; (2) were originally collected for cross-view image geolocalization task with coupled images; (3) do not include any images from off-street locations. To train, compare, and evaluate the performance of cross-view image retrieval, we present a new 6 class crossview image dataset termed as CrossViewRet which comprises of images including freeway, mountain, palace, river, ship, and stadium with 700 high-resolution dual-view images for each class. Results show that the proposed DeepCVIR outperforms conventional matching approaches on CVIR task for the given dataset and would also serve as the baseline for future research.