Projects

Ph.D. Research Projects

Q-RBSA: High-Resolution 3D EBSD Map Generation Using An Efficient Quaternion Transformer Network

Feb 2022 - March 2023

Gathering 3D material microstructural information is time-consuming, expensive, and energy-intensive. Acquisition of 3D data has been accelerated by developments in serial sectioning instrument capabilities; however, for crystallographic information, the electron backscatter diffraction (EBSD) imaging modality remains rate limiting. We propose a physics-based efficient deep learning framework to reduce the time and cost of collecting 3D EBSD maps. Our framework uses a quaternion residual block self-attention network (QRBSA) to generate high-resolution 3D EBSD maps from sparsely sectioned EBSD maps. In QRBSA, quaternion-valued convolution effectively learns local relations in orientation space, while self-attention in the quaternion domain captures long-range correlations. We apply our framework to 3D data collected from commercially relevant titanium alloys, showing both qualitatively and quantitatively that our method can predict missing samples (EBSD information between sparsely sectioned mapping points) as compared to high-resolution ground truth 3D EBSD maps.

[Paper] [Code]

EBSD-SR: Super-Resolution for Electron Backscatter Diffraction Maps using Physics-Guided Neural Networks

June 2020 - Dec 2022

The problem of single image super-resolution (SISR) using convolutional neural networks has been extensively explored in the computer vision field. Current architectures perform well in both the qualitative and quantitative output evaluation of optical images, but applications to experimental data in the physical sciences have not been well explored. This data is often gathered using other modalities, producing images that differ both in appearance and encoded information. One example of this is electron backscatter diffraction (EBSD), a widely used materials characterization technique that produces maps describing the crystal arrangement of solid materials. These maps are critically important to the understanding of materials behavior in a wide array of applications ranging from biomedical to aerospace. The ability to generate a physically accurate, high-resolution EBSD map from a captured low-resolution map would significantly impact both science and industry, allowing for the capture of three-dimensional EBSD experimental data in a high-throughput manner and with high spatial resolution. However, as EBSD maps are built by indexing electron diffraction patterns that follow crystallographic symmetry conventions, we cannot directly apply state-of-the-art SISR networks to generate physically accurate high-resolution images. For the neural network to have an appropriate understanding of the relevant domain knowledge, physics-based information must be incorporated into the learning process. Therefore, we introduce a novel physics-based loss for network training that accounts for both orientation representation and crystallographic symmetry. We apply this approach to EBSD data of a commercially relevant titanium alloy, and prove both quantitatively and qualitatively that the generated results using physics-based loss are significantly better than those achieved using commonly used L_1 loss approaches or traditional upsampling algorithms such as bilinear, bicubic, and nearest neighbor. Additionally, we present a new inference pipeline to convert network output into established visualization formats for EBSD maps.

[Paper] [Code] [Bisque Module] [Poster]

3D Object Generation using Generative Adversarial Network

Feb 2020 - Jan 2022

Designed a 3D generative architecture using adaptive instance normalization, mapping network, and wasserstein loss to learn morphology of voxel-based 3D objects. The network generates a 3D object from a random latent space vector without any 2D image information, and achieves the state-of-the-art classification accuracy on discriminative features from ModelNet computer vision 3D dataset in unsupervised methods. We have also used this network to generate realistic microstructure morphology features and demonstrate its capabilities on the crystalline titanium grain shapes dataset. Learning morphology of crystalline materials is essential to predict physical properties such as stress, strain, conductivity, etc.

[Paper]

Internship Projects

AI-based Image Sharpening of Mobile Camera Images

June 2021 - Sept 2021 (Patent in progress)

Images captured from a smartphone are processed through lens, sensor, and different processing blocks before being saved as jpeg images in the storage. First, the captured light-rays from the lens go through the Bayer sensor and forms Bayer Raw images. Then, the Bayer Raw images are processed through an image signal processing (ISP) block and finally saved as RGB jpeg images. The ISP pipeline contains different processing modules such as white balance, de-mosaicking, de-noising, color space transform, tone reproduction, etc. Some of the processing modules in the ISP pipeline, such as de-mosaicking and de-noise, introduce blurs in the images. The blurs due to de-mosaicking and de-noise are called in-processing blurs. The other blur, which comes from the optical system, such as the lens of mobile cameras, is called lens blur. The in-processing blurs and lens blur worsens in mobile captured images during non-ideal conditions such as high-noise and low-light conditions. The current exiting non-AI-based methods cannot remove these blurs and sharpen images, and AI-based methods have shown better performance in other image restoration tasks such as de-blurring, de-noising, super-resolution, etc. Therefore, it becomes crucial to design an AI-based network to remove in-processing and lens blurs and sharpen images. Due to the non-availability of publicly available in-processed and lens blur datasets, we have designed our data generation pipeline to simulate both in-processing and lens blur in real mobile camera images and synthetic images generated from the dead leaves model. Our designed deep neural network architectures are trained on these simulated blurred datasets. We have shown that our networks remove in-processing and lens blurs and sharpen mobile images with reasonable computational complexity.

[This work was done during summer'21 internship at Samsung Research America , and submitted for patent. ]

Efficient Single Image Super-Resolution

June 2022 - Sept 2022 (Patent in progress)

The zoom factor of images from smartphone cameras is limited by the physical constraints of optical lenses. It becomes crucial to design computational algorithms to super resolve smartphone camera images by a factor which is more than physical limitation of optical lenses. The super-resolved images should have good perceptual quality than standard interpolation algorithms. The goal of this internship was to design an efficient deep learning framework to super resolve smartphone camera images. The traditional super-resolution algorithms work on synthetically created low-resolution images from ground truth high-resolution images. The algorithms, which are trained on synthetically created low-resolution images, are not suitable for real-world application. Therefore, MPI lab has created low-resolution images, which have real-degradation from the smartphone camera image signal processing pipeline (ISP).

[This work was done during summer'22 internship at Samsung Research America , and submitted for patent. ]

[Image Source: Samsung Newsroom U.S.]

Detection Scheme at the Receiver in Amplify and Forward Relay Network over Rayleigh Fading Channel

May 2014 - July 2014

In this project, we consider a Maximum Likelihood (ML) detection approach for signals which are corrupted by Middlenton Class A noise. These signals are transmitted from source to destination through N number of Amplitude and Forward relays. There already exists a rich literature on copoperative diversity but these results are mainly restricted to the conventional assumption of additive white gaussian noise (AWGN). There are few papers which consider impulsive nature of noise. We know, Maximum Ratio Combining (MRC) is optimal detector in presence of AWGN but it is not true in case of impulsive noise. Most of the papers talk about MRC detection scheme in A-F relays system in prsence of implusive noise but not ML detection scheme. We performed the simulation for both detection schemes (ML and MRC). We found that performance of the ML detection scheme is much better than MRC detection scheme.

[Report]

[This work was done during summer internship at Jacobs University, Bremen as DAAD-WISE Scholar]

[Image Source: BITS Wordpress]

Course Projects

Image Registration

Sept'19-Oct'19

Image Registration is the process of transforming two or more images/data into the same coordinate system. We have used Open-CV's inbuilt SIFT algorithm for feature detection, and implemented standard DLT and normalized DLT algorithms to estimate parameters of homography. RANSAC algorithm is used to estimate the homography in the presence of outliers.

[Github ]

Visual Question Generation

Jan'19-March'19

In this project, we undertake the task of generating a natural question about an image, which can potentially engage a human to start a conversation. We have implemented the existing VGG19 and LSTM architecture of VQA. We have used pre-trained VGG19 to extract the features from Images and features are given to the LSTM network of step size 26, which generates a natural question as an output.

Road Damage Detection

March'19- June'19

We have used pre-trained YOLO v2 to detect potholes and SSD Mobile Net to detect 10 different types of road damage detection on Images and Videos. We have also used Class activation map to detect pot holes without any bounding box information.

[ Video Credit: This original video without detection part is taken from YouTube ]

Image Forensic

Oct'19-Nov'19

The task of this project was to detect Copy-Paste forgery without using Deep Learning. I have used the Expectation-Maximization algorithm given by A.C. Popescu to detect forgery. A probability map is generated as an output of the EM algorithm. Fourier transform of probability map is calculated for small patches of the Image, which include both modified part and original part of the image, to detect forgery.

[Github ] [Report]

Image Compression Algorithms

Oct'19-Nov'19

Karhunen–Loève transform (KLT) is best choice for performing image compression because it results in least possible mean square error. However, KLT depends on the input image, which makes compression impractical. DCT is the closest approximation of KLT, therefore we use DCT for JPEG compression. In this project, it is shown that KLT is perfectly able to reconstruct compressed image, while DCT and DFT do not reconstruct perfectly. We have used DCT for JPEG compression with threshold and zonal coding.

[Github] [Report]

A Short Theoretical Survey on Training of GAN Network

March'20-June'20

Generative Adversarial Networks (GANs) have attracted attention to the machine learning research community due to their impressive results. However, despite their success, training GANs has been really difficult due to their fundamental nature of optimization problems. To reduce training instability, a novel wasserstein loss with gradient penalty has been proposed by researchers. In this project, it was investigated why W-GAN with gradient penalty works better than traditional GAN.

[Image Credit: Jonathan Hui ]

[Report]