Cloud properties are important because they influence Earth's energy balance and are essential for understanding climate change and making accurate weather forecasts. In remote sensing, satellites use solar radiation reflected from clouds to estimate these properties. However, this estimation is challenging due to the effects of radiative transfer—specifically, the complex interactions between solar radiation and the 3D structure of clouds; and the geometric limitations of satellite observations. Traditionally, the Independent Pixel Approximation (IPA) method has been used for cloud property estimation, but it introduces significant bias due to its simplified assumptions. Recently, deep learning-based methods have shown improved accuracy in retrieving cloud properties by incorporating 2D spatial features to reduce radiative transfer effects. However, variations in radiance intensity, distortion, and cloud shadows still cause substantial errors in cloud property estimation under different solar and viewing zenith angles. To address these issues, we propose a new deep learning model, called Cloud-Attention-Net with Angle Coding (CAAC), which uses attention mechanisms and angle embeddings to account for satellite viewing geometry and 3D radiative transfer effects, enabling more accurate retrieval of cloud optical thickness. Our multi-angle training strategy ensures that our CAAC model is angle-invariant. Through comprehensive experiments, we demonstrate that our model outperforms IPA and state-of-the-art deep learning methods, reducing cloud property retrieval errors by at least ninefold.
Language: Python (Developed using Pytorch framework)
The Earth’s radiation budget relies on cloud properties like Cloud Optical Thickness obtained from cloud radiance observations. Traditional physics-based cloud retrieval methods face challenges due to 3D radiative transfer effects. Deep learning approaches have emerged to address this, but their performance are limited by simple deep neural network architectures and vanilla objective functions. To overcome these limitations, we propose CloudUNet, a modified UNet-style architecture that captures spatial context and mitigates 3D radiative transfer effects. We introduce a cloud-sensitive objective function with regularized L2 and SSIM losses to learn thick cloud regions often underrepresented in input radiance data. Experiments using realistic atmospheric and cloud Large-Eddy Simulation data demonstrate that our proposed CloudUNet obtains 5-fold improvement over the existing state-of-the-art deep learning, and physics-based methods.
Language: Python (Developed using Pytorch framework)
The goals of this on-going work are two-fold. One, provide a comprehensive survey of the existing adversarial federated learning for speech related tasks. Second, a demonstration of the federated learning algorithms on the edge devices such as Raspberry Pi with applications to automatic speech recognition. Challenges faced so far derived from the low computational power and memory of the RPI.
The main goal of this project was twofold. First it targets the study of existing methods for interpreting CNNs and for measuring the similarity between CNNs. Second, it aims the proposal of novel schemes for the comparison of CNNs based on their intermediate characteristics.
Language: Python (Developed using Pytorch framework)
Object tracking is a very challenging task in video-surveillance systems. Among multiple approaches and methods, this report is solely focused on tracking a single object based on histograms. In this assignment we explored color-based and gradient-based tracking at first. Generally both models have their strengths and weaknesses depending on the application scenario. Eventually we combined those methods to get the best of both worlds. We tested our models on the dataset provided by the lab and attempted to overcome challenges associated with each video sequence. Further we evaluated the performance of each tracking algorithm individually and combined.
Language: C++
Fig: Estimated Bounding Box Shifted to Wrong Target Model (similar appearance) due to Large Candidate Search Area.
Object detection and identification is a very challenging task in video-surveillance systems. Among multiple approaches and methods, this report is solely focused on the method based on foreground segmentation which can be divided into two sub tasks i.e., blob extraction and blob classification. In the blob extraction stage, OpenCV’s floodfill module was used while in the classification stage a statistical classifier was implemented. Another challenge faced in video sequence analysis is to detect stationary foreground objects. For this reason, a routine was developed to detect stationary foreground blobs in the frame based on foreground history images. Further, a snippet of MATLAB code was generated to form an improved classification model.
Language: C++ and MATLAB
The project started by obtaining the intrinsic parameters of our camera. Next, local matches between several views of an object were found using points of interest from different methods such as SIFT, SURF and KAZE. Finally 3D scene was recontructed using multi-view geometry from the 2D images.
Language: MATLAB
The project focused on applying Naïve Bayes and Gaussian Model (GM) to classify body positionsfrom the Kinect dataset. Naive bayes is a simple model which assumes all the variables are mutually independent while GM considers dependencies between variables making it more practical and efficient.
Language: Python
X-Ray Computed Tomography (CT) is an indirect and non-destructive imaging method. It allows to visualize the local absorption properties of a specimen. As it is an indirect method, CT requires the application of a numerical reconstruction algorithm to retrieve these absorption coecients. We implemented the Filtered Backprojection (FBP) algorithm and apply it to the mouse data. First, the sinogram data are filtered with a ramp filter. The filtered data are then back-projected, along the rays that generated the unfiltered data. Filtering handles the oversampling and under-sampling problem of the low frequency and high frequency contents in the sinogram data.
Language: MATLAB
Effect of Ramp Filter on reconstructed image