In this project, I implemented an RL agent that plays the classic Atari game Pong using different learning algorithms: policy gradients, and value-based deep Q-learning. I used the Open AI gym Pong-v0 environment to exercise my RL agent for this problem. The videos below show the performance of the agent (in green) at different stages of training. Also, I found visualizing what the neurons in the first layer of the neural net recognized and triggered interesting.
At Qualcomm Research, my team and I participated and won the first place in the first edition of a Kaggle-style data mining challenge. The problem is to predict the throughput a user will experience given data about the handset, carrier, and other radio-related and spatio-temporal signals. We modeled it as a regression problem and used an ensemble stacked model with neural nets, gradient boosted regression trees and random forests.
In this project, three of my friends and I participated in the KDD Cup competition 2011 hosted by Yahoo! Labs. In particular, we addressed the problem using the Yahoo! music rating dataset to predict future user ratings and classify items based on user rating patterns. To accomplish this task, we leveraged the current state-of-the-art techniques in Collaborative Filtering, which included implementing an ensemble of approaches based on neighborhood-based methods, co-clustering, and latent factor models as well as incorporating the temporal dynamics and hierarchical structure of the dataset. Our approaches involve a combination of item-based and user-based prediction to improve performance. Additionally, we propose modifications to existing similarity measures which provably increase their robustness. By combining the approaches in an ensemble, we were able to achieve competitive results for both tracks of the competition. Please click here to read the full technical report.
Compressed Sensing provides an elegant framework for recovering sparse signals from highly under-sampled set of linear measurements. In the absence of any measurement noise, one can exactly recover a sparsely representable signal from only a few observations if the measurement matrices satisfy Restricted Isometry Property (RIP) conditions. For example, in the case of i.i.d. Gaussian measurement matrices, a K-sparse vector x can be exactly recovered (with high probability) from N = O(K log(P/K)) measurements via l-1 minimization. If the observations contain additive noise bounded by e, then conventional compressed sensing algorithms recover the sparse signal with error at most Ce (where C is a small positive constant). In the recent times, the field of Robust Compressed Sensing has gained considerable interest. This essentially deals with recovering sparse signals when one or more measurements are corrupt (possibly in an adversarial manner). Variants of the classic compressed sensing algorithm have been proposed previously for solving the case when additive noise is sparse with possibly high magnitude (for example, due to shot noise, narrowband interference etc). However, analyzing sparse signal recovery in the case when the measurement matrix and/or measurements are corrupt arbitrarily is, in general, a hard problem. In this project, I investigated the recovery of sparse signals under the following settings:
1) the measurement matrix is corrupted arbitrarily on a constant fraction of the rows
2) the measurement matrix and the observations are corrupt arbitrarily on a constant fraction of the rows, but on the same row support.
First, an algorithm for exact signal recovery was developed under a noiseless setting (but with adversarial corruption) and theoretical analysis was provided proving recovery guarantees. Later, this approach was extended to handle noisy observations as well. Link to a detailed report can be found here.
In the present day, there has been a huge increase in the amount of video data, delivered to a wide variety of users, employing a wide variety of viewing media. This trend is expected to increase steadily with Internet video traffic contributing a significant portion of the overall IP traffic that will flow in next generation wireless networks. The video networks must operate with a high network capacity, maximizing the net goodput. However, good perceptual quality of the video must be the ultimate goal, since it is humans who observe the received video. This project aims at developing a link adaptation algorithm for wireless stored-video delivery by maximizing the perceptual quality of the received video. Perceptual video quality is quantified using full-reference and no-reference perceptual image quality/distortion metrics - SSIM and BLIINDS respectively. This report provides detailed information on the project: Video Link Adaptation Report
Speaker and speech recognition is one topic that had intrigued me from my high school days. I always wanted to build my own speech recognition system. On one day, I stumbled upon this article on Speaker Recognition by the IFP group at UIUC here. In this project, I learnt about the Linde-Buzo-Gray (LBG) algorithm for Vector Quantization (VQ) to classify high dimensional data in an efficient manner by making use of the correlation inherent to the data set samples. The input utterance (framed for every 20-30ms) is transformed using Mel frequency Cepstrum Coefficients (MFCC) into an acoustic vector. This is mapped in the VQ space for each speaker during the training session. While testing, the new speaker's voice is transformed into similar acoustic vectors and compared for the error between trained voice. If the error found is less than a specified threshold for the input voice the user is identified. Else the system displays that the user is someone new to the system.
After completing the basic speaker recognition aspects, I worked on making this system robust to environment noise and language dependency. The project was implemented in MATLAB and the complete source code is available here.
This has been one of the interesting projects that I worked on independently. I first learnt the concept of Principal Component Analysis (PCA) and was eagerly waiting to apply it to build an application. That is when I came across the classic paper on "Face Recognition using Eigen Faces" by Turk and Pentland.
I understood the algorithm proposed and started to build a face recognition system using this. I simulated the problem in MATLAB and was very happy with the results it produced. The systems was able to detect and identify the same class of faces showing different emotions in different backgrounds at a very high strike rate. The training faces belong to a standard database "Yale Faces" used for testing face recognition systems. The simulation results for a four image input database (out of which only three were taken to form the Eigen faces) are shown here. The source code is available here and the training/testing faces are available here 1 2 3 4 5.
Here is a nice and intuitive reading on PCA. In the process of implementing this project, I also learnt about Nonlinear PCA and Kernel-based PCA for image compression and feature extraction.
In this project, I performed video coding of real-time data from a webcam using motion DCT - sample the video input frame by frame and do individual DCT's on each frame independently. I then transmitted the coefficients in a binary encoded form continuously. One observation based on this experiment was that, the DCT coefficient spread was more when the detail in the image was more. This was verified by comparing two scenarios: (a) by blocking camera completely and (b) focusing it to high detail regions. The latter case had more DCT coefficients to be transmitted. Thus number of bits required depended on the image being coded for a given energy compaction.
The source code for this project is available here. Please change the video input to the appropriate source before executing the program.
Image Transforms play a vital role in Transform Coding of images and video. The ultimate motive behind any compression algorithm is to compress the original image in the best basis possible with the fewest number of co-efficients to do the job!
Hence, I analyzed the energy compaction property of the common transforms : DCT, DST, DFT, Walsh Hadamard Transform, Haar Transform, Slant Transform and Karen-Louve Transform - KLT.
From the simulation results, I was able to conclude that KLT performed the best compared to others when it came to energy compaction. This is intuitively expected because this is the only transform among the compared ones that has a basis which is image dependent. All other basis functions are predefined at both the Tx and Rx. A disadvantage here is that one has to transmit the new basis vectors and the projections of the image on the new basis to the Rx. However, the number of vectors that capture the maximum signal energy is less and hence the overload is less.
Next best performance was given by DCT. This is rightly so because the basis function "cosine wave" is more closely associated with the so called "ideal" basis of the images and represent slow variation between pixels. The results of DCT applied on a 160x120 pixel image and reconstruction with coefficients containing 98% of the signal energy is shown here. The number of coefficients required were 917 out of the total 19200 coefficients. This was earlier used in JPEG image compression. It has now been replaced by a more efficient Wavelet Transform. Here you can view the complete source code.
Modern multiple-input multiple-output (MIMO) wireless communication systems often employ non-linear detection strategies. These non-linear detection methods achieve near-capacity on a MIMO fading channel, but their complexity grows exponentially with spectral efficiency. On the other hand, there exist linear detection techniques that are computationally much simpler, but have sub-optimal performance in most regimes. This work proposes a novel selection strategy for performing channel-adaptive MIMO (CA-MIMO) detection based on instantaneous channel state information (CSI) at the receiver. The performance of two soft-output MIMO detection techniques viz., the Qualcomm Sphere Decoder (QSD) and Linear MMSE (LMMSE) are analyzed for a 2x2 MIMO system. Based on this analysis, a selection strategy that achieves substantial reduction in complexity (and hence power) for almost no loss in throughput is proposed.
The selection rule is applied on a wireless system simulator to estimate power gain by adaptive switching. Simulation results show that under uncorrelated Rayleigh fading the proposed CA-MIMO detector can achieve ~30-50% reduction in complexity with near-MAP performance. This research work is a part of my summer internship at Qualcomm Inc., San Diego in 2011.
In Spring 2011, I worked on the baseband simulator design for prototyping the IEEE 802.11n PHY Layer on USRP N210 hardware using LabVIEW under the supervision of Prof. Robert W. Heath, Jr. ECE Dept., UT Austin. This project is a part of the Intel-Cisco 'Video Aware Wireless Networks' (VAWN) Program. The new USRP N210 platform supports a Gigabit Ethernet interface to the host PC and has options for setting up multiple antenna systems. Information on the 802.11n standard can be obtained here.
I implemented this survey project on PAPR reduction for OFDM systems as a part of my coursework for EE 381V Wireless Communications Lab at UT Austin. In this project, I built and tested an end-to-end OFDM system across a wireless link and studied the performance of different Peak-to-Average-Power-Ratio (PAPR) reduction techniques.
For more information on this project visit the project website: PAPR Reduction Project.
ANUSAT micro-satellite is the first of its kind in India, its specialty being that it is the first university-based micro-satellite project, completely tested and integrated by undergraduate and graduate students of Anna University. ANUSAT was primarily started to inculcate the interest of satellite construction and space research among the students by the Indian Space Research Organization (ISRO)
ANUSAT took her seat in space on April 20th, after being launched into space by the PSLV-C12 rocket from Sriharikota, India at 06:45 IST. ANUSAT would not have been possible without the combined efforts of a number of students of Anna University (Members of the Integrated Systems Laboratory (ISL) and EEE students at College of Engineering, Guindy (CEG) and students at Madras Institute of Technology (MIT)).
My work in this project taught me a lot of practical difficulties that arise when building real communication systems. In ANUSAT, I worked on designing and simulating a GPS receiver. A GPS antenna was on board ANUSAT (the white circular antenna in the picture). The 50 bps GPS data collected by the antenna would be down-linked by activating the auxiliary payload. The GPS data had to be demodulated and Doppler rectified. Then using triangulation algorithm, a position fix for the satellite was to be found. With basic GPS positioning algorithms, I was able to obtain a location accuracy of about 500 m. The minimum time required for a guaranteed position fix was about 30 seconds in cold start (as is the case with pure GPS data - Later versions such as the d-GPS, e-GPS and a-GPS can find a fix in a few seconds using additional side information).
I learnt a lot working in this project. To name a few, demodulation in the presence of Doppler, correlation properties of C/A codes, Tracking and Acquisition (of C/A codes) issues using serial search and FFT based tracking, PDOP problems, and Kalman Filtering. Further, hands-on experience in real-time satellite SNR measurements, payload technology validation, antenna design, positioning and related experiments proved really valuable.
My visit to the ISRO with my Professor P. V. Ramakrishna during May-June 2009 for post-launch mission operations and satellite health analysis at ISTRAC/ISRO was another unforgettable experience. Here I had the rare opportunity to interact with space scientists at ISRO. I could see real satellite communication systems in action along with the huge tracking antennas at the ground station. Although it initially took me some time to understand the complete process that is executed when a satellite falls over an antenna's field of view, I was then able to follow along after a few days. It was indeed a great learning experience!
In this project, I worked towards simulating the approach laid out in the paper on “R Conjugate Codes for Multi Code CDMA” by Adam Tarr and D. Zoltowiski in MATLAB. You can find the original paper here.
An interesting detail about this approach to dealing with "crosstalk" is that the correlation matrix used to find the codes is derived from the Eigen vectors of the channel noise plus interference parameters (sent to the transmitter through feedback). The code performance in terms of orthogonality in noisy channels was better compared to other schemes. I was able to infer this from the constellation diagram of the received signal even for sufficiently low SNRs. The codes in addition to being orthogonal between themselves are nearly-orthogonal to the noise in the channel too (provided the noise plus interference distribution remains approx. constant during training and testing sessions). The results of the simulation on one of the channels is shown here.
Future work could be to look into Multi Code-CDMA in conjunction with Multi Carrier CDMA (OFDM-CDMA) - compositely called MC-MC-CDMA and evaluate it's performance in frequency-selective fading channels.
This was my first project at the Integrated Systems Lab (ISL) and is close to my heart! Just out of freshman year at college, I took up the ambitious task of designing an AM crystal radio with my own "hand-woven" inductors and capacitors. First, I set out to wind a copper wire around a pencil to make my inductor. The toughest part was to make a variable capacitor with aluminum plates and paper strips in-between them. After a couple of days work, I finally “synthesized” these components and found their reactance values by constructing a simple LC resonant circuit. Then, I connected them to my circuit and attached an antenna. After debugging the circuit for a few hours, I finally succeeded in my attempt to build a radio which can operate forever without external power! The crystal radio actually harnesses energy of the RF waves in the air to power itself into life. This project helped me learn the basics of amplitude modulation, envelope detection, loading effects resulting from impedance mismatch, antenna length considerations, noise suppression etc. I found this site very helpful to me while working on this project.
Design of a Digital FM Demodulator on a 2nd Order All Digital Phase Locked Loop – In this project, my friends and I tried to implement this interesting paper on digital FM reception. It was an unforgettable learning experience. We worked together for almost a week just to lock the PLL! The thery behind PLL-based signal detection and choice of the IIR loop filter co-efficient (which I later found was the most critical component for closed loop PLL operation) became clear to me after this experiment. Design of fundamental blocks like NCO, Phase detector, LPF brought out the signal processing aspects of wireless communications. I was able to monitor the SNR levels at which the PLL breaks, lock & capture Range, capture effect in FM etc. practically for the first time during this exercise. We simulated the receiver in MATLAB and VHDL. This reference also proved very useful for the execution of this project. Later the design was ported over to an FPGA.
This was a project for Defence and Research Development Lab (DRDL), Hyderabad, India that my team carried out at the Integrated Systems Lab. The specifications for the design are as follows.
IF Freq: 70 MHz, Data rate: 20 Mbps, Doppler: Mach 5@C band, Optional RRC at Tx and Rx, Differential Encoding on data. We had to simulate the complete QPSK transceiver with non decision-directed approach to carrier recovery and achieve symbol timing synchronization also. The Costas Loop with a 2nd order PLL was used for the carrier recovery process. We could track fixed phase and frequency offsets with zero error and small Doppler (acceleration in frequency +/- 5 kHz from carrier) with a finite steady state error (magnitude of this error depended on the loop gain parameter we chose). Differential encoding was used to ward off phase ambiguity associated with the Costas Loop based carrier recovery.
The most difficult part was the symbol timing synchronization with clock drift incorporated into it. We used the Gardener detector, which is one of the most powerful and efficient methods for time sync in literature. Gardener's classic paper on “A BPSK/QPSK Timing Error Detector for Sampled Receivers” can be found here. This paper gave me insight into the issues involved in timing error detection and correction and its dependence on input bit patterns (especially when an all 1/0 stream appears for long). Early-Late based timing synchronizers were later employed and compared against the Gardener detector. We spent almost a month to get this working fully. The Eb/No vs BER plot after simulation can be found here. The book “Digital Communications” – by John. G. Proakis and the paper “Timing recovery in Digital Synchronous Data Receivers” by K.Muller and M.Muller were very useful references for this simulation study.
This is one of my course projects - term project in the penultimate semester to be specific. Here, my friends and I took up the ambitious task of designing a prototype Automated Campus Navigation Vehicle. This vehicle can move through my campus from one place to another with the aid of GPS data, processing the video signals on its path. This was challenging because the entire navigation was in an untrained environment except for the GPS co-ordinate mapping. We used ultrasound obstacle sensors for obstacle detection and avoidance. It sparked in us the idea to execute this project when we came upon the PATH project at UC, Berkeley. Although developing an Automatic Guided Vehicle on road is a massive project, a little effort on a similar direction would definitely serve as a good starting point for future research.
We worked on this for close to 5 months. Here is the region of our campus that we had mapped using Google Earth. We learnt a lot that went into building such rich signal processing systems for real-time applications.
This project was my team's proposal for the TI Analog Design Contest 2009 and was carried out at the Integrated Systems Lab, CEG, AU under the guidance of my Prof. P.V Ramakrishna. Specifically, our goal in the project was to provide position location and navigation aid for the blind. We had proposed 3 methods viz. IR based, Ultrasound Triangulation and RFID based navigation for indoor navigation and GPS based movement for outdoor navigation.