Developed a knowledge guided encoder-decoder based model to improve the current multi-document summarization system.
The model takes each document separately, thereby representing thematic diversity within the corpus.
The work helped in illustrating the strengths and weaknesses of the attention-based model and the result was compared with several other baseline methods.
Region Proposals were extracted from input images/videos to compute the CNN features. From the features, regions were classified along with object mask. Training was done using ResNet and GoogLeNet network architectures.
Virginia Tech Campus Tour: Trained Mask RCNN
Virginia Tech Campus Tour: Trained YOLO
Estimate where a person would pay attention to while driving,and which part of the scene around the vehicle is more critical for the task.
The multi-branch deep architectures which are used here are Berkeley Deep-Drive Attention, ML-NET and LSTM Saliency Attentive Map Models. Compared the Avg. MSE (per pixel), Z-norm (per-pixel) and SSIM scores of predicted gaze map with reference to groundtruth fixation map.
Berkeley Deep-Drive Attention model performed better in terms of identifying critical situations on the road as compared to the other models.
Implemented a set interface that combines two key synchronization ideas: elimination and software combining.
Elimination means using operations with opposite semantics (e.g., a stack’s push and pop) to directly exchange elements, instead of synchronizing at a central location (e.g., elimination back-off stack).
Software combining means having one thread iteratively do the work of multiple operations with identical semantics (e.g., push) while other threads wait, instead of all threads synchronizing at a central location.