In this project, we extracted terms closely associated with the term "Pedestrian" and filtered them based on various word2vec models. After the filtering, we applied a semantic analysis to check for the terms' meaning as close to a pedestrian. After extracting such terms, we augmented these terms' images to existing datasets to improve the pedestrian detection system.
Barzamini, Hamed, Murtuza Shahzad, Hamed Alhoori, and Mona Rahimi. "A multi-level semantic web for hard-to-specify domain concept, Pedestrian, in ML-based software." Requirements Engineering (2022): 1-22.
Techniques used - Data Preprocessing, Deep Learning, Neural Networks, and Computer Vision.
In this project, the research question is to find the change in the perception of a concept ("Pedestrian") throughout a timespan. For this purpose, we collected tweets related to pedestrian accidents caused by autonomous vehicles. This data was filtered, and we extracted the terms that show significant importance for the accident event. This significance was statistically calculated by observing the term with a probability of occurrence outside the 3 standard deviations.
Techniques used - Text Processing, Data Analysis, and Statistics.
Online software repositories often infuse bugs and lead to vulnerable code. The fixed version of the code, in turn, may produce a new vulnerability issue to the code. In this project, I extract the code from different GitHub repositories that make the code vulnerable. After extraction, I analyze the code and convert the code to vectors using CodeBERT. Furthermore, I build ML models that predict if a future vulnerability will happen, given the type of bug-fixed code.
Techniques used - Classification, Natural Language Processing, Code2Vec, Software Engineering.
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Altmetrics have been proposed as a complement to scholarly metrics, such as citations. Altmetrics is a growing area of interest that intends to measure the societal impact of research based on the dissemination of a research outcome via multiple social media platforms such as Facebook and Twitter, reference managers such as Mendeley, and information websites such as Wikipedia, online news outlets, blogs, and other peer review websites.
In this project, we used Altmetrics to predict citations that a scholarly publication could receive. I built various classification and regression models and evaluated their performance. I found that tree-based models performed best in classification. We found that Mendeley's readership, publication age, post length, maximum followers, and academic status were the most important factors in predicting citations.
Techniques used - Data Preprocessing, Classification, Regression.
In this project, I used the Altmetrics dataset and built various models on it to predict the long-term impact of an article on various online platforms. I built the clusters for various publication years of the research articles with respect to their citation count. On each cluster, I built Machine Learning and Deep Learning models to predict if they had more than median number of citation counts. Through a detailed analysis of the results, I found that Mendeley counts are the key factor in determining the long-term impact of an article online. Policy counts also contributed a lot to the data. Random Forest and Bernoulli Classifiers outperformed other classifiers for this prediction.
Techniques used - Classification, Clustering, and Neural Networks.
The goal of this was to provide authors with a sentiment score their research article would receive after publication. For this purpose, I used the Facebook reactions feature and Twitter sentiment analysis. This research was very interesting as the outcome help many authors to predict the reaction of the community before formally submitting their research paper for peer review or final approval. Random Forest and Naive Bayes Classifier were better in predicting the Facebook reaction(“Like, Love, Haha, Wow, Sad, and Angry”) an article would receive.
Techniques used - Classification, NLP and Neural Networks.
Working at Infosys Limited was a great experience. I was exposed to real-time project development and the environment. The opportunity to interact with various people from diverse backgrounds was in itself a huge learning experience.
As a trainee at Infosys, I developed an application for Flight and Hotel booking from requirements gathering to all phases of the waterfall model, providing users with various options for booking(both flights and hotels).
Technologies Used - Python and SQL.
In this project, I developed a shopping web application called "JCart". I analyzed business requirements, added validations and wrote junit test cases to deal with all phases of test-driven development.
Technologies Used - SQL, JSF, and Hibernate Framework.
Bristow is a helicopter service providing company for different operations like search and rescue (SAR), and oil and gas. Different operations required for providing helicopter services were catered to by the project flight . The operations are pre-flight, flight, post-flight. I developed the back-end application for calculating the Center of Gravity(CoG) of the helicopter and thereby adjusting the baggage and freight accordingly. In addition, I developed the admin part of the system, where the admin has privileges to give access to the system to various members of the helicopter based on their designation(pilot, co-pilot, crew).
Technologies Used - HTML, CSS, JQuery, Spring, Hibernate framework.