Selected Projects

"Linear Regression in C" (in progress, Github repository )

Implemented a module to perform linear regression from scratch in C. Created a companion linear algebra module to perform all the necessary matricial operations. The code uses linear regression and receives a dataset from the user, who can specify the learning rate and number of iterations. Work still in progress.

Keywords:  C, Linear Regression, Linear Algebra, Machine Learning 

"Integrating Domain Knowledge for Financial QA: A Multi-Retriever RAG Approach with LLMs" (with Yukun Zhang and Samyak Jain, report, poster)

We aim to address the errors of financial numerical reasoning QA tasks due to the lack of domain knowledge in finance. We build a RAG-like multi-retriever system to retrieve both external domain knowledge and internal question contexts as the inputs for our generator. Despite recent advances in LLMs, financial numerical questions are challenging because they require specific domain knowledge in finance as well as complex multi-step numeric reasoning. Our model outperforms of the non-expert human crowd, yet it is still not at the expert human crowd level. Our best neural-symbolic generator model outperforms the FinQA baseline on both execution and program accuracy.

Keywords:  Python, Pytorch, NLP, RAG, BERT, DPR FAISS, FinRAD

"Solar Panel Detection on Satellite Images" (with Camila Nicollier, poster)

We train and implement a model for automating solar panel detection from satellite images. We deploy the new state-of-the-art YOLOv10 architecture only a few days after its release with satisfactory results. Automating solar panel identification is a relevant task in the context of renewable energies, where the need to keep track of these installations has increased exponentially and solar developers have little to no tools to quickly identify existing projects in a specific area.

Keywords:  Python, Pytorch, Computer Vision, Object Detection, YOLO, YOLOv10, YOLOv9, CNN, Fast R-CNN

"Rock Song Lyric Generator" (Github, Deployed App)

Generative AI project that generates song lyrics starting from a user-provided first line. The model is based on a GPT2 model finetuned over a novel dataset of over 50,000 rock-song lyrics. The model generates songs for different genres and different song structures. Please refer to the demos page in the deployed app to see some examples. 

Keywords:  Python, GenAI, Pytorch, GPT2, Huggingface, Transformers, Pandas, rock, lyrics, music, text generation, finetuning, scraping