This project addresses the growing complexities of research ethics in an era dominated by advanced AI technologies. This AI-powered tool was designed to complement Institutional Review Boards (IRBs) by providing a structured, detailed analysis of research protocols. It goes beyond legal compliance, helping researchers identify ethical concerns early in their study design, which can improve research quality while reducing the workload on IRBs.
It utilizes a retrieval-augmented generation system integrated with a vector database for dynamic retrieval of ethical guidelines. It aligns research practices with diverse ethical protocols and frameworks and generates tailored feedback for specific research contexts with actionable insights.
The goal of this study was to evaluate the viability of developing a more impartial evaluation dataset to reduce biases and determine how these biases affect the performance of the Stable Diffusion models. Evaluation becomes more challenging when T2I models start to exhibit hallucinations, in which expected pictures are replaced by unexpected outputs, such as apple pies instead of apples.
The project began with creating a vocabulary to generate prompts of increasing complexity. Next, these prompts were fed to the Stable Diffusion v1.4 and v2.1 models to generate images. Finally, to compare and evaluate the models' understanding of the prompt, the T2I-CompBench evaluation benchmark was applied to the images.
Habituate is an innovative AI-powered mobile app designed to enhance habit tracking by addressing the limitations of traditional manual logging methods. By integrating computer vision and advanced language models, Habituate provides personalized insights into dietary and workspace organization habits. Users upload images of their meals and workspaces, which are processed through ResNet for object detection. These detections are further analyzed by the Moondream2 visual-language model and Mistral-7B, combined with retrieval-augmented generation, to generate tailored feedback.
By detecting subtle signs of disorganization or declining motivation, Habituate aims to enable early interventions, enhance self-awareness, and support long-term habit formation.
AutoML-Viz is an interactive visualization tool designed to support users in refining the search space of Automated Machine Learning (AutoML) processes and analyzing the results. This tool aims to alleviate the pain of manually selecting machine learning algorithms and tuning hyperparameters by providing users with a visual interface to interact with the AutoML process. It allows users to monitor the AutoML process, analyze the searched models, and refine the search space in real-time. Users can analyze searched models across three levels of detail: algorithm, hyperpartition, and hyperparameter.
This will be available as a dashboard to the user, who can interact with its various elements to derive insights that can aid them in their decision-making process.
This study focused on evaluating how well LLMs can convert natural language descriptions of planning tasks into accurate Planning Domain Definition Language (PDDL) code. The study revealed key limitations of LLMs, including difficulties with domain-specific queries and inaccuracies in the generated plans.
To address these challenges, the research proposed a novel approach using scene graphs to represent the initial and goal states of planning problems. This enabled a semantic evaluation method that went beyond traditional syntactic checks, providing a more meaningful comparison between the generated PDDL code and its ground truth. By refining the assessment process, the project offered deeper insights into LLMs' capabilities and limitations in structured planning tasks.
VeRA explored how vector databases (VecDBs) can work alongside LLMs to improve information retrieval and natural language generation.
The research uncovered key factors that influence Retrieval-Augmented Generation (RAG) performance, like the quality of embeddings and how GPU memory is allocated, showing their impact on retrieval efficiency. After an examination of varied data sources such as blog posts and images, the project demonstrated how VecDBs could complement LLMs, helping to reduce issues like hallucinations and limitations in domain-specific knowledge. Among the findings, FAISS consistently outperformed ChromaDB in retrieval tasks, reiterating the critical role of high-quality embeddings in making RAG workflows more effective.