CRII: III: Improving the Utilization of Humans in Data Integration and Discovery (NSF)
This project will develop new data integration and discovery solutions that account for human-in-the-loop processes and the emergence of LLMs. It will introduce methods to support humans in data integration and discovery, lay the foundations for studying human involvement in these pipelines, and establish new methods to evaluate and benchmark them. This project has the potential to create new human jobs, such as prompt engineers and response validators and will also contribute to the transparency of solutions via proper prompting and validation of AI responses. More information about the award can be found in the abstract online.
FairPrep: Fair Data Preparation, from Discovery to Integration (BSF) w/ Avigdor Gal
The topic is data integration and discovery are the cornerstones in contemporary data science, aiming at understanding datasets, extending and improving them, while focusing on the data. Various effective solutions have been proposed over the years; however, a proper emphasis on fairness has yet to be established. In what follows, we propose FairPrep, highlighting fairness. In data preparation focusing on data discovery and integration. Our project will allow scientists to identify data that must be protected based on regulations or policies (sensitive attributes) and find new sources of information that guarantee fair distributions across datasets.