A Knowledge Graph Warehouse for Neighborhood Information
National Science Foundation Award # 2333790
Project Abstract
This project aims to establish a robust and sustainable data infrastructure to integrate neighborhood-level data to assist and inform various local stakeholders. Drawing on local records, census data and other neighborhood-level data the project will construct a unified database to capture crucial connections among the variety of neighborhood-level information sources. Project outcomes include integrated neighborhood-level data and software for constructing and operating a knowledge graph warehouse. The educational component of the project will integrate outcomes from this project into course content, foster student mentoring, and promote educational innovation with a focus on inclusivity and diversity within the associated STEM programs.
Working in partnership with the National Institute of Justice (NIJ) and other expert entities, this project addresses critical issues in unifying disparate data sources at the neighborhood-level, e.g., demographics, land use, local incidents and injuries, proximity to trauma centers, and the like by leveraging advanced data extraction and record linkage methods. The proposed knowledge graph warehouse is designed to organize and maintain pertinent neighborhood-level information, with data transformation achieved through zero-shot extraction techniques and key-phrase generation methods for free text data. The warehouse will support efficient querying and summarization with adaptable techniques for its unique structure, including novel pattern mining methods for trend detection, ensuring sustainability and extensibility with compatibility for other knowledge graphs, and incorporating incremental updates and extensions for new data and entity types. To ensure data accuracy, the project plans to integrate data from various local agencies, provide user feedback mechanisms, and uphold a robust metadata record. In order to mitigate biases and to provide a comprehensive view, the project will continuously update the infrastructure with new data sources, ensuring transparency through accessibility of metadata and recording of data provenance.
Investigators
Jing Gao (Purdue University, Principal Investigator)
Fenglong Ma (Pennsylvania State University, Co-Principal Investigator)
Jingbo Shang (University of California San Diego, Co-Principal Investigator)
Daniel Semenza (Rutgers University, Co-Principal Investigator)
Partners
Elizabeth Groff (National Institute of Justice)
Barbara E Lopez (National Institute of Justice)
Students
Yi-Hsiang (Purdue University)
Xiaoze Liu (Purdue University)
Xiaochen Wang (Pennsylvania State University)
Zilong Wang (University of California San Diego)
Tianle Wang (University of California San Diego)
Sipeng Zhang (University of California San Diego)
Knowledge Graph Warehouse Architecture
Interface
TBD Link
This website is designed for a broad range of stakeholders—such as policymakers, local government agencies, law enforcement, researchers, health professionals, and community organizations—seeking to understand neighborhood-level firearms data. By providing an integrated view of local incidents, injuries, demographics, proximity to trauma centers, and other relevant factors, the platform empowers users to run targeted queries, uncover trends, and gain meaningful insights into community-specific challenges. These insights can inform policy decisions, guide resource allocation, shape preventative strategies, and facilitate community outreach. Through an intuitive interface and robust metadata features, the website ensures transparent and accountable data use, making it an essential tool for evidence-based decision-making and targeted interventions.
Broader Impacts
This project has significant potential to advance theory and practice in the fields of data integration, information extraction, knowledge graphs, data warehouses, databases, pattern mining, and other relevant data science fields. The outcomes of this project include benchmark integrated firearm datasets at both incident and aggregate levels, as well as software implementations of the proposed knowledge graph warehouse construction and operations. By integrating and structuring firearms data that were originally scattered and disconnected in a multi-dimensional space, critical information and patterns that are essential for firearm policymaking and academic research will become more searchable, accessible, interoperable, and reusable. With this new firearms data infrastructure, we will be able to provide practitioners, policymakers, and researchers in criminal justice and public health effective tools to find answers to important questions previously unanswerable due to a lack of fine-grained information and disconnection between information sources. A repository of the developed software with a user interface, datasets, and other research and education materials will be disseminated in both computer science, criminal justice, and public health areas via websites, seminars, tutorials, and presentations. Moreover, the proposed research work will be integrated tightly with education as we plan to leverage the proposed research for course content adaptation, student mentoring, and educational innovation tasks. We will also encourage the participation of undergraduate and minority students in computing and other major aspects of the project.
Publications
Che, Liwei Che and Wang, Jiaqi and Liu, Xinyue and Ma, Fenglong. Leveraging Foundation Models for Multi-modal Federated Learning with Incomplete Modality. ECML PKDD 2024.