UnCoRe 2021

An Investigation to Enhance User’s Information Secrecy and Security under A Back-end Retargetable Programming Infrastructure to Facilitate Data Science

NSF REU Summer 2021

About the Project

Across science and engineering disciplines, it is becoming increasingly common to make use of large datasets, which make it possible for researchers to produce profound new discoveries and contributions. However, most of these scientists and engineers do not specialize in data science. As a result, they may experience technical barriers when trying to make discoveries about their data. DV^f is a domain-specific functional programming language designed with general engineers and scientists in mind, offering a set of declarative language-based facilities that address this problem. In working with data sets and machine learning models, DV^f must consider the way it handles potentially sensitive information. As a foundational design principle, the functional DV^f programming language uses a state-of-the-art scientific workflow framework. Research efforts to improve common security issues under scientific workflow, such as provenance access control policies, should also be extended to the DV^f infrastructure. Furthermore, a user of DV^f may use their domain knowledge to determine which features would result in the most successful model, which essentially characterizes the related DV^f program as user-associated intellectual property. The exposition of such information to unauthorized people would result in a violation of their intellectual property rights. Finally, some adversaries may attempt to reverse engineer a model to learn about the data set used to train the model. The aim of this project is to investigate and enhance the security and confidentiality of information within this programming infrastructure with the goal of protecting personal data and intellectual property.

Page updated

Report abuse