We are investigating the need for security aware resource brokering over traditional one-dimensional resource allocation for multi-cloud data collaboration. Data-intensive science applications in bioinformatics and other areas, often use federated multi-cloud infrastructures to support their compute-intensive processing needs. However, lack of knowledge about: a) individual domain's security policies, b) how that translates to security assurance of the applications, and c) nature of performance and security trade-offs - can cause performance-security conflicts for applications and inefficient/expensive resource usage (see Figure 1). We are developing a security-aware resource brokering middleware framework, i.e., the MCPS (Multi-Cloud Performance and Security) Broker, within federated multi-cloud systems to allocate application resources by satisfying their performance and security requirements. The architecture of the MCPS Broker is demonstrated in Figure 2 for a Bioinformatics workflow within a science gateway environment i.e., SoyKB. Our implementation of MCPS Broker and case study evaluation with workflows have demonstrated the benefits of our proposed middleware in ensuring both performance optimization and security compliance for SoyKB and other workflows in KBCommons.
Figure 1: End-to-end lifecycle stages of a data-intensive application with dynamic security requirements using federated multi-cloud resources from domains with diverse resource policy specifications.
Figure 2: Security-aware resource brokering middleware services with MCPS Broker and its underlying components interaction.
We have investigated multi-cloud template solution deployment issues when considering user resource specifications involving performance, agility, cost and security (PACS) factors. We have addressed the challenge of multi-cloud resource selection using cloud template solutions-based on user requirements. Figure 3 shows the steps of the collection, provisioning, consumption and monitoring that we have designed for supporting a custom scientific workflow. We have proposed an optimizer that uses an optimal combinatorial composition-based on PACS factors. It is designed to be integrated with a novel resource broker (i.e., PACS Broker) for prescriptive recommendations of template solutions with intelligent choices for users. We have evaluated the PACS Broker in our most recent efforts with several Bioinformatics workflows in KBCommons.
Figure 3: Architecture design and implementation steps for PACS Broker framework components and their interactions.
We are addressing the issue of adoption of big data analytics in healthcare applications involving huge volume of data analysis with challenges in dealing with data heterogeneity and sensitivity. We have proposed a novel community cloud architecture to help clinicians and researchers automate any existing semi-automated big data health application that uses large healthcare related databases to test bold hypotheses. Our approach involves a co-design of high-scale performance and security compliance through alignment of user requirements and data provider policies via a semi-automated Honest Broker module.
The functions of our semi-automated Honest Broker are shown in Figure 4. Using a case study involving a ophthalmological illness data analysis use case with multiple data sources (e.g., a Health Facts database, imaging data from scientific instruments, I2B2, Millennium), we have investigated how our community cloud architecture featuring virtual desktop thin-clients can mitigate the query response latency in running large-scale queries over billion transaction records, while also ensuring compliance with heterogeneity in the data classification levels in the various lifecycle stages of the health big data application.
Figure 4: Illustration of a semi-automated Honest Broker facilitation for protected health big data sharing to meet requirements of researchers and clinicians.
Using the reference architecture of the above community cloud architecture approach, we conducted an expanded study that was published in a 2019 issue of the Journal for Modeling in Ophthalmology. In this work, we addressed the issue of healthcare researchers in e.g., Ophthalmology who need easy/increased accessibility to protected data sets from multiple sources, while also ensuring security compliance of data providers is not compromised – in order to find new trends and enhance patient care. Specifically, we tackled the problem where there is an inherent lack of trust in the current healthcare community ecosystem between the data custodians (i.e., health care organizations, hospitals) and data consumers (i.e., researchers, clinicians). This typically results in a manual governance approach that causes slow data accessibility for researchers due to concerns such as ensuring auditability for any authorization of data consumers, and assurance to ensure compliance with health data security standards. We addressed this issue of long-drawn data accessibility by proposing a semi-automated “honest broker” framework that can be implemented in an online health application. The framework establishes trust between the data consumers and the custodians by: (a) improving the efficiency in compliance checking for data consumer requests using a risk assessment technique, (b) incorporating auditability for consumers to access protected data by including a custodian-in-the-loop only when essential, and (c) increasing the speed of large-volume data actions (such as view, copy, modify and delete) using a popular common data model. Our experimental results featured an ophthalmology case study involving an age-related cataract research and demonstrated how our solution approach practically implements natural language processing concepts, risk assessment (following NIST SP 800 guidelines) and a common data model. Thus, our results showed how we can improve timely data access and secure computation of protected data for ultimately achieving data-driven eye health insights.
This Project has been Supported by National Science Foundation (OAC-1827177) | Updated December 2021.