Reach areas:
My research focuses on Big Data Security, IoT Security, Cyber-Security, Computer Networking, Cryptography, Vulnerability Assessment, Penetration Testing, Ethical Hacking, and Software Test Automation.
Data security has both technical and social dimensions. Understanding and leveraging the inter play of these dimensions can help design more secure systems and more effective policies. However, the majority of data security research has only focused on the technical dimension. In my research, I use a different approach that combines big data techniques, computational models, and network science methods with IoT networks to ensure the reliability of networks by assuring the security of sensitive data.
Current Research
I am currently conducting my research in the area of data security of the EXPOSOEM project. The Texas Tech University (TTU) EXPOSOME Project is both funded by NIH and NSF for identifying exposures of people occurring from prenatal through death and the impact of these exposures on health. Raw data for analysis is available in various data formats; hence, it is very difficult task for data analysts to analyze raw data to gain an insight from the data. A transdisciplinary team is currently implementing a secure system to convert different types of data formats for storage in a MongoDB database for further analysis of the TTU EXPOSOME Project files.
Since more data is being added on a continuous basis including de-identified, but still protected personal health records [1], the data must be secured and privacy assured. In addition, the authenticated researchers’ access to the data should not be impaired while keeping the data safe. As a first solution, the transdisciplinary team has designed a secure data analytic model with the TTU EXPOSOME Project data files, Linux Server, Linux Singularity Container, and MongoDB due to their availability at the TTU High Performance Computing Center (HPCC).
In this project, my role is as a security analyst, to make sure the security of the entire data set so that the system complies with Health Insurance Portability and Accountability Act (HIPAA). After implementing the initial system with basic security features, recently we were able to publish a paper “NoSQL Based Medical Data Management and Retrieval: Exposome Project securely” in the ACM Companion Proceedings of the10th International Conference on Utility and Cloud Computing (Acceptance rate 27%). As the next step, in order to enhance the security of the current system, my research investigated how to assure the TTU EXPOSOME Project data security by proposing a secure data analytic framework with the Singularity Linux container and the MongoDB NoSQL database, commonly available at TTU. In this stage, I further investigated what are the advantages of Linux Containers (LXCs) [2] over Virtual Machines (VMs) [3] with security and performance perspectives. Then, I focused on four main HIPPA required areas in data security, such as authentication, authorization, encryption, and auditing, in order to make sure system security is assured to handle healthcare data. After finishing this stage, again my second paper “A Review of MongoDB and Singularity Container Security in regards to HIPAA Regulations” got accepted and published in the ACM Companion Proceedings of the10th International Conference on Utility and Cloud Computing (Acceptance rate 27%).
After finishing the initial survey of security and performance issues with VMs, LXCs and MongoDB in the Linux cluster, it is implemented the proposed system in the literature survey. Then to find the available vulnerabilities against any system, it is mandatory to conduct vulnerability assessments as scheduled tasks in a regular manner. Thus, an easily deployable, easily maintainable, accurate vulnerability assessment testbed or a model is helpful as facilitated by Linux containers. Nowadays Linux containers (LXCs) which have operating system level virtualization, are very popular over virtual machines (VMs) which have hypervisor or kernel level virtualization in high performance computing (HPC) due to reasons, such as high portability, high performance, efficiency and high security [4]. Hence, LXCs can make an efficient and scalable vulnerability assessment testbed or a model by using already developed analyzing tools such as OpenVas, Dagda, PortSpider, OWASP Zed Attack Proxy, and OpenSCAP, to assure the required security level of a given system very easily. To verify the overall security of any given software system, this research first introduces a virtual, portable and easily deployable vulnerability assessment general testbed within the Linux container network. From this section I have published one more conference proceeding “Dynamic & portable vulnerability assessment testbed with Linux containers to ensure the security of MongoDB in Singularity LXCs” at Supercomputing-2018 (SC18) and a journal research articles “Security Assurance of MongoDB in Singularity LXCs: An Elastic & Convenient Testbed Using Linux Containers to Explore Vulnerabilities” at Springer Cluster Computing (Journal special issue): the Journal of Networks, Software Tools and Applications accordingly. These two research items present, how to conduct experiments using this testbed on a MongoDB database implemented in Singularity Linux containers to find the available vulnerabilities in images accompanied by containers, host, and network by integrating three tools; OpenVas, Dagda, and PortSpider to the container-based testbed. Finally, it discusses how to use generated results to improve the security level of the given system.
In aforementioned steps, only the basic secure data analytic framework is considered with MongoDB and Singularity that complies very closely with the HIPAA data security requirement. In order to improve the security of the proposed (partially implemented) system in the future, it is important to make sure that the entire system complies with more HIPAA requirements [5], such as ensuring confidentiality, integrity, availability, risk analysis, risk management, administrative safeguards, technical safeguards, physical safeguards, and organizational needs. Three main tasks are identified to achieve such system security: 1. Vulnerability and threat analysis of the proposed system, 2. A new security model proposal with enhanced features to protect the system from any vulnerability found in task one to make a virtual data analytic framework with MongoDB and Singularity, and 3. Introduce security mechanisms to ensure network security, web application security, and physical security. My continuous research efforts was able to complete these three sub steps. From the 1st sub step’s results, I was able to publish another conference proceeding “Vulnerability Prioritization, Root Cause Analysis, and Mitigation of Secure Data Analytic Framework Implemented with MongoDB on Singularity Linux Containers” at the 4th International Conference on Compute and Data Analysis (ICCDA-2020). The next article “Mechanisms and Techniques to Enhance the Security of Big Data Analytic Framework with MongoDB and Linux Containers” is coming along with as a combination of 2nd and 3rd sub steps results and recently get published in Elsevier “Array” journal.
A big picture of the first proposed data analytic framework, implemented on Windows cluster at TTU-HPCC is shown below in the Figure-01 (1st publication), then conducted a vulnerability analysis followed by literature survey to move current system to a Linux cluster with LXCs as shown in Figure-02 to improve the security (2nd publication). After implementing the proposed system in the 2nd publication as shown in Figure-03, made a vulnerability assessment testbed as shown in Figure-04 with Linux containers and set of vulnerability assessments tools to conduct a vulnerability assessment for the current Secure Big Data Analytic Framework with MongoDB and Linux Containers to see how much the system is vulnerable to attacks (3rd and 4th publication). Finally, this research finds root causes for major vulnerabilities found under the vulnerability assessment and thereafter the research propose and implement security mechanisms to get rid of those major vulnerabilities from the data analytic framework (5th publication). Again and again, based on new findings, again we can improve the system and continue the process repeatedly. Later this system can be used as a virtual secure data analytic framework in any data analytic research where security is essential.
Ultimately, the research focus of me will be in the direction of designing and building novel algorithms and systems that improve the performance, privacy, and security of data. With the existing knowledge, I gained through my prior research work, that reveal some security issues and new ideas would be immensely instrumental to continue the current doctoral research and forthcoming publications on the topic with which, I intend to uncover the knowledge combining computer science and users enabling a secured and reliable data network experience in future.