Big Data PRIVACY
PhD in Computer Science and Engineering
A.Y. 2024-2025
University of Bologna, Italy
Prof. Alfredo Cuzzocrea
PhD in Computer Science and Engineering
A.Y. 2024-2025
University of Bologna, Italy
Prof. Alfredo Cuzzocrea
Speaker
Dist. Prof. Alfredo Cuzzocrea
iDEA Lab, Founder and Director
University of Calabria, Rende, Italy
Excellence Chair in Computer Engineering –
Big Data Management and Analytics
Department of Computer Science
University of Paris City, Paris, France
web: https://sites.google.com/unical.it/cuzzocrea/
Prof. Alfredo Cuzzocrea - Biographical Notes
Alfredo Cuzzocrea is Distinguished Professor of Computer Engineering, and Founder and Director of the Big Data Engineering and Analytics Laboratory (iDEA Lab) of the University of Calabria, Rende, Italy. He also covers the role of Full Professor in Computer Engineering at the University of Paris City, Paris, France, as holding the Excellence Chair in Big Data Management and Analytics. He is Honorary Professor of Computer Engineering at the School of Engineering and Technology of the Amity University, Noida, India. He is also Research Associate of the National Research Council (CNR), Rome, Italy.
Course Overview
Abstract and Lecture Summary
Big data privacy is gaining momentum in the research community, due to the several challenges posed by the issue of ensuring the privacy of such kind of data in real-life applications and systems. The proposed course will start from foundations of big data privacy and will evolve to specialized topics of big data privacy, with particular regards to the case of multidimensional big data.
Lecture 1: Big Data Privacy: Foundations
Summary: Lecture 1 details on the foundations of big data privacy. In this lecture, we will explore the critical aspects of privacy in the era of big data. As the volume, variety, and velocity of data continue to grow, ensuring the privacy and security of personal information has become increasingly challenging. We will discuss the following key topics: (i) Introduction to Big Data: understanding the fundamentals of big data and its impact on various industries; (ii) Privacy Concerns: identifying the primary privacy issues associated with big data, including data breaches, unauthorized access, and data misuse; (iii) Regulatory Frameworks: examining the legal and regulatory frameworks that govern data privacy, such as GDPR, and other international standards; (iv) Privacy-Preserving Techniques: exploring various techniques and technologies designed to protect data privacy, including anonymization, encryption, and differential privacy; (v) Case Studies: analyzing real-world examples of big data privacy challenges and how organizations have addressed them. (vi) Future Trends: discussing emerging trends and future directions in big data privacy, including advancements in AI and machine learning for enhanced data protection.
Lecture 2: State-Of-The-Art Algorithms for Big Data Privacy
Summary: Lecture 2 focuses the attention on most relevant algorithms for achieving big data privacy. These algorithms represent the baseline knowledge for building solid privacy-preserving big data systems. Among the relevant privacy-preserving algorithms, we will consider the following ones: (i) Differential Privacy: a technique that ensures the privacy of individuals in a dataset by adding noise to the data; (ii) Homomorphic Encryption: allows computations to be performed on encrypted data without decrypting it; (iii) Secure Multi-Party Computation: enables multiple parties to jointly compute a function over their inputs while keeping those inputs private; (iv) k-Anonymity: ensures that each record in a dataset is indistinguishable from at least k-1 other records with respect to certain identifying attributes; (v) L-Diversity: extends k-anonymity by ensuring that sensitive attributes have at least l well-represented values in each equivalence class; (vi) T-Closeness: ensures that the distribution of a sensitive attribute in any equivalence class is close to the distribution of the attribute in the overall dataset.
Lecture 3: Federated Learning over Big Data
Summary: Lecture 3 presents topics related to Federated Learning (FL) over big data. The lecture will focus on understanding the concept of FL and its importance in the context of big data, by addressing privacy concerns, data security, and the need for decentralized data processing. After, some key concepts of FL will be exploited, such as: (i) Federated Averaging (FedAvg): a core algorithm that aggregates model updates from multiple clients to create a global model; (ii) Client-Server Architecture: the structure where multiple clients (devices) collaborate with a central server to train a model; (iii) Privacy-Preserving Techniques: methods such as differential privacy and secure multi-party computation used to ensure data privacy. Finally, various FL algorithms will be presented in details.
Lecture 4: Privacy-Preserving Big Multidimensional Data Management
Summary: The problem of making privacy-preserving big multidimensional data is a major research topic in Big Data research. Here, several approaches have been proposed and explored in different computational settings. Lecture 4 will discuss the fundamental of privacy-preserving OLAP and proposes a novel technique used to obtain privacy-preserving data cubes, a special case of big multidimensional data, that “balance” accuracy and privacy constraints for a wide family of next-generation applications, also falling in the modern context of Cloud Computing. The proposed technique is based on nice flexible sampling-based techniques, which allow us to gain in “query coverage” while spending something in terms of computational overheads. Experimental result analysis will be provided, with discussions of trade-offs and pro/cons assessment.
Lecture 5: Privacy-Preserving Big Data Analytics via Drill-Across Multidimensional Analytics over Big Co-Occurrence Aggregate Hierarchical Data
Summary: Lecture 5 presents the framework Drill-CODA, which allows us to support drill-across multidimensional big data analytics over big co-occurrence aggregate hierarchical data, with privacy-preservation features. Drill-CODA is a composite framework that combines several data processing and analytics metaphors over hierarchical data, all in the multidimensional fashion, with the goal of providing useful insights over large-scale big data repositories, while protecting their privacy. This finally demonstrates how complex techniques can be combined together to obtain the privacy-preservation effect over big data. The lecture provides principles, architecture, algorithms and functionalities of Drill-CODA, along with its experimental evaluation.
Course Duration
20 hours
Course Prerequisites
Big data foundations
Room
TBA, Department of Computer Science and Engineering - DISI - University of Bologna
Examination and Grading
Oral presentation of a course’s topic, plus general examination on all the course’s topics. At the end of the exam’s successfully-pass, an official certificate will be released. This certificate will be used with the UniBo PhD Office for proofing the achieved credits.
Course Calendar
November 17, 2025 – 09:00-13:00
November 18, 2025 – 09:00-13:00
November 19, 2025 – 09:00-13:00
November 20, 2025 – 09:00-13:00
November 21, 2025 – 09:00-13:00
Examination Days (Tentative)
December 15, 2025 – 9:30-12:30 (Microsoft TEAMS)
December 16, 2025 – 9:30-12:30 (Microsoft TEAMS)
Course Material
Delivered to students attending the course