Big Data Privacy - Prof. Alfredo Cuzzocrea

Big Data PRIVACY

PhD in Computer Science and Engineering
A.Y. 2024-2025
University of Bologna, Italy
Prof. Alfredo Cuzzocrea

Speaker

Dist. Prof. Alfredo Cuzzocrea
Full Professor
iDEA Lab, Founder and Director
University of Calabria, Rende, Italy
Excellence Chair in Computer Engineering –
Big Data Management and Analytics
Department of Computer Science
University of Paris City, Paris, France
web: https://sites.google.com/unical.it/cuzzocrea/

Prof. Alfredo Cuzzocrea - Biographical Notes

Alfredo Cuzzocrea is Full Professor and Distinguished Professor of Computer Engineering, and Founder and Director of the Big Data Engineering and Analytics Laboratory (iDEA Lab) of the University of Calabria, Rende, Italy. He also holds the Excellence Chair in Big Data Management and Analytics at the University of Paris City, Paris, France. He is Honorary Professor of Computer Engineering at the School of Engineering and Technology of the Amity University, Noida, India. He is also Research Associate of the National Research Council (CNR), Rome, Italy.

Course Overview

Abstract and Lecture Summary
Big data privacy is gaining momentum in the research community, due to the several challenges posed by the issue of ensuring the privacy of such kind of data in real-life applications and systems. The proposed course will start from foundations of big data privacy and will evolve to specialized topics of big data privacy, with particular regards to the case of multidimensional big data.

Lecture 1: Big Data Privacy: Foundations
Summary: Lecture 1 details on the foundations of big data privacy. In this lecture, we will explore the critical aspects of privacy in the era of big data. As the volume, variety, and velocity of data continue to grow, ensuring the privacy and security of personal information has become increasingly challenging. We will discuss the following key topics: (i) Introduction to Big Data: understanding the fundamentals of big data and its impact on various industries; (ii) Privacy Concerns: identifying the primary privacy issues associated with big data, including data breaches, unauthorized access, and data misuse; (iii) Regulatory Frameworks: examining the legal and regulatory frameworks that govern data privacy, such as GDPR, and other international standards; (iv) Privacy-Preserving Techniques: exploring various techniques and technologies designed to protect data privacy, including anonymization, encryption, and differential privacy; (v) Case Studies: analyzing real-world examples of big data privacy challenges and how organizations have addressed them. (vi) Future Trends: discussing emerging trends and future directions in big data privacy, including advancements in AI and machine learning for enhanced data protection.

Lecture 2: State-Of-The-Art Algorithms for Big Data Privacy
Summary: Lecture 2 focuses the attention on most relevant algorithms for achieving big data privacy. These algorithms represent the baseline knowledge for building solid privacy-preserving big data systems. Among the relevant privacy-preserving algorithms, we will consider the following ones: (i) Differential Privacy: a technique that ensures the privacy of individuals in a dataset by adding noise to the data; (ii) Homomorphic Encryption: allows computations to be performed on encrypted data without decrypting it; (iii) Secure Multi-Party Computation: enables multiple parties to jointly compute a function over their inputs while keeping those inputs private; (iv) k-Anonymity: ensures that each record in a dataset is indistinguishable from at least k-1 other records with respect to certain identifying attributes; (v) L-Diversity: extends k-anonymity by ensuring that sensitive attributes have at least l well-represented values in each equivalence class; (vi) T-Closeness: ensures that the distribution of a sensitive attribute in any equivalence class is close to the distribution of the attribute in the overall dataset.

Lecture 3: Federated Learning over Big Data
Summary: Lecture 3 presents topics related to Federated Learning (FL) over big data. The lecture will focus on understanding the concept of FL and its importance in the context of big data, by addressing privacy concerns, data security, and the need for decentralized data processing. After, some key concepts of FL will be exploited, such as: (i) Federated Averaging (FedAvg): a core algorithm that aggregates model updates from multiple clients to create a global model; (ii) Client-Server Architecture: the structure where multiple clients (devices) collaborate with a central server to train a model; (iii) Privacy-Preserving Techniques: methods such as differential privacy and secure multi-party computation used to ensure data privacy. Finally, various FL algorithms will be presented in details.

Lecture 4: Privacy-Preserving Big Multidimensional Data Management
Summary: The problem of making privacy-preserving big multidimensional data is a major research topic in Big Data research. Here, several approaches have been proposed and explored in different computational settings. Lecture 4 will discuss the fundamental of privacy-preserving OLAP and proposes a novel technique used to obtain privacy-preserving data cubes, a special case of big multidimensional data, that “balance” accuracy and privacy constraints for a wide family of next-generation applications, also falling in the modern context of Cloud Computing. The proposed technique is based on nice flexible sampling-based techniques, which allow us to gain in “query coverage” while spending something in terms of computational overheads. Experimental result analysis will be provided, with discussions of trade-offs and pro/cons assessment.

Lecture 5: Advanced Privacy-Preserving Big Data Analytics
Summary: Lecture 5 presents frameworks that support advanced privacy-preserving big data analytics methods and systems. In particular, we will first describe QUALITOP Federated Big Data Analytics Learning System (QFLS), a Cloud-based framework supporting big healthcare data management and analytics for big data lakes, with an emphasis on the main privacy-preserving data management and analytics functionalities. After, we will describe an innovative algorithmic framework called Advanced Privacy-Preserving Big Data Publishing in Hierarchical DOMains (AB-DOM), which is based on state-of-the-art anonymization techniques mixed with a graph coloring algorithm and an integrated data sampling method to guarantee that sensitive data are highly secured.

Course Duration
20 hours

Course Prerequisites
Big data foundations

Room
On November 17, 2025: Anthropology Hall, third floor, BiGeA Department.

From November 18, 2025 to November 21, 2025: Nadia Busi Hall, ground floor, Department of Computer Science and Engineering.

Examination and Grading
Oral presentation of a course’s topic, plus general examination on all the course’s topics. At the end of the exam’s successfully-pass, an official certificate will be released. This certificate will be used with the UniBo PhD Office for proofing the achieved credits.

Course Calendar
November 17, 2025 – 09:00-13:00
November 18, 2025 – 09:00-13:00
November 19, 2025 – 09:00-13:00
November 20, 2025 – 09:00-13:00
November 21, 2025 – 09:00-13:00

Examination Days (Tentative)
December 15, 2025 – 9:30-12:30 (Microsoft TEAMS)
December 16, 2025 – 9:30-12:30 (Microsoft TEAMS)

Course Material
Delivered to students attending the course

Page updated

Report abuse