Contributed talks

Tuesday morning (13/02): Security

Timing side channels have been shown to undermine the security of various systems and cryptographic protocols. Likewise, many differentially private algorithms provide vacuous privacy guarantees when their running time is revealed along with their output. Consequently, many implementations of differentially private algorithms are not well suited for the interactive query setting. In this work, we establish a general framework for differential privacy in the presence of timing side channels. We define new notions of running-time stability and running-time privacy with respect to the joint distribution of the program’s output and running time. Our framework enables chaining running-time stable programs with timing-private delay programs (programs that delay releasing their output) such that the entire execution achieves running-time privacy. Importantly, our definitions allow for measuring running-time privacy and output privacy using different privacy measures which can simplify the overall privacy analysis. 

(slides)

In “The Incognito Conundrum” we explore the challenge of balancing privacy with accountability in online identity management, highlighting state-of-the-art methods for anonymous authentication and the challenges in private identity systems.

(slides)

Tuesday afternoon (13/02): Machine Learning

We consider the setting where machine learning models are retrained on updated datasets in order to incorporate the most up-to-date information or reflect distribution shifts. We investigate whether one can infer information about these updates in the training data (e.g., changes to attribute values of records). Here, the adversary has access to snapshots of the machine learning model before and after the change in the dataset occurs. Contrary to the existing literature, we assume that an attribute of a single or multiple training data points are changed rather than entire data records are removed or added. We propose attacks based on the difference in the prediction confidence of the original model and the updated model. We evaluate our attack methods on two public datasets along with multi-layer perceptron and logistic regression models. We validate that two snapshots of the model can result in higher information leakage in comparison to having access to only the updated model. Moreover, we observe that data records with rare values are more vulnerable to attacks, which points to the disparate vulnerability of privacy attacks in the update setting. When multiple records with the same original attribute value are updated to the same new value (i.e., repeated changes), the attacker is more likely to correctly guess the updated values since repeated changes leave a larger footprint on the trained model. These observations point to vulnerability of machine learning models to attribute inference attacks in the update setting.

Multiparty computation is a key privacy-enhancing technology as it allows computing on distributed data without revealing to any particular participant. I will introduce the core aspects and present results on using it for machine learning.

(slides)

Wednesday morning (14/02): Privacy Measures and Accounting

Third-party cookies have been a privacy concern since cookies were first developed in the mid 1990s, but more strict cookie policies were only introduced by Internet browser vendors in the early 2010s. More recently, due to regulatory changes, browser vendors have started to completely block third-party cookies, with both Firefox and Safari already compliant.

The Topics API is being proposed by Google as an additional and less intrusive source of information for interest-based advertising (IBA), following the upcoming deprecation of third-party cookies. Initial results published by Google estimate the probability of a correct re-identification of a random individual would be below 3% while still supporting IBA.

We analyze the re-identification risk for individual Internet users introduced by the Topics API from the perspective of Quantitative Information Flow (QIF), an information- and decision-theoretic framework. Our model allows a theoretical analysis of both privacy and utility aspects of the API and their trade-off, and we show that the Topics API does have better privacy than third-party cookies while providing high utility to IBA companies.

We present the notion of reasonable utility for binary mechanisms, which applies to all utility functions in the literature. This notion induces a partial ordering on the performance of all binary differentially private (DP) mechanisms. DP mechanisms that are maximal elements of this ordering are optimal DP mechanisms for every reasonable utility. By looking at differential privacy as a randomized graph coloring, we characterize these optimal DP in terms of their behavior on a certain subset of the boundary datasets we call a boundary hitting set. In the process of establishing our results, we also introduce a useful notion that generalizes DP conditions for binary-valued queries, which we coin as suitable pairs. Suitable pairs abstract away the algebraic roles of epsilon and delta in the DP framework, making the derivations and understanding of our proofs simpler. Additionally, the notion of a suitable pair can potentially capture privacy conditions in frameworks other than DP and may be of independent interest.

(slides)

The study of leakage measures for privacy has been a subject of intensive research and is an important aspect of understanding how privacy leaks occur in computer systems. Differential privacy has been a focal point in the privacy community for some years and yet its leakage characteristics are not completely understood. In this paper we bring together two areas of research –information theory and the g-leakage framework of quantitative information flow (QIF)– to give an operational interpretation for the epsilon parameter of local differential privacy. We find that epsilon emerges as a capacity measure in both frameworks; via (log)-lift, a popular measure in information theory; and via max-case g-leakage, which we introduce to describe the leakage of any system to Bayesian adversaries modelled using "worst-case" assumptions under the QIF framework. Our characterisation resolves an important question of interpretability of epsilon and consolidates a number of disparate results covering the literature of both information theory and quantitative information flow.

(slides)

Thursday afternoon (15/02)

The increasing interest in Differential Privacy (DP) necessitates tools that simplify its implementation, making it accessible to a wide range of users. These tools, which include libraries and frameworks, differ in functionality, security, usability, and performance, presenting a challenge in selecting the appropriate one for specific needs.

To assist practitioners in this selection process, we conducted a comparative analysis of leading DP tools, evaluating them both qualitatively and quantitatively. Our analysis focuses on features, scalability, and accuracy performance across common analytical queries.

This presentation shares insights from our analysis and experiences in developing an early version of a web interface for DP application, complemented by a user study with government agencies to understand user expectations of this technology. Our findings are aimed at helping organizations overcome engineering challenges, identify future research directions, and encourage investment in DP.

(slides)

National Privacy and De-Identification in Clinical Imaging on the Australian Imaging Service. Current State and New Directions.


The Australian Legal System is still living in the 90's despite around 25 years of technological advances. There is a massive reliance on printed paper with the NSW State Local and District courts being furthest behind requiring physical attendance and/or production necessary for valid judicial processes. The Federal courts and some tribunals have implemented remote eLodgement portals however the way this data is kept and used behind the scenes creates public concern. External postal and electronic transactions that contain sensitive evidence under non-publication orders are also maintained in a centralised database. This legal system isn't optimal nor completely private and also lacking integrity. How exactly can we utilise this valuable database while still preserving individual privacy? With a background in information security, mathematics and a recent year of several New South Wales court appearances I will be sharing details on major Australian legal processes and possible privacy applications as of 2024.

(slides)