Implicit Class-Conditioned Domain Alignment
for Unsupervised Domain Adaptation

Xiang Jiang, Qicheng Lao, Stan Matwin, Mohammad Havaei

TLDR: We propose a simple sampling-based implicit alignment approach to tackle within-domain class imbalance and between-domain class distribution shift in unsupervised domain adaptation. It addresses the domain-discriminator shortcut while removing the need to optimize the model parameters from pseudo-labels explicitly.

Abstract: We present an approach for unsupervised domain adaptation—with a strong focus on practical considerations of within-domain class imbalance and between-domain class distribution shift—from a class-conditioned domain alignment perspective. Current methods for class-conditioned domain alignment aim to explicitly minimize a loss function based on pseudo-label estimations of the target domain. However, these methods suffer from pseudo-label bias in the form of error accumulation. We propose a method that removes the need for explicit optimization of model parameters from pseudo-labels directly. Instead, we present a sampling-based implicit alignment approach, where the sample selection procedure is implicitly guided by the pseudo-labels. Theoretical analysis reveals the existence of a domain-discriminator shortcut in misaligned classes, which is addressed by the proposed implicit alignment approach to facilitate domain-adversarial learning. Empirical results and ablation studies confirm the effectiveness of the proposed approach, especially in the presence of within-domain class imbalance and between-domain class distribution shift.

Problem Setup

Domain Shift

Domain shift is an important challenge facing real-world deployments of machine learning models. It can be understood as the generative process where the observation X is determined by some latent variable Y, together with a domain variable D. The goal of domain adaptation is to obtain a predictive model p(y|x) that is independent of the domain variable D. From the perspective of causal inference, a disease Y has different manifestations X depending on the the scanner D. We hope to learn an anti-causal model of p(disease|image) that is robust with respect to different scanners D.


Unsupervised Domain Adaptation

Unsupervised domain adaptation is a type of transfer learning where:

  1. Domain shift gives rise to the source and target domains;

  2. The source domain is labeled while the target domain is unlabeled;

  3. Both domains share the same classification task.

Motivations

Applied motivation: prior probability shift

The applied motivation arises from the prior probability shift between domains. For instance, the proportion of the population that develops certain illnesses might vary in different countries, and the assumption of identical class distribution could result in inaccurate models when adapted to the target domain.

Theoretical motivation: domain-discriminator shortcut

The domain discriminator aims to distinguish between different domains (red and blue), where the decision boundary is represented by dashed lines. But misaligned samples create a shortcut where the domain labels can be directly determined by the misaligned class labels (3 and 6). The decision boundary of the resulting shortcut is independent of the covariate that causes the domain difference, which does not contribute to adversarial domain-invariant learning. This shortcut interferes with adversarial domain adaptation because the model could bypass the optimization for domain-invariant representations, but rather optimize for a shortcut function that is independent of the covariate contributing to the domain difference.

The Proposed Approach: Implicit Class-Conditioned Domain Alignment


We aim to align p_S(x) and p_T(x) at the input and label space jointly with the factorization p(x,y)=p(x|y)p(y) while ensuring that the sampled classes are aligned between the two domains. The alignment distribution p(y) is pre-specified, e.g., uniform distribution, to ensure samples are aligned in the shared label space in spite of different empirical label distributions of the two domains.

This algorithm addresses class imbalance within each domain as well as class distribution shift between different domains by specifying the sampling strategy p(y) in the label space. The pseudo-labels are used implicitly to construct class aligned mini-batches at the input space. This also prohibits the learning of domain-discriminator shortcut by maximizing the label space overlap between the source and target domains.

Conclusion

We introduce an approach for unsupervised domain adaptation—with a strong focus on practical considerations of within-domain class imbalance and between-domain class distribution shift—from a class-conditioned domain alignment perspective. We show theoretically that the proposed implicit alignment provides a more reliable measure of empirical domain divergence which facilitates adversarial domain-invariant representation learning, that would otherwise be hampered by the class-misaligned domain divergence. We show that our proposed approach leads to superior UDA performance under extreme within-domain class imbalance and between-domain class distribution shift, as well as competitive results on standard UDA tasks. We emphasize that the proposed method is robust to pseudo-label bias, simple to implement, has a unified training objective, and does not require additional parameter tuning. We also show that the proposed approach is orthogonal to the choice of domain adaptation algorithms and offers consistent improvements to feature-based and classifier-based domain adaptation algorithms.

Future Work

Future work includes extensions to cost-sensitive learning for domain adaptation, as well as other domain adaptation setups such as open set domain adaptation and partial domain adaptation. More work on domain adaptation in the presence of within-domain imbalance and between-domain class distribution shift are needed to facilitate safer use of machine learning models in the real-world.