Cheems Wang*, Yiqin Lv*, Yixiu Mao*, Yun Qu, Yi Xu, Xiangyang Ji
Affiliations: Tsinghua University, Dalian University of Technology
Our purpose is to automatically generate the explicit task distribution as the adversary for robustifying meta learning methods🙂.
Meta-learning is a practical learning paradigm to transfer skills across tasks from a few examples. Nevertheless, the existence of task distribution shifts tends to weaken meta-learners' generalization capability, particularly when the task distribution is naively hand-crafted or based on simple priors that fail to cover typical scenarios sufficiently. Here, we consider explicitly generative modeling task distributions placed over task identifiers and propose robustifying fast adaptation from adversarial training. Our approach, which can be interpreted as a model of a Stackelberg game, not only uncovers the task structure during problem-solving from an explicit generative model but also theoretically increases the adaptation robustness in worst cases. This work has practical implications, particularly in dealing with task distribution shifts in meta-learning, and contributes to theoretical insights in the field. Our method demonstrates its robustness in the presence of task subpopulation shifts and improved performance over SOTA baselines in extensive experiments.
Meaning of task identifiers: They configure the task, such as the topic type in the corpus for large language models [1,2], the amplitude and phase in sinusoid functions, or DoFs in robotic manipulators [3,4].
Some Task Identifiers in Meta-Learning
Task distribution matters: Many studies adopt hand-crafted or simple prior task distributions, such as uniform distributions [5]. The resulting naive task samplers might encounter catastrophic failures in real-world scenarios when distribution shifts occur in the task space. Hence, can we develop a strategy to automatically generate task distributions with meaningful structures 🤔?
Some practical scenarios: to learn task distributions in risk-sensitive scenarios for enabling robust few-shot fast adaptations!
NVIDIA DRIVE SIM: Create Distribution over Rarely-Seen Traffic Scenes
What constitutes this work: Instead of uniform or manually designed task distributions, this paper considers an explicit task distribution to capture along with the learning progress and then automatically creates task distribution shifts for the meta-learner to adapt robustly.
Diagram of Adversarial Robust Meta Learning.
Task distribution shift with constraints: Now, we translate the generative task distributions for robust adaptation into:
Equivalently, we can rewrite the above optimization objective in the form of unconstrained one with the help of a Lagrange multiplier:
Understanding optimization: The above can be further simplified as:
Practical effects: The distribution adversary attempts to transform the initial task distribution into one that raises challenging task proposals with higher probability. Such a setup drives adaptively shifting task sampling chance under constraints, crucial for generalization across risky scenarios. Technically, the optimization objective implies the meta learners' robust optimization under the deviation over the initial task distribution.
Best Response Approximation: The commonly used strategy to compute the equilibrium is the Best Response (BR), which means:
For implementations, we employ alternating gradient descent ascent (GDA):
Built on typical risk minimization principles, these include vanilla MAML [5], DRO-MAML [6], TR-MAML [7], DR-MAML [8], and AR-MAML (ours).
In summary, this work:
develops a game-theoretical approach for generating explicit task distributions in an adversarial way;
contributes to theoretical understandings, e.g., convergence and generalization, through the sequential game theory;
improves adaptation robustness in constrained distribution shifts and discovers interpretable task structures in extensive scenarios.
Our approach can analytically compute the density/entropy of the generated task distribution, provide more understanding in optimization process, and well robustify meta learners 😝.
[1] Wang, Q ., Feng, Y., Huang , J., Lv, Y., Xie, Z., and Gao, X. ( 2023 ). Large-scale generative simulation artificial intelligence: The next hotspot. The Innovation, 4 (6) .
[2] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al. ( 2020 ) . Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901.
[3] Faverjon, B. and Tournassoud, P. ( 1987 ) . A local based approach for path planning of manipulators with a high number of degrees of freedom. In Proceedings. 1987 IEEE international conference on robotics and automation, volume 4, pages 1152–1159. IEEE.
[4] Anne, T., Wilkinson, J., and Li, Z. ( 2021 ) . Meta-learning for fast adaptive locomotion with uncertainties in environments and robot dynamics. In 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 4568–4575. IEEE.
[5] Finn, C., Abbeel, P., and Levine, S. ( 2017 ) . Model-agnostic meta-learning for fast adaptation of deep networks. In International conference on machine learning, pages 1126–1135. PMLR.
[6] Collins, L., Mokhtari, A., and Shakkottai, S. ( 2020 ) . Task-robust model-agnostic meta-learning. Advances in Neural Information Processing Systems, 33:18860–18871.
[7] Wang, Q ., Lv, Y., Feng , Y., Xie, Z., and Huang , J. ( 2023 ) . A simple yet effective strategy to robustify the meta learning paradigm. Advances in Neural Information Processing Systems, 36.
We thank all members from Tsinghua decision-making group and all readers who take interest in this work 🌹. Thanks for reading this blog!