Workshop Overview

Note: the 2016 NIPS workshop on reliable machine learning can be found here.

How and when can we be confident that a system that performed well in the past will continue to do so in the future, in the presence of novel and potentially adversarial input distributions, and on a scale where human monitoring becomes difficult? Answering these questions is critical to guaranteeing the safety of emerging “high stakes” applications of AI, such as self-driving cars (Geiger et al., 2012) and automated surgical assistants (Taylor et al., 2008), as well as for reasoning reliably about large-scale machine learning systems (Sculley et al., 2015). This workshop explores approaches that are principled or can provide performance guarantees, ensuring AI systems are robust and beneficial in the long run (Russell et al., 2015). We will focus on three aspects — robustness, adaptation, and monitoring — that can aid us in designing and deploying reliable machine learning systems.

In addition to traditional machine learning work, we hope to imports ideas from many fields adjacent to machine learning. For instance robotics, especially in the context of autonomous vehicle control, has developed tools for ensuring robustly good behavior in novel situations, yielding reachability analysis (Lygeros et al., 1999; Mitchell et al., 2005) and H-control (Başar and Bernhard, 2008). Causal identification has traditionally been the purview of econometrics, but has recently made inroads into machine learning as well (Bottou et al., 2013; Wager and Athey, 2015; Athey and Imbens, 2015). Such cross-pollination of ideas has historically been extremely fruitful for machine learning, and we hope to continue in this tradition.


We are grateful to acknowledge funding from the Future of Life Institute.


Jacob Steinhardt, Stanford
Tom Dietterich, OSU
Percy Liang, Stanford
Andrew Critch, MIRI
Jessica Taylor, MIRI
Adrian Weller, Cambridge


S. Athey and G. Imbens. A measure of robustness to misspecification. The American Economic Review,
    105(5):476–480, 2015.
K. Balasubramanian, P. Donmez, and G. Lebanon. Unsupervised supervised learning II: Margin-based classification
    without labels.
Journal of Machine Learning Research (JMLR), 12:3119–3145, 2011.
T. Başar and P. Bernhard. H-infinity optimal control and related minimax design problems: a dynamic game
Springer Science & Business Media, 2008.
M. Basseville. Detecting changes in signals and systems–A survey. Automatica, 24(3):309–326, 1988.
D. Bertsimas, D. B. Brown, and C. Caramanis. Theory and applications of robust optimization. SIAM review,
    53(3):464–501, 2011.
J. Blitzer, S. Kakade, and D. P. Foster. Domain adaptation with coupled subspaces. In Artificial Intelligence and
    Statistics (AISTATS), pages 173–181, 2011.
A. Blum, Y. Mansour, and J. Morgenstern. Learning valuation distributions from partial observation. arXiv, 2014.
L. Bottou. Two high stakes challenges in machine learning. Invited talk at the 32nd International Conference on
    Machine Learning, 2015.
L. Bottou, J. Peters, J. Quiñonero-Candela, D. X. Charles, D. M. Chickering, E. Portugaly, D. Ray, P. Simard, and E.
    Snelson. Counterfactual reasoning and learning systems: The example of computational advertising. Journal of
    Machine Learning Research (JMLR), 14:3207–3260, 2013.
Y. Chow, A. Tamar, S. Mannor, and M. Pavone. Risk-sensitive and robust decision-making: a CVaR optimization
In Advances in Neural Information Processing Systems (NIPS), pages 1522–1530, 2015.
S. Cook, C. Conrad, A. L. Fowlkes, and M. H. Mohebbi. Assessing Google flu trends performance in the United States
    during the 2009 influenza virus A (H1N1) pandemic.
PloS one, 6(8), 2011.
A. Daniely, A. Gonen, and S. Shalev-Shwartz. Strongly Adaptive Online Learning. In International Conference
    on Machine Learning (ICML), 2015.
J. Gama, P. Medas, G. Castillo, and P. Rodrigues. Learning with drift detection. In Advances in Artificial
    Intelligence, pages 286–295, 2004.
A. Geiger, P. Lenz, and R. Urtasun. Are we ready for autonomous driving? The KITTI vision benchmark suite. In
    Computer Vision and Pattern Recognition (CVPR), pages 3354–3361, 2012.
A. Globerson and S. Roweis. Nightmare at test time: robust learning by feature deletion. In International
    Conference on Machine Learning (ICML), pages 353–360, 2006.
I. J. Goodfellow, J. Shlens, and C. Szegedy. Explaining and harnessing adversarial examples. arXiv, 2014.
Y. Halpern, Y. Choi, S. Horng, and D. Sontag. Using anchors to estimate clinical state without labeled data. In
    American Medical Informatics Association Annual Symposium, pages 606–615, 2014.
A. Hans, D. Schneegaß, A. M. Schäfer, and S. Udluft. Safe exploration for reinforcement learning. ESANN, pages
    143–148, 2008.
M. Herbster and M. K. Warmuth. Tracking the best linear predictor. The Journal of Machine Learning Research,
    1:281–309, 2001.
A. Jaffe, B. Nadler, and Y. Kluger. Estimating the accuracies of multiple classifiers without labeled data. In Artificial
    Intelligence and Statistics (AISTATS), pages 407–415, 2015.
Y. Kawahara and M. Sugiyama. Change-point detection in time-series data by direct density-ratio estimation. SDM,
    9:389–400, 2009.
L. Li, M. L. Littman, T. J. Walsh, and A. L. Strehl. Knows what it knows: a framework for self-aware learning.
    Machine learning, 82(3):399–443, 2011.
S. Liu, M. Yamada, N. Collier, and M. Sugiyama. Change-point detection in time-series data by relative density-ratio
Neural Networks, 43:72–83, 2013.
J. Lygeros, C. Tomlin, and S. Sastry. Controllers for reachability specifications for hybrid systems. Automatica,
    35(3):349–370, 1999.
S. Mei and X. Zhu. The security of latent dirichlet allocation. In Artificial Intelligence and Statistics (AISTATS),
S. Mei and X. Zhu. Using machine teaching to identify optimal training-set attacks on machine learners. In
    Association for the Advancement of Artificial Intelligence (AAAI), 2015b.
I. M. Mitchell, A. M. Bayen, and C. J. Tomlin. A time-dependent Hamilton-Jacobi formulation of reachable sets for
    continuous dynamic games.
IEEE Transactions on Automatic Control, 50(7): 947–957, 2005.
T. M. Moldovan and P. Abbeel. Safe exploration in Markov decision processes. In International Conference on
    Machine Learning (ICML), pages 1711–1718, 2012.
W. K. Newey and D. McFadden. Large sample estimation and hypothesis testing. In Handbook of Econometrics,
    volume 4, pages 2111–2245. 1994.
N. Nisan, T. Roughgarden, E. Tardos, and V. V. Vazirani. Algorithmic game theory, volume 1. Cambridge
    University Press, 2007.
E. A. Platanios. Estimating accuracy from unlabeled data. Master’s thesis, Carnegie Mellon University, 2015.
E. A. Platanios, A. Blum, and T. M. Mitchell. Estimating accuracy from unlabeled data. In Uncertainty in Artificial
    Intelligence (UAI), 2014.
J. L. Powell. Estimation of semiparametric models. In Handbook of Econometrics, volume 4, pages 2443–2521.
J. Quiñonero-Candela, M. Sugiyama, A. Schwaighofer, and N. D. Lawrence. Dataset shift in machine learning.
    The MIT Press, 2009.
R. Raina, A. Battle, H. Lee, B. Packer, and A. Y. Ng. Self-taught learning: transfer learning from unlabeled data. In
    International Conference on Machine Learning (ICML), pages 759–766, 2007.
S. Russell, D. Dewey, M. Tegmark, J. Kramar, and R. Mallah. Research priorities for robust and beneficial artificial
, 2015.
D. Sculley, G. Holt, D. Golovin, E. Davydov, T. Phillips, D. Ebner, V. Chaudhary, M. Young, J. Crespo, and D.
    Dennison. Hidden technical debt in machine learning systems. In Advances in Neural Information Processing
    Systems (NIPS), pages 2494–2502, 2015.
G. Shafer and V. Vovk. A tutorial on conformal prediction. Journal of Machine Learning Research (JMLR),
    9:371–421, 2008.
S. Shalev-Shwartz. Online learning and online convex optimization. Foundations and Trends in Machine Learning,
194, 2011.
H. Shimodaira. Improving predictive inference under covariate shift by weighting the log-likelihood function. Journal
    of Statistical Planning and Inference, 90:227–244, 2000.
A. Swaminathan and T. Joachims. Counterfactual Risk Minimization: Learning from Logged Bandit Feedback. In
    International Conference on Machine Learing (ICML), 2015.
R. H. Taylor, A. Menciassi, G. Fichtinger, and P. Dario. Medical robotics and computer-integrated surgery. In
    Springer Handbook of Robotics, pages 1199–1222. 2008.
S. Temizer, M. J. Kochenderfer, L. P. Kaelbling, T. Lozano-Pérez, and J. K. Kuchar. Collision avoidance for
    unmanned aircraft using Markov decision processes.
In AIAA Guidance, Navigation, and Control Conference,
S. Wager and S. Athey. Estimation and inference of heterogeneous treatment effects using random forests. arXiv,
L. Xu, F. Hutter, H. H. Hoos, and K. Leyton-Brown. SATzilla: portfolio-based algorithm selection for SAT. Journal of
    Artificial Intelligence Research (JAIR), 32:565–606, 2008.