Friday Dec 9th, Room 113 

How can we build systems that will perform well in the presence of novel, even adversarial, inputs? What techniques will let us safely build and deploy autonomous systems on a scale where human monitoring becomes difficult or infeasible? Answering these questions is critical to guaranteeing the safety of emerging high stakes applications of AI, such as self-driving cars and automated surgical assistants.

This workshop will bring together researchers in areas such as human-robot interaction, security, causal inference, and multi-agent systems in order to strengthen the field of reliability engineering for machine learning systems. We are interested in approaches that have the potential to provide assurances of reliability, especially as systems scale in autonomy and complexity.

We will focus on five aspects — robustness, awareness, adaptation, value learning, and monitoring -- that can aid us in designing and deploying reliable machine learning systems. Some possible questions touching on each of these categories are given below, though we also welcome submissions that do not directly fit into these categories.
  • Robustness: How can we make a system robust to novel or potentially adversarial inputs? What are ways of handling model mis-specification or corrupted training data? What can be done if the training data is potentially a function of system behavior or of other agents in the environment (e.g. when collecting data on users that respond to changes in the system and might also behave strategically)?
  • Awareness: How do we make a system aware of its environment and of its own limitations, so that it can recognize and signal when it is no longer able to make reliable predictions or decisions? Can it successfully identify “strange” inputs or situations and take appropriately conservative actions? How can it detect when changes in the environment have occurred that require re-training? How can it detect that its model might be mis-specified or poorly-calibrated?
  • Adaptation: How can machine learning systems detect and adapt to changes in their environment, especially large changes (e.g. low overlap between train and test distributions, poor initial model assumptions, or shifts in the underlying prediction function)? How should an autonomous agent act when confronting radically new contexts?
  • Value Learning: For systems with complex desiderata, how can we learn a value function that captures and balances all relevant considerations? How should a system act given uncertainty about its value function? Can we make sure that a system reflects the values of the humans who use it?
  • Monitoring: How can we monitor large-scale systems in order to judge if they are performing well? If things go wrong, what tools can help?

Invited speakers:


We gratefully acknowledge support from the Open Philanthropy Project, the Center for the Study of Existential Risk, and the Leverhulme Center for the Future of Intelligence.


Percy Liang, Stanford


D. Amodei, C. Olah, J. Steinhardt, P. Christiano, J. Schulman, and Mané, D., Concrete problems in AI safetyarXiv:1606.06565, 2016.
S. Athey and G. Imbens. A measure of robustness to misspecification. The American Economic Review, 105(5):476–480, 2015.
K. Balasubramanian, P. Donmez, and G. Lebanon. Unsupervised supervised learning II: Margin-based classification without labels. Journal of Machine Learning Research (JMLR), 12:3119–3145, 2011.
T. Başar and P. Bernhard. H-infinity optimal control and related minimax design problems: a dynamic game approach. Springer Science & Business Media, 2008.
M. Basseville. Detecting changes in signals and systems–A survey. Automatica, 24(3):309–326, 1988.
D. Bertsimas, D. B. Brown, and C. Caramanis. Theory and applications of robust optimization. SIAM review, 53(3):464–501, 2011.
J. Blitzer, S. Kakade, and D. P. Foster. Domain adaptation with coupled subspaces. In Artificial Intelligence and Statistics (AISTATS), pages 173–181, 2011.
A. Blum, Y. Mansour, and J. Morgenstern. Learning valuation distributions from partial observation. arXiv, 2014.
L. Bottou. Two high stakes challenges in machine learning. Invited talk at the 32nd International Conference on Machine Learning, 2015.
L. Bottou, J. Peters, J. Quiñonero-Candela, D. X. Charles, D. M. Chickering, E. Portugaly, D. Ray, P. Simard, and E. Snelson. Counterfactual reasoning and learning systems: The example of computational advertising. Journal of Machine Learning Research (JMLR), 14:3207–3260, 2013.
Y. Chow, A. Tamar, S. Mannor, and M. Pavone. Risk-sensitive and robust decision-making: a CVaR optimization approach. In Advances in Neural Information Processing Systems (NIPS), pages 1522–1530, 2015.
S. Cook, C. Conrad, A. L. Fowlkes, and M. H. Mohebbi. Assessing Google flu trends performance in the United States during the 2009 influenza virus A (H1N1) pandemic. PloS one, 6(8), 2011.
A. Daniely, A. Gonen, and S. Shalev-Shwartz. Strongly Adaptive Online Learning. In International Conference on Machine Learning (ICML), 2015.
J. Gama, P. Medas, G. Castillo, and P. Rodrigues. Learning with drift detection. In Advances in Artificial Intelligence, pages 286–295, 2004.
A. Geiger, P. Lenz, and R. Urtasun. Are we ready for autonomous driving? The KITTI vision benchmark suite. In Computer Vision and Pattern Recognition (CVPR), pages 3354–3361, 2012.
A. Globerson and S. Roweis. Nightmare at test time: robust learning by feature deletion. In International Conference on Machine Learning (ICML), pages 353–360, 2006.
I. J. Goodfellow, J. Shlens, and C. Szegedy. Explaining and harnessing adversarial examples. arXiv, 2014.
D. Hadfield-Menell, A. Dragan, P. Abbeel, and S. Russell. Cooperative Inverse Reinforcement Learning. arXiv, 2016
Y. Halpern, Y. Choi, S. Horng, and D. Sontag. Using anchors to estimate clinical state without labeled data. In American Medical Informatics Association Annual Symposium, pages 606–615, 2014.
A. Hans, D. Schneegaß, A. M. Schäfer, and S. Udluft. Safe exploration for reinforcement learning. ESANN, pages 143–148, 2008.
M. Herbster and M. K. Warmuth. Tracking the best linear predictor. The Journal of Machine Learning Research, 1:281–309, 2001.
A. Jaffe, B. Nadler, and Y. Kluger. Estimating the accuracies of multiple classifiers without labeled data. In Artificial Intelligence and Statistics (AISTATS), pages 407–415, 2015.
Y. Kawahara and M. Sugiyama. Change-point detection in time-series data by direct density-ratio estimation. SDM, 9:389–400, 2009.
L. Li, M. L. Littman, T. J. Walsh, and A. L. Strehl. Knows what it knows: a framework for self-aware learning. Machine learning, 82(3):399–443, 2011.
S. Liu, M. Yamada, N. Collier, and M. Sugiyama. Change-point detection in time-series data by relative density-ratio estimation. Neural Networks, 43:72–83, 2013.
J. Lygeros, C. Tomlin, and S. Sastry. Controllers for reachability specifications for hybrid systems. Automatica, 35(3):349–370, 1999.
S. Mei and X. Zhu. The security of latent dirichlet allocation. In Artificial Intelligence and Statistics (AISTATS), 2015a.
S. Mei and X. Zhu. Using machine teaching to identify optimal training-set attacks on machine learners. In Association for the Advancement of Artificial Intelligence (AAAI), 2015b.
I. M. Mitchell, A. M. Bayen, and C. J. Tomlin. A time-dependent Hamilton-Jacobi formulation of reachable sets for continuous dynamic games. IEEE Transactions on Automatic Control, 50(7): 947–957, 2005.
T. M. Moldovan and P. Abbeel. Safe exploration in Markov decision processes. In International Conference on Machine Learning (ICML), pages 1711–1718, 2012.
W. K. Newey and D. McFadden. Large sample estimation and hypothesis testing. In Handbook of Econometrics, volume 4, pages 2111–2245. 1994.
N. Nisan, T. Roughgarden, E. Tardos, and V. V. Vazirani. Algorithmic game theory, volume 1. Cambridge University Press, 2007.
E. A. Platanios. Estimating accuracy from unlabeled data. Master’s thesis, Carnegie Mellon University, 2015.
E. A. Platanios, A. Blum, and T. M. Mitchell. Estimating accuracy from unlabeled data. In Uncertainty in Artificial Intelligence (UAI), 2014.
J. L. Powell. Estimation of semiparametric models. In Handbook of Econometrics, volume 4, pages 2443–2521. 1994.
J. Quiñonero-Candela, M. Sugiyama, A. Schwaighofer, and N. D. Lawrence. Dataset shift in machine learning.  The MIT Press, 2009.
R. Raina, A. Battle, H. Lee, B. Packer, and A. Y. Ng. Self-taught learning: transfer learning from unlabeled data. In International Conference on Machine Learning (ICML), pages 759–766, 2007.
S. Russell, D. Dewey, M. Tegmark, J. Kramar, and R. Mallah. Research priorities for robust and beneficial artificial intelligence, 2015.
D. Sculley, G. Holt, D. Golovin, E. Davydov, T. Phillips, D. Ebner, V. Chaudhary, M. Young, J. Crespo, and D. Dennison. Hidden technical debt in machine learning systems. In Advances in Neural Information Processing Systems (NIPS), pages 2494–2502, 2015.
G. Shafer and V. Vovk. A tutorial on conformal prediction. Journal of Machine Learning Research (JMLR), 9:371–421, 2008.
S. Shalev-Shwartz. Online learning and online convex optimization. Foundations and Trends in Machine Learning,  4(2):107194, 2011. 
H. Shimodaira. Improving predictive inference under covariate shift by weighting the log-likelihood function. Journal of Statistical Planning and Inference, 90:227–244, 2000.
A. Swaminathan and T. Joachims. Counterfactual Risk Minimization: Learning from Logged Bandit Feedback. In 
 International Conference on Machine Learing (ICML), 2015.
R. H. Taylor, A. Menciassi, G. Fichtinger, and P. Dario. Medical robotics and computer-integrated surgery. In Springer Handbook of Robotics, pages 1199–1222. 2008.
S. Temizer, M. J. Kochenderfer, L. P. Kaelbling, T. Lozano-Pérez, and J. K. Kuchar. Collision avoidance for unmanned aircraft using Markov decision processes. In AIAA Guidance, Navigation, and Control Conference, 2010.
S. Wager and S. Athey. Estimation and inference of heterogeneous treatment effects using random forests. arXiv, 2015.
L. Xu, F. Hutter, H. H. Hoos, and K. Leyton-Brown. SATzilla: portfolio-based algorithm selection for SAT. Journal of Artificial Intelligence Research (JAIR), 32:565–606, 2008.