Note: the 2016 NIPS workshop on reliable machine learning can be found here.How and when can we be confident that a system that performed well in the past will continue to do so in the future, in the presence of novel and potentially adversarial input distributions, and on a scale where human monitoring becomes difficult? Answering these questions is critical to guaranteeing the safety of emerging “high stakes” applications of AI, such as self-driving cars (Geiger et al., 2012) and automated surgical assistants (Taylor et al., 2008), as well as for reasoning reliably about large-scale machine learning systems (Sculley et al., 2015). This workshop explores approaches that are principled or can provide performance guarantees, ensuring AI systems are robust and beneficial in the long run (Russell et al., 2015). We will focus on three aspects — robustness, adaptation, and monitoring — that can aid us in designing and deploying reliable machine learning systems. In addition to traditional machine learning work, we hope to imports ideas from many fields adjacent to machine learning. For instance robotics, especially in the context of autonomous vehicle control, has developed tools for ensuring robustly good behavior in novel situations, yielding reachability analysis (Lygeros et al., 1999; Mitchell et al., 2005) and H _{∞}-control (Başar and Bernhard, 2008). Causal identification has traditionally been the purview of econometrics, but has recently made inroads into machine learning as well (Bottou et al., 2013; Wager and Athey, 2015; Athey and Imbens, 2015). Such cross-pollination of ideas has historically been extremely fruitful for machine learning, and we hope to continue in this tradition.
Jacob Steinhardt, Stanford
Tom Dietterich, OSU Percy Liang, Stanford Andrew Critch, MIRI Jessica Taylor, MIRI Adrian Weller, Cambridge
A measure of robustness to misspecification. The American Economic Review,105(5):476–480, 2015. K. Balasubramanian, P. Donmez, and G. Lebanon. Unsupervised supervised learning II: Margin-based classification Journal of Machine Learning Research (JMLR), 12:3119–3145, 2011.without labels. T. Başar and P. Bernhard. H-infinity optimal control and related minimax design problems: a dynamic game Springer Science & Business Media, 2008.approach. M. Basseville. Detecting changes in signals and systems–A survey. Automatica, 24(3):309–326, 1988.D. Bertsimas, D. B. Brown, and C. Caramanis. Theory and applications of robust optimization. SIAM review,53(3):464–501, 2011. J. Blitzer, S. Kakade, and D. P. Foster. Domain adaptation with coupled subspaces. In Artificial Intelligence andStatistics (AISTATS), pages 173–181, 2011. A. Blum, Y. Mansour, and J. Morgenstern. Learning valuation distributions from partial observation. arXiv, 2014.L. Bottou. Two high stakes challenges in machine learning. Invited talk at the 32nd International Conference onMachine Learning, 2015. L. Bottou, J. Peters, J. Quiñonero-Candela, D. X. Charles, D. M. Chickering, E. Portugaly, D. Ray, P. Simard, and E. Snelson. Counterfactual reasoning and learning systems: The example of computational advertising. Journal ofMachine Learning Research (JMLR), 14:3207–3260, 2013. Y. Chow, A. Tamar, S. Mannor, and M. Pavone. Risk-sensitive and robust decision-making: a CVaR optimization In Advances in Neural Information Processing Systems (NIPS), pages 1522–1530, 2015.approach. S. Cook, C. Conrad, A. L. Fowlkes, and M. H. Mohebbi. Assessing Google flu trends performance in the United States PloS one, 6(8), 2011.during the 2009 influenza virus A (H1N1) pandemic. A. Daniely, A. Gonen, and S. Shalev-Shwartz. Strongly Adaptive Online Learning. In International Conference on Machine Learning (ICML), 2015. J. Gama, P. Medas, G. Castillo, and P. Rodrigues. Learning with drift detection. In Advances in Artificial Intelligence, pages 286–295, 2004. A. Geiger, P. Lenz, and R. Urtasun. Are we ready for autonomous driving? The KITTI vision benchmark suite. InComputer Vision and Pattern Recognition (CVPR), pages 3354–3361, 2012. A. Globerson and S. Roweis. Nightmare at test time: robust learning by feature deletion. In InternationalConference on Machine Learning (ICML), pages 353–360, 2006. I. J. Goodfellow, J. Shlens, and C. Szegedy. Explaining and harnessing adversarial examples. arXiv, 2014.Y. Halpern, Y. Choi, S. Horng, and D. Sontag. Using anchors to estimate clinical state without labeled data. InAmerican Medical Informatics Association Annual Symposium, pages 606–615, 2014. A. Hans, D. Schneegaß, A. M. Schäfer, and S. Udluft. Safe exploration for reinforcement learning. ESANN, pages143–148, 2008. M. Herbster and M. K. Warmuth. Tracking the best linear predictor. The Journal of Machine Learning Research, 1:281–309, 2001. A. Jaffe, B. Nadler, and Y. Kluger. Estimating the accuracies of multiple classifiers without labeled data. In ArtificialIntelligence and Statistics (AISTATS), pages 407–415, 2015. Y. Kawahara and M. Sugiyama. Change-point detection in time-series data by direct density-ratio estimation. SDM,9:389–400, 2009. L. Li, M. L. Littman, T. J. Walsh, and A. L. Strehl. Knows what it knows: a framework for self-aware learning.Machine learning, 82(3):399–443, 2011. S. Liu, M. Yamada, N. Collier, and M. Sugiyama. Change-point detection in time-series data by relative density-ratio Neural Networks, 43:72–83, 2013.estimation. J. Lygeros, C. Tomlin, and S. Sastry. Controllers for reachability specifications for hybrid systems. Automatica,35(3):349–370, 1999. S. Mei and X. Zhu. The security of latent dirichlet allocation. In Artificial Intelligence and Statistics (AISTATS),2015a. S. Mei and X. Zhu. Using machine teaching to identify optimal training-set attacks on machine learners. InAssociation for the Advancement of Artificial Intelligence (AAAI), 2015b. I. M. Mitchell, A. M. Bayen, and C. J. Tomlin. A time-dependent Hamilton-Jacobi formulation of reachable sets for IEEE Transactions on Automatic Control, 50(7): 947–957, 2005.continuous dynamic games. T. M. Moldovan and P. Abbeel. Safe exploration in Markov decision processes. In International Conference onMachine Learning (ICML), pages 1711–1718, 2012. W. K. Newey and D. McFadden. Large sample estimation and hypothesis testing. In Handbook of Econometrics,volume 4, pages 2111–2245. 1994. N. Nisan, T. Roughgarden, E. Tardos, and V. V. Vazirani. Algorithmic game theory, volume 1. CambridgeUniversity Press, 2007. E. A. Platanios. Estimating accuracy from unlabeled data. Master’s thesis, Carnegie Mellon University, 2015.E. A. Platanios, A. Blum, and T. M. Mitchell. Estimating accuracy from unlabeled data. In Uncertainty in ArtificialIntelligence (UAI), 2014. J. L. Powell. Estimation of semiparametric models. In Handbook of Econometrics, volume 4, pages 2443–2521.1994. J. Quiñonero-Candela, M. Sugiyama, A. Schwaighofer, and N. D. Lawrence. Dataset shift in machine learning. The MIT Press, 2009. R. Raina, A. Battle, H. Lee, B. Packer, and A. Y. Ng. Self-taught learning: transfer learning from unlabeled data. InInternational Conference on Machine Learning (ICML), pages 759–766, 2007. S. Russell, D. Dewey, M. Tegmark, J. Kramar, and R. Mallah. Research priorities for robust and beneficial artificial, 2015.intelligence D. Sculley, G. Holt, D. Golovin, E. Davydov, T. Phillips, D. Ebner, V. Chaudhary, M. Young, J. Crespo, and D. Dennison. Hidden technical debt in machine learning systems. In Advances in Neural Information ProcessingSystems (NIPS), pages 2494–2502, 2015. G. Shafer and V. Vovk. A tutorial on conformal prediction. Journal of Machine Learning Research (JMLR),9:371–421, 2008. S. Shalev-Shwartz. Online learning and online convex optimization. Foundations and Trends in Machine Learning, 4(2):107–194, 2011. H. Shimodaira. Improving predictive inference under covariate shift by weighting the log-likelihood function. Journalof Statistical Planning and Inference, 90:227–244, 2000. A. Swaminathan and T. Joachims. Counterfactual Risk Minimization: Learning from Logged Bandit Feedback. In International Conference on Machine Learing (ICML), 2015.
R. H. Taylor, A. Menciassi, G. Fichtinger, and P. Dario.
Medical robotics and computer-integrated surgery. InSpringer Handbook of Robotics, pages 1199–1222. 2008. S. Temizer, M. J. Kochenderfer, L. P. Kaelbling, T. Lozano-Pérez, and J. K. Kuchar. Collision avoidance for In AIAA Guidance, Navigation, and Control Conference,unmanned aircraft using Markov decision processes. 2010. S. Wager and S. Athey. Estimation and inference of heterogeneous treatment effects using random forests. arXiv,2015. L. Xu, F. Hutter, H. H. Hoos, and K. Leyton-Brown. SATzilla: portfolio-based algorithm selection for SAT. Journal ofArtificial Intelligence Research (JAIR), 32:565–606, 2008. |