Hands-On: AutoGluon
Long Title: The AutoML Revolution: Leveraging Text, Images, and the Kitchen Sink to solve complex ML problems in 1 line of code with AutoGluon
Abstract
AutoGluon is an open source AutoML framework, developed by AWS, that shattered the SOTA in AutoML on its initial release in 2019 and continues to push the boundaries of what AutoML can achieve. With over two million PyPi downloads, AutoGluon takes inspiration from competition winning solutions on sites like Kaggle and redefines AutoML by ensembling multiple models and stacking them in multiple layers. Among the most challenging of ML problems are datasets that consist of multiple modalities of data, such as text, image, and tabular data. Properly leveraging each modality requires extensive experience and complicated engineering efforts. AutoGluon is able to train on multimodal image-text-tabular data with a single line of code, producing a powerful multi-layer stack ensemble of ResNet image models, BERT language models, and a suite of tabular models all working in tandem. This talk will give an overview of AutoGluon followed by a deep dive into how (and why) it has proven to be so effective, and finish with code examples to demonstrate how you can revolutionize your ML workflow.
Code
Code will be provided, which can be used to automatically train and explain models on your own datasets.
Bio
Nick Erickson is a Senior Software Development Engineer at Amazon AI. He obtained his master's degree in Computer Science and Engineering from the University of Minnesota Twin Cities. He is the co-author and lead developer of the open-source AutoML framework AutoGluon. Starting as a personal competition ML toolkit in 2018, Nick continually expanded the capabilities of AutoGluon and joined Amazon AI in 2019 to open-source the project and work full time on advancing the state-of-the-art in AutoML.
Alex Smola is a VP/Distinguished Scientist at Amazon AI. He received a Ph.D. in Computer Science from the Berlin University of Technology, Germany. He held faculty positions at the Australian National University, UC Berkeley, and Carnegie Mellon University and has worked at NICTA, Yahoo Research, and Google. He published over 250 papers, 5 books, and his work is cited more than 140k times. His research interests include deep learning, Bayesian nonparametrics, kernel methods, statistical modeling, and scalable algorithms.