Link to Zotero AutoML group Next meeting
24 January 2019 (Thursday) at 17h (Paris time)
Room 2014, DIGITEO (Bat. 660) or via video call link
Check the calendar (Zimbra login required)
Presenter:
Lisheng Sun-Hosoya Cycle of presenters: Guillaume, Zhengying, Heri, Lisheng, Pierre To-Read List Link to this list in Zotero group. Everyone is welcome to suggest and add papers to read (send your Zotero username or email address to the organizers). Past Meetings Meeting of 15/11/2018 Presenter: Heri Participants: #TODO: @Heri Main idea of the article: #TODO: @Heri Slides #TODO: @Heri Meeting of 25/10/2018 Presenter: Zhengying Participants: Loris, Heri, Marc, Michèle, Guillaume Charpiat Main idea of the article: In a lifelong learning + concept drift + AutoML setting, the authors use auto-sklearn + a drift detector + several model adaptation methods (e.g. re-train completely the model, update weights, add model, etc) and have some basic results for the AutoML3 challenge Meeting of 11/10/2018 Paper: Andrew Brock, Theodore Lim, J.M. Ritchie, Nick Weston "SMASH: One-Shot Model Architecture Search through HyperNetworks". Presenter: Guillaume Doquet Participants: Loris, Heri, Zhengying, Pierre, Guillaume Charpiat Main idea of the article: #TODO: Guillaume Slides (#TODO: Guillaume) Meeting of 20/09/2018 Paper: Lorraine, Jonathan, and David Duvenaud. "Stochastic Hyperparameter Optimization through Hypernetworks." arXiv preprint arXiv:1802.09419 (2018). Presenter: Pierre Participants: Michèle, Marc, Guillaume Charpiat, Heri, Zhengying Main idea of the article: #TODO: Pierre Slides (#TODO: Pierre) Meeting of 14/06/2018 Paper: Linnan, Wang; Yiyang, Zao; Yuu, Jinnai "AlphaX: eXploring Neural Architectures with Deep Neural Networks and Monte Carlo Tree Search". Presenter: Heri Participants: Michèle, Marc, Zhengying, Pierre Main idea of the article: Use Monte Carlo Tree Search method to find optimal architecture for neural network. The proposed approach exploits the "block" design introduced in NAS (Neural Architecture Search). Another composant to speed up the search is the meta-DNN model (for prediction of the performance of a given architecture). Meeting of 07/06/2018 PhD seminar Paper : Wolpert, David H., and William G. Macready. "No free lunch theorems for optimization." IEEE transactions on evolutionary computation 1.1 (1997): 67-82. Paper: Wolpert, David H. "The lack of a priori distinctions between learning algorithms." Neural computation 8.7 (1996): 1341-1390. Presenter: Zhengying Participants: Guillaume C., Lisheng, Victor B., Olivier, Diviyan, Théophile, Victor E., Giancarlo Main idea of the article: Any two (optimization) algorithms work equally well when their performance is averaged across all possible problems. Meeting of 17/05/2018 Paper : Al-Shedivat, Maruan, et al. "Continuous adaptation via meta-learning in nonstationary and competitive environments.". ICLR 2018. Presenter: Lisheng Participants: Zhengying, Heri, Guillaume D. Main idea of the article: Learn (via optimizing over-task loss) to continuously adapt policy to nonstationary (modeled as competitive agents) environments. Solve purely RL problems. Meeting of 03/05/2018 Paper : Lisha, Li, et al. "Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization". ICLR 2017. # TODO: Aris Presenter: Participants: Main idea of the article: Slides Meeting of 12/04/2018
Evolving the Topology of Large Scale Deep Neural Networks
Presenter: Guillaume DOQUET
Participants: Isabelle, Marc, Michèle, Laurent, Aris, Heri, Pierre, Guillaume, Zhengying
Main idea of the article: Use an evolutionary strategy to learn the structure of a deep neural network. The algorithm operates on 2 levels simultaneously : the macro scale (number and type of layers) and the layer scale
(parameters of that layer). Results are presented on CIFAR10, CIFAR100, MNIST, and Fashion-MNIST, beating or rivalizing state-of-the-art performance.
Meeting of 05/04/2018
Large-Scale Evolution of Image Classifiers Learning Transferable Architectures for scalable image recognition Presenter: Pierre Wolinski Participants: Zhengying, Guillaume Charpiat, Heri, ? Main idea of the articles:
Meeting of 29/03/2018
Monte Carlo tree search for algorithm selection (MOSAIC)
Presenter: Heri Participants: Zhengying, Guillaume (Charpiat, Doquet), Pierre, Aris
Main idea of the article: Tackle hyperparameter optimization problem using monte carlo tree search: state (value of hyperparameters already choosed), action (value of the next hyperparameter to choose), reward (CV-score). The algorithm is composed of two parts: bandits part (designed for algorithm selection: random forest, svm, ...) and MCTS (for preprocessing and algorithm configuration). This new idea produces good (first) results but many improvements still to be done.
Meeting of 22/03/2018
Synthesis of AutoML Reading Group (21/11/2017-22/03/2018) (yes with color! and you can do that too!)
Presenter: Zhengying
Participants: Isabelle, les 2 Guillaumes, Heri, Pierre, Lisheng
Main idea of the article: Describe the AutoML problem in a comprehensive and intuitive optimization manner, formulate many existing AutoML approaches in a uniform way, attach each approach to one step in the classic machine learning pipeline and make some discussion on future research ideas.
Meeting of 08/03/2018
Paper : Swersky, Kevin, Jasper Snoek, and Ryan Prescott Adams. "Freeze-thaw Bayesian optimization." arXiv preprint arXiv:1406.3896 (2014).
Presenter: Lisheng Participants: Isabelle, Marc, Guillaume Charpiat, Heri, Pierre, Zhengying
Main idea of the article: A strategy for efficiently choosing hyperparameters: pause the training of models that are not promising. Model training curves as samples of Gaussian process.
Meeting of 01/03/2018
Paper : M. Feurer, A. Klein, K. Eggensperger, J.T. Springenberg, M. Blum, F. Hutter "Efficient and Robust Automated Machine Learning" (NIPS 2015)
Presenter: Guillaume Doquet Participants: Pierre, Zhengying, Isabelle, Guillaume Charpiat
Main idea of the article: Combine meta-learning, tree-based Bayesian Optimization (SMAC) and ensemble method (ensemble selection) to tackle AutoML problems.
Meeting of 22/02/2018
Paper: Frank Hutter, Holger H. Hoos, Kevin Leyton-Brown (International Conference on Learning and Intelligent Optimization 2011)
Presenter: Pierre
Participants: Zhengying, Guillaume D, Guillaume C, Lisheng, Isabelle
Main idea of the article: Presentation of the hyperparameter search algorithm SMAC. SMAC is a SMBO-based algorithm using random forests to model hyperparameters. Moreover, it implements the case where one tunes the hyperparameters for multiple instance sets. (Slides) Questions & Remarks: instance set possibly refers to a split of a data set between a train set and a validation set
Meeting of 08/02/2018
Paper: R. Bardenet, M. Brendel, B. Kégl, M. Sebag "Collaborative hyperparameter tuning" ICML (2013).
Presenter: Heri Participants: Zhengying, Isabelle, Aris, Pierre, Heri
Main idea of the article: By collaborative tuning of hyperparameter on multiple datasets, one can incorporate (expert) knowledge from similar tasks to improve Bayesian hyperparameter search. Hyper-parameter ranking is used (instead of validation score) to assess the quality of one hyperparameter. (slides)
Meeting of 01/02/2018
Paper: Liu, C., Zoph, B., Shlens, J., Hua, W., Li, L. J., Fei-Fei, L., & Murphy, K. "Progressive neural architecture search." arXiv preprint arXiv:1712.00559 (2017).
Presenter: Aris Participants: Guillaume Charpiat, Guillaume Doquet, Heri, Lisheng, Zhengying, Aris
Main idea of the article: Learn an RNN that estimates the quality of a CNN sub-module ("cell") generated using multiple blocks (each chosen from some fixed options of convolutions and pooling operators).
When expanding the cell structure, the RNN is used to prune the search space. (Slides) Meeting of 25/01/2018
Paper : Max Jaderberg, Karen Simonyan, Andrew Zisserman and Koray Kavukcuoglu. "Spatial transformer networks." arXiv preprint arXiv:1506.02025v3 (2016).
Presenter : Lisheng
Participants : Michèle, Guillaume Charpiat, Aris, Heri, Guillaume Doquet
Main idea of the article : A Spatial transformation network which learns an appropriate transformation of input feature map, is proposed to be inserted to existing architecture to make the task (e.g. classification) in later layers easier, this is possible mainly because the STN is differentiable. (Slides)
To-do:
Meeting of 18/01/2018
Paper : Koutník, Jan, Juergen Schmidhuber, and Faustino Gomez. "A frequency-domain encoding for neuroevolution." arXiv preprint arXiv:1212.6521 (2012).
Presenter : Zhengying
Participants : Michèle, Isabelle, Guillaume Charpiat, Lisheng, Aris, Heri, Guillaume Doquet
Main idea of the article : Solve Octopus Arm Problem by using a few Fourier coefficients (chromosome) to compactly represent recurrent neural networks and using Natural Evolution Strategy to select promising prior distribution of neural networks. (Slides)
To-do:
Meeting of 11/01/2018
Paper : Ravid Shwartz-Ziv and Naftali Tishby, "Opening the Black Box of Deep Neural Networks via Information" (Arxiv, March 2017)
Presenter : Guillaume Doquet
Participants : Guillaume Charpiat, Cyril, Zhengying, Heri, Guillaume Doquet
Main idea of the article : Deep neural networks go through 2 distinct phases during training. In the first phase, the mutual information between each hidden layer and the labels increases. In the second phase,
the mutual information between the layers and the data decreases. In other words, a compressed latent representation of the data is found. This is a byproduct of the stochastic nature of the gradient descent. (Slides)
Meeting of 04/01/2018
Paper: Saxe, Andrew M., et al. "On Random Weights and Unsupervised Feature Learning" (ICML 2011).
Presenter: Pierre Wolinski
Participants: Michèle, Guillaume Doquet, Guillaume Charpiat, Zhengying, Heri, Aris
Main idea of the article: In some cases, untrained neural networks are almost as accurate as trained neural networks. By studying the Fourier transform of the convolution, we are able to explain these results. Moreover, the article gives a heuristic for architecture selection. (Slides)
To-do:
Meeting of 21/12/2017
Paper: Domhan et at. "Speeding up Automatic Hyperparameter Optimization of Deep Neural Networks by Extrapolation of Learning Curves" (IJCAI 2015).
Presenter: Heri
Participants: Michèle, Guillaume (Doquet, Charpiat), Isabelle, Pierre, Zhengying
Main idea of the article: Speeding up the search of hyperparameter by predicting the performance of the model. Use a set of parametric function to extrapolate the learning curve. Stop runs that are unlikely to perform the best observed so far. (Slides)
To-do:
Meeting of 30/11/2017 Paper: Klein, Aaron, et al. "Fast bayesian optimization of machine learning hyperparameters on large datasets." arXiv preprint arXiv:1605.07079 (2016). Presenter: Zhengying Participants: Isabelle, Michèle, Marc, Lisheng, Guillaume (Doquet), Heri Main idea of the article: Use Bayesian Optimization to do hyperparameter selection, with faster training (thus faster loss evaluation) using sampled sub-dataset, following a strategy that chooses next point to evaluate by maximizing information gain per computational cost on the distribution of the global minimum of the goal function (e.g. validation error w.r.t hyperparameter) Slides: 10 pages, contains also a very brief introduction to Bayesian Optimization and Gaussian Process, with a small exercise ;) Remarks & questions:
To-do:
Meeting of 21/11/2017 Paper: Munoz, Mario A., et al. "Instance Spaces for Machine Learning Classification." Mach. Learn (2017). Presenter: Guillaume Doquet Participants: Michèle, Lisheng, Heri, Zhengying Main idea of the article: Extend the Algorithm Selection Problem framework suggested by Rice to gain knowledge on how well the combination of different algorithms and datasets can be or to objectively measure the performance of an algorithm, using 2-d visualisation in a so-called instance space. For each instance (a dataset and a classification problem), a lot of features are computed and then selected. A performance is adopted then SVM is used for fitting. |