Machine Learning Workshop

Workshop Materials

All the tutorials are all recorded and are available to members of the American Astronomical Society (AAS). To watch the recordings, you need to log on to the AAS237 conference webpage. Check out this video by Luisa Rebull for guidance on how to find the recordings.

You can download all the Jupyter Notebooks and data used in this workshop at this link.

To run the tutorials below, you need to have Jupyter Notebook installed. You can get it easily using anaconda

Also, you will need to install all the Python packages that are needed to run the codes. We created two requirements files (for pip and conda) that should include all the packages needed to run the codes:

Below, you can find a list of all tutorials and required packages (you can also install them individually using "pip install" or "conda install").

Introductory Tutorials

Here is a list of introductory tutorials presented on day 1 of the workshop.

1. Introduction to Convolutional Neural Networks (CNN)

Lead: Asad Khan (University of Illinois, Urbana-Champaign)

Description: Introduction to convolutional neural networks based on a simple linear regression example. Introduction to filters and convolution as well as the structure of a standard convolution neural network. 

Tutorial file: tutorial_intro.ipynb

Required packages: numpy, matplotlib, cv2, scipy

2. Simple Example of a Convolution Neural Network

Lead: Sinan Deger (Caltech/IPAC)

Description: A simple CNN is applied to galaxy images. The network is trained to recognize spheroids (compact) and disk galaxies.

Tutorial file: tutorial_1a.ipynb

Presentation slides: N/A

Required packages: os, time, sys, random, numpy, pandas, matplotlib, pickle, tensorflow, keras, sklearn, astropy

3. Introduction to Decision Trees and Random Forest

Lead: Sinan Deger (Caltech/IPAC)

Description: A simple random forest classifier is used to split a galaxy sample in disk and spheroids (compact). The parameters include several morphological parameters such as Sersic index, half-light radius, and moments of light.

Tutorial file: tutorial_1b.ipynb

Presentation slides: N/A

Required packages: os, sys, time, pandas, numpy, matplotlib, sklearn

4. Introduction to Unsupervised Machine Learning Techniques

Lead: Andreas Faisst (Caltech/IPAC)

Description: This set of two tutorials used the unsupervised machine learning algorithms t-SNE and SOM to classify galaxies in disk galaxies and spheroids (compact) based on their images. 

Tutorial file: tutorial_1c_TSNE.ipynb, tutorial_1d_SOM.ipynb

Required packages: numpy, matplotlib, pickle, time, random, regions, astropy, sklearn, pymvpa2, mpl_toolkits

Note: If you have problems installing the pymvpa2 package, try to install SWIG before installing pymvpa2 via pip. 

Other materials: A handy Embedding Projector for visualization of datasets (e.g., using T-SNE, UMAP, PCA, etc).

5. Visualization of Convolutional Neural Networks

Lead: Asad Khan (University of Illinois, Urbana-Champaign)

Description: This tutorial explains how the filters of a convolutional neural network can be visualized.

Tutorial file: tutorial_1e.ipynb

Required packages: os, time, sys, numpy, pandas, matplotlib, pickle, tensorflow, keras, sklearn, astropy

Other materials: A handy Embedding Projector for visualization of datasets (e.g., using T-SNE, UMAP, PCA, etc).

More Advanced Tutorials

These are the more advanced tutorials presented during day 2 of the workshop.

6. Reduction of Spitzer/IRAC Data using Random Forests

Lead: Jessica Krick (Caltech/IPAC)

Description: This tutorial uses decision trees and boosted random forests to do data reduction on Spitzer IRAC exoplanet light curves.

Tutorial file: tutorial_2_IRAC.ipynb

Required packages: pandas, numpy, matplotlib, statistics, time, xgboost, scipy, sklearn, seaborn

7. Using a SOM for Visualization and Analysis of Big Data

Lead: Dan Masters (Caltech/IPAC)

Description: This tutorial shows how a SOM can be used to visualize high-dimensional data and help with their analysis. Note that the SOM presented here has been preconstructed, as the idea in this presentation is to understand how to use the output of a SOM to analyze the properties of a large dataset. 

Tutorial file: tutorial_3_SOM.ipynb

Required packages: numpy, astropy, matplotlib, pandas

8. Introduction to Bayesian Neural Networks and Measurement of Black Hole Spins from Gravitational Waves

Lead: Hongyu Shen and William Wei (University of Illinois, Urbana-Champaign)

Description: This set of tutorials introduces Bayesian Neutral Networks and applies it to measure the spin of black holes from gravitational waves.

Tutorial file: Tutorial_4/LinearRegression.ipynb, Tutorial_4/BNN_GW_Estimation.ipynb

Presentation slides: You can download them here.

Required packages: os, sys, numpy, matplotlib, tensorflow, tensorflow_probability, seaborn, scipy, h5py, pickle, random