Deep Learning Problems

Groups:

Group 1 (xena01-03) - Koeller, Jordan; Holloway, Taylor; Walker, Blair

Group 2 (xena04-06) - Burton, Craig; Ang, Sam; Whitten, Marcus

Group 3 (xena07-09) - Skogman, Brett; Yang, Mary; Usiri, Calvin

Group 4 (xena10-12) - Samoray, Nicholas; Viltoft, Jorgen; Burnett, Jesse

Group 5 (xena13-15) - Chang, Stephen; Croxton, John; Reyes, Miguel

Group 6 (xena16-18) - Andres, Robbie; Witecki, Ian; Newton, Michael

Group 7 (xena19-21) - Bomer, Dan; Fordin, Sarah; Herbert, Emily

Data Set:

This week's dataset is the Trinity admissions dataset we have used previously. You can find it in /data/BigData/admissions/. I've made two files (with the numbers 2 and 3) specifically for doing this project.

In Class Questions:

1. Convert the example at https://github.com/deeplearning4j/dl4j-examples/blob/master/dl4j-examples/src/main/java/org/deeplearning4j/examples/dataexamples/CSVExample.java to Scala and adapt it to read the file AdmissionAnon2.tsv. What accuracy do you get using the configuration that is part of that file?

Note that new ClassPathResource("iris.txt").getFile() needs to be just a new File for the appropriate file. Also, you will need all of the following in your build.sbt.

libraryDependencies += "org.nd4j" % "nd4j-native-platform" % "0.9.1",

libraryDependencies += "org.deeplearning4j" % "deeplearning4j-core" % "0.9.1",

libraryDependencies += "org.datavec" % "datavec-api" % "0.9.1",

Before you leave class, one member of your group needs to send me an email with your group answers to these questions and the code you wrote to solve them. Make sure the email also includes the names of all the group members who were present to work on this.

Between Class Questions:

All the code that you write to answer these questions should be put in a package called deeplearn in the in-class repository. You should also make a file called deeplearn.md in the top level of your repository that includes a write-up with your answers to the questions and any requested plots.

1. Find the best classification scheme you can for the last column using the DeepLearning4J library. There are two ways to interpret the classification, and I want you to do those independently. (See a and b below.) I want you to try a few different configurations for your neural network to see which works best for this data. Document the configurations you use and how well they do.

a. Using AdmissionAnon2.tsv with 5 label classes.

b. Using AdmissionAnon3.tsv with 2 label classes.