In the Spring of 2017, I developed a course MATH 370 Machine Learning and Neural Networks. Please see the syllabus for more detail. I wanted to teach a course most similar to my research so I could share my experience with students as well as impart skills highly sought after by major companies such as Amazon, Google, and State Farm. I co-taught this course with Dr. Harold Hastings, who covered the Neural Networks lectures.
I had the following set of goals for this course
Students will successfully understand the math behind machine learning methods.
Students will use the mathematics to motivate the programming.
Students will correctly identify which machine learning methods are necessary given a particular problem.
Students will present their ideas cogently.
All of these goals were met with relative success.
We assessed students' understanding of the mathematics with quizzes and homework. Please see quiz 2 and homework on the Principal Component Analysis as an example. Many students appreciated the homework because they felt it better prepared them for the programming assignments. Unfortunately, the quizzes were not met with the same favor.
We gave the students programming assignments. In order to determine whether the students' code was successful, they were provided with data to test results. See the assignment prompt for DBSCAN on git. We were aware some students did not use the math to motivate the programming as this was evident from quiz scores. However, the students during this term were well equipped with computer science knowledge which allowed them to program successfully.
While we did not give a test that determined whether students can identify machine learning techniques needed, we did have students work on projects for 3 - 4 weeks. For these projects, students chose a problem to work on and needed to identify which machine learning methods were needed to successfully complete the project. Many students did so. Please see this example.
The students were required to present their project findings and were given a graded based on organization, delivery, and clarity.
At the end of the term, we gave a questionnaire in order to receive feedback from the students about the course. I provide examples of both compliments and critiques.
(+) "enjoyable class"
(-) "class and lectures could use more structure"
(-) "would benefit from more in-class code examples"
(-) "make linear algebra a pre-requisite"
(-) "cover backprop/multi layer in more detail"
(+) "the focus on projects was my favorite part of the class"
(+) "great course! I particularly enjoyed the final project component of the class"
(+) "I really liked Amanda's red slides, having them up on Moodle was very helpful."
(-) "Have more rigorous code review, like the people who turn in assignments look at each other's"
(+) "a little more work than a standard comp sci course, but was a fun experience nonetheless"
This feedback is very helpful so the next version of the course can be improved.
Many of the criticisms had to do with teaching programming. While I did provide pseudocode for the students to follow, as you can see on slide 11 of gradient descent method, it seems this was not delivered in an effective manner.
Rather than explaining each step myself, which has a higher risk of not being retained by students, I plan to do the following in the future: after explaining one step, I will have them explain in their own words what the line means and then discuss how they would interpret it into code. Afterward, I would have them discuss their individual answers with each other.
Another critique of the course had to do with the pre-requisites. For this first semester, the pre-requisites were either Linear Algebra, Probability Theory, Discrete Mathematics, or Algorithms and Data Structures. Because of this, we had a high enrollment of students who had significant computer science experience but no more than Calculus II for math background. These students struggled the most with understanding the material. For example, to understand gradient descent, you must know how to take partial derivatives and the concept of the gradient. While many students could catch on, it is understandable that some could not. This forced us to design our course to teach more on algorithms and algorithm development rather than the mathematics behind these methods. We did not lower our standards, but we did have to change them.
In the future, we will require Linear Algebra or Probability Theory and Algorithms and Data Structures. The students enrolled in this course with this background did significantly better on the quizzes, math homework, programming assignments, and class discussion.
If we had had this constraint to begin with, we may have completed our entire schedule. Because we did not, our syllabus was a bit too ambitious.