Data Analysis

Data gathering, manipulation, analysis, and visualization

Visit the official UMSI website for the most up-to-date information on client based courses

Information on the site you are currently visiting is no longer being updated as of Summer 2021


SI 670: Applied Machine Learning

What is Applied Machine Learning?

In SI 670: Applied Machine Learning, graduate students learn basic machine learning concepts and methods, along with how to select and apply these methods correctly to solve supervised and unsupervised machine learning problems on real-world datasets. Students will master basic concepts of supervised (classification) and unsupervised (clustering) techniques, identify which technique they need to apply for a particular dataset and problem.

Deliverables

What do clients receive for participating in this course?

  • Midterm Assignment

    • Report of Findings

  • Final Project

    • Initial Proposal

    • Final Report

      • Students will analyze the dataset and deliver insights identified through supervised and unsupervised machine learning techniques. This may include clustering, dimensionality reduction, regression, classification, or deep learning methods.

    • Poster Presentation

    • Code Repository

Client Eligibility

Who can participate?

Potential clients should meet the following criteria:

  • Able to provide one or more datasets that might offer insights to interesting questions posed by your organization

  • Data must not be too small (100-1000 rows is likely too small)

  • Willingness to allow students to identify trends, insights, and questions about the data

  • Ability to answer student questions about the data or intended use of the data or insights

Projects

What kinds of projects are appropriate for the course?

Potential projects should meet the following criteria:

  • Data-centric and revolve around large-scale datasets, with students working on problems of data manipulation, analysis, and visualization

  • Involve significant technical work, with corresponding amounts of programming and/or data analysis scripting

  • Questions about predictive modeling, gaining deeper insights into data through machine learning techniques

Desirable projects/datasets should be able to gain insights from at least two of the following methods:

  • Classification or regression analysis

  • Clustering analysis

  • Random Forest, SVM, and other machine learning techniques

  • Deep learning analysis, such as neural networks

What kinds of projects are NOT appropriate for the course?

Less desirable projects may include the following:

  • Work on mission-critical components or processes with critical dependencies on other projects

  • Projects that do not involve significant technical or programming work

Participate

How do I become a client?

Potential clients should complete this brief form with their contact information and a short summary of their project idea. Our Client Engagement Team will review your submission and reach out to you within 3 business days with next steps.

What if I don't have a project right now, but I'm interested in future opportunities or want to learn more?

If you don't have a specific project in mind for the upcoming semester, but would like to stay informed about future opportunities to work with students through our client-based courses or other programs, complete this registration form to be added to our mailing list.

Timeline

This is a Fall semester course which occurs September – December

June - August

  • Client submits project idea

  • Client Engagement Team (CET) reviews project idea and requests full project proposal

  • CET works with client to scope and refine proposal

  • Client sends sample of dataset to CET and faculty

August - September

  • Faculty choose proposals to present to students

  • Students choose their project

  • Client sends full dataset to CET

October - December

  • Students explore data and finalize anticipated scope and deliverables