Data Analysis

Data gathering, manipulation, analysis, and visualization

Visit the official UMSI website for the most up-to-date information on client based courses

Information on the site you are currently visiting is no longer being updated as of Summer 2021


SI 699: Big Data Analytics

What is Big Data?

Projects for SI 699: Big Data Analytics must be data-centric and revolve around large-scale datasets. Master’s-level students will work on problems of data manipulation, analysis, and visualization, and use enterprise-scale data to improve performance, outcomes, or understanding of a problem. Clients receive a final report with findings from the analysis and have the opportunity to interact with students who specialize in data analytics who may be early in their job search.

Deliverables

What do clients receive for participating in this course?

  • A written report

  • Additional deliverable(s) determined collaboratively between the students and the client, which may include any of the following:

    • New data sets

    • Additions to existing data sets

    • Code repositories

    • System-level documentation to instruct clients on using scripts generated for the project

    • Other negotiated deliverables

Client Eligibility

Who can participate?

Potential clients should meet the following criteria:

  • It is required for the client to transfer their data to students and answer questions about it

  • Engage with students mid-semester and at the end of the term

Data sets should meet the following criteria:

  • Data must be ready to go and sent to students by the start of the course (week of January 18th, 2021)

  • Data must be portable (capable of being taken outside of organization), so that students can upload and analyze it on a system of their choice

  • Data for this course should be very big, complex, and messy - data should be big enough that it requires additional computing beyond what students can run on their laptops.

    • 1M+ rows, and dozen or more columns or 3-4G minimum

    • Messy datasets that require technical skills to clean and multi-modal (text, images, etc.) data are preferred but not required.

Projects

What kinds of projects are appropriate for the course?

Potential projects should meet the following criteria:

  • Include pointed questions that students have the potential to answer through analysis of the data.

  • Data-centric and revolve around large-scale dataset that require distributed computing and that require data manipulation, analysis, and visualization that cannot be performed on a personal computer

  • Involve significant technical work, with corresponding amounts of programming and/or data analysis scripting

  • The projects that work the best for this course relate to the following tasks:

    • Classifications/categorical

  • Unsupervised

  • Supervised

  • Prediction

  • Search (Optimization on networks)

  • Recommendation (based on relevance level)

Desirable projects may also include the following:

  • Parsing, analyzing and interpreting data for your organization/commercial enterprise. Types of data include: web logs, email traces

  • Using enterprise-scale data to improve performance, outcomes, or understanding of a problem

What kinds of projects are NOT appropriate for the course?

Less desirable projects may include the following:

  • Work on mission-critical components or processes with critical dependencies on other projects

  • Projects that do not involve significant technical or programming work

How many projects are selected for this course?

  • Winter 2021: 30 projects projected

  • Winter 2020: 10 projects selected

* Due to variability in the number of enrolled students each year, these numbers are subject to change and can be used as a rough estimate.

Participate

How do I become a client?

Potential clients should complete this brief form with their contact information and a short summary of their project idea. Our Client Engagement Team will review your submission and reach out to you within 3 business days with next steps.

What if I don't have a project right now, but I'm interested in future opportunities or want to learn more?

If you don't have a specific project in mind for the upcoming semester, but would like to stay informed about future opportunities to work with students through our client-based courses or other programs, complete this registration form to be added to our mailing list.

Timeline

This is a Winter semester course which occurs January – April

June - November

  • Client submits project idea

  • Client Engagement Team (CET) reviews project idea and requests full project proposal

  • CET works with client to scope and refine proposal

  • Faculty choose proposals to present to students

January

  • Students begin project

April

  • Students finish project and provide deliverable(s) to client