H. Moneyballer‎ > ‎

Ultimate Frisbee

Learning objectives (and summaries)

Compare and rank individuals within one or more quantitative distributions.
  • Develop a method to approach large spreadsheets of real data and use it to intelligently make decisions
  • Use multiple regression in MS Excel to identify correlations between key player attributes
  • Use ranking and z-scores of individuals to compare them within a distribution
  • Develop ways to rank individuals using a rank and z-scores from multiple variables
  • Assess consistency of individuals using histograms
Assessment
    • Formula, written justification, and graphs for simulation (8pts)
      • Each team needs to submit one Excel spreadsheet with multiple regression data
      • Create a formula to rank all players. (1pt)
      • As a team, describe all parts of your formula and explain why your team thought this would help you find the best players. (3pts)
      • Once you have a good idea of who you want to draft and what stats are most important, choose at least 3 good players to compare.  Use histograms with the same bins to compare the center, shape, and spread of the distribution of a statistic you care about.  See instructions in the Ultimate section below. (3pts)
      • In the write-up, add a small section on what information your histograms helped you learn about a few of your top individuals. (1pt)
      • Each team should submit their spreadsheet, their written description, and their Excel histograms via email, with subject "statsproject", cc-ing all teammates.
    • Article discussion: http://fivethirtyeight.com/features/billion-dollar-billy-beane/
    • Pre-draft analysis article: http://www.royalsreview.com/2011/2/14/1992424/success-and-failure-rates-of-top-mlb-prospects
    • Typed reflection on the Moneyball process, drafting the best possible team, and future applications (5pts)
      • INDIVIDUALLY reflect on your experience performing "Moneyball" on the game of Ultimate.
        • Discuss how you used your intuition and understanding of the game and how you also used pure statistical analysis to best identify talent.
        • Talk about how you approached spreadsheets with over 50,000 cells in a way that left you informed for the draft.  You can draw on the days we played Ultimate, watched Moneyball, and worked as teams on the formula/graphs.
        • Reference the article discussion (Billion Dollar Beane) to explain why this process is so incredibly valuable in any industry where you are searching for talent.
        • For full credit, I expect at least two solid paragraphs (think somewhere around 300 words, and no fluffy intros or conclusions to take up space).  There is plenty to write about, and if you're stuck, ask me for help in class.
      • Each person should submit their reflection via email with subject "statshw".
    Ultimate Frisbee simulation
        Your team's goal is to draft the best Ultimate Frisbee team possible from a set of statistics on fictional players.  This will be a competitive draft -- on draft day, once a team selects a player, they will be off the market from all teams, so you will want to use the past performance data to your advantage and find the most valuable players that others may not notice.  Standardized scores, summary statistics, spreadsheet formulas, and a touch of intuition will aide you with this task.  Work in teams of 2-3.

        Getting started:
        • Download and save the _player_data.csv spreadsheet.  Open in Excel and turn into a table (see first video)
        • Calculate relevant statistics from the given stats, such as catching percentage or number of turnovers
        • Create a formula that uses z-scores or ranks to standardize values and combine into a single ranking number (see second video below)
        • Perform multiple regression to identify statistics that are highly correlated with wins
        • Create two histograms to further compare a few individuals around a specific stat (see last video below)
        • See the details in the "assessment" section above for specific details on what you need to turn in
        {1] Getting started in Excel


        [2] Installing the data analysis pack for Excel (if needed):


        [3] Using multiple regression to identify key stats


        [4] Combining multiple z-scores into a weighted "super-rank":


        [5] Using single-stat Excel files with individual game data to produce histograms that measure player consistency:


        Notes
            http://nyloncalculus.com/

            ċ
            _player_data.csv
            (7k)
            Andy Pethan,
            Jan 19, 2015, 7:15 PM
            ċ
            long_catches_against_data.csv
            (8k)
            Andy Pethan,
            Jan 19, 2015, 7:15 PM
            ċ
            long_catches_data.csv
            (8k)
            Andy Pethan,
            Jan 19, 2015, 7:15 PM
            ċ
            long_completions_data.csv
            (8k)
            Andy Pethan,
            Jan 19, 2015, 7:15 PM
            ċ
            long_thrown_at_against_data.csv
            (8k)
            Andy Pethan,
            Jan 19, 2015, 7:15 PM
            ċ
            long_thrown_at_data.csv
            (8k)
            Andy Pethan,
            Jan 19, 2015, 7:15 PM
            ċ
            long_throws_data.csv
            (8k)
            Andy Pethan,
            Jan 19, 2015, 7:15 PM
            ċ
            short_catches_against_data.csv
            (9k)
            Andy Pethan,
            Jan 19, 2015, 7:15 PM
            ċ
            short_catches_data.csv
            (9k)
            Andy Pethan,
            Jan 19, 2015, 7:15 PM
            ċ
            short_completions_data.csv
            (10k)
            Andy Pethan,
            Jan 19, 2015, 7:15 PM
            ċ
            short_thrown_at_against_data.csv
            (9k)
            Andy Pethan,
            Jan 19, 2015, 7:15 PM
            ċ
            short_thrown_at_data.csv
            (9k)
            Andy Pethan,
            Jan 19, 2015, 7:15 PM
            ċ
            short_throws_data.csv
            (10k)
            Andy Pethan,
            Jan 19, 2015, 7:15 PM
            ċ
            td_catches_data.csv
            (8k)
            Andy Pethan,
            Jan 19, 2015, 7:15 PM