Here is a list of future projects that I plan on working on as I get more time.
Gravity/Magnetism
Quantifying Gravity has been a goal of mine for a handful of years now. As soon as I heard about tracking data for the NFL, I wanted to see if this was possible. Gravity is a fairly easy thing to talk about, and has some applicability in other sports. If you are a very good player, you will get more attention from the opponent. For example, "Justin Jefferson's gravity pulled the coverage away making that a much easier completion to Jordan Addison. "
Approaches I have tried are using things like voronoi tessellations, as well as pure distance or angle calculations from one player to every other player. I have not been super pleased with any of my results so far, since there are many edge cases that often skew things pretty heavily one way or another. I think that moving forward I want to get a much more defined approach and filter down data to avoid a lot of these edge cases.
Once I get the basics down of how much gravity, magnetism, attention, whatever that a player gets, the next step would be to define how valuable the gravity actually is. Answering the "So What?" part of the problem. How much credit should we give them for pulling that attention away from someone else?
Styles Optimization
For positions with lots of the same kinds of players doing the same tasks, especially if they get to be the ones dictating the interaction (DL or WR), trying to figure out what the best combination of players is a goal of mine. This would pair quite well with gravity so that you can see if having a Superman (truly elite on their own) and anyone else is better or more valuable than a Batman & Robin (someone really good who pairs well specifically with their partner). Basically the idea is to be able to see if there is actually an advantage to be had in comparing the styles of the players entering a room. Is getting the best players, regardless of playstyle, better than optimizing a room with complimentary skillsets?
Yards Based Value
One of my main issues with my analysis is that a lot of the foundation is based on PFF grading and my homebrew Wins Above Replacement values. I want to be able to get another approach to valuation that can be derived from raw statistics instead. Basically, convert each statistic back into a common value like yards or expected points and compare the different positions and players against each other to see where the true value lies.
For example, a cornerback gets dinged for any yards they allow into their coverage, but gets credit for the yards that they would have allowed on a pass break up. For turnovers, they get credit for the yards they saved on that play, along with the expected yards or EPA that they would have gained on any future plays of that drive.
Additionally, I want to be able to give some credit for things being interconnected with something like gravity. For example, Myles Garrett gets double teamed or chip helped by the offense. On those plays, his odds of getting a pressure or sack decrease, but in theory his teammates odds increase. I want to credit that increase in odds and the corresponding value of those outcomes to Myles Garrett specifically.
Championship Equity
This is an attempt to value the saying "Putting their team in position to win a championship." Current thought is to look at betting odds changes for different parts of the off-season. Seeing how free agency or the draft affect a team's odds gets attributed to the GM for example.
Also could be useful to compare MVP odds back to this metric to see how much a QB affects a team's championship odds.
Additionally, could be useful for comparing a team's odds to the actual results to see how a coach or team differed from expectation.
NLP Prospect Analysis
Similar to what Ben Brown has done in the past with PFF and Sumer Sports (https://sumersports.com/the-zone/using-text-analytics-to-evaluate-the-2024-nfl-draft-quarterback-class/)
Want to be able to compare different positions instead of just QBs, as well as being able to determine which elements of the process align most with value. For example, say "flexibility" or "smoothness" show up a lot in wide receiver scouting. Does that showing up help with your projection to the NFL (my guess is yes) and being able to compare what the data shows versus the scouting report.
One of my main concerns with my "scouting" approach is that it is largely derived from the data. I may only get around to watching a handful of players outside of the first two rounds, and only certain positions are easy for me to find tape or highlights of. Being able to pair my data-driven approach with some elements of scouting reports back towards the data could really unlock something. Scouting is notorious for being able to evaluate the physical elements of traits and talent which are "easier" to project than stats are because they are more dependent on the situation. Being able to combine these two features would be very useful.
Coaching Staff Network Analysis
I am a big fan of network analysis and think one of the most applicable use cases would be coaching staff analysis. I was first inspired by a presentation I saw at the Carnegie Mellon University Sports Analytics Conference from Robert Binion and Mark Wood (https://www.fromtherumbleseat.com/2022/3/24/22989513/the-racial-imbalance-in-college-football-coaching).
Essentially I want to be able to create a network showing a coach's connections and efficiency in each of their stops, and be able to use that to predict which coaches are most likely to become an NFL head coach and how successful we expect them to become. Not sure that it will be super useful or predictive, but I think it would be cool regardless.
Expected Contracts from Production
Another element of my analysis that isn't very well rounded is comparing value to the salary cap. I have done work to implement something similar to what Brad Spielberger did at PFF regarding two year WAR percentiles (https://www.pff.com/news/nfl-introducing-pff-contract-projections) for my GM toolbox shiny app, but that is simply projecting based on what the NFL currently does. Ideally, I would like to be able to derive valuations outside of what has historically been done and base them instead on what the data shows is the most valuable. (Essentially the same push as the newly created draft models instead of using the Jimmy Johnson trade chart.)
Another approach that I saw that I thought was very interesting was another presentation that I saw at the CMUSAC, this time from Albert Cohen at Michigan State University. He was pulling expected contract values from production based stats and how those changed the effective win percentage of the team. Once you have the effective change in win percentage for a player, you convert how much value a win is worth in terms of dollars and cents and pay them accordingly. Albert was coming from a very strong finance background and pulled on previous academic works in that field for his analysis, particularly within baseball. Given how much money a team might make from fans showing up to the games, how much of a difference does winning impact the bottom line and using that difference to inform the value of a win in terms of dollars.
For the NFL, it would be a little different for a few reasons that I have encountered. Namely, that QBs are simply too darn valuable, they wreck the curve for everyone else. Even if you exclude QB, you still run into the problem of a salary capped league with equally shared revenue splits for every team. (Yes, I know teams make money that is exclusive to the other teams on top of that.) Still, it is harder to determine the relative value in terms of butts in seats for a WR when you only play 17 games and each game is usually pretty well attended regardless of who is playing or how much you are winning. No doubt that winning more helps, but given the outsized influence of QBs on winning, the marginal utility of a single WR on the bottom line simply isn't that much. Regardless, I think there is some value to be found in quantifying that in a vacuum regardless of what teams have historically done in the past.