An idea that I had prior to the playoffs was to try and map out the coverages played by the opposing defense based on what the offense was doing in a programmatic way. I figured that doing a decision tree would be the perfect method since this is effectively what these coordinators are doing anyways. "Oh, its 3rd down with 7 yards to go in the high red zone area. They just checked in to 12 personnel and they likely do this kind of play from that situation. Let me check that portion of my playsheet and pick one of those calls."
The plot below is for the Detroit Lions defense from the 2023 season prior to the playoffs. For a decision tree you start at the top and work your way down like a flowchart. The features that were included in this version of the model (not necessarily the best features, just the ones I wanted to look at) were down, yards to go, the global yardline, and the offensive formation.
The most expected call with no information was Cover 3 (I think this was pretty much true for every team that I checked). The values shown within each bubble are what percentage of plays were the specific call going in order of the legend. For example, the top left bubble for "Cover 0" shows a 0, 0.91, 0.09, and 0s to fill out the bubble. That means that among all plays the Detroit lions covered under the 7 yardline of the opponent (which the nfl pbp data had a coverage classification for) 91% of them were Cover 0 and the remaining 9% were Cover 1. The percentage at the bottom within each bubble shows the total percentage the bubble makes up, so only 4% of the Lions defensive snaps were when the opponent was really backed up behind the 7 yardline.
For another example, the 2023 Baltimore Ravens defense is shown below and they had 22% of their total defensive snaps occur when the opponent wasn't backed up behind their 23 yardline, it was 3rd/4th down, and there was less than 12 yards to go. On those snaps, they played Cover 1 most often at 35% of the time, with their second most commonly called play being Cover 3 at about 19% of those snaps.
In conclusion, I think that this kind of analysis is interesting but I ultimately didn't do too much with it since I found the labels from the play by play data to be lacking. It almost never shows a team playing Cover 2 for example, and it only has those 8 playcalls within the dataset. Maybe if I ever got access to better labelled data this would be worth picking up again. The same could be done for offense as well, especially if you had both labels and could see what teams most often do versus different kinds of calls from the opponent. Additionally, I wanted to be able to get some kind of edge compared to hand scouting, but it runs the balance of becoming too complicated of a flowchart to follow.
Detroit Lions 2023 Defense
Baltimore Ravens 2023 Defense