To start, I had to find the results from the 2023 USL season, which I manually inputted into my own spreadsheets (which can be found at the bottom) from their website. I decided to keep any matches that ended in a draw and decided by penalties (such as the final) as draws. I recognize that there is an element of skill to winning a penalty shootout, but decided that the result after 120 minutes would be the best indicator of team performance for my investigation. Next, I had to decide how I wanted to go about making this ELO system. I looked at Club Elo's Methodology and did some similar modeling to theirs. However, there were some small changes I had to make due to the differences between club soccer in Europe's top 5 leagues and semi-professional soccer in the United States.
After importing and cleaning the data in R, I decided to give every team a base ELO Rating of 1000. I then added the Elo equation, which shows the relationship between the win probability of both teams in direct confrontation. The equation is E = 1/(10^((away elo - home elo)/400) +1).
To determine the number of points that are exchanged between teams after a match, I used the formula 𝛥Elo = (Result - Expected) * k, where Results have the following breakdown: wins count as 1, draws as 0.5, losses as 0. The value of k is the weighting of the Elo system and must be decided upon depending on the context. A higher k will have the ratings converge quicker to their true values but will suffer from more variation. A smaller k provides more stable values that take longer to converge. Club Elo uses an index of k=20. However, they have data over a much longer period and have a much larger sample size of teams. For my findings, I needed to use a higher k value because of the lack of historical data in the USL (the league was only founded in 2011) and the smaller sample size of teams (currently 24 in the league).
Below are three different tables with the end of season ELO for every team in the USL in 2023 under the parameters of their starting ELO being 1000 and the k values tested being 50, 100, and 250.
Unsurprisingly, Phoenix Rising's ELO went up with each subsequent increase of k because they performed very well, and above expectations according to the ELO, towards the end of the season, and a higher k value will more heavily favor recent results.
I then decided to give the system some sort of historical context. I looked at the end of season standings in the USL from the 2022 season and weighted the starting ELO of each of the teams based on how they performed in the previous season (first by their points and then by goal difference as a tiebreaker) from 1600 to 580 in increments of 40. There were a few teams that played in the 2022 season that did not continue into the 2023 season; they were ignored.
Below are three different tables with the end of season ELO for every team in the USL in 2023 under the parameters of their starting ELO depending on their previous season's results and the k values tested being 50, 100, and 250.
As expected, for the lower k values, there are more differences between the even starting ELO and the past performance-based starting ELO, as that model would be used to provide more stable values over multiple seasons, while the higher k values show less variation between the even starting ELO and the past performance-based starting ELO.
I was also interested in seeing just how much of an underdog Phoenix Rising was on their way to USL championship glory. The following results are from the model that factored the previous season's performance and had a k value of 100, which I considered to be a fair middle ground of variance and stability. In all four of their playoff matches, we can see that they had a lower ELO, indicating that they were expected to lose all four of these games but managed to pull off one of the greatest underdog runs in USL history.