As discussed on the "RPI Formula" page, in general the NCAA computes the RPI for Division I women’s soccer without regard to game locations. The one exception is in the adjustments the NCAA makes to the unadjusted RPI in order to produce the Adjusted RPI. The adjustments award bonuses for good wins and ties and impose penalties for poor ties and losses, with the bonus and penalty amounts depending in part on whether a team receiving an adjustment played the game at home, at a neutral site, or away. Apart from these adjustments, however, the NCAA computes the ratings without regard to game locations.
Some critics have argued that the basic RPI formula should make a distinction between home, away, and neutral site games, in recognition that there is a home field advantage. The critics have asserted, in particular, that the top teams and teams from the strong conferences have leverage that enables them to play more non-conference games at home than away and that this unfairly skews the RPI in their favor.
For Division I men's basketball, the RPI formula makes a game site distinction: In its Element 1 (Team's winning percentage) computation, it treats a home win as 0.6 of a win and an away win as 1.4 wins; and an away loss as 0.6 of a loss and a home loss as 1.4 losses. In baseball, there is a similar distinction except that the weights are 0.7 and 1.3, since home field advantage is statistically less in baseball than in basketball. The question is whether Division I women's soccer should convert to a comparable system. This would mean incorporating game location data into the basic RPI formula. As indicated on the "Getting the Correct Data" page, game location data are the data as to which the NCAA is most prone to make errors.
In order to test whether teams play balanced home/away schedules, I looked at conferences and regions over the 2007 to 2011 seasons to determine whether they have game location imbalances. I found that they do. I also found a pattern to the imbalances: Conferences and regions with higher average Adjusted RPIs tend to have favorable home game imbalances and conferences and regions with lower average Adjusted RPIs tend to have unfavorable home game imbalances. (The same is true for the Non-Conference RPI.) This is not true in all cases, but it is true on average.
The following table shows the relationship, for the 2007 through 2011 seasons, between conferences’ average NCAA ARPIs and conferences’ home game imbalances. It demonstrates that conferences with higher average NCAA ARPIs, on average, have favorable home game imbalances and weaker conferences have unfavorable imbalances:
The following table shows the relationship, for the 2007 through 2011 seasons, between regional playing pools’ average NCAA ARPIs and their home game imbalances. It demonstrates that regional pools with higher average NCAA ARPIs tend to have favorable home game imbalances and weaker regional pools tend to have unfavorable imbalances:
In summary, there are home/away imbalances. Further, stronger conferences and regions tend to have favorable home game imbalances and weaker conferences and regions tend to have unfavorable home game imbalances.
HOME FIELD ADVANTAGE
IS THERE A HOME FIELD ADVANTAGE IN DIVISION I WOMEN'S SOCCER?
Granted that there are home/away imbalances, the next question is whether there is a home field advantage and, if so, its extent. According to Massey, who produces ratings for Division I women's soccer and also produces college football ratings used in the BCS bowl selection process, home field advantage in Division I women's soccer is worth 0.35 goals per game. That is not helpful, however, for RPI computation purposes.
In order to determine the extent of home field advantage in relation to RPI ratings, I used my Correlator and performance percentage method of analysis applied to data for the five seasons from 2007 through 2011. (See the "RPI: Measuring the Correlation Between Teams' Performance and Their Ratings" page for information on the Correlator and performance percentage method of analysis.) In a performance percentage analysis, a percentage above 100% means that the group of teams, on average, is outperforming its ratings; and a percentage below 100% means that the group is underperforming its ratings.
Specifically, I used the performance percentage method to compare teams' performance percentage in home games as compared to their performance percentage in away games. I did this for all games regardless of the rating difference between opponents, for the 1500 most closely rated games (~10% of all games), and for the 3000 most closely rated games (~20%). I did this using both the 2011 Adjusted RPI and the Unadjusted Non-Conference RPI. Using these as examples, the following table shows the results, with the performance percentage being that for the teams favored by the RPI to win the games:
These data show that teams significantly outperform their ratings at home and underperform away. Or, to put it differently, when teams are at home, they perform as though their ratings are higher than their NCAA-measured ratings; and when they are away, they perform as though their ratings are lower. Simply put, there is a home field advantage in Division I women's soccer; and the advantage has a significant effect on game results. This, of course, is not surprising.
WHAT IS THE EXTENT OF HOME FIELD ADVANTAGE?
Knowing that there are home field imbalances and that home field advantage affects game results, the next step is to measure the effect. Since home teams perform as though their ratings are higher than their NCAA RPI ratings and away teams perform as though their ratings are lower, this suggests that there should be an upward RPI correction one could add to home teams’ ratings when they host games and a matching downward correction to away teams’ ratings such that, with those game-by-game corrections, the teams then would perform as a whole in accord with their "Home/Away/Neutral (or HAN) Corrected" ratings -- in other words, their performance percentages would be right around 100% for both home and away games. With that in mind, for the five years 2007 through 2011, I tested a series of HAN Correction amounts, made on a game by game basis, to reflect game location. This included testing the Corrections in games in which opponents were closely rated (the 1500 most closely rated games -- approximately 10% of all games-- and the 3000 most closely rated games -- approximately 20% of all games), to see what level of HAN Correction would produce correlations in which teams performed in accord with their ratings regardless of whether they were home or away. I did this for numerous variations of the RPI.
The HAN Corrections I initially tested ranged from +0.001 for home teams matched by -0.001 for away teams, to +0.015 for home teams matched by -0.015 for away teams. In conducting the tests, I found that the results would converge on a particular set of matching upward and downward Corrections at which teams’ performances matched their RPIs in closely rated games, regardless of game location. For Corrections of lesser amounts, home teams still outperformed and away teams underperformed their ratings; and for Corrections of greater amounts, home teams underperformed and away teams outperformed their ratings. I also found that that the best games to focus on were the 1500 most closely rated and the 3000 most closely rated games. These were the best games because, if I focused on games where the rating differences were larger, the needed Corrections, to get home and away teams performing equally well in relation to their RPIs, had to be so large that in games between more closely rated teams the home teams would underperform their ratings and the away teams would outperform them.
The following table illustrates how this process works, using the performance percentages of the Unadjusted RPI as an example:
Thus for the URPI, the "just right" HAN Correction amount, at which both home and away teams perform in accord with their HAN Corrected ratings, is 0.009. A lesser Correction would leave the home teams outperforming and the away teams underperforming their ratings; and a greater Correction would leave the home teams underperforming and the away teams outperforming their ratings.
The appropriate HAN Correction varies as the versions of the RPI vary. This appears to be due to both differences in the spreads between the high ends of the ratings and the low ends in moving from one version of the RPI to the next and differences in how the ratings are distributed in moving from one version to the next.
IS THE LOCATION-CORRECTED RPI A BETTER PERFORMANCE MEASURE THAN THE RPI?
As a "validating" test of whether the location-corrected RPI better represents teams' performance in games, my next step is to compare the RPI’s correlation with regular season game results to the HAN Corrected RPI's correlation.
Looking at all games played from 2007 through 2011, using the 2009 Adjusted RPI and NCRPI as an example, here is how the ratings perform overall and for the Top 60 ARPI teams, with and without HAN Corrections:
As the above table shows, the HAN Corrected RPI version correlates slightly better with game results than the uncorrected RPI. This also is true for other variations of the RPI. Thus this test validates that game-by-game HAN Corrections produce ratings, for individual games, that better reflect teams' strength in the games given the game locations.
Using HAN Corrected ratings is important when studying how well the RPI performs, when comparing one variation of the RPI to another, and when considering the effects of home game imbalances in relation to the RPI's conference and region playing pool problems.