RPI: Measuring the Correlation Between Teams' Ratings and Their Performance

[NOTE:  In 2022, a new NCAA rule eliminated overtimes except in conference tournament and NCAA tournament games.  Unless stated otherwise, information on this page is based on data from 2010 to the present, with game results adjusted as though the 2022 change to no overtimes had been in effect.  These game results adjustments are possible because the NCAA data system since 2010 has shown which games were decided in overtime.  Also, the Covid-affected 2020 season is excluded from the data.]

Updated April 2024

How good is the RPI at evaluating teams' performance over the course of the regular season (including conference tournaments)?  How does home performance compare to away performance, in relation to teams' RPI ratings?  How well does each conference perform in relation to its average rating?  How well do teams from different geographic regions perform in relation to their average ratings?  How well do different variations of the RPI perform when compared to each other?  How well does the RPI perform as compared to other systems?

To answer questions like these, I developed a computer tool for testing how well groups of teams perform in relation to their ratings -- for example, home teams and away teams, teams grouped by conference, and teams grouped by geographic region.  I use the tool to evaluate not only the RPI, but also other rating systems.   I call the tool the "Correlator."  I call it the Correlator because it determines the extent to which a group of teams' actual game results correlate with the teams' ratings.  So that readers, if they want, will know exactly how the Correlator works, this webpage provides a technical explanation of it.

The Correlator uses two methods to see how groups of teams perform in relation to their ratings:  the Performance Percentage Analysis method and the Actual Results Compared to Likely Results Analysis method.

Method 1: Performance Percentage Analysis

To measure a group of teams' performances in relation to their ratings, this method determines two sets of numbers:

a.  The number of games the ratings say the teams should have won (because they have higher ratings than their opponents) and, of those games, the number the teams actually won -- in other words, did not lose or tie; and

b.  The number of games the ratings say the teams should have lost (because they have poorer ratings than their opponents) and, of those games, the number the teams did not lose -- in other words, won or tied.

Together, these numbers show the extent to which teams performed as their ratings say they should have performed -- Did they perform better, poorer, or just about right?

I then determine the percent of games the teams in the group actually won, of the games the ratings say they should have won; and the percent of games the teams actually won or tied, of the games the ratings say they should have lost.  Next, I add these two percentages together to produce the group's "performance percentage."

If I look at all games played, without regard for the size of the rating difference between opponents, the teams with the higher ratings in games (as adjusted for home field advantage) win roughly 65% of the games and teams with the lower ratings in games win or tie about 35% of the games.  So if a team wins the percentage of games it "should" win -- 65% of them -- and of the games it "should" lose, wins or ties the percentage it should -- 35% of them -- then its performance percentage, the sum of the two numbers, will be 100%.  This 100% performance percentage is what one would expect from a group of teams, if the rating system rates the group perfectly.  When I break all teams down into groups, however, such as conferences, some groups have performance percentages above 100% and some groups have performance percentages below 100%.  The groups with performance percentages above 100% performed better than their ratings say they should have, in other words are underrated.  The groups with performance percentages below 100% performed more poorly than their ratings say they should have, in other words are overrated.  As an illustration, consider two groups of teams: (1) those playing at home; and (2) those playing away.  If I compute the performance percentages of these two groups, without their ratings adjusted for home field advantage, the performance percentage of teams playing at home is above 100% -- typically, for all games, a little over 120% -- and the performance percentage of teams playing away is below 100% -- typically, a little under 80%.  This means that teams playing at home do better than their ratings say they should (are underrated when playing at home) and teams playing away do more poorly than their ratings say they should (are overrated when playing away), which of course is not surprising.

Next comes the question: How do I account for the fact that some groups of teams will tend to have higher rating differences in their games than other groups of teams?

To illustrate what I mean by this question, consider a conference of teams with very high ratings.  Because the teams have very high ratings, in the conference's non-conference games, a higher than ordinary number of games may have high rating differences between the conference's teams and their opponents.  That being the case, one would expect the conference's teams to win a high proportion of their non-conference games.  Compare this to a mid-level conference whose teams play primarily mid-level non-conference opponents.  In the mid-level conference's games, fewer games may have high rating differences than for the very strong conference.  That being the case, one would expect the mid-level conference's teams to win a lower proportion of their non-conference games than the very strong conference.  Using a performance percentage approach that bulks together all of the very strong conference's games and compares their performance percentage to the performance percentage of the mid-level conference bulking together all of its games quite naturally will show the very strong conference with a higher performance percentage than the mid-level conference.  This does not mean, however, that the very strong conference is underrated and the mid-level conference is overrated.  Rather, it simply means that the very strong conference, in games the ratings said it should win, had higher rating differences than did the mid-level conference in games the ratings said it should win, so naturally the very strong conference will have a higher performance percentage.

In order to address this problem, for groups of teams whose performance I want to measure, I look at how those teams performed in games at similar levels of "rating difference."  For two opponents, their "rating difference" is the difference between their ratings.  My database for performance percentage measurements consists of more than 40,000 games.  I arrange all of these games in order of the rating difference between the opponents.  Then, starting with the most closely rated games,  for some purposes I take the 1st 10% of the games (~4,000 games), the 2nd 10%, the 3rd 10%, and so on in 10% increments until I have covered all the games.  For other purposes, I take the 1st 20% of the games (~8,000 games), the 2nd 20%, and so on.  And when determining the value of home field I consider the most closely rated 3% of games.

When using the 10% and 20% increments, for each group of teams whose performance I'm measuring, I next calculate the group's performance percentage for each 10% or 20% increment.  I then average the performance percentages for all the 10%  or 20% increments.  By doing this, I eliminate the problem of different groups of teams having different proportions of games at the various rating difference levels.  For example, the very strong conference may have 30 games at a high rating difference level as compared to a mid-level conference having 10.  If each conference wins 90% of its games at this rating difference level, then this 90% amount will be averaged in with how the conference performed at each other rating difference level, so that the ultimate performance percentages are not influenced by the two conferences having different proportions of their games at the different rating difference levels.

Another question is:  What rating difference levels should I look at?

There is a potential problem with using all of the rating difference levels when generating groups' performance percentages.  Let's assume that the rating system underrates a group of teams and that I want to determine the extent of the underrating.  A problem I have, if I compute a performance percentage using all rating difference levels, is that it will understate the extent of the underrating.  Think of it this way:  Suppose that Team A is rated 0.5400 but, due to a rating system problem, really should be rated 0.5450.  Suppose that Team B is correctly rated at 0.6100, so that in a game between the two the rating difference is 0.0700, although it really should be 0.0650.  Also suppose that, at either rating difference level, experience shows that the higher rated team wins 100% if its games.  And, also suppose that a rating difference level of 0.0700 is in the 10th 10% group, in other words the games with the greatest rating differences between opposing teams.  If I use this rating difference level's games when considering my group of teams' performance percentage, I will end up understating the extent of the underrating.  It's not that the underrating of Team A doesn't exist at this level of game rating difference, it's just that the underrating  is not enough to affect the game result.  So what I need to do is not use this rating difference level to determine the exact extent of the underrating, but rather to look at those games where the rating difference level is in a range where the underrating is most likely to affect the performance percentage.  Ideally, this means looking at the rating difference levels at the very close end of the rating difference spectrum.

Another related problem, however, is the need to have sufficient data to produce reliable performance percentages.  Depending on how I group teams, this means I need to examine how near to the close end of the rating difference spectrum I safely can come when I am comparing performance percentages without taking an undue risk of not having enough data:

If I have grouped teams simply as Group 1 - Home Teams and Group 2 - Away Teams, insufficient data at the "closely rated" end of the spectrum is not a problem since all but the relatively few neutral site games have both a home team and an away team.  Thus when comparing the two groups' (home teams' and away teams') performance percentages, where I have a data base of roughly 40,000 games, I look at the most closely rated 3% of games.  When doing this for the various versions of the RPI, 3% of all games is ~1,200 games, which is a good-sized data sample.

I have to engage in similar thinking when looking at conferences' and regional playing pools' performance percentages.  There, I am looking only at inter-conference or inter-region games; and each rating difference level's games are divided potentially among 31 conferences or among 4 regions.  This means I'm dealing with much smaller data sets.  Because of that, for these groupings of teams it is best to look at the closest 10% and 20% of games to determine the extent of any rating problem and at all the 10% increments to determine the overall effect of the problem.

And, a further question is: How do I take into account the effects of home field imbalances?

On the RPI: Home/Away/Neutral Issues page, I show how I use the Correlator to calculate the value, from a rating perspective, of a team playing at home rather than away.

When I use the Correlator to measure the performance of a group of teams (such as teams in a conference or geographic region) in relation to the group members' ratings, it is important that the performance percentage the Correlator produces not be a reflection of the group's having a positive or negative home field imbalance but rather that the performance percentage be a reflection only of the extent of any problem the RPI has properly rating that group of teams in relation to other similar groups of teams.  This means that I have to filter out the effects of any home field imbalances in the process of generating the performance percentages.  Here is how I do this:

a.  First, for the particular rating system I am investigating, I calculate the rating value of a team playing at home rather than away, using the method I describe on the RPI: Home/Away/Neutral Issues page.  (The rating value of playing at home varies from one version of the RPI to another and from one alternative rating system to another.)

b.  Then, for each game, I adjust the ratings of the opponents by giving the home team's rating an upward adjustment and the away team's rating a downward adjustment in the appropriate amount as determined in step (a).

By doing this, I effectively filter out from the teams' performance percentage calculations any effects that are due to game locations.

Based on all of this, I believe the performance percentage system give a fair representation of how well the NCAA RPI, other RPI variations, and other systems do at measuring groups of teams' performance.

Example

Since the above description is abstract, here is a real life demonstration of the kinds of information the Correlator produces.  This specific information is from the Correlator's evaluation of the NCAA RPI.

a.  Overall Performance, by Rating Difference Level.  Overall, the NCAA RPI formula's ratings, as adjusted for game locations, are consistent with game results as shown in the following table.  For the NCAA RPI, the game site ratings adjustments are +0.0083 for the home team and -0.0083 for the away team.  In other words, home field, by itself, is equal to an 0.0166 rating difference. 

In the top row of the table, I show how well the RPI's ratings match all game results: The higher rated team in a game wins 65.3% of the time; ties 21.1%; and loses 13.6%.  In the lower portion of the chart, I show what these percentages are for different 1% "slices" of games from the 1st 1% -- the most closely rated, near the top of the table -- to the 100th 1% -- the most one-sided games, at the bottom of the table.  Each 1% slice represents 407 games (except the 100th 1%, which represents slightly fewer).

b.  Performance by Ranking Groups.  In addition, I look at games by teams' ranks, broken down into groups of 10 teams.  Thus I look at teams ranked 1 to 10, 11 to 20, and so on.  When I do this, I also look at the top 60 teams as a group.  I do the top 60 teams because historically, #60 is slightly outside the boundary of the teams that have a chance of getting an at large selection for the NCAA Tournament -- since 2007, using the NCAA RPI, the most poorly ranked team to get an at large selection is #57.  The following table applies to teams broken into these ranking groups:

c.  Performance for Conferences.  The following table, in the fifth column, shows the average NCAA RPI for each conference's teams, arranged in order from highest conference average rating to lowest.  Other columns show the conferences' teams' performance percentages for non-conference games.  By using only non-conference games, the table shows how well the RPI does at rating teams from the different conferences in relation to each other.

In this table, the second column is the conference's performance percentage for the most closely rated 10% of games.  The third is for the most closely rated 20%.  The fourth is for all ten 10% increments of  games averaged out.

In this table, the 1st 10% column provides the best indication of the extent of the RPI problem rating the different conferences in a single system because it's when teams are closely rated that over- and under-rating errors are most likely to result in differences between the ratings' predicted results and actual game results.  The All 10% Segments column shows the overall impact of the problem, taking into account that as the rating differences between opponents increase, the apparent effect of the rating problem diminishes.

In this table, there are three main things to look at:

The spread between the conference with the best performance percentage and the conference with the poorest.  This is a measure of the rating system's general "fairness" at rating conferences within a single system.  Thus for the NCAA RPI, looking at All 10% Segments, the most underrated conference is the ACC, with a performance percentage of 113.7% and the most overrated is the Southwestern (SWAC) with a performance percentage of 74.4%, followed by the Southland with 87.8%.  In these discussions and related tables and charts, I'm going to disregard the SWAC, as it is so much weaker than even the next conference from the bottom that the RPI can't come close to properly ranking its teams.  Including an outlier like that would artifically exaggerate the NCAA RPI's conference problem.  The "High/Low Spread" between the ACC's overall performance percentage of 113.7% and the Southland's 87.8% is 25.9%.  This 25.9% is a measure of the general fairness of the NCAA RPI, across all games, when it rates teams from different conferences in relation to each other.  The lower the number, the better.

The total amount by which all of the conferences have performance percentages either above or below 100%.  This is a measure of the  amount by which the rating system misses perfect correlations between ratings and game results, totaled up for all the conferences.  The totals are at the bottom of the table under the three columns on the right (with the totals excluding outlier SWAC).  For the NCAA RPI, looking at All 10% Segments, the total "Over [100%]/Under [100%]" amount is 148.5%.  This is another measure of the general fairness of the NCAA RPI, across all games, when it rates teams from different conferences in relation to each other.  Again, the lower the number, the better.

The relationship between conference average rating and conference performance.  Whereas the previous two measures look at general fairness in rating conferences' teams in a single system, this looks to see whether the rating system has a pattern of discrimination related to conference strength.  The data in the above table provide the basis for this look; and the chart below uses the data (with SWAC excluded) to show whether there is a pattern of discrimination:

The chart shows the relationship between the conferences' average NCAA RPIs and the conferences' performance percentages.  The conferences are arranged from left to right in descending order of strength: the highest average NCAA RPI -- the strongest conference -- is on the left -- and the lowest average NCAA RPI -- the weakest conference -- is on the right.  The uneven red line represents the conferences' performance percentages for the most closely rated 10% of games.  The straight red line is a computer-generated straight trend line for the red data points.  The uneven and straight yellow lines are for the most closely rated 20% of games.  The uneven and straight blue lines are for the All 10% Segments averaged out.

The chart shows that teams from conferences with higher average NCAA RPIs (on the left) tend to have higher performance percentages and thus perform better in relation to their ratings than teams from conferences with lower average NCAA RPIs (on the right).  In other words, the NCAA RPI tends to underrate teams from conferences with higher average NCAA RPIs and to overrate teams from conferences with lower average NCAA RPIs.

In the chart, the overall extent of the NCAA RPI's discrimination is the spread between the high ends of the trend lines, on the left, and the low ends, on the right.  On the chart, in the upper right, you will see three formulas.  These are formulas for the trend lines, with the formula for the "10% of games" trend line at the top, followed by the formula for the "20% of games," with the formula for "All 10% Segments" at the bottom.  The formula tells the exact performance percentage at any point on the trend line.  When using the formula, for the strongest conference, on the extreme left, x = 1; and for the weakest conference, on the extreme right, x = 30.  Applying the  "All 10% Segments" formula using these x values, the trended performance percentage for the strongest conference is  108.0% and for the weakest is  92.3%.  The difference between these two numbers -- the Trend Spread -- is 15.7%.  This too is a measure of the fairness of the NCAA RPI, across all games, when it rates teams from different conferences, but rather than being only a measure of general fairness it is a measure of whether, and if so how much, the NCAA RPI systematically discriminates based on conference strength.

d.  Performance for Regions.   When I refer to "regions," I mean geographic regions.  I group teams in four regions based on the states where they are located:  North, Middle, South, and West.  I place states in regions based on where teams from each state play the majority (or plurality) of their games. The RPI: Regional Issues page of this website has a map showing the teams in each region.

The following table shows the average NCAA RPIs for each region's teams, arranged in order from highest region average NCAA RPI to lowest, together with the regions' teams' performance percentages.  The region performance percentages are for inter-regional games only.

As with conferences, there are three things to look at:

The spread between the region with the best performance percentage and the region with the poorest.  This is a measure of the rating system's general "fairness" at rating the different regions' teams within a single system.  Thus looking at all games, the most underrated region is the West, with an All 10% Segments performance percentage of 113.4% and the most overrated is the South with a performance percentage of 94.3%.  The "High/Low Spread" between the West's performance percentage of 113.4% and the South's 94.3% is 19.0%.  This 19.0% is a measure of the general fairness of the NCAA RPI, across all games, when it rates teams from different regions in relation to each other.

The total amount by which all of the regions have performance percentages either above or below 100%.  This is a measure of the  amount by which the rating system misses perfect correlations between ratings and game results, totaled up for all the regions.  The amounts for the individual regions are in the three columns on the right, with the total amounts "missed" at the bottom.  Looking at All 10% Segments, the total "Over [100%]/Under [100%]" amount is 26.5%.  This too is a measure of the general fairness of the NCAA RPI, across all games, when it rates teams from different regions in relation to each other.

The relationship between region average rating and region performance.  Whereas the previous two measures look at general fairness in rating regions' teams in a single system, this looks to see whether the rating system has a pattern of discrimination related to region strength.  The following chart uses the data from the above table to show whether there is a pattern of discrimination:

The chart shows the relationship between the regions' average NCAA RPIs and the regions' performance percentages.  The regions are arranged in descending order of strength: the highest average NCAA RPI -- the strongest region -- is on the left -- and the lowest average NCAA RPI -- the weakest region -- is on the right.  The uneven red line represents the regions' performance percentages for the most closely rated 10% of games.  The straight red line is a computer-generated straight trend line for the red data points.  The uneven and straight yellow lines are for the most closely rated 20% of games.  The uneven and straight blue lines are for All 10% Segments.

Using the trend lines, the chart suggests that teams from regions with higher average NCAA RPIs (on the left) on average have higher performance percentages and thus perform better in relation to their ratings than teams from regions with lower average NCAA RPIs (on the right).  In other words, the NCAA RPI on average discriminates against teams from regions with higher average NCAA RPIs and in favor of teams from regions with lower average NCAA RPIs.  If this interpretation of the chart is correct, then the overall extent of the discrimination is the spread between the high ends of the trend lines, on the left, and the low ends, on the right.  On the chart, in the upper right, you will see three formulas.  These are formulas for the trend lines, with the formula for the "10% of games" trend line at the top,  the formula for the "20% of games" in the middle, and the formula for "All 10% Segments" at the bottom.  The formula tells the exact performance percentage at any point on the trend line.  For the strong region end of the trend line, on the extreme left, x = 1; and for the weak region end, on the extreme right, x = 4.  Applying the  "All 10% Segments" formula using these x values, the trended performance percentage for the strong end is  108.3% and for the weak end is  95.1%.  The difference between these two numbers -- the Trend Spread -- is 13.2%.  This too may be a measure of the fairness of the NCAA RPI in terms of how it rates teams from different regions, but rather than being only a measure of general fairness it also may be a measure of how much the NCAA RPI, across all games, systematically discriminates based on region strength.

Looking at the chart, the West region is significantly underrated and one might conclude that this due to its having the best average NCAA RPI.  In other words, the NCAA RPI has the same problem with regions as it has with conferences.  However, if you look at the data points for the South, Middle, and North regions, their performance percentages go up and down rather than declining steadily as region strength declines.   The chart thus leaves open the possibility that the variations in region performance may may be due partly to region strength but/or also may be due to some other factor.  I discuss this further on the RPI: Regional Issues page.

Method 2:  Actual Results Compared to Likely Results Analysis

The first table above shows, for each 1% increment of games in terms of how closely the opponents are rated, the likelihood of the higher rated team winning, tying, and losing the game.  The following chart is based on the table:

In the chart, the data points are from the table.  Using the red as an example, they are the percent win likelihoods at different rating difference levels between opponents.  The rating differences run from the smallest on the left to the greatest on the right.  The solid red line is a computer-generated trend line for win likelihood based on the data.  Of the three formulas on the table, the upper-most is the formula for the red trend line, in other words for win likelilhoods.  For any rating difference, the formula lets you compute the expected win likelihood for the better rated team.  This similarly can be done for the tie and loss likelihoods for any rating difference.

Using the formulas allows the production of a table that shows the win, tie, and loss likelihoods in any game based on the location-adjusted rating difference between the opponents.  (The table, which is too long to reproduce here, needs some manual adjustments at the extreme high rating difference end of the rating difference spectrum, where some of the computer-generated likelihoods dip below 0.)

The Actual Results Compared to Likely Results analysis goes through the following steps:

a.  Performance for Conferences   The following table, in the second column, shows the average NCAA RPI for each conference's teams, arranged in order from highest conference average rating to lowest.  The next two columns show the conferences' actual and likely winning percentages for non-conference games.  The last column shows the difference between the two.  By using only non-conference games, the table shows how well the RPI does at rating teams from the different conferences in relation to each other.

In this table, there are three main things to look at:

The spread between the conference with the best Actual Less Likely Winning Percentage difference and the conference with the poorest.  This is a measure of the rating system's general "fairness" at rating conferences within a single system.  Thus for the NCAA RPI, the most underrated conference is the ACC, with a difference of 4.7% and the most overrated is the Southwestern (SWAC) with a difference of -8.2%, followed by the Southland with -4.3%.  As discussed above, I'm going to disregard the SWAC, as it is an outlier so much weaker than even the next conference from the bottom that the RPI can't come close to properly ranking its teams. The "High/Low Spread" between the ACC's Winning Percentage difference of 4.7% and the Southland's -4.3% is 9.0%.  This 9.0% is a measure of the general fairness of the NCAA RPI when it rates teams from different conferences in relation to each other.  The lower the number, the better.

The total amount by which all of the conferences have Winning Percentage differences either above or below 100%.  This is a measure of the  amount by which the rating system misses perfect correlations between ratings and game results, totaled up for all the conferences.  The total is at the bottom of the table under the right-hand column (with the total excluding outlier SWAC).  For the NCAA RPI, the total is 59.3%.  This is another measure of the general fairness of the NCAA RPI when it rates teams from different conferences in relation to each other.  Again, the lower the number, the better.

The relationship between conference average rating and conference performance.  Whereas the previous two measures look at general fairness in rating conferences' teams in a single system, this looks to see whether the rating system has a pattern of discrimination related to conference strength.  The data in the above table provide the basis for this look; and the chart below uses the data (with SWAC excluded) to show whether there is a pattern of discrimination:

The chart shows the relationship between the conferences' average NCAA RPIs and the conferences' Actual Less Likely Winning Percentage differences.  The conferences are arranged from left to right in descending order of strength: the highest average NCAA RPI -- the strongest conference -- is on the left -- and the lowest average NCAA RPI -- the weakest conference -- is on the right.  The uneven orange line with the black markers represents the conferences' Actual Less Likely Winning Percentage differences.  The straight black line is a computer-generated straight trend line for the black data points.

The chart shows that teams from conferences with higher average NCAA  RPIs (on the left) tend to have higher Actual Less Likely Winning Percentage differences and thus perform better in relation to their ratings than teams from conferences with lower average ratings (on the right).  In other words, the NCAA RPI tends to underrate teams from conferences with higher average NCAA RPIs and to overrate teams from conferences with lower average NCAA RPIs.

In the chart, the overall extent of the RPI's discrimination is the spread between the high end of the trend line, on the left, and the low end, on the right.  On the chart, in the upper right, you will see a formula.  This is a formula for the trend line.  The formula tells the Actual Less Likely Winning Percentage difference at any point on the trend line.  When using the formula, for the strong conference extreme, on the left, x = 1; and for the weak conference extreme, on the right, x = 30.  Applying the formula using these x values, the trended Actual Less Likely Winning Percentage difference for the strongest conference is  2.9% and for the weakest is  -2.9%.  The difference between these two numbers -- the Trend Spread -- is 5.8%.  This too is a measure of the fairness of the NCAA RPI when it rates teams from different conferences, but rather than being only a measure of general fairness it is a measure of how much the NCAA RPI systematically discriminates based on conference strength.

d.  Performance for Regions.   The following table, in the second column, shows the average NCAA RPIs for each region's teams, arranged in order from best region average NCAA RPI to poorest.  To the right, it shows regions' Actual and Likely Winning Percentages and the Actual Less Likely Winning Percentage differences.  The region Winning Percentage numbers are for inter-regional games only.

As with conferences, in this table, there are three main things to look at:

The spread between the region with the best Actual Less Likely Winning Percentage difference and the region with the poorest.  This is a measure of the rating system's general "fairness" at rating regions within a single system.  Thus for the NCAA RPI, the most underrated region is the West, with a difference of 5.1% and the most overrated is the South, with a difference of -2.5%.  The "High/Low Spread" between the West's Winning Percentage difference of 5.0% and the South's -2.6% is 7.5% (not 7.6%, due to rounding).  This 7.5% is a measure of the general fairness of the NCAA RPI when it rates teams from different regions in relation to each other.  The lower the number, the better.

The total amount by which all of the regions have Winning Percentage differences either above or below 100%.  This is a measure of the  amount by which the rating system misses perfect correlations between ratings and game results, totaled up for all the regions.  The total is at the bottom of the table under the right-hand column.  For the NCAA RPI, the total is 10.7%.  This is another measure of the general fairness of the NCAA RPI when it rates teams from different regions in relation to each other.  Again, the lower the number, the better.

The relationship between region average rating and region performance.  Whereas the previous two measures look at general fairness in rating regions' teams in a single system, this looks to see whether the rating system has a pattern of discrimination related to region strength.  The data in the above table provide the basis for this look; and the chart below uses the data to show whether there is a pattern of discrimination:

The chart shows the relationship between the regions' average NCAA RPIs and the regions' Actual Less Likely Winning Percentage differences.  The regions are arranged from left to right in descending order of strength: the highest average NCAA RPI -- the strongest region-- is on the left -- and the lowest average NCAA RPI -- the weakest region -- is on the right.  The uneven orange line with the black markers represents the regions' Actual Less Likely Winning Percentage differences.  The straight black line is a computer-generated straight trend line for the black data points.

In the chart, the overall extent of the NCAA RPI's discrimination is the spread between the high end of the trend line, on the left, and the low end, on the right.  On the chart, in the upper right, you will see a formula.  This is a formula for the trend line.  The formula tells the Actual Less Likely Winning Percentage difference at any point on the trend line.  When using the formula, for the strong region extreme, on the left, x = 1; and for the weak region extreme, on the right, x = 4.  Applying the formula using these x values, the trended Actual Less Likely Winning Percentage difference for the strongest region is  3.0% and for the weakest is  -1.7%.  The difference between these two numbers -- the Trend Spread -- is 4.7%.  This too is a measure of the fairness of the NCAA RPI when it rates teams from different regions, but rather than being only a measure of general fairness it is a measure of how much the NCAA RPI systematically discriminates based on region strength.

Looking at the chart, the West region is significantly underrated and one might conclude that this due to its having the best average NCAA RPI.  In other words, the NCAA RPI has the same problem with regions as it has with conferences.  However, if you look at the data points for the regions from left to right, their Winning Percentage differences go up and down.   Given there are only four data points, the chart leaves open the possibility that the variations in region performance may may be due in part to region strength but also/or may be due to some other factor.  I discuss this further on the RPI: Regional Issues page.

Observation About the Two Methods

If you compare the conference tables and charts for the two methods, they are similar.  This includes where the conferences' performances fit in relation to their NCAA RPI ratings, although the match is not identical.  If you compare the region tables and charts, they are very similar.  Since the two methods for evaluating the NCAA RPI in terms of how it handles rating of the conferences and regions in relation to other conferences and regions produce similar results, this strongly confirms the problem the NCAA RPI has with conferences and regions.