Trang chủ‎ > ‎IT‎ > ‎Data Science - Python‎ > ‎

RMSE in Python, MATLAB or R

What is RMSE? What problem does it solve?

If you understand RMSE, asking for a library to do it for you is over-engineering. RMSE is a single line of python code at most 2 inches long.

RMSE answers the question: "How similar, on average, are the numbers in list1 to list2?". The two lists must be the same size. I want to "wash out the noise between any two given elements, wash out the size of the data collected, and get a single number feel for change over time".

Intuition and ELI5 for RMSE:

Imagine you are learning to throw darts at a dart board. Every day you practice for one hour. You want to figure out if you are getting better or getting worse. So every day you make 10 throws and measure the distance between the bullseye and where your dart hit.

You make a list of those numbers. Use the root mean squared error between the distances at day 1 and a list containing all zeros. Do the same on the 2nd and nth days. What you will get is a single number that hopefully decreases over time. When your RMSE number is zero, you hit bullseyes every time. If the number goes up, you are getting worse.

Example in calculating root mean squared error in python:

import numpy as np
d = [0.000, 0.166, 0.333]
p = [0.000, 0.254, 0.998]

print("d is: " + str(["%.8f" % elem for elem in d]))
print("p is: " + str(["%.8f" % elem for elem in p]))

def rmse(predictions, targets):
    return np.sqrt(((predictions - targets) ** 2).mean())

rmse_val = rmse(np.array(d), np.array(p))
print("rms error is: " + str(rmse_val))

Which prints:

d is: ['0.00000000', '0.16600000', '0.33300000']
p is: ['0.00000000', '0.25400000', '0.99800000']
rms error between lists d and p is: 0.387284994115

The mathematical notation:

enter image description here

The rmse done in small steps so it can be understood:

def rmse(predictions, targets):

    differences = predictions - targets                       #the DIFFERENCEs.

    differences_squared = differences ** 2                    #the SQUAREs of ^

    mean_of_differences_squared = differences_squared.mean()  #the MEAN of ^

    rmse_val = np.sqrt(mean_of_differences_squared)           #ROOT of ^

    return root_of_of_the_mean_of_the_differences_squared     #get the ^

How does every step of RMSE work:

Subtracting one number from another gives you the distance between them.

8 - 5 = 3         #distance between 8 and 5 is 3
-20 - 10 = -30    #distance between -20 and 10 is +30

If you multiply a number times itself, you get a positive:

3*3     = 9   = positive
-30*-30 = 900 = positive

Add them all up, but wait, then an array with many elements would have a larger error than a small array, so average them.

But wait, we squared them earlier to force them positive. Undo the damage with a square root!

That leaves you with a single number that represents, on average, the distance between every value of list1 to it's corresponding element value of list2.

If the RMSE value goes down over time we are happy because variance is decreasing.

Gotchas that can break this RMSE function:

If there are nulls or infinity in either input list, then output rmse value is is going to not make sense. There are three strategies to deal with nulls / missing values / infinities in either list: Ignore that component, zero it out or add a best guess or a uniform random noise to all timesteps. Each remedy has its pros and cons depending on what your data means. In general ignoring any component with a missing value is preferred, but this biases the RMSE toward zero making you think performance has improved when it really hasn't. Adding random noise on a best guess could be preferred if there are lots of missing values.

In order to guarantee relative correctness of the RMSE output, you must eliminate all nulls/infinites from the input.

sklearn.metrics has a mean_squared_error function. The RMSE is just the square root of whatever it returns.

from sklearn.metrics import mean_squared_error
from math import sqrt

rms = sqrt(mean_squared_error(y_actual, y_predicted))

n = len(predictions)
rmse = np.linalg.norm(predictions - targets) / np.sqrt(n)

return np.sqrt(np.mean((predictions-targets)**2))

For Keras:
def root_mean_squared_error(y_true, y_pred):
        return K.sqrt(K.mean(K.square(y_pred - y_true), axis=-1)) 

model.compile(optimizer = "rmsprop", loss = root_mean_squared_error, 
              metrics =["accuracy"])
(y - yhat)    % Errors
(y - yhat).^2   % Squared Error
mean((y - yhat).^2)   % Mean Squared Error
RMSE = sqrt(mean((y - yhat).^2));  % Root Mean Squared Error

An example in R to calculate RMSE and MAE:
# Function that returns Root Mean Squared Error
rmse <- function(error)
# Function that returns Mean Absolute Error
mae <- function(error)
# Example data
actual <- c(4, 6, 9, 10, 4, 6, 4, 7, 8, 7)
predicted <- c(5, 6, 8, 10, 4, 8, 4, 9, 8, 9)
# Calculate error
error <- actual - predicted
# Example of invocation of functions
# Example in a linear model
## Annette Dobson (1990) "An Introduction to Generalized Linear Models".
## Page 9: Plant Weight Data.
ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14)
trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69)
group <- gl(2, 10, 20, labels = c("Ctl","Trt"))
weight <- c(ctl, trt)
lm.D9 <- lm(weight ~ group)
rmse(lm.D9$residuals) # root mean squared error

Moreover, In R, there is a function named accuracy to measure the time series prediction performance:




Accuracy Measures For A Forecast Model

Returns range of summary measures of the forecast accuracy. If x is provided,
the function measures test set forecast accuracy based on x-f. If x is not provided,
the function only produces training set accuracy measures of the forecasts based
on f["x"]-fitted(f). All measures are defined and discussed in Hyndman and
Koehler (2006).


An object of class “forecast”, or a numerical vector containing forecasts.
It will also work with Arimaets and lm objects if x is omitted -- in
which case training set accuracy measures are returned.


Additional arguments depending on the specific method.


An optional numerical vector containing actual values of the same length
as object, or a time series overlapping with the times of f.


Indicator of which elements of x and f to test. If test is NULL,
all elements are used. Otherwise test is a numeric vector containing
the indices of the elements to use in the test.


An integer indicating the number of lag-1 differences to be used for
the denominator in MASE calculation. Default value is 1 for non-seasonal
series and 0 for seasonal series.


An integer indicating the number of seasonal differences to be used
for the denominator in MASE calculation. Default value is 0 for non-seasonal
series and 1 for seasonal series.


The measures calculated are:

  • ME: Mean Error

  • RMSE: Root Mean Squared Error

  • MAE: Mean Absolute Error

  • MPE: Mean Percentage Error

  • MAPE: Mean Absolute Percentage Error

  • MASE: Mean Absolute Scaled Error

  • ACF1: Autocorrelation of errors at lag 1.

By default, the MASE calculation is scaled using MAE of training set naive forecasts
for non-seasonal time series, training set seasonal naive forecasts for seasonal
time series and training set mean forecasts for non-time series data. If f is
a numerical vector rather than a forecast object, the MASE will not be returned
as the training data will not be available.