mlr::arsq gives 1.0 always and greater than mlr::rsq #2711

ManuKulamkombil · 2020-01-08T13:33:10Z

Prework

Didn't find any duplicates on this.

Description

After train and predict, arsq (adjusted r squared) gave result as 1 always. Also greater than rsq value. I guess the formula used for arsq is different from what I know of. I tried with custom r code and got the right answer. Tried to extend to arsq.v2 in my local system with required changes and got an answer equal to rsq and not arsq. I am currently using mlr version 2.14.0

Reproducible example

Learned that the formula given in the link

 1 - (1 - rsq) * (p / (n - p - 1L))

is not same as I expected it to be. My understanding of ARSQ is

 1 - (1 - rsq) * ((n - 1) / (n - p - 1L))

This is the inbuilt arsq:

mlr/R/measures.R

Line 268 in e124753

#' @export arsq

I tried to change that as follows:

arsq.v2 = makeMeasure(id = "arsq.v2", minimize = FALSE, best = 1, worst = 0,
                   properties = c("regr", "req.pred", "req.truth"),
                   name = "Adjusted coefficient of determination",
                   note = "Defined as: 1 - (1 - rsq) * ((n - 1) / (n - p - 1L)). Adjusted R-squared is only defined for normal linear regression.",
                   fun = function(task, model, pred, feats, extra.args) {
                       n = length(pred$data$truth)
                       p = length(model$features)
                       if (n == p + 1) {
                           warning("Adjusted R-squared is undefined if the number observations is equal to the number of independent variables plus one.")
                           return(NA_real_)
                       }
                       1 - (1 - measureRSQ(pred$data$truth, pred$data$response)) * ((n - 1) / (n - p - 1L))
                   })

Compared both with this

meas = mlr::performance(testPred, measures = list(mlr::rmse, mlr::mae, mlr::rsq, arsq, arsq.v2)); meas

Got the following results.

    rmse      mae      rsq     arsq  arsq.v2 
3.147790 1.187279 0.479620 1.000000 0.479620

Then tried arsq with custom rsq

preds = testPred$data$response
actual = testPred$data$truth
rss <- sum((preds - actual) ^ 2)  ## residual sum of squares
tss <- sum((actual - mean(actual)) ^ 2)  ## total sum of squares
rsq <- 1 - rss/tss; rsq

adj.r.squared = 1 - (1 - rsq) * ((n - 1)/(n-p-1)); adj.r.squared

Expected output

> actual = testPred$data$truth
> rss <- sum((preds - actual) ^ 2)  ## residual sum of squares
> tss <- sum((actual - mean(actual)) ^ 2)  ## total sum of squares
> rsq <- 1 - rss/tss; rsq
[1] 0.47962
> 
> adj.r.squared = 1 - (1 - rsq) * ((n - 1)/(n-p-1)); adj.r.squared
[1] 0.4594142

The text was updated successfully, but these errors were encountered:

pat-s · 2020-01-08T16:03:54Z

Thanks for reporting.

Can you please add a reproducible example, preferably using the reprex package?
Your issue is somewhat extensive but also a bit confusing.

Its great that you found out how to make a custom measure.
Next step is to find a scientific prove about how "arsq" is calculated and check whether our implementation is wrong.

Also you state: "gives always 1.0". Can you also give a reprex for this point?

ManuKulamkombil · 2020-01-09T06:40:19Z

# Setup ####
Install_And_Load <- function(packages) {
    k <- packages[!(packages %in% installed.packages()[,"Package"])];
    if(length(k))
    {install.packages(k, repos='https://cran.rstudio.com/');}
    
    for(package_name in packages)
    {suppressMessages(library(package_name,character.only=TRUE, quietly = TRUE));}
}

Install_And_Load(c("rapportools", 
                   "plyr" , "dplyr", "tidyr",
                   "plotly", "gganimate", "gifski", "png", "ggmosaic",
                   "tictoc", 
                   "pracma", "tis",
                   "purrr", "purrrlyr",
                   "cluster", "Rtsne", "mlr",
                   "kknn", "glmnet", "h2o", "caret",
                   "irace", "parallelMap", 
                   "parallel" , "mice",
                   # wordcloud
                   "wordcloud", 'wordcloud2', 'RColorBrewer',
                   'datarium'
))
#> Warning: package 'dplyr' was built under R version 3.5.3
#> Warning: package 'gganimate' was built under R version 3.5.3
#> Warning: package 'gifski' was built under R version 3.5.3
#> Warning: package 'ggmosaic' was built under R version 3.5.3
#> Warning: package 'pracma' was built under R version 3.5.3
#> Warning: package 'tis' was built under R version 3.5.3
#> Warning: package 'purrrlyr' was built under R version 3.5.3
#> Warning: package 'Rtsne' was built under R version 3.5.3
#> Warning: package 'mlr' was built under R version 3.5.3
#> Warning: package 'ParamHelpers' was built under R version 3.5.3
#> Warning: package 'kknn' was built under R version 3.5.3
#> Warning: package 'glmnet' was built under R version 3.5.3
#> Warning: package 'h2o' was built under R version 3.5.3
#> Warning: package 'irace' was built under R version 3.5.3
#> Warning: package 'parallelMap' was built under R version 3.5.3
#> Warning: package 'mice' was built under R version 3.5.3
#> Warning: package 'wordcloud' was built under R version 3.5.3
#> Warning: package 'wordcloud2' was built under R version 3.5.3
#> Warning: package 'datarium' was built under R version 3.5.3


# Load sample data ####
data("marketing", package = "datarium")

# Split data

set.seed(12345)
index <- sample(1:200, size = trunc(.7 * 200))
trainData <- marketing %>% filter(row_number() %in% index)
testData <- marketing %>% filter(!row_number() %in% index)

# Generate task ####

trainTask <- trainData %>%
    as.data.frame() %>%
    makeRegrTask(data = ., target = "sales", id = "Train data")

testTask <- testData %>%
    as.data.frame() %>%
    makeRegrTask(data = ., target = "sales", id = "Test data")

# Generate the learner ####
regr.lm.learner = makeLearner("regr.lm")

# Train the learner on trainData ####
model <- mlr::train(learner = regr.lm.learner, 
                                   task = trainTask)

# Predict on testData ####
testPred <- predict(model, task = testTask)


# Measure performance ####

# With custom arsq.v2
arsq.v2 = makeMeasure(id = "arsq.v2", minimize = FALSE, best = 1, worst = 0,
                      properties = c("regr", "req.pred", "req.truth"),
                      name = "Adjusted coefficient of determination",
                      note = "Defined as: 1 - (1 - rsq) * ((n - 1) / (n - p - 1L)). Adjusted R-squared is only defined for normal linear regression.",
                      fun = function(task, model, pred, feats, extra.args) {
                          n = length(pred$data$truth)
                          p = length(model$features)
                          if (n == p + 1) {
                              warning("Adjusted R-squared is undefined if the number observations is equal to the number of independent variables plus one.")
                              return(NA_real_)
                          }
                          1 - (1 - measureRSQ(pred$data$truth, pred$data$response)) * ((n - 1) / (n - p - 1L))
                      })


meas = mlr::performance(testPred, measures = list(mlr::rmse, mlr::mae, mlr::rsq, arsq, arsq.v2)); meas
#>      rmse       mae       rsq      arsq   arsq.v2 
#> 2.1012827 1.5839530 0.9021156 1.0000000 0.9021156


# finding the bug ####

# setting n and p 
n = length(testPred$data$truth); n # no of observations
#> [1] 60
p = length(model$features); p # no of features
#> [1] 3

# calculate arsq without any mlr 
preds = testPred$data$response
actual = testPred$data$truth
rss <- sum((preds - actual) ^ 2)  # residual sum of squares
tss <- sum((actual - mean(actual)) ^ 2)  # total sum of squares
r.squared <- 1 - rss/tss; r.squared
#> [1] 0.9021156
adj.r.squared = 1 - (1 - r.squared) * ((n - 1)/(n-p-1)); adj.r.squared
#> [1] 0.8968717


# calculate arsq using R.Square from mlr results

R.squared = meas[[3]]; R.squared
#> [1] 0.9021156
adj.r.squared = 1 - (1 - R.squared) * ((n - 1)/(n-p-1)); adj.r.squared
#> [1] 0.8968717

# using mlr::ARSQ

measures::ARSQ(truth = testPred$data$truth, 
               response = testPred$data$response, 
               n, 
               p)
#> [1] 0.9947562

# reprex::reprex()

^{Created on 2020-01-09 by the reprex package (v0.3.0)}

ManuKulamkombil · 2020-01-09T07:01:39Z

Basically two things

Definition of adjusted R square in mlr/measures library is different from what I know as given in wikipedia.

mlr::arsq gives value close to 1 and greater than mlr::rsq, where as adjusted r square should be always less than r square. This allerted me. I checked its code and found it to be different from what I know.

Ideally, arsq should be
adj.r.squared = 1 - (1 - r.squared) * ((n - 1)/(n-p-1))
but here (in mlr) it is
adj.r.squared = 1 - (1 - r.squared) * (p/(n-p-1)).

Let me know if I am wrong about this.

Unable to rectify.

I created arsq.v2 (using makeMeasure) with update in the formula part of mlr::arsq. But it gives same value as rsq. Where as same formula outside of mlr gives a value less than rsq.

pat-s · 2020-01-12T15:01:29Z

Ok, thanks for the research! I'll take a closer look in the next days and see what we can do on our side.

For backward comp we can't just change the implementation but would need to solve this via a new arg or similar (if we take action).

berndbischl · 2020-01-14T15:06:14Z

sure we can chnage this, if this is a bug, and we should change this.
i am also a little bit confused here, why it looks like it is implemened
this should / must really be correct

pat-s · 2020-01-14T16:08:14Z

sure we can chnage this, if this is a bug, and we should change this.

If there is only one implementation - it looks like there are maybe multiple ones floating around? I haven't yet taken a look how we did it in mlr3measures. If its clearly wrong then sure, we should just replace it.
If there are multiple "accepted" ways to do it, we should use an arg to allow both.

adibender · 2020-01-14T17:07:22Z

looks like a weird mix up between two definitions + mix up wrt the definition of p (number of features/number of parameters/number of parameters -1)

The two, equivalent definitions in German Wikipedia are

rsq - (1 - rsq) * (k / (n - p))

1 - (1 - rsq) * ((n - 1) / (n - p)

In the german Wikipedia p equals number of coefficients including intercept (beta_0) and k=p-1, in the English Wikipedia p equals number of coefficients minus intercept (beta_0).
So I think the formula is definitely wrong. However, I think there is more to consider.
The mlr code defines

p = length(model$features)

Are cases caught were the model is defined like lm(y ~ poly(x, degree = 3) -> one feature, 3 parameters + intercept ?

Also, in lm/summary.lm the adjusted r.squared is defined as

1 - (1 - rsq) * ((n-df.int)/rdf))

where df.int can be 0 if a model is specified without an intercept (lm(y ~ x - 1))
and rdf the residual degrees of freedom can depend on weights, i.e. if one of the weights is 0, then rdf automatically decreases by 1. So weights should be taken into account as well I think.

berndbischl · 2020-01-14T22:36:10Z

Are cases caught were the model is defined like lm(y ~ poly(x, degree = 3) -> one feature, 3 parameters + intercept ?

thats actually not really possible in mlr. well at least not directly, mlr does not have a formula interface. we do support specific learners for poly-regression though.

should we maybe simple remove the measure? it really does not make sense from a broad ML perspective?

adibender · 2020-01-15T09:12:55Z

should we maybe simple remove the measure? it really does not make sense from a broad ML perspective?

I wouldn't mind. It only really makes sense when comparing models with different number of parameters in sample, i.e. on training data (no need to adjust R^2 when evaluated on test data) and according to help page it is only available for "normal" linear models, i.e. OLS. So how often does one compare performance of different linear models on training data using mlr?

berndbischl · 2020-01-15T09:46:53Z

exactly, i agree 100%.

we can still advice @ManuKulamkombil how to produce a "local" measure for his purposes, as he seems to be the one user that needs the adj-r^2 with mlr :)
but it i read his top post correctly he already knows how to integragte his own local measure

berndbischl · 2020-01-15T09:47:00Z

all agree?

pat-s · 2020-01-15T10:09:25Z

Ok, I'll remove it.

ManuKulamkombil · 2020-01-15T10:41:32Z

exactly, i agree 100%.

we can still advice @ManuKulamkombil how to produce a "local" measure for his purposes, as he seems to be the one user that needs the adj-r^2 with mlr :)
but it i read his top post correctly he already knows how to integragte his own local measure

Agreed. But I was not successful in creating Arsq correctly. I created arsq.v2 but it gave a value equal to rsq and not less than that.

ManuKulamkombil · 2020-01-15T10:42:31Z

Please suggest corrections in arsq.v2

Thanks in advance

adibender · 2020-01-15T11:07:59Z

@ManuKulamkombil Are you sure you need Adjusted R^2?

You evaluate the performance on test data

mlr::performance(testPred, measures = list(mlr::rmse, mlr::mae, mlr::rsq, arsq, arsq.v2))

Adjusted R^2 is only useful when comparing different models on training data.

ManuKulamkombil · 2020-01-15T11:20:10Z

@ManuKulamkombil Are you sure you need Adjusted R^2?

You evaluate the performance on test data
mlr::performance(testPred, measures = list(mlr::rmse, mlr::mae, mlr::rsq, arsq, arsq.v2))
Adjusted R^2 is only useful when comparing different models on training data.

I am using it on training data for comparing models in my work, but I did not show that in my example. I wanted to make a brief example, so I skipped that part.
So yes, I need adjusted r^2.

adibender · 2020-01-17T16:22:56Z

You need to provide the model object to the performance function:

trainPred <- predict(model, task = trainTask)
mlr::performance(trainPred, model = model, measures = list(mlr::rmse, mlr::mae, mlr::rsq, mlr::arsq, arsq.v2))

Otherwise model is set to NULL and p = length(NULL$features) = 0 and (n-1)/(n-p-1) = 1

ManuKulamkombil added the type-bug label Jan 8, 2020

pat-s closed this as completed in bdeda0e Jan 19, 2020

vrodriguezf pushed a commit to vrodriguezf/mlr that referenced this issue Jan 16, 2021

* remove adjusted Rsq measure (arsq), fixes mlr-org#2711

f833bd9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mlr::arsq gives 1.0 always and greater than mlr::rsq #2711

mlr::arsq gives 1.0 always and greater than mlr::rsq #2711

ManuKulamkombil commented Jan 8, 2020 •

edited

Loading

pat-s commented Jan 8, 2020

ManuKulamkombil commented Jan 9, 2020

ManuKulamkombil commented Jan 9, 2020

pat-s commented Jan 12, 2020

berndbischl commented Jan 14, 2020

pat-s commented Jan 14, 2020

adibender commented Jan 14, 2020 •

edited

Loading

berndbischl commented Jan 14, 2020

adibender commented Jan 15, 2020

berndbischl commented Jan 15, 2020

berndbischl commented Jan 15, 2020

pat-s commented Jan 15, 2020

ManuKulamkombil commented Jan 15, 2020

ManuKulamkombil commented Jan 15, 2020

adibender commented Jan 15, 2020

ManuKulamkombil commented Jan 15, 2020

adibender commented Jan 17, 2020 •

edited

Loading

mlr::arsq gives 1.0 always and greater than mlr::rsq #2711

mlr::arsq gives 1.0 always and greater than mlr::rsq #2711

Comments

ManuKulamkombil commented Jan 8, 2020 • edited Loading

Prework

Description

Reproducible example

Expected output

pat-s commented Jan 8, 2020

ManuKulamkombil commented Jan 9, 2020

ManuKulamkombil commented Jan 9, 2020

pat-s commented Jan 12, 2020

berndbischl commented Jan 14, 2020

pat-s commented Jan 14, 2020

adibender commented Jan 14, 2020 • edited Loading

berndbischl commented Jan 14, 2020

adibender commented Jan 15, 2020

berndbischl commented Jan 15, 2020

berndbischl commented Jan 15, 2020

pat-s commented Jan 15, 2020

ManuKulamkombil commented Jan 15, 2020

ManuKulamkombil commented Jan 15, 2020

adibender commented Jan 15, 2020

ManuKulamkombil commented Jan 15, 2020

adibender commented Jan 17, 2020 • edited Loading

ManuKulamkombil commented Jan 8, 2020 •

edited

Loading

adibender commented Jan 14, 2020 •

edited

Loading

adibender commented Jan 17, 2020 •

edited

Loading