Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mlr::arsq gives 1.0 always and greater than mlr::rsq #2711

Closed
1 task
ManuKulamkombil opened this issue Jan 8, 2020 · 17 comments
Closed
1 task

mlr::arsq gives 1.0 always and greater than mlr::rsq #2711

ManuKulamkombil opened this issue Jan 8, 2020 · 17 comments
Labels

Comments

@ManuKulamkombil
Copy link

ManuKulamkombil commented Jan 8, 2020

Prework

  • Didn't find any duplicates on this.

Description

After train and predict, arsq (adjusted r squared) gave result as 1 always. Also greater than rsq value. I guess the formula used for arsq is different from what I know of. I tried with custom r code and got the right answer. Tried to extend to arsq.v2 in my local system with required changes and got an answer equal to rsq and not arsq. I am currently using mlr version 2.14.0

Reproducible example

Learned that the formula given in the link

 1 - (1 - rsq) * (p / (n - p - 1L))

is not same as I expected it to be. My understanding of ARSQ is

 1 - (1 - rsq) * ((n - 1) / (n - p - 1L))

This is the inbuilt arsq:

#' @export arsq

I tried to change that as follows:

arsq.v2 = makeMeasure(id = "arsq.v2", minimize = FALSE, best = 1, worst = 0,
                   properties = c("regr", "req.pred", "req.truth"),
                   name = "Adjusted coefficient of determination",
                   note = "Defined as: 1 - (1 - rsq) * ((n - 1) / (n - p - 1L)). Adjusted R-squared is only defined for normal linear regression.",
                   fun = function(task, model, pred, feats, extra.args) {
                       n = length(pred$data$truth)
                       p = length(model$features)
                       if (n == p + 1) {
                           warning("Adjusted R-squared is undefined if the number observations is equal to the number of independent variables plus one.")
                           return(NA_real_)
                       }
                       1 - (1 - measureRSQ(pred$data$truth, pred$data$response)) * ((n - 1) / (n - p - 1L))
                   })

Compared both with this

meas = mlr::performance(testPred, measures = list(mlr::rmse, mlr::mae, mlr::rsq, arsq, arsq.v2)); meas

Got the following results.

    rmse      mae      rsq     arsq  arsq.v2 
3.147790 1.187279 0.479620 1.000000 0.479620 

Then tried arsq with custom rsq

preds = testPred$data$response
actual = testPred$data$truth
rss <- sum((preds - actual) ^ 2)  ## residual sum of squares
tss <- sum((actual - mean(actual)) ^ 2)  ## total sum of squares
rsq <- 1 - rss/tss; rsq

adj.r.squared = 1 - (1 - rsq) * ((n - 1)/(n-p-1)); adj.r.squared

Expected output

> actual = testPred$data$truth
> rss <- sum((preds - actual) ^ 2)  ## residual sum of squares
> tss <- sum((actual - mean(actual)) ^ 2)  ## total sum of squares
> rsq <- 1 - rss/tss; rsq
[1] 0.47962
> 
> adj.r.squared = 1 - (1 - rsq) * ((n - 1)/(n-p-1)); adj.r.squared
[1] 0.4594142
@pat-s
Copy link
Member

pat-s commented Jan 8, 2020

Thanks for reporting.

Can you please add a reproducible example, preferably using the reprex package?
Your issue is somewhat extensive but also a bit confusing.

Its great that you found out how to make a custom measure.
Next step is to find a scientific prove about how "arsq" is calculated and check whether our implementation is wrong.

Also you state: "gives always 1.0". Can you also give a reprex for this point?

@ManuKulamkombil
Copy link
Author

# Setup ####
Install_And_Load <- function(packages) {
    k <- packages[!(packages %in% installed.packages()[,"Package"])];
    if(length(k))
    {install.packages(k, repos='https://cran.rstudio.com/');}
    
    for(package_name in packages)
    {suppressMessages(library(package_name,character.only=TRUE, quietly = TRUE));}
}

Install_And_Load(c("rapportools", 
                   "plyr" , "dplyr", "tidyr",
                   "plotly", "gganimate", "gifski", "png", "ggmosaic",
                   "tictoc", 
                   "pracma", "tis",
                   "purrr", "purrrlyr",
                   "cluster", "Rtsne", "mlr",
                   "kknn", "glmnet", "h2o", "caret",
                   "irace", "parallelMap", 
                   "parallel" , "mice",
                   # wordcloud
                   "wordcloud", 'wordcloud2', 'RColorBrewer',
                   'datarium'
))
#> Warning: package 'dplyr' was built under R version 3.5.3
#> Warning: package 'gganimate' was built under R version 3.5.3
#> Warning: package 'gifski' was built under R version 3.5.3
#> Warning: package 'ggmosaic' was built under R version 3.5.3
#> Warning: package 'pracma' was built under R version 3.5.3
#> Warning: package 'tis' was built under R version 3.5.3
#> Warning: package 'purrrlyr' was built under R version 3.5.3
#> Warning: package 'Rtsne' was built under R version 3.5.3
#> Warning: package 'mlr' was built under R version 3.5.3
#> Warning: package 'ParamHelpers' was built under R version 3.5.3
#> Warning: package 'kknn' was built under R version 3.5.3
#> Warning: package 'glmnet' was built under R version 3.5.3
#> Warning: package 'h2o' was built under R version 3.5.3
#> Warning: package 'irace' was built under R version 3.5.3
#> Warning: package 'parallelMap' was built under R version 3.5.3
#> Warning: package 'mice' was built under R version 3.5.3
#> Warning: package 'wordcloud' was built under R version 3.5.3
#> Warning: package 'wordcloud2' was built under R version 3.5.3
#> Warning: package 'datarium' was built under R version 3.5.3


# Load sample data ####
data("marketing", package = "datarium")

# Split data

set.seed(12345)
index <- sample(1:200, size = trunc(.7 * 200))
trainData <- marketing %>% filter(row_number() %in% index)
testData <- marketing %>% filter(!row_number() %in% index)

# Generate task ####

trainTask <- trainData %>%
    as.data.frame() %>%
    makeRegrTask(data = ., target = "sales", id = "Train data")

testTask <- testData %>%
    as.data.frame() %>%
    makeRegrTask(data = ., target = "sales", id = "Test data")

# Generate the learner ####
regr.lm.learner = makeLearner("regr.lm")

# Train the learner on trainData ####
model <- mlr::train(learner = regr.lm.learner, 
                                   task = trainTask)

# Predict on testData ####
testPred <- predict(model, task = testTask)


# Measure performance ####

# With custom arsq.v2
arsq.v2 = makeMeasure(id = "arsq.v2", minimize = FALSE, best = 1, worst = 0,
                      properties = c("regr", "req.pred", "req.truth"),
                      name = "Adjusted coefficient of determination",
                      note = "Defined as: 1 - (1 - rsq) * ((n - 1) / (n - p - 1L)). Adjusted R-squared is only defined for normal linear regression.",
                      fun = function(task, model, pred, feats, extra.args) {
                          n = length(pred$data$truth)
                          p = length(model$features)
                          if (n == p + 1) {
                              warning("Adjusted R-squared is undefined if the number observations is equal to the number of independent variables plus one.")
                              return(NA_real_)
                          }
                          1 - (1 - measureRSQ(pred$data$truth, pred$data$response)) * ((n - 1) / (n - p - 1L))
                      })


meas = mlr::performance(testPred, measures = list(mlr::rmse, mlr::mae, mlr::rsq, arsq, arsq.v2)); meas
#>      rmse       mae       rsq      arsq   arsq.v2 
#> 2.1012827 1.5839530 0.9021156 1.0000000 0.9021156


# finding the bug ####

# setting n and p 
n = length(testPred$data$truth); n # no of observations
#> [1] 60
p = length(model$features); p # no of features
#> [1] 3

# calculate arsq without any mlr 
preds = testPred$data$response
actual = testPred$data$truth
rss <- sum((preds - actual) ^ 2)  # residual sum of squares
tss <- sum((actual - mean(actual)) ^ 2)  # total sum of squares
r.squared <- 1 - rss/tss; r.squared
#> [1] 0.9021156
adj.r.squared = 1 - (1 - r.squared) * ((n - 1)/(n-p-1)); adj.r.squared
#> [1] 0.8968717


# calculate arsq using R.Square from mlr results

R.squared = meas[[3]]; R.squared
#> [1] 0.9021156
adj.r.squared = 1 - (1 - R.squared) * ((n - 1)/(n-p-1)); adj.r.squared
#> [1] 0.8968717

# using mlr::ARSQ

measures::ARSQ(truth = testPred$data$truth, 
               response = testPred$data$response, 
               n, 
               p)
#> [1] 0.9947562

# reprex::reprex()

Created on 2020-01-09 by the reprex package (v0.3.0)

@ManuKulamkombil
Copy link
Author

Basically two things

  1. Definition of adjusted R square in mlr/measures library is different from what I know as given in wikipedia.

mlr::arsq gives value close to 1 and greater than mlr::rsq, where as adjusted r square should be always less than r square. This allerted me. I checked its code and found it to be different from what I know.

Ideally, arsq should be
adj.r.squared = 1 - (1 - r.squared) * ((n - 1)/(n-p-1))
but here (in mlr) it is
adj.r.squared = 1 - (1 - r.squared) * (p/(n-p-1)).

Let me know if I am wrong about this.

  1. Unable to rectify.

I created arsq.v2 (using makeMeasure) with update in the formula part of mlr::arsq. But it gives same value as rsq. Where as same formula outside of mlr gives a value less than rsq.

@pat-s
Copy link
Member

pat-s commented Jan 12, 2020

Ok, thanks for the research! I'll take a closer look in the next days and see what we can do on our side.

For backward comp we can't just change the implementation but would need to solve this via a new arg or similar (if we take action).

@berndbischl
Copy link
Member

sure we can chnage this, if this is a bug, and we should change this.
i am also a little bit confused here, why it looks like it is implemened
this should / must really be correct

@pat-s
Copy link
Member

pat-s commented Jan 14, 2020

sure we can chnage this, if this is a bug, and we should change this.

If there is only one implementation - it looks like there are maybe multiple ones floating around? I haven't yet taken a look how we did it in mlr3measures. If its clearly wrong then sure, we should just replace it.
If there are multiple "accepted" ways to do it, we should use an arg to allow both.

@adibender
Copy link

adibender commented Jan 14, 2020

looks like a weird mix up between two definitions + mix up wrt the definition of p (number of features/number of parameters/number of parameters -1)

The two, equivalent definitions in German Wikipedia are

rsq - (1 - rsq) * (k / (n - p))
1 - (1 - rsq) * ((n - 1) / (n - p)

In the german Wikipedia p equals number of coefficients including intercept (beta_0) and k=p-1, in the English Wikipedia p equals number of coefficients minus intercept (beta_0).
So I think the formula is definitely wrong. However, I think there is more to consider.
The mlr code defines

p = length(model$features)

Are cases caught were the model is defined like lm(y ~ poly(x, degree = 3) -> one feature, 3 parameters + intercept ?

Also, in lm/summary.lm the adjusted r.squared is defined as

1 - (1 - rsq) * ((n-df.int)/rdf))

where df.int can be 0 if a model is specified without an intercept (lm(y ~ x - 1))
and rdf the residual degrees of freedom can depend on weights, i.e. if one of the weights is 0, then rdf automatically decreases by 1. So weights should be taken into account as well I think.

@berndbischl
Copy link
Member

Are cases caught were the model is defined like lm(y ~ poly(x, degree = 3) -> one feature, 3 parameters + intercept ?

thats actually not really possible in mlr. well at least not directly, mlr does not have a formula interface. we do support specific learners for poly-regression though.

should we maybe simple remove the measure? it really does not make sense from a broad ML perspective?

@adibender
Copy link

should we maybe simple remove the measure? it really does not make sense from a broad ML perspective?

I wouldn't mind. It only really makes sense when comparing models with different number of parameters in sample, i.e. on training data (no need to adjust R^2 when evaluated on test data) and according to help page it is only available for "normal" linear models, i.e. OLS. So how often does one compare performance of different linear models on training data using mlr?

@berndbischl
Copy link
Member

exactly, i agree 100%.

we can still advice @ManuKulamkombil how to produce a "local" measure for his purposes, as he seems to be the one user that needs the adj-r^2 with mlr :)
but it i read his top post correctly he already knows how to integragte his own local measure

@berndbischl
Copy link
Member

all agree?

@pat-s
Copy link
Member

pat-s commented Jan 15, 2020

Ok, I'll remove it.

@ManuKulamkombil
Copy link
Author

exactly, i agree 100%.

we can still advice @ManuKulamkombil how to produce a "local" measure for his purposes, as he seems to be the one user that needs the adj-r^2 with mlr :)
but it i read his top post correctly he already knows how to integragte his own local measure

Agreed. But I was not successful in creating Arsq correctly. I created arsq.v2 but it gave a value equal to rsq and not less than that.

@ManuKulamkombil
Copy link
Author

Please suggest corrections in arsq.v2

Thanks in advance

@adibender
Copy link

@ManuKulamkombil Are you sure you need Adjusted R^2?

You evaluate the performance on test data

mlr::performance(testPred, measures = list(mlr::rmse, mlr::mae, mlr::rsq, arsq, arsq.v2))

Adjusted R^2 is only useful when comparing different models on training data.

@ManuKulamkombil
Copy link
Author

@ManuKulamkombil Are you sure you need Adjusted R^2?

You evaluate the performance on test data

mlr::performance(testPred, measures = list(mlr::rmse, mlr::mae, mlr::rsq, arsq, arsq.v2))

Adjusted R^2 is only useful when comparing different models on training data.

I am using it on training data for comparing models in my work, but I did not show that in my example. I wanted to make a brief example, so I skipped that part.
So yes, I need adjusted r^2.

@adibender
Copy link

adibender commented Jan 17, 2020

You need to provide the model object to the performance function:

trainPred <- predict(model, task = trainTask)
mlr::performance(trainPred, model = model, measures = list(mlr::rmse, mlr::mae, mlr::rsq, mlr::arsq, arsq.v2))

Otherwise model is set to NULL and p = length(NULL$features) = 0 and (n-1)/(n-p-1) = 1

@pat-s pat-s closed this as completed in bdeda0e Jan 19, 2020
vrodriguezf pushed a commit to vrodriguezf/mlr that referenced this issue Jan 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants