Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix task data #2107

Merged
merged 3 commits into from
Dec 13, 2017
Merged

Fix task data #2107

merged 3 commits into from
Dec 13, 2017

Conversation

mb706
Copy link
Contributor

@mb706 mb706 commented Dec 13, 2017

This fixes a few discrepancies I found in the task files in the data directory:

  • the "surv" tasks had type "regr" which was fixed in Fix 2101 surv task type #2102 but had not propagated to the builtin data. Affected lung.task, wpbc.task.
  • The "surv" tasks had the $censoring slot which was removed from "surv" tasks in Improve Survival stuff #1833. Affected lung.task, wpbc.task.
  • Some of the classif and multilabel tasks were missing the class.distribution slot. Affected bc.task.spatial, costiris.task, gunpoint.task, phoneme.task.
  • (minor) yeast.task's task.desc had names in a different order than other task.descs

The Assumption I am making here is that changeData(task, getTaskData(task, functionals.as = "matrix")) should always be all.equal the original task. I added a test that verifies this for the builtin data. Please don't merge this if this assumption is wrong.

The tasks were fixed from the original data using the script

# in the 'data' subdir
for (x in list.files()) {
  load(x)
  xname = gsub("\\.rda$", "", x)
  xdat = get(xname)
  xdat2 = changeData(xdat, getTaskData(xdat, functionals.as = "matrix"))
  if (isTRUE(all.equal(xdat, xdat2))) next
  assign(xname, xdat2)
  save(list=xname, file=x)
}

@larskotthoff
Copy link
Member

Thanks for fixing this, merging.

We should probably make it part of the travis build to regenerate the data files so things like this can't happen.

@larskotthoff larskotthoff merged commit fa29f45 into master Dec 13, 2017
@larskotthoff larskotthoff deleted the mb706_fix_task_data branch December 13, 2017 16:49
@mb706 mb706 mentioned this pull request Dec 13, 2017
zmjones pushed a commit that referenced this pull request Dec 19, 2017
* Fixing surv task data files for problem #2010

* renewing tasks that differed from what changeData() gives

* test for changed data format
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants