forked from mlr-org/mlr
-
Notifications
You must be signed in to change notification settings - Fork 0
/
NEWS
310 lines (291 loc) · 13.3 KB
/
NEWS
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
mlr_2.4:
- WrappedModel printer was slightly improved
- ReampleResult now stores the runtime it took to resample in a slot
- getTaskFormula / getTaskFormulaAsString have new argument 'explicit.features'
- getTaskData now has recodeY = "drop.levels" which drops empty factor levels
- option fix.factors in makeLearner was renamed to fix.factors.prediction for clarity
- showHyperPars was removed. getParamSet does exactly the same thing
- 'resample' and 'benchmark' got the argument keep.pred,
setting it to FALSE allows to discard the prediction objects to save memory
- we had to slightly change how the mem usage is reported in tuning and feature selection
See TuneControl and FeatSelControl where it is documented what is done now.
- tuneIrace: allows to set the precision / digits within irace (using the argument 'digits'
in makeTuneControlIrace); default is maximum precision
- for plotting in general we try to introduce a "data layer", so the data can be generated independently
of the plotting first, into well-defined objects; these can then be plotted with mlr or custom code;
the naming scheme is always generate<Foo>Data and plot<Foo>
- getFilterValues is deprecated in favor of generateFilterValuesData
- plotFilterValues can now plot multiple filter methods using facetting
- plotROCRCurves has been rewritten to use ggplot2
- classif.ada: added "loss" hyperpar
- add missings properties to all ctree and cforest methods:
regr/classif for ctree, regr/classif/surv for cforest, and regr/classif for blackboost
- learner xgboost was removed, because the package is not on CRAN anymore, unfortunately
- reg.km: added param 'iso'
- classif.mda: added param 'start.method' and changed its default to 'lvq', added params 'sub.df', 'tot.df'
and 'criterion'
- classif.randomForest: 'sampsize' can now be an int vector (instead of a scalar)
- plotThreshVsPerf and plotLearningCurve now have param 'facet'
- new functions
-- getTaskSize
-- getNestedTuneResultsX, getNestedTuneResultsOptPathDf
-- tuneDesign
-- generateROCRCurvesData, generateFilterValuesData, generateLearningCurveData, plotLearningCurve,
generateThreshVsPerfData, plotThreshVsPerf
-- generateThreshVsPerfData accepts Prediction, ResampleResult, lists of ResampleResult,
and BenchmarkResult objects.
-- experimental ggvis functions: plotROCRCurvesGGVIS, plotLearningCurveGGVIS,
plotTuneMultiCritResultGGVIS, plotThreshVsPerfGGVIS, and plotFilterValuesGGVIS
- new learners:
-- classif.bst
-- classif.hdrda
-- classif.nodeHarvest
-- classif.pamr
-- classif.rFerns
-- classif.sparseLDA
-- regr.bst
-- regr.frbs
-- regr.nodeHarvest
-- regr.slim
- new measures:
-- brier
mlr_2.3:
- resample now returns an object of class ResampleResult (downward compatible)
to allow for a print method.
- resampling on features now supported for an arbitrary number of factor features
- mlr supports ViperCharts plots now
- ROC plot via ROCR can now be created automatically, before you had to call asROCRPrediction,
then construct the plots via ROCR your self. See plotROCRCurves
- all mlr measures now have slots "name" and "note"
- exported a few very simple "getters" for tasks, see below
- in makeLearner a probability predict.threshold can be set for classifiers, also see setPredictThreshold
- in the control objects for tuning and feature selection, the user can now enable threshold tuning
- in the control objects for tuning and feature selection, the user can now define his own logging function
- default console logging for tuneParams and selectFeatures is more informative, it displays time and memory info
- updated some properties of some learners
- Default arguments of classif.bartMachine, classif.randomForestSRC, regr.randomForestSRC and sur.randomForestSRC
have been changed to allow missing data support with default settings.
- externalized measure functions to be used on vectors.
- some minor bug fixes
- required basic learner packages are not loaded into the global namespace anymore, requireNamespace
is used internally instead. this ensures less name clashes and name shadowing
- resample passes dot arguments to the learner hyperpars
- new option "on.par.out.of.bounds" to disable out-of-bound checks for model parameters
- measures were slightly internally changed. they expose more properties (check ?Measure) and some
now unnecessary object slots were removed
- classif.lda and classif.qda now have hyperpar "predict.method"
- filterFeatures and makeFilterWrapper gain an argument for mandatory features
- plotLearnerPrediction has new option "err.size"
- classif.plsDA and cluster.DBscan for now removed because of problems with the underlying learning algorithm
- new aggregation test.join
- the following models now can handle factors and ordereds by extra dummy or int encoding:
classif.glmnet, regr.glmnet, surv.glmnet, surv.cvglmnet, surv.penalized,
surv.optimCoxBoostPenalty, surv.glmboost, surv.CoxBoost
- new functions
-- getTaskType, getTaskId, getTaskTargetNames
-- plotROCRCurves
-- plotViperCharts
-- measureSSE, measureMSE, measureRMSE, measureMEDSE, ...
-- PreprocWrapperCaret
-- setPredictThreshold
- new learners:
-- classif.bdk
-- classif.binomial
-- classif.extraTrees
-- classif.probit
-- classif.xgboost
-- classif.xyf
-- regr.bartMachine
-- regr.bcart
-- regr.bdk
-- regr.bgp
-- regr.bgpllm
-- regr.blm
-- regr.brnn
-- regr.btgp
-- regr.btgpllm
-- regr.btlm
-- regr.cubist
-- regr.elmNN
-- regr.extraTrees
-- regr.laGP
-- regr.xgboost
-- regr.xyf
-- surv.rpart
mlr_2.2:
- The web tutorial was MUCH improved!
- more example tasks and data sets
- Learners and tasks now support ordered factors as features.
The task description knows whether ordered factors are present and it is checked whether the learner
supports such a feature. We have set this property 'ordered' very conservatively, so very
few learners have it, where we are sure ordered inputs are handled correctly during training.
If you know of more models that support this, please inform us.
- basic R learners now have new slots: name (a descriptive name of the algorithm),
short.name (abbreviation that can be used in plots and tables) and note
(notes regarding slight changes for the mlr integration of the learner and such).
- makeLearner now supports some options regarding learner error handling and output which
could before only be set globally via configureMlr
- Additional arguments for imputation functions to allow a more fine-grain
control of dummy column creation
- imputeMin and imputeMax now subtract or add a multiple of the range of
the data from the minimum or to the maximum, respectively.
- cluster methods now have property 'prob' when they support fuzzy cluster membership probabilities,
and also then support predict.type = 'prob'. Everything basically works the same as for posterior
probabilities in classif.* methods.
- predict preserves the rownames of the input in its output
- fixed a bug in createDummyFeatures that caused an error when the data contained missing values.
- plotLearnerPrediction works for clustering and allows greyscale plots (for printing or articles)
- the whole object-oriented structure behind feature filtering was much
improved. Smaller changes in the signature of makeFilterWrapper and
filterFeatures have become necessary.
- fixed a bug in filter methods of the FSelector package that caused an error when variable names
contained accented letters
- filterFeatures can now be also applied to the result of getFilterValues
- We dropped the data.frame version of some preprocessing operations like mergeFactorLevelsBySize,
joinClassLevels and removeConstantFeatures for consistency. These now always require tasks as input.
- We support a pretty generic framework for stacking / super-learning now, see makeStackedLearner
- imbalancy correction + smote:
-- fix a bug in "smote" when only factor features are present
-- change to oversampling: sample new observations only (with replacement)
--- extension to smote algorithm (sampling): minority class observations in binary classification
are either chosen via sampling or alternatively, each minority class observation is used an
equal number of times
- made the getters for BenchmarkResult more consistent. These are now:
getBMRTaskIds, getBMRLearnerIds, getBMRPredictions, getBMRPerformances, getBMRAggrPerformances
getBMRTuneResults, getFeatSelResults, getBMRFilteredFeatures
The following methods do not work for BenchmarkResult anymore: getTuneResult, getFeatSelResult
- Removed getFilterResult, because it does the same as getFilteredFeatures
- new learners:
-- classif.bartMachine
-- classif.lqa
-- classif.randomForestSRC
-- classif.sda
-- regr.ctree
-- regr.plsr
-- regr.randomForestSRC
-- cluster.cmeans
-- cluster.DBScan
-- cluster.kmeans
-- cluster.FarthestFirst
-- surv.cvglmnet
-- surv.optimCoxBoostPenalty
- new filters:
-- variance
-- univariate
-- carscore
-- rf.importance, rf.min.depth
-- anova.test, kruskal.test
-- mrmr
- new functions
-- makeMulticlassWrapper
-- makeStackedLearner, getStackedBaseLearnerPredictions
-- joinClassLevels
-- summarizeColumns, summarizeLevels
-- capLargeValues, mergeFactorLevelsBySize
mlr_2.1:
- mlr now supports multi-criteria tuning
- mlr now supports cluster analysis (experimental)
- improve makeWeightedClassesWrapper: Hyperparams for class weighting are now supported, too.
- removed fix.factors option from randomForest, but added it in general to makeLearner,
so it now works for all learners.
Helps when feature factor levels where dropped in newdata prediction data.frames
- more consistent results for tuning algorithms and parameters with "trafos" :
we always return the optimal settings on the transformed scale, but in the opt.path in
the original scale.
- fix a bug when feature filtering resulted in a NoFeatureModel
- resample now returns a data.frame "err.mgs" or error messages that might have occurred during resampling
- stratified resampling for survival
- new learners:
-- classif.cforest
-- classif.glmnet
-- classif.plsdaCaret
-- regr.cforest
-- regr.glmnet
-- regr.svm
-- surv.cforest
-- cluster.SimpleKMeans
-- cluster.EM
-- cluster.XMeans
- new measures
-- bac
-- db, dunn, g1, g2, silhouette
- new functions
-- makeClusterTask
-- removeHyperPars
-- tuneParamsMultiCrit
-- makeTuneMultiCritControlGrid, makeTuneMultiCritControlRandom, makeTuneMultiCritControlNSGA2
-- plotTuneMultiCritResult
-- getFailureModelMsg
mlr_2.0:
- mlr now supports survival analysis models (experimental)
- mlr now supports cost-sensitive learning with example-specific costs (experimental)
- Some example tasks and data sets were added for simple access
- added FeatSelWrapper and getFeatSelResult
- performance functions now allows to compute multiple measures
- added multiclass.roc performance measure
- observation weights can now also be specified in the task
- added option on.learner.warning to configureMlr to suppress warnings in learners
- fixed a bug in stratified CV where elements where not distributed as evenly as possible
when the split number did not divide the number of observation
- added class.weights param for classif.svm
- add fix.factors.prediction option to randomForest
- generic standard error estimation in randomForest and BaggingWrapper
- added fixup.data option to task constructors, so basic data cleanup can be performed
- show.info is now an option in configureMlr
- learners now support taggable properties that can be queried and changed. also see below.
- listLearners(forTask) was unified
- removed tuning via R' optim method (makeTuneControlOptim), as the optimizers in there really make
no sense for tuning
- Grid search was improved so one does not have to discretize parameters manually anymore
(although this is still possible). Instead one now passes a 'resolution' argument. Internally we
now use ParamHelpers::generateGridDesign for this.
- toy tasks were added for convenient usage: iris.task, sonar.task, bh.task
they also also have corresponding resampling instances, so you directly start working, e.g., iris.rin
- new learners:
-- classif.knn
-- classif.IBk
-- classif.LiblineaRBinary
-- classif.LiblineaRLogReg
-- classif.LiblineaRMultiClass
-- classif.linDA
-- classif.plr
-- classif.plsDA
-- classif.rrlda
-- regr.crs
-- regr.IBk
-- regr.mob
-- surv.CoxBoost
-- surv.coxph
-- surv.glmboost
-- surv.glmnet
-- surv.penalized
-- surv.randomForestSRC
- new measures
-- multiauc
-- cindex
-- meancosts, mcp
- new functions
-- removeConstantFeatures, normalizeFeatures, dropFeatures, createDummyFeatures
-- getTaskNFeats
-- hasProperties, getProperties, setProperties, addProperties, removeProperties
-- showHyperPars
-- setId
-- listMeasures
-- isFailureModel
-- plotLearnerPrediction
-- plotThreshVsPerf
-- holdout, subsample, crossval, repcv, bootstrapOOB, bootstrapB632, bootstrapB632plus
-- listFilterMethods, getFilterValues, filterFeatures, makeFilterWrapper, plotFilterValues
-- benchmark
-- getPerformances, getAggrPerformances, getPredictions, getFilterResult, getTuneResult, getFeatSelResult
-- oversample, undersample, makeOversampleWrapper, makeUndersampleWrapper
-- smote, makeSmoteWrapper
-- downsample, makeDownsampleWrapper
-- makeWeightedClassesWrapper
-- makeTuneControlGenSA
-- makeModelMultiplexer, makeModelMultiplexerParamSet
-- makeCostSensTask, makeCostSensClassifWrapper, makeCostSensRegrWrapper, makeCostsSensWeightedPairsLearner
-- makeSurvTask
-- impute, reimpute, makeImputeWrapper, lots of impute<Method>, makeImputeMethod
mlr_1.1:
- Initial release to CRAN