Skip to content
This repository has been archived by the owner on Apr 8, 2021. It is now read-only.

error-minimizing pruner #24

Closed
avibryant opened this issue Dec 21, 2014 · 3 comments
Closed

error-minimizing pruner #24

avibryant opened this issue Dec 21, 2014 · 3 comments
Assignees

Comments

@avibryant
Copy link
Contributor

This is only really relevant to people building single-tree models, but you should be able to prune a single tree to minimize validation error

@avibryant
Copy link
Contributor Author

To elaborate a bit more on the steps that would be needed here:

  • You'd need to add a method to Tree that took a Map[Int,T] with the validation distributions for each leaf, by ID, as well as an Error and Voter.
  • It should work its way recursively up the tree from the leaves, in each case checking the following:
    • Let's call the leaf training distributions TL and TR (for left and right) and the leaf validation distributions VL and VR (though actually our code should generalize to any number of children)
    • Let's use E(TL,VL) to denote the error object produced by comparing the training and validation distributions (this actually looks like error.create(tl, voter.combine(Some(vl)))).
    • We have semigroups for both distributions and errors; let's use + to denote combining them.
    • We want to prune these leaves iff E(TL + TR, VL + VR) <= E(TL,VL) + E(TR,VR)
  • Once we have this method on Tree, we want a method on Trainer that will construct the Map[Int,T] from the trainingData for each tree, and then transform the trees using the prune method.

@roban roban self-assigned this Feb 19, 2015
@roban
Copy link
Contributor

roban commented Feb 26, 2015

In progress at #36

@roban
Copy link
Contributor

roban commented Feb 26, 2015

Closed by #36

@roban roban closed this as completed Feb 26, 2015
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants