-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Eigen Memory Trees (EMT)
An Eigen Memory Tree (EMT) is a memory based learning reduction that works on examples with simple label types. EMTs will remember previous training examples and use this memory to assign labels to future requested predictions. In this way EMT is similar in kind to other memory based learners such as nearest neighbor, however, unlike nearest neighbors models EMT is able to learn and grow incrementally while preserving O(log n) time complexity. The original EMT paper describing the reduction in detail can be found here.
Python based contextual bandit experiments using EMT can be found at: https://github.com/mrucker/emt_experiments. These experiments create contextual bandit learners from an EMT by remembering the rewards received when taking an action in a given context. At prediction time the EMT is queried for every available context action pair. The context action pair that EMT predicts will receive the highest reward becomes the greedy action.
In order to use the EMT reduction in VW all one needs to do is pass --emt
when initializing VW. In addition to this there a few other optional flags.
vw --emt --emt_tree <tree mems> --emt_leaf <leaf mems> --emt_scorer <scorer type> --emt_router <router type> --emt_initial <initial metric>
The first flag --emt
is a required indicator that simply tells VW to create an EMT reduction. The first argument, <tree mems>
, indicates the number of memories the tree is allowed to have before memories begin to be pruned. This is an optional argument whose default value, 0, means that the an EMT tree will never forget memories. The second argument, <leaf mems>
, indicates how many memories a leaf in the tree can have before a leaf will be split into two smaller leaves. Leaf size is a trade-off parameter. In general larger leaf sizes lead to more accurate predictions at the cost of more compute time. The next two arguments <scorer type>
and <router type>
control how the tree is searched. By default these are set to use self-consistent rank scorer and eigen routing. The final argument, <initial metric>
determines the metric used to initialize the scoring function. By default this uses cosine similarity. In general the default values should work well for most problems.
EMT works with simple label types:
[Label] [Importance] [Base]
-
[Label]
should be a scalar value. -
[Importance]
determines how large of an update step to take. -
[Base]
is unused by EMT. EMT calculates its own base internally.
EMT also works with several existing VW flags due to it reducing to a regression learner. We have found the following flags can improve performance.
--min_prediction 0 --max_prediction 3 --interactions <polymomial namespace interactions>
Due to the mechanism used by EMT to compare memories bounding the min and max prediction used by the reduction learner can greatly improve performance. Additionally, adding a few simple non-linear namespace interactions can also improve the performance of the scorer reduction learner. None of these flags are necessary, but if you are not getting desired results from EMT they are easy to test.
- Home
- First Steps
- Input
- Command line arguments
- Model saving and loading
- Controlling VW's output
- Audit
- Algorithm details
- Awesome Vowpal Wabbit
- Learning algorithm
- Learning to Search subsystem
- Loss functions
- What is a learner?
- Docker image
- Model merging
- Evaluation of exploration algorithms
- Reductions
- Contextual Bandit algorithms
- Contextual Bandit Exploration with SquareCB
- Contextual Bandit Zeroth Order Optimization
- Conditional Contextual Bandit
- Slates
- CATS, CATS-pdf for Continuous Actions
- Automl
- Epsilon Decay
- Warm starting contextual bandits
- Efficient Second Order Online Learning
- Latent Dirichlet Allocation
- VW Reductions Workflows
- Interaction Grounded Learning
- CB with Large Action Spaces
- CB with Graph Feedback
- FreeGrad
- Marginal
- Active Learning
- Eigen Memory Trees (EMT)
- Element-wise interaction
- Bindings
-
Examples
- Logged Contextual Bandit example
- One Against All (oaa) multi class example
- Weighted All Pairs (wap) multi class example
- Cost Sensitive One Against All (csoaa) multi class example
- Multiclass classification
- Error Correcting Tournament (ect) multi class example
- Malicious URL example
- Daemon example
- Matrix factorization example
- Rcv1 example
- Truncated gradient descent example
- Scripts
- Implement your own joint prediction model
- Predicting probabilities
- murmur2 vs murmur3
- Weight vector
- Matching Label and Prediction Types Between Reductions
- Zhen's Presentation Slides on enhancements to vw
- EZExample Archive
- Design Documents
- Contribute: