-
Notifications
You must be signed in to change notification settings - Fork 1.9k
JSON
Markus Cozowicz edited this page Oct 22, 2016
·
17 revisions
The native C++ codebase can ingest JSON by passing --json
The following JSON format can be ingested into VW:
- Top-level properties are considered features for the default namespace.
- Top-level non-features are considered namespaces.
- Features are JSON strings, integer, float, boolean, arrays of integers and/or floats.
- Top-level properties starting with _ are ignored, except if they match a special property (e.g. "_label", "_multi", "_text").
- Labels can be passed using top-level "_label" property. This is also supported for multiline examples, but the label needs to be part of one of the multiline examples.
- If the JSON value is either a string, integer or float is converted to a string and passed directly to VW label parser.
- If the JSON value is an object, the first property needs to match one of the JSON properties of SimpleLabel or ContextualBanditLabel.
- Special text handling through "_text": properties named "_text" are processed using string splitting and not string escaping (see sample below).
- Multiline examples as used by contextual bandits are specified by using the "_multi" property. Each entry itself is an example as described above and can optionally contain a label. The top-level properties are used for the optional shared example.
The C# layer can ingest
- JSON strings
- JSON.NET's JsonReader
- C# objects serializable to the above JSON format using JSON.NET serializing rules. Thus JsonProperty annotations are inspected and so on. This is particularly useful if one needs to score a given object, then serialize it JSON and train from the JSON serialization as it circumvents the de-serialization for the scoring part.
JSON | VW String |
---|---|
{
"f1":25,"f2":true,
"_aux":"some ignored info"
} |
| f1:25 f2 |
{
"ns1":{"location":"New York"},
"f2":[1,0.2,3]
} |
|ns1 New_York | :1 :.2 :.3 |
{
"ns1":{"location":"New York"},
"ns2":{"f2":3.4},"_label":1
} |
1 |ns1 New_York |ns2 f2:3.4 |
{
"ns1":{"location":"New York", "f2":3.4},
"_label":{"Label":2,"Weight":0.3}
} |
2 0.3 |ns1 New_York f2:3.4 |
{
"x":2,
"_text":"elections US iowa"
} |
| x:2 elections US iowa |
{
"UserAge":15,
"_multi":[
{"_text":"elections maine", "Source":"TV"},
{"Source":"www", "topic":4, "_label":"2:3:.3"}
]
} |
shared | UserAge:15 | elections maine SourceTV 2:3:.3 | Sourcewww topic:4 |
- Home
- First Steps
- Input
- Command line arguments
- Model saving and loading
- Controlling VW's output
- Audit
- Algorithm details
- Awesome Vowpal Wabbit
- Learning algorithm
- Learning to Search subsystem
- Loss functions
- What is a learner?
- Docker image
- Model merging
- Evaluation of exploration algorithms
- Reductions
- Contextual Bandit algorithms
- Contextual Bandit Exploration with SquareCB
- Contextual Bandit Zeroth Order Optimization
- Conditional Contextual Bandit
- Slates
- CATS, CATS-pdf for Continuous Actions
- Automl
- Epsilon Decay
- Warm starting contextual bandits
- Efficient Second Order Online Learning
- Latent Dirichlet Allocation
- VW Reductions Workflows
- Interaction Grounded Learning
- CB with Large Action Spaces
- CB with Graph Feedback
- FreeGrad
- Marginal
- Active Learning
- Eigen Memory Trees (EMT)
- Element-wise interaction
- Bindings
-
Examples
- Logged Contextual Bandit example
- One Against All (oaa) multi class example
- Weighted All Pairs (wap) multi class example
- Cost Sensitive One Against All (csoaa) multi class example
- Multiclass classification
- Error Correcting Tournament (ect) multi class example
- Malicious URL example
- Daemon example
- Matrix factorization example
- Rcv1 example
- Truncated gradient descent example
- Scripts
- Implement your own joint prediction model
- Predicting probabilities
- murmur2 vs murmur3
- Weight vector
- Matching Label and Prediction Types Between Reductions
- Zhen's Presentation Slides on enhancements to vw
- EZExample Archive
- Design Documents
- Contribute: