Remarks on benchmark problems #3

gdalle · 2024-05-06T13:35:15Z

Interface

Document what generate_blabla does
- generate_maximizer does not return a differentiable layer
- in particular for generate_maximizer the signature (args and kwargs) of the returned closure
How to include losses in addition to CO layers?
- Callable struct that combines model, CO layer and loss? Not ideal, better leave ingredients separate
Add function for turnkey training?
- Or a struct that stores the whole dataset

Getting data

Data sources

Problem meaning

Subset selections:
- artificial split: top $k$ becomes 1) linear model to get cost followed by 2) LP
- the optimal statistical model is identity (but the computer doesn't know)

Varying instance sizes

Modify ShortestPathBenchmark to draw a random grid size from specified ranges of height and width, then see what you need in the interface to make it work

The text was updated successfully, but these errors were encountered:

gdalle · 2024-06-27T15:08:24Z

Don't define the pipelines. We can put stuff in an InferOpt extension relying on InferOptBenchmarks to train our pipelines.

Imagine we define these benchmarks for a comparison between InferOpt and a competitor

InferOptBenchmarks should not depend on InferOpt. Maybe we should rename it to "DecisionFocusedLearningBenchmarks".

gdalle · 2024-06-27T15:15:08Z

In the docs and tests of this package, use a black box optimizer in the pipeline (no autodiff shenanigans) to learn without depending on InferOpt

gdalle · 2024-08-29T12:20:33Z

Today's meeting notes:

Dedicated struct for sample instead of dataset (no need to make a pseudo vector).
Add rng to model generation because Flux networks contain their parameters.
Possibly useful to pass instance as a kwarg to model. Flux doesn't allow it but GraphNeuralNetworks would for example.
Avoid returning closures in generate_maximizer, use callable struct instead.
Keep every algorithm in the maximization direction.
theta and friends don't have to be arrays all the time.
Rename compute_gap into average_gap.
Avoid dispatching on AbstractArray, Nothing wil error anyway.
Be careful with paths to the data.
For tests, don't care about correctness of InferOpt pipeline and learning.
Mostly basic checks on size and types.

gdalle changed the title ~~Use DataDeps.jl or DataToolkit.jl to get datasets~~ Remarks on benchmark problems May 23, 2024

Provide feedback