Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remarks on benchmark problems #3

Open
gdalle opened this issue May 6, 2024 · 3 comments
Open

Remarks on benchmark problems #3

gdalle opened this issue May 6, 2024 · 3 comments

Comments

@gdalle
Copy link
Member

gdalle commented May 6, 2024

Interface

  • Document what generate_blabla does
    • generate_maximizer does not return a differentiable layer
    • in particular for generate_maximizer the signature (args and kwargs) of the returned closure
  • How to include losses in addition to CO layers?
    • Callable struct that combines model, CO layer and loss? Not ideal, better leave ingredients separate
  • Add function for turnkey training?
    • Or a struct that stores the whole dataset

Getting data

  • DataDeps.jl
  • DataToolkit.jl

Data sources

Problem meaning

  • Subset selections:
    • artificial split: top $k$ becomes 1) linear model to get cost followed by 2) LP
    • the optimal statistical model is identity (but the computer doesn't know)

Varying instance sizes

  • Modify ShortestPathBenchmark to draw a random grid size from specified ranges of height and width, then see what you need in the interface to make it work
@gdalle gdalle changed the title Use DataDeps.jl or DataToolkit.jl to get datasets Remarks on benchmark problems May 23, 2024
@gdalle
Copy link
Member Author

gdalle commented Jun 27, 2024

Don't define the pipelines. We can put stuff in an InferOpt extension relying on InferOptBenchmarks to train our pipelines.

Imagine we define these benchmarks for a comparison between InferOpt and a competitor

  • data
  • encoder
  • optimizer
  • performance metric

InferOptBenchmarks should not depend on InferOpt. Maybe we should rename it to "DecisionFocusedLearningBenchmarks".

Take inspiration from https://github.com/JuliaSmoothOptimizers/OptimizationProblems.jl

@gdalle
Copy link
Member Author

gdalle commented Jun 27, 2024

In the docs and tests of this package, use a black box optimizer in the pipeline (no autodiff shenanigans) to learn without depending on InferOpt

@gdalle
Copy link
Member Author

gdalle commented Aug 29, 2024

Today's meeting notes:

  • Dedicated struct for sample instead of dataset (no need to make a pseudo vector).
  • Add rng to model generation because Flux networks contain their parameters.
  • Possibly useful to pass instance as a kwarg to model. Flux doesn't allow it but GraphNeuralNetworks would for example.
  • Avoid returning closures in generate_maximizer, use callable struct instead.
  • Keep every algorithm in the maximization direction.
  • theta and friends don't have to be arrays all the time.
  • Rename compute_gap into average_gap.
  • Avoid dispatching on AbstractArray, Nothing wil error anyway.
  • Be careful with paths to the data.
  • For tests, don't care about correctness of InferOpt pipeline and learning.
  • Mostly basic checks on size and types.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant