Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Neural OT notebook: add more 2d datasets #303

Merged
merged 2 commits into from
May 30, 2023
Merged

Neural OT notebook: add more 2d datasets #303

merged 2 commits into from
May 30, 2023

Conversation

bamos
Copy link
Contributor

@bamos bamos commented Feb 16, 2023

Followup of #219. I quickly tried adding some other 2d datasets and the neural solver with the non-convex potentials with the default options seem relatively stable and near-optimal! I'm documenting this result as a PR since it was easy to try in the existing neural OT notebook using the w2ot dataloaders. Would you like me to create new loaders in ott.problems.nn.dataset (using sklearn and replacing the w2ot ones) so we can merge in these results to the bottom of the notebook here? Or shall we keep the notebook/dataset smaller and not merge this PR in? One downside is that the training for all of these makes the notebook take ~1-2 hours to execute rather than ~30 minutes. But, it could be nice to show (and test) the stability of OTT's neural solver with the default options, and we could also report the estimated W2 distance in these settings and hope that people would find it a useful baseline (e.g., papers like this).

If you're interested in merging in, I'll do another pass through and fill in the rest of the details and then we can start a full review. I can also coordinate the updates to the data loader with #289.

image


These were first presented in generative modeling with OT maps:

image

I also show them in the amortizing convex conjugates paper:

image

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@marcocuturi
Copy link
Contributor

Thanks Brandon! I do think it's very nice to add other 2D examples in the notebook, I am sure they can be useful for research, thanks for this WIP! I took a quick look, and LGTM, but ping us when you feel you're closer to a version you want to submit.

@bamos
Copy link
Contributor Author

bamos commented Feb 22, 2023

Ok, sounds good! I'll wait for #289 to get in first with some other data and updates to the neural solver, and will then finish this one to add these new datasets and updates to the notebook.

In addition to gaussian_mixture_samplers or uniform_mixture_samplers (in #289), I will probably add a function sklearn_samplers that will return these tasks, perhaps with a single argument name to be set to circles, moons, s-curve, or spiral. Let me know if you would prefer something else!

\cc @michalk8

@bamos
Copy link
Contributor Author

bamos commented Feb 22, 2023

And I can keep the sklearn dependency optional, following #304. I'm not sure what the best way of doing that is. Should I add sklearn to docs in pyproject.toml like here, and/or environment.yml like here?

@michalk8
Copy link
Collaborator

In addition to gaussian_mixture_samplers or uniform_mixture_samplers (in #289), I will probably add a function sklearn_samplers that will return these tasks, perhaps with a single argument name to be set to circles, moons, s-curve, or spiral. Let me know if you would prefer something else!

I'd avoid having sklearn as a dependency, would just do in the code once you decide to add it.

try:
    from sklearn... import ...
except ImportError:
    ...

And I can keep the sklearn dependency optional, following #304. I'm not sure what the best way of doing that is. Should I add sklearn to docs in pyproject.toml like here, and/or environment.yml like here?

I think this can be kept as a silent dependency (i.e., not in pyproject.toml's extra requirements). The environment.yml is used to setup up Binder (and should possibly contain union of all imports used in the notebooks; will look into it in the future).

@marcocuturi
Copy link
Contributor

Hi @bamos ! should we push what's in the PR currently? (except maybe the interrupted cell :) )

@bamos
Copy link
Contributor Author

bamos commented Apr 22, 2023

Sorry for the delays! I'll do a quick pass through it within a few days and we can then merge it

@bamos bamos changed the title [WIP/tentative] neural OT notebook: add more 2d datasets Neural OT notebook: add more 2d datasets Apr 24, 2023
@bamos
Copy link
Contributor Author

bamos commented Apr 24, 2023

Hi @marcocuturi @michalk8, I just finalized the notebook here and it's ready for a review/merge. Can you take a look when you get a chance? Instead of adding the dataloaders to OTT, I just kept the reference to the w2ot dataloaders in the notebook and added an error in case that's not installed.

@codecov-commenter
Copy link

codecov-commenter commented Apr 24, 2023

Codecov Report

Merging #303 (44f3a4f) into main (2abdd72) will decrease coverage by 0.55%.
The diff coverage is n/a.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #303      +/-   ##
==========================================
- Coverage   88.75%   88.20%   -0.55%     
==========================================
  Files          52       51       -1     
  Lines        5657     5530     -127     
  Branches      864      831      -33     
==========================================
- Hits         5021     4878     -143     
- Misses        509      532      +23     
+ Partials      127      120       -7     

see 23 files with indirect coverage changes

@marcocuturi marcocuturi merged commit 4ddf509 into ott-jax:main May 30, 2023
@marcocuturi
Copy link
Contributor

Thakns Brandon!

@michalk8 michalk8 added the enhancement New feature or request label May 31, 2023
michalk8 pushed a commit that referenced this pull request Jun 27, 2024
* neural OT notebook: add more 2d datasets

* neural_dual: add comment about final 2d datasets and re-run
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants