Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

maml_omniglot model with functorch #1179

Closed
wants to merge 2 commits into from

Conversation

zou3519
Copy link
Contributor

@zou3519 zou3519 commented Sep 14, 2022

Very similar to the maml_omniglot model in torchbench (which is what we originally based this model off of). We also load some sample inputs from there because the model takes the same inputs.

Test Plan:

  • python test.py -k TestBenchmark.test_functorch_maml_omniglot_train_cuda

Very similar to the maml_omniglot model in torchbench (which is what we
originally based this model off of). We also load some sample inputs
there because the model takes the same inputs.

Test Plan:
- python test.py -k TestBenchmark.test_functorch_maml_omniglot_train_cuda
Copy link
Contributor

@xuzhao9 xuzhao9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM. See inline comments about the batch size we should use.


class Model(BenchmarkModel):
task = OTHER.OTHER_TASKS
DEFAULT_TRAIN_BSIZE = 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the batch size used in the original training code? I am working on another version of maml/protonet originated from fewshot(https://github.com/pytorch/benchmark/pull/1184/files#diff-9c209c92cea7ccad363e7d477e210a3120028678829b11086cd9772af97181eaR9) and it looks like the batch size should be larger than this.

Copy link
Contributor Author

@zou3519 zou3519 Sep 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I copied this from

DEFAULT_TRAIN_BSIZE = 1
DEFAULT_EVAL_BSIZE = 1
.

"batch_size" is a term that doesn't apply to the maml model in the traditional sense. The original maml omniglot (https://arxiv.org/pdf/1703.03400.pdf) was trained under 32 tasks (also referred to as meta batch-size) and done using both 1-shot and 5-shot.

The code provided in this PR does 32 tasks and 5-shot. I can change DEFAULT_TRAIN_BSIZE to 32 instead and add a comment if that sounds reasonable to you? (The first dimension of the inputs is the task dimension and has size 32)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, maml_omniglot and maml models were marked as "incorrect" in #328, so we created #1184 trying to use an actual correct implementation, but I haven't go time to get it work. I am okay with landing this PR for now, but I think we should investigate if #1184 is a better implementation, then rebase all maml models to fewshot.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will merge this for now. Happy to rebase this model if we do want to replace it with the one in the fewshot repo.

@facebook-github-bot
Copy link
Contributor

@zou3519 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants