Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Use use_auth_token in all cases when loading from the HF Hub #3094

Merged
merged 17 commits into from
Aug 25, 2022

Conversation

sjrl
Copy link
Contributor

@sjrl sjrl commented Aug 24, 2022

Related Issues

  • fixes the issue of not being able to load private models from HF for various models throughout Haystack (e.g. FARMReader and Retriever)

Proposed Changes:

I've passed the use_auth_token to the HuggingFace methods that accept it (e.g. from_petrained() and pipeline).

How did you test it?

I was able to load a private Reader model from HF after the changes were made

from haystack.nodes import FARMReader
reader = FARMReader(model_name_or_path="private/model", use_gpu=True, use_auth_token=True)

Notes for the reviewer

use_auth_token is not consistently used throughout Haystack. This PR tries to use use_auth_token consistently in all places in Haystack when loading models from the HF hub.

Checklist

@sjrl sjrl marked this pull request as ready for review August 24, 2022 12:13
@sjrl sjrl requested review from a team as code owners August 24, 2022 12:13
@sjrl sjrl requested review from vblagoje and removed request for a team August 24, 2022 12:13
@sjrl sjrl changed the title fix: Pass use_auth_token to from_pretrained fix: Use use_auth_token in all cases when loading from the HF Hub Aug 24, 2022
Copy link
Member

@julian-risch julian-risch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very good to me already. Just some small remarks to further improve. Further, I was wondering whether we could add a test case to make sure a problem with auth tokens is recognized faster in future. The auth token could be stored as a GitHub secret.

haystack/nodes/reader/farm.py Outdated Show resolved Hide resolved
haystack/pipelines/base.py Outdated Show resolved Hide resolved
haystack/utils/docker.py Outdated Show resolved Hide resolved
Copy link
Contributor

@ZanSara ZanSara left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What a change! That's no surprise no one wanted to do this before 😄 Looks good to me, except for the pipes, that need to be reverted. @vblagoje do you want to have a look too?

haystack/nodes/reader/table.py Show resolved Hide resolved
haystack/nodes/retriever/dense.py Outdated Show resolved Hide resolved
@sjrl
Copy link
Contributor Author

sjrl commented Aug 24, 2022

Further, I was wondering whether we could add a test case to make sure a problem with auth tokens is recognized faster in future. The auth token could be stored as a GitHub secret.

We could definitely do that. However, to completely test this I think we would need to have a private HF model for all cases we want to test (e.g. Reader, Retriever, Summarizer, AnswerGenerator, etc.).

@sjrl
Copy link
Contributor Author

sjrl commented Aug 24, 2022

Further, I was wondering whether we could add a test case to make sure a problem with auth tokens is recognized faster in future. The auth token could be stored as a GitHub secret.

We could definitely do that. However, to completely test this I think we would need to have a private HF model for all cases we want to test (e.g. Reader, Retriever, Summarizer, AnswerGenerator, etc.).

@julian-risch Is it alright if we open this as a new issue? It may be worth discussing how we would like to best test this across Haystack.

@julian-risch julian-risch self-requested a review August 24, 2022 15:45
Copy link
Member

@julian-risch julian-risch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me now. 👍

@sjrl sjrl merged commit 0cf0568 into main Aug 25, 2022
@sjrl sjrl deleted the fix-auth_token branch August 25, 2022 08:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants