Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: optional transformers #5101

Merged
merged 26 commits into from
Jun 14, 2023
Merged

feat: optional transformers #5101

merged 26 commits into from
Jun 14, 2023

Conversation

ZanSara
Copy link
Contributor

@ZanSara ZanSara commented Jun 7, 2023

Proposed Changes:

Apparently PyScript (based on Pyodide) and PyTorch are currently incompatible: pyodide/pyodide#1625. However, we could still make some nice demos in PyScript with components that do not use transformers.

On the other hand, we don't want to extract transformers from the core dependencies, due to the fact that it would restrict too much the number of nodes users could use out of the box.

This PR makes all the imports of Torch and Transformers optional, but does NOT extract transformers from the list of core dependencies. In this way, a casual user doing pip install farm-haystack will get transformers and pytorch, while a motivated power user that absolutely can't install transformers or torch can create a requirements.txt with the following content:

tokenizers
requests
pydantic
pandas
rank_bm25
scikit-learn>=1.0.0
lazy-imports==0.3.1
prompthub-py==4.0.0
platformdirs
tqdm
networkx
quantulum3
posthog
azure-ai-formrecognizer>=3.2.0b2
huggingface-hub>=0.5.0
tenacity
sseclient-py
more_itertools
boilerpy3
tiktoken>=0.3.2
jsonschema
canals==0.2.2
events
requests-cache<1.0.0
pillow
click

and then do:

pip install --no-deps farm-haystack && pip install -r requirements.txt to obtain a PyTorch-free version of Haystack.

How did you test it?

  • CI
  • Local run of pip install --no-deps . && pip install -r requirements.txt && haystack

Notes for the reviewer

  1. Right now it contains changes from the PR mentioned above. Wait for the review until those are merged and this PR is rebased.
  2. Sorry, the PR is huge 😥 It's mostly just re-arranging imports throuhg, so any functional change should be pointed out.

Checklist

@github-actions github-actions bot added the type:documentation Improvements on the docs label Jun 7, 2023
@ZanSara ZanSara requested review from masci and removed request for vblagoje June 7, 2023 15:58
@coveralls
Copy link
Collaborator

coveralls commented Jun 12, 2023

Pull Request Test Coverage Report for Build 5264914479

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • 1977 unchanged lines in 32 files lost coverage.
  • Overall coverage increased (+0.2%) to 42.268%

Files with Coverage Reduction New Missed Lines %
modeling/init.py 2 66.67%
nodes/prompt/invocation_layer/handlers.py 3 94.34%
environment.py 5 90.2%
nodes/prompt/prompt_model.py 8 80.85%
nodes/_json_schema.py 11 87.2%
nodes/summarizer/transformers.py 12 84.38%
nodes/prompt/invocation_layer/hugging_face.py 16 87.6%
nodes/translator/transformers.py 17 70.21%
nodes/prompt/invocation_layer/hugging_face_inference.py 19 79.65%
nodes/image_to_text/transformers.py 21 36.73%
Totals Coverage Status
Change from base Build 5264489844: 0.2%
Covered Lines: 9528
Relevant Lines: 22542

💛 - Coveralls

@ZanSara ZanSara self-assigned this Jun 12, 2023
@julian-risch
Copy link
Member

julian-risch commented Jun 13, 2023

@ZanSara Really looking forward to this PR! I tried out this branch and manually installed Haystack's dependencies including transformers but without pytorch, sentencepiece, sentence-transformers and huggingface-hub.

I was able to run examples/agent_multihop_qa.py. All I had to do was also install pip install Pillow because the multimodal embedder wants to import it:


After that it worked as expected without any torch/cuda dependencies installed. 🎉

Not sure which of pytorch, sentencepiece, sentence-transformers and huggingface-hub usually installed Pillow. We should make the import from Pillow also a lazy import. That can happen in a separate PR.

Once this PR here is merged, I will open one removing torch from Haystack's core dependencies.

@ZanSara ZanSara mentioned this pull request Jun 13, 2023
12 tasks
@ZanSara
Copy link
Contributor Author

ZanSara commented Jun 13, 2023

Thanks for testing it! I'll update the PR description to include Pillow

@ZanSara ZanSara requested review from julian-risch and removed request for masci June 14, 2023 07:53
Copy link
Member

@julian-risch julian-risch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very good to me and as I have already said earlier, I'm really looking forward to it! 👍 I have a few minor questions in the review comments. Once we have sorted them out, this PR will be ready to be merged.

haystack/nodes/audio/whisper_transcriber.py Outdated Show resolved Hide resolved
haystack/nodes/file_converter/pdf.py Outdated Show resolved Hide resolved
@@ -3,21 +3,25 @@
import logging
from pathlib import Path

import torch
from tqdm.auto import tqdm
import numpy as np
from PIL import Image
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's double check whether this needs a LazyImport too. I'm guessing it does.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now I'd just install it, Pillow is not huge. But in the spirit of making all dependencies somewhat optional in the future, we could move it out as well. Let's do it in another PR.

haystack/modeling/__init__.py Outdated Show resolved Hide resolved
haystack/utils/experiment_tracking.py Outdated Show resolved Hide resolved
Copy link
Member

@julian-risch julian-risch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ready to be merged now! ✅ Thanks for addressing the change requests so quickly. 👍

@ZanSara ZanSara merged commit 20c1f23 into main Jun 14, 2023
@ZanSara ZanSara deleted the optional-transformers branch June 14, 2023 10:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants