You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Installing Haystack is really hard. It shouldn't be. This issue is about tracking efforts to remove dependencies from the core list and try-catch import statements.
Here is a list of all the dependencies we need to act on. Try-catch the dependencies aggressively and add an error message stating the dedicated extra to install. The goal is to be able to run import haystack without that dependency installed and having it fail only when a specific class is being initialized.
Extract Transformers. The heaviest and harder to extract. Let's make a single extra transformers for all the packages involved. Packages to address are (likely). We decided to not remove transformers out of the core dependencies. Even HFInvocationLayer in PromptNode requires it. So the only thing one could do without transformers is PromptNode with OpenAI. Let's focus on other topics instead, such as making torch installation optional
qualtulum3: We're using its parser in our TableReader to check if all cells contain at least one numerical value and that all values share the same unit. That functionality could be become a part of Haystack's utility functions but it's more than just a few lines so I would keep it as is.
The resulting list of core dependencies will look like:
"requests", # API calls
"pydantic", # for the primitive classes
"pandas", # basic dataframes handling
"rank_bm25", # for InMemoryDocumentStore
"scikit-learn", # tfidf, sklearnqueryclassifier and some other metrics -> might be removed later
"tqdm", # progress bars in model download and training scripts
"networkx", # graphs library
"posthog", # telemetry
"tenacity", # retry decorator
"jsonschema", # Schema validation
"more_itertools" # Utilities
The above happily installs in a couple of seconds. Let's get to it! 🚀
The text was updated successfully, but these errors were encountered:
Installing Haystack is really hard. It shouldn't be. This issue is about tracking efforts to remove dependencies from the core list and try-catch import statements.
Here is a list of all the dependencies we need to act on. Try-catch the dependencies aggressively and add an error message stating the dedicated extra to install. The goal is to be able to run
import haystack
without that dependency installed and having it fail only when a specific class is being initialized.Steps
tiktoken
(the issue has been resolved): build: Remove tiktoken alternative #4991elasticsearch
#4667Extract Transformers. The heaviest and harder to extract. Let's make a single extra. We decided to not remove transformers out of the core dependencies. Even HFInvocationLayer in PromptNode requires it. So the only thing one could do without transformers is PromptNode with OpenAI. Let's focus on other topics instead, such as making torch installation optionaltransformers
for all the packages involved. Packages to address are (likely)transformers[torch]
Related to Allow exclusion of torch GPU dependencies during installation #4233huggingface-hub
sentence-transformers
transformers
#5101metrics
extrascipy
seqeval
mlflow
rapidfuzz
preprocessing
andfile-conversion
nltk
python-docx
langdetect
tika
: It's not only required for tables but also for pipeline evaluation. eval results are stored in dataframes.pandas
azure-ai-formrecognizer
Other libraries to investigate
mmh3
: Remove mmh3 dependency #4847protobuf
: Install protobuf through transformers[sentencepiece] extra #4988dill
: build: Remove dill dependency #4985qualtulum3
: We're using its parser in our TableReader to check if all cells contain at least one numerical value and that all values share the same unit. That functionality could be become a part of Haystack's utility functions but it's more than just a few lines so I would keep it as is.tiktoken
needed for ~3 times faster tokenization, for example with PromptNode (see build: Remove tiktoken alternative #4991)Outcome
The resulting list of core dependencies will look like:
The above happily installs in a couple of seconds. Let's get to it! 🚀
The text was updated successfully, but these errors were encountered: