Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange dependencies? #16

Closed
flatsiedatsie opened this issue Mar 6, 2024 · 4 comments
Closed

Strange dependencies? #16

flatsiedatsie opened this issue Mar 6, 2024 · 4 comments

Comments

@flatsiedatsie
Copy link

flatsiedatsie commented Mar 6, 2024

I'm installing this via Pip and I'm amazed at the number of modules this is installing. Does it really need this many?

pip3 install styletts2 -t lib --no-cache-dir --prefix "" --default-timeout=180 --upgrade

E.g. why is it installing:

Jinja2
google-cloud-core
google-auth
google-resumable-media
googleapis-common-protos
pytz
fonttools
huggingface-hub

The full list:

pytz, python-crfsuite, pydub, mpmath, gruut-lang-en, docopt, zipp, urllib3, tzlocal, typing-extensions, tqdm, threadpoolctl, tenacity, sympy, sniffio, six, safetensors, regex, PyYAML, pyparsing, pygments, pycparser, pyasn1, psutil, protobuf, platformdirs, pillow, packaging, orjson, numpy, num2words, networkx, mypy-extensions, munch, multidict, msgpack, mdurl, MarkupSafe, llvmlite, lazy-loader, kiwisolver, jsonpointer, joblib, jmespath, idna, gruut-ipa, greenlet, google-crc32c, fsspec, frozenlist, fonttools, filelock, exceptiongroup, einops, decorator, cycler, click, charset-normalizer, certifi, cachetools, Babel, audioread, attrs, async-timeout, annotated-types, yarl, typing-inspect, SQLAlchemy, soxr, scipy, rsa, requests, python-dateutil, pydantic-core, pyasn1-modules, numba, nltk, marshmallow, markdown-it-py, jsonpatch, jsonlines, jinja2, importlib-resources, googleapis-common-protos, google-resumable-media, einops-exts, contourpy, cffi, anyio, aiosignal, torch, soundfile, scikit-learn, rich, pydantic, pooch, matplotlib, huggingface-hub, google-auth, dateparser, dataclasses-json, botocore, aiohttp, torchaudio, tokenizers, s3transfer, librosa, langsmith, gruut, google-api-core, accelerate, transformers, langchain-core, google-cloud-core, boto3, langchain-text-splitters, langchain-community, google-cloud-storage, langchain, cached-path, styletts2

// It's done installing.. 900MB :-D

@sidharthrajaram
Copy link
Owner

Hey @flatsiedatsie , thanks for the note. A lot of these dependencies (especially the google modules and huggingface-hub) stem from AllenAI's cached-path package. It provides a fairly standardized way to cache model checkpoint paths. A good alternative would be to just have the local caching functionality from that package (in other words, cached-path without the support for S3/GCS/HF), but I haven't found one just yet.

I'll keep an eye out for any way to replace cached-path if a good alternative comes up (or just re-implement based on cached-path)

@flatsiedatsie
Copy link
Author

Ah, I see. Thanks for the explanation!

@sidharthrajaram
Copy link
Owner

sidharthrajaram commented Mar 11, 2024

@flatsiedatsie Just wanted to follow up on this. After looking into it some more, I think replacing cached-path might be the correct move eventually. It just seems like way too many dependencies -- I'm beginning to encounter conflicts due to boto3 dependencies (even though we're not even using S3 for loading checkpoints!).

Ideally, cached-path can make the GCS/S3/HF dependencies optional. I've opened an issue (allenai/cached_path#223) with them, feel free to keep an eye on it.

@flatsiedatsie
Copy link
Author

They are "happy to accept PR's" :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants