Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build(deps): avoid version conflicts #636

Merged
merged 28 commits into from
May 24, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
b9fc265
Add new requirements/constraints .in files
qued May 23, 2023
df955cc
Use new files in make targets and setup
qued May 23, 2023
cff4c69
Manifest to make sure files are available for install
qued May 23, 2023
5f1fd53
Update all reqs according to the new rules
qued May 23, 2023
7e8a256
Use full paths
qued May 23, 2023
a450af8
Move all constraint-type pins to base-constraints
qued May 23, 2023
de9d386
Comment changes with re pip-compile
qued May 23, 2023
488d6dc
Redo constraints
qued May 23, 2023
d605d29
cache file should be noop as a requirements file
qued May 23, 2023
bc93c3d
update deps with constraints
qued May 23, 2023
1313679
Correct makefile
qued May 23, 2023
ce6d6c2
Add consistency checking script
qued May 23, 2023
3c6254d
shellcheck updates
qued May 23, 2023
46293bd
Add dependency check in ci
qued May 23, 2023
43abbc9
Make target for deps
qued May 23, 2023
cd5d834
Suppress pip output in script
qued May 23, 2023
cb15628
Update to give output andproper exit value
qued May 23, 2023
9b8bf34
Correct manifest
qued May 23, 2023
be7e8a5
Merge branch 'main' into build(deps)/avoid-version-conflicts
qued May 23, 2023
8bf98fc
shellcheck fix
qued May 23, 2023
3957973
fix ruff error
qued May 23, 2023
19d7c44
Add some comments
qued May 24, 2023
53d3a50
Add type hints to load_requirements
qued May 24, 2023
8a056d4
make sure CI catches this
qued May 24, 2023
d985d42
dep check can run in parallel actually
qued May 24, 2023
168488b
Change back to no conflicts
qued May 24, 2023
d538c0b
Simpler backup
qued May 24, 2023
8d068fe
update changelog
qued May 24, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 30 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,36 @@ jobs:
source .venv/bin/activate
make install-ci

check-deps:
strategy:
matrix:
python-version: ["3.8","3.9","3.10"]
runs-on: ubuntu-latest
needs: setup
steps:
- uses: actions/checkout@v3
- uses: actions/cache@v3
id: virtualenv-cache
with:
path: .venv
key: unstructured-${{ runner.os }}-${{ matrix.python-version }}-${{ hashFiles('requirements/*.txt') }}
# NOTE(robinson) - This is a fallback in case the lint job does not find the cache.
# We can take this out when we implement the fix in CORE-99
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
- name: Setup virtual environment (no cache hit)
if: steps.virtualenv-cache.outputs.cache-hit != 'true'
run: |
python${{ matrix.python-version }} -m venv .venv
source .venv/bin/activate
make install-base-pip-packages
- name: Check for dependency conflicts
run: |
source .venv/bin/activate
make check-deps

lint:
strategy:
matrix:
Expand Down
3 changes: 2 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@
## 0.6.9-dev2
## 0.6.9

### Enhancements

* fast strategy for pdf now keeps element bounding box data
* setup.py refactor

### Features

Expand Down
12 changes: 12 additions & 0 deletions MANIFEST.in
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
include requirements/base.in
include requirements/huggingface.in
include requirements/local-inference.in
include requirements/ingest-s3.in
include requirements/ingest-azure.in
include requirements/ingest-discord.in
include requirements/ingest-github.in
include requirements/ingest-gitlab.in
include requirements/ingest-reddit.in
include requirements/ingest-slack.in
include requirements/ingest-wikipedia.in
include requirements/ingest-google-drive.in
31 changes: 18 additions & 13 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -108,28 +108,28 @@ install-local-inference: install install-unstructured-inference install-detectro
## pip-compile: compiles all base/dev/test requirements
.PHONY: pip-compile
pip-compile:
pip-compile --upgrade -o requirements/base.txt
pip-compile --upgrade requirements/base.in
# Extra requirements for huggingface staging functions
pip-compile --upgrade --extra huggingface -o requirements/huggingface.txt
pip-compile --upgrade requirements/huggingface.in
# NOTE(robinson) - We want the dependencies for detectron2 in the requirements.txt, but not
# the detectron2 repo itself. If detectron2 is in the requirements.txt file, an order of
# operations issue related to the torch library causes the install to fail
pip-compile --upgrade requirements/dev.in
pip-compile --upgrade requirements/test.in
pip-compile --upgrade requirements/dev.in
pip-compile --upgrade requirements/build.in
pip-compile --upgrade --extra local-inference -o requirements/local-inference.txt
pip-compile --upgrade requirements/local-inference.in
# NOTE(robinson) - doc/requirements.txt is where the GitHub action for building
# sphinx docs looks for additional requirements
cp requirements/build.txt docs/requirements.txt
pip-compile --upgrade --extra=s3 --output-file=requirements/ingest-s3.txt requirements/base.txt setup.py
pip-compile --upgrade --extra=azure --output-file=requirements/ingest-azure.txt requirements/base.txt setup.py
pip-compile --upgrade --extra=discord --output-file=requirements/ingest-azure.txt requirements/base.txt setup.py
pip-compile --upgrade --extra=reddit --output-file=requirements/ingest-reddit.txt requirements/base.txt setup.py
pip-compile --upgrade --extra=github --output-file=requirements/ingest-github.txt requirements/base.txt setup.py
pip-compile --upgrade --extra=gitlab --output-file=requirements/ingest-gitlab.txt requirements/base.txt setup.py
pip-compile --upgrade --extra=slack --output-file=requirements/ingest-slack.txt requirements/base.txt setup.py
pip-compile --upgrade --extra=wikipedia --output-file=requirements/ingest-wikipedia.txt requirements/base.txt setup.py
pip-compile --upgrade --extra=google-drive --output-file=requirements/ingest-google-drive.txt requirements/base.txt setup.py
pip-compile --upgrade requirements/ingest-s3.in
pip-compile --upgrade requirements/ingest-azure.in
pip-compile --upgrade requirements/ingest-discord.in
pip-compile --upgrade requirements/ingest-reddit.in
pip-compile --upgrade requirements/ingest-github.in
pip-compile --upgrade requirements/ingest-gitlab.in
pip-compile --upgrade requirements/ingest-slack.in
pip-compile --upgrade requirements/ingest-wikipedia.in
pip-compile --upgrade requirements/ingest-google-drive.in

## install-project-local: install unstructured into your local python environment
.PHONY: install-project-local
Expand Down Expand Up @@ -198,6 +198,11 @@ version-sync:
check-coverage:
coverage report --fail-under=95

## check-deps: check consistency of dependencies
.PHONY: check-deps
check-deps:
scripts/consistent-deps.sh

##########
# Docker #
##########
Expand Down
6 changes: 3 additions & 3 deletions docs/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ babel==2.12.1
# via sphinx
beautifulsoup4==4.12.2
# via furo
certifi==2022.12.7
certifi==2023.5.7
# via
# -r requirements/build.in
# requests
Expand All @@ -20,7 +20,7 @@ docutils==0.18.1
# via
# sphinx
# sphinx-rtd-theme
furo==2023.3.27
furo==2023.5.20
# via -r requirements/build.in
idna==3.4
# via requests
Expand All @@ -40,7 +40,7 @@ pygments==2.15.1
# sphinx
pytz==2023.3
# via babel
requests==2.30.0
requests==2.31.0
# via sphinx
snowballstemmer==2.2.0
# via sphinx
Expand Down
16 changes: 16 additions & 0 deletions requirements/base.in
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
-c "constraints.in"
argilla
chardet
lxml
msg_parser
nltk
openpyxl
pandas
pdfminer.six
pillow
pypandoc
python-docx
python-pptx
python-magic
markdown
requests
57 changes: 32 additions & 25 deletions requirements/base.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,20 +2,20 @@
# This file is autogenerated by pip-compile with Python 3.8
# by the following command:
#
# pip-compile --output-file=requirements/base.txt
# pip-compile requirements/base.in
#
anyio==3.6.2
# via httpcore
argilla==1.6.0
# via unstructured (setup.py)
argilla==1.7.0
# via -r requirements/base.in
backoff==2.2.1
# via argilla
certifi==2022.12.7
certifi==2023.5.7
# via
# -c requirements/constraints.in
# httpcore
# httpx
# requests
# unstructured (setup.py)
cffi==1.15.1
# via cryptography
chardet==5.1.0
Expand All @@ -25,7 +25,9 @@ charset-normalizer==3.1.0
# pdfminer-six
# requests
click==8.1.3
# via nltk
# via
# nltk
# typer
commonmark==0.9.1
# via rich
cryptography==40.0.2
Expand All @@ -51,59 +53,59 @@ joblib==1.2.0
# via nltk
lxml==4.9.2
# via
# -r requirements/base.in
# python-docx
# python-pptx
# unstructured (setup.py)
markdown==3.4.3
# via unstructured (setup.py)
# via -r requirements/base.in
monotonic==1.6
# via argilla
msg-parser==1.2.0
# via unstructured (setup.py)
# via -r requirements/base.in
nltk==3.8.1
# via unstructured (setup.py)
# via -r requirements/base.in
numpy==1.23.5
# via
# argilla
# pandas
olefile==0.46
# via msg-parser
openpyxl==3.1.2
# via unstructured (setup.py)
# via -r requirements/base.in
packaging==23.1
# via argilla
pandas==1.5.3
# via
# -r requirements/base.in
# argilla
# unstructured (setup.py)
pdfminer-six==20221105
# via unstructured (setup.py)
# via -r requirements/base.in
pillow==9.5.0
# via
# -r requirements/base.in
# python-pptx
# unstructured (setup.py)
pycparser==2.21
# via cffi
pydantic==1.10.7
pydantic==1.10.8
# via argilla
pygments==2.15.1
# via rich
pypandoc==1.11
# via unstructured (setup.py)
# via -r requirements/base.in
python-dateutil==2.8.2
# via pandas
python-docx==0.8.11
# via unstructured (setup.py)
# via -r requirements/base.in
python-magic==0.4.27
# via unstructured (setup.py)
# via -r requirements/base.in
python-pptx==0.6.21
# via unstructured (setup.py)
# via -r requirements/base.in
pytz==2023.3
# via pandas
regex==2023.5.5
# via nltk
requests==2.30.0
# via unstructured (setup.py)
requests==2.31.0
# via -r requirements/base.in
rfc3986[idna2008]==1.5.0
# via httpx
rich==13.0.1
Expand All @@ -119,17 +121,22 @@ tqdm==4.65.0
# via
# argilla
# nltk
typing-extensions==4.5.0
typer==0.9.0
# via argilla
typing-extensions==4.6.0
# via
# pydantic
# rich
urllib3==2.0.2
# via requests
# typer
urllib3==1.26.16
# via
# -c requirements/constraints.in
# requests
wrapt==1.14.1
# via
# argilla
# deprecated
xlsxwriter==3.1.0
xlsxwriter==3.1.1
# via python-pptx
zipp==3.15.0
# via importlib-metadata
6 changes: 3 additions & 3 deletions requirements/build.txt
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ babel==2.12.1
# via sphinx
beautifulsoup4==4.12.2
# via furo
certifi==2022.12.7
certifi==2023.5.7
# via
# -r requirements/build.in
# requests
Expand All @@ -20,7 +20,7 @@ docutils==0.18.1
# via
# sphinx
# sphinx-rtd-theme
furo==2023.3.27
furo==2023.5.20
# via -r requirements/build.in
idna==3.4
# via requests
Expand All @@ -40,7 +40,7 @@ pygments==2.15.1
# sphinx
pytz==2023.3
# via babel
requests==2.30.0
requests==2.31.0
# via sphinx
snowballstemmer==2.2.0
# via sphinx
Expand Down
2 changes: 1 addition & 1 deletion requirements/cache.txt
Original file line number Diff line number Diff line change
@@ -1 +1 @@
a
# a
15 changes: 15 additions & 0 deletions requirements/constraints.in
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
####################################################################################################
# This file can house global constraints that aren't *direct* requirements of the package or any
# extras. Putting a dependency here will only affect dependency sets that contain them -- in other
# words, if something does not require a constraint, it will not be installed.
####################################################################################################
# NOTE(alan): Pinning to avoid conflicts with downstream ingest-s3
urllib3<1.27, >=1.25.4
# consistency with local-inference-pin
protobuf<3.21
# NOTE(robinson) - Required pins for security scans
jupyter-core>=4.11.2
wheel>=0.38.1
# NOTE(robinson) - The following pins are to address
# vulnerabilities in dependency scans
certifi>=2022.12.07
6 changes: 3 additions & 3 deletions requirements/dev.in
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
-c constraints.in
-c base.txt
-c test.txt
jupyter
ipython
pip-tools
pre-commit
# NOTE(robinson) - Required pins for security scans
jupyter-core>=4.11.2
wheel>=0.38.1
Loading