Skip to content

Commit

Permalink
fix(build): nltk now requires explicit download of omw corpus (#136)
Browse files Browse the repository at this point in the history
* bump nltk to 3.6.7
* added explicit download for omw-1.4 corpus in tox
* added explicit download for omw-1.4 corpus in docker container
* tweaks to clean targets in makefile to clean up after test run with clean-all target
  • Loading branch information
m3mike authored Feb 7, 2022
1 parent 33c1903 commit 66689d9
Show file tree
Hide file tree
Showing 4 changed files with 12 additions and 5 deletions.
3 changes: 2 additions & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,8 @@ RUN --mount=type=cache,target=/root/.cache \
cp -f ./docker/entrypoint.sh entrypoint.sh && \
# Download NLTK data
python3 -m nltk.downloader punkt && \
python3 -m nltk.downloader wordnet
python3 -m nltk.downloader wordnet && \
python3 -m nltk.downloader omw-1.4

# Generate and Run Django migrations scripts, collectstatic app files
RUN python3 /tram/src/tram/manage.py makemigrations tram && \
Expand Down
7 changes: 6 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -74,10 +74,15 @@ clean: ## Clean up pycache files
find . -name '__pycache__' -type d -delete

.PHONY: clean-all
clean-all: clean ## Clean up venv and tox if necessary, in addition to standard clean
clean-all: clean-tox clean ## Clean up venv and tox if necessary, in addition to standard clean
find . \( -name ".tox" -o -name "$(VENV)" -o -name "*.egg-info" \) -type d -prune -exec rm -rf {} +
find ./src -name '*.egg' -delete
rm -f coverage.xml
rm -rf data/media

.PHONY: clean-tox
clean-tox: # Clean up tox if necessary
find . -name ".tox" -type d -prune -exec rm -rf {} +

.PHONY: venv-activate
venv-activate: venv ## Activate venv
Expand Down
2 changes: 1 addition & 1 deletion requirements/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ django-rest-framework==0.1.0
django==3.2.11
faker==6.5.0
gunicorn==20.1.0
nltk==3.6.5
nltk==3.6.7
pandas==1.2.3
pdfplumber==0.5.27
python-docx==0.8.10
Expand Down
5 changes: 3 additions & 2 deletions tox.ini
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,9 @@ passenv = GITHUB_*
[testenv:tram]
description = Run Pytest
commands =
python -c "import nltk; nltk.download('punkt')"
python -c "import nltk; nltk.download('wordnet')"
python3 -m nltk.downloader punkt
python3 -m nltk.downloader wordnet
python3 -m nltk.downloader omw-1.4
pytest --cov=src/ --cov=src/tram --cov=src/tram/tram/ml --cov=src/tram/tram/management/commands --cov-report=xml

[testenv:bandit]
Expand Down

0 comments on commit 66689d9

Please sign in to comment.