Update Click, add .gitignore, format code, and use Poetry #8

sumanthratna · 2020-03-13T19:48:33Z

~~I'm not sure if this is a PR that you want to merge—if it is, I can create another PR with the same changes as this for the dev_object_detection branch~~.

This PR:

removes directories that usually aren't committed to version control
formats code according to black (https://github.com/psf/black)
switches to Poetry for package management (for easy updating of dependencies and easy publishing to PyPI)
updates Click to 7.1.1 (this is necessary so new libraries such as pymana can use methods from PathFlow without experiencing version conflicts)
- we'll need to update the Wiki because underscores have been changed to hyphens in Click commands
updates pandas to 1.x
rebuilds docs with Sphinx
add testing with pytest
add continuous integration with Travis CI

Merging a few changes

Fix svs2dask_array

jlevy44 · 2020-03-14T20:18:30Z

Nice!!!! Thanks for linting the repo amd updating the docs! Also, I collapsed and PR'd the dev_object_detector branch. We can PR to master or feel free to make branches. Generally, I'm hoping that all features that have been added to the public release are fully functional, so I may further deprecate code that is not fully functional.

A couple of questions: Have you verified that the docs work? Has any of the linting caused any of the modules to fail? And referring to your previous PR, I saw that you had attempted to change the orientation of the slide, any impact on downstream functionality?

We may want to consider starting a unit-testing framework with continuous integration. It's been in the plans for a while but haven't gotten around to it. We could both iterate on the framework and incorporate in pieces if you are interested.

In any case, will probably accept this PR, just want to test the code in your commits before proceeding and make sure it doesn't significantly conflict with the previous PR.

jlevy44 · 2020-03-14T20:25:25Z

We should also make sure poetry is compatible with conda. I haven't used poetry before, but seems great. Eventually this will be packaged with Docker (we already have a preliminary Dockerfile but unreleased) and just need to make sure it can handle more complex installs such as nvidia Apex.

sumanthratna · 2020-03-14T20:27:37Z

I had a lot of those questions and was planning on bringing them up to you.

The docs have a pretty significant error—the types of parameters get combined with the parameter names. This is because of numpydoc I have a commit ready that'll fix this within the next 2 minutes (hopefully).

I also wanted to test all of the methods but since PathFlow doesn't have a testing framework set up, I haven't had a chance to do that yet. Obviously we should do some testing before merging—if you'd like I can set up some basic testing using pytest.

As for the orientation of the slide, I don't expect it'll cause any errors but we should figure that out with the testing.

Also—unfortunately, Poetry doesn't easily publish to conda (I think). However, I think I could write a script to automatically publish to Conda for us. The same goes for Docker.

Also—Poetry doesn't allow custom install instructions (this is probably the biggest downside of Poetry for PathFlow). We could set up a script called pathflow-setup which installs apex and other libraries. Then, in the documentation, we could mention that users should run pathflow-setup after installing.

jlevy44 · 2020-03-15T04:08:11Z

Were you building the docs using sphinx?

That sounds great. In the bin, I have a script called install_apex, but we can just merge this into pathflow setup.

If you want to try the conda and docker set ups, that would be great, I can also set it up as well. I think once we throw in a few pytest modules (we don't need to do the entire framework yet), we should be good to go for the merge.

sumanthratna · 2020-03-15T15:57:49Z

Yup, the docs were built with sphinx. To build the docs:

cd docs
make html

I'm working on writing a Makefile for development files right now, I'll add that to this PR soon.

Dockerfile

Bumps [psutil](https://github.com/giampaolo/psutil) from 5.6.3 to 5.6.6. - [Release notes](https://github.com/giampaolo/psutil/releases) - [Changelog](https://github.com/giampaolo/psutil/blob/master/HISTORY.rst) - [Commits](giampaolo/psutil@release-5.6.3...release-5.6.6) Signed-off-by: dependabot[bot] <[email protected]>

…il-5.6.6 Bump psutil from 5.6.3 to 5.6.6 in /docker

jlevy44 · 2020-03-22T13:31:36Z

This is a really impressive PR that moves us closer to a more reproducible workflow.

I would say before merging, we need the remaining minimal set of tests:

Preprocess pipeline test (is a SQL file generated? Zarr? NPZ?), in a future PR we can consider removing the dependence on zarr and npz
Quick test of train_model (generate a pretrained model, no need to train, but if training, only store 100 patches)
Preliminary tests of visualization module (UMAP plot production with plotly, UMAP with imagenet pretrained network with images overlaid; this can be done without training the model, we can discuss over Slack).
The aforementioned CLIs work, no click errors or errors introduced by using black.

To perform the tests the most efficiently, you can also subset the SQL database to the first 100 entries, which should make 2 and 3 easier to accomplish. Some of the steps for performing these tasks are in the Wiki.

publish

jlevy44 · 2020-03-25T14:00:27Z

tests/test_utils.py

+        png_file = join(input_dir, basename + ".png")
+        xml_file = join(input_dir, basename + ".xml")
+
+    test_segmentation()


Technically, you could test both classification and segmentation on the same dataset, I'm not sure if this is what you were going for here.

Or even regression from patch level labels featured in the SQL. I'll try to get a new dataset over soon.

I was planning on using TCGA-18-5592-01Z-00-DX1 for testing both segmentation and classification, like you suggested. The reason I'm splitting the test into two different methods is because some of the parameter names change (such as npy_mask vs. xml_file).

I'm also planning on adding support for TCGA annotations in PathFlow, so a new dataset shouldn't be necessary.

jlevy44 · 2020-03-25T14:02:20Z

tests/test_utils.py

+            "annotation",
+            "0",
+            "1",
+            "2",


You may want to change these header names to reflect final testing dataset.

jlevy44 · 2020-03-25T14:05:57Z

tests/test_utils.py

+        from sqlite3 import connect as sql_connect
+
+        connection = sql_connect(odb)
+        cursor = connection.execute('SELECT * FROM "256";')


One potential option here is to limit the number of patches before testing the classification and segmentation pipelines

Also add Python 3.7 and 3.8 to build matrix

sumanthratna and others added 6 commits March 12, 2020 19:51

Fix svs2dask_array

50826a7

Merge pull request jlevy44#7 from jlevy44/dev_object_detection

e3e0de6

Merging a few changes

Merge pull request jlevy44#6 from sumanthratna/master

4e7fa96

Fix svs2dask_array

Add .gitignore

a18f078

Merge remote-tracking branch 'upstream/master'

582bf1b

Lint and format code

798fccd

sumanthratna changed the title ~~Add .gitignore~~ Update Click, add .gitignore, format code, and use Poetry Mar 14, 2020

Rebuild and update docs

a89cfa2

Fix parameter names and types in docs

0b6e8ff

sumanthratna and others added 16 commits March 15, 2020 15:32

Add publish script

0a34204

Add svs2dask_array test

cab0498

Add large files to Git LFS

7e529cd

Adding optimization level

f27becb

Programmatically download test file

5dee239

Merge branch 'master' of github.com:jlevy44/PathFlowAI

b09ca93

First attempt at docker.. Need nvidia-docker and singularity

871b6e3

Add Travis CI

4869050

Fix setup.py entry_points

9d20541

Add OpenSlide installation to Travis

663db89

Add Travis badge to README

b6edc15

Rebuild docs, make scripts executable, and update dependencies

9b4b78d

Add Black to Travis

4fdde7f

Install Black in Travis

b3db849

Updates to dockerfile

42d3f7e

changed capitalization of docker

672c92d

jlevy44 and others added 3 commits March 22, 2020 09:19

Merge pull request jlevy44#11 from jlevy44/dockerfile

ebc7aee

Dockerfile

Merge pull request jlevy44#12 from jlevy44/dependabot/pip/docker/psut…

ce7c4d4

…il-5.6.6 Bump psutil from 5.6.3 to 5.6.6 in /docker

sumanthratna added 4 commits March 22, 2020 10:28

Merge branch 'master' of github.com:jlevy44/PathFlowAI

716574c

Add test_preprocessing_pipeline

0606903

Test preprocessing pipeline CLI

bd1e7e7

Verify file creation for preprocessing

1e5d2c3

sumanthratna force-pushed the master branch from 168c255 to 468fbec Compare March 23, 2020 02:25

Replace preprocessing test data

f9620b0

sumanthratna force-pushed the master branch from 64be5bf to f9620b0 Compare March 23, 2020 17:44

sumanthratna added 6 commits March 24, 2020 21:10

Merge branch 'master' of github.com:jlevy44/PathFlowAI

2e27248

Convert TCGA sample to npy for testing segmentation

e2f3836

Fix Travis build

e8fd702

Fix Travis build v2

d7da285

Improve targets in publish script

0f44010

Fix help command in publish script

48d9fbd

jlevy44 reviewed Mar 25, 2020

View reviewed changes

publish Outdated Show resolved Hide resolved

jlevy44 reviewed Mar 25, 2020

View reviewed changes

tests/test_utils.py

"annotation",

"0",

"1",

"2",

Copy link

Owner

jlevy44 Mar 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may want to change these header names to reflect final testing dataset.

jlevy44 reviewed Mar 25, 2020

View reviewed changes

sumanthratna added 9 commits March 25, 2020 11:08

Fix Docker username in publish script

ca2ef13

Merge branch 'master' of github.com:jlevy44/PathFlowAI

ffc5312

Merge branch 'master' of github.com:jlevy44/PathFlowAI

59185e1

Update dependencies

762eef7

Use np.testing in unit tests

76cf4f7

Add NVIDIA apex to pyproject.toml

07f4e08

Fix Travis build

ce32da7

Fix Travis build v2

2d7d4b4

Fix Travis build v3

70324e7

Also add Python 3.7 and 3.8 to build matrix

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update Click, add .gitignore, format code, and use Poetry #8

Update Click, add .gitignore, format code, and use Poetry #8

sumanthratna commented Mar 13, 2020 •

edited

Loading

jlevy44 commented Mar 14, 2020

jlevy44 commented Mar 14, 2020

sumanthratna commented Mar 14, 2020 •

edited

Loading

jlevy44 commented Mar 15, 2020

sumanthratna commented Mar 15, 2020

jlevy44 commented Mar 22, 2020 •

edited

Loading

jlevy44 Mar 25, 2020

jlevy44 Mar 25, 2020

sumanthratna Mar 25, 2020

jlevy44 Mar 25, 2020

jlevy44 Mar 25, 2020

Update Click, add .gitignore, format code, and use Poetry #8

Are you sure you want to change the base?

Update Click, add .gitignore, format code, and use Poetry #8

Conversation

sumanthratna commented Mar 13, 2020 • edited Loading

jlevy44 commented Mar 14, 2020

jlevy44 commented Mar 14, 2020

sumanthratna commented Mar 14, 2020 • edited Loading

jlevy44 commented Mar 15, 2020

sumanthratna commented Mar 15, 2020

jlevy44 commented Mar 22, 2020 • edited Loading

jlevy44 Mar 25, 2020

Choose a reason for hiding this comment

jlevy44 Mar 25, 2020

Choose a reason for hiding this comment

sumanthratna Mar 25, 2020

Choose a reason for hiding this comment

jlevy44 Mar 25, 2020

Choose a reason for hiding this comment

jlevy44 Mar 25, 2020

Choose a reason for hiding this comment

sumanthratna commented Mar 13, 2020 •

edited

Loading

sumanthratna commented Mar 14, 2020 •

edited

Loading

jlevy44 commented Mar 22, 2020 •

edited

Loading