Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

0.1.0 release #226

Merged
merged 27 commits into from
Nov 8, 2021
Merged

0.1.0 release #226

merged 27 commits into from
Nov 8, 2021

Conversation

adamjstewart
Copy link
Collaborator

source
------

TorchGeo can also be installed from source using the ``setup.py`` file and setuptools.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Direct invocation of setup.py is deprecated: https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html

"program.log_dir=" + str(log_dir),
"trainer.fast_dev_run=1",
"experiment.task=" + task,
"program.overwrite=True",
"config_file=" + os.path.join("conf", "task_defaults", task + ".yaml"),
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@calebrob6 without this line, the default configs aren't being loaded, so there might be something broken with default loading.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default config is loaded by name here -- https://github.com/microsoft/torchgeo/blob/main/train.py#L97.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but if you comment out this line, you'll see that it isn't loading the default config.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It definitely is loading the default config, else it would throw an error about not finding the default config

Copy link
Member

@calebrob6 calebrob6 Nov 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of the way configurations are merged, experiment.datamodule.root_dir was being overridden by the defaults in defaults.yaml (i.e. program.data_dir). I'm not sure it makes sense to have the test datasets be the default data for each dataset. The fixes here are:

  • do as you've done
  • remove the default mapping between program.data_dir and experiment.datamodule.root_dir in defaults.yaml
  • remove the default root_dirs that you added to all the datasets and pass those instead as arguments to program.data_dir
  • rewrite the configuration system :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I realize I've been abusing the config system by modifying it to suit the tests. What if we:

  1. Keep doing this until we get the release out since the conf directory won't get installed anyway
  2. After the release is out, create a tests/conf directory for test-specific configuration settings
  3. Think about Re-think how configs are handled in train.py #227 post-rebuttal when we have more time

We could also do 2 real quick, it wouldn't take that long.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make a TODO here or somewhere that we'll remember?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll create an issue so we don't re-trigger the tests

- name: Install pip dependencies
run: pip install .[tests]
run: |
pip install gdal tqdm # TODO: these deps shouldn't be needed
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@isaaccorley the indices tutorial imports things that aren't TorchGeo deps, does it need to?

Copy link
Collaborator

@isaaccorley isaaccorley Nov 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can revamp them but requiring that notebooks only use torchgeo deps might be too restrictive. E.g we may want to use visualization libraries in a tutorial but not add them as a dependency to torchgeo.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code definitely uses these two libraries.

  • tqdm is simple to include as a dependency, it will be indirectly included anyway, and is quite useful in notebooks and command line scripts
  • The gdal bit can be replaced quite easily with rasterio

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, let's do this for the next release

- name: Run notebook checks
env:
MLHUB_API_KEY: ${{ secrets.MLHUB_API_KEY }}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, we could use a different dataset that doesn't require an API key to download

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather not have an API key secret, and would rather use UCMerced instead of Cyclone. Lets put that on the 0.2 release path.

@@ -451,3 +451,6 @@ def validation_step( # type: ignore[override]
)

self.log("val_loss", loss, on_step=False, on_epoch=True)

def test_step(self, *args: Any) -> None: # type: ignore[override]
"""No-op, does nothing."""
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PyTorch Lightning 1.5 added a feature to check the validity of a trainer. If you pass a DataModule with a test_loader but your trainer doesn't have a test_step, it raises an error.

@adamjstewart adamjstewart marked this pull request as ready for review November 8, 2021 03:03
@adamjstewart adamjstewart merged commit 740d4f8 into main Nov 8, 2021
@adamjstewart adamjstewart deleted the releases/v0.1 branch November 8, 2021 04:06
@adamjstewart adamjstewart restored the releases/v0.1 branch November 8, 2021 04:06
@adamjstewart adamjstewart added this to the 0.1.0 milestone Nov 20, 2021
yichiac pushed a commit to yichiac/torchgeo that referenced this pull request Apr 29, 2023
* 0.1.0 release

* Train deps needed for release testing

* Update development status

* setup.py should not be run directly

* Test more trainers

* Fix local docs build

* Update installation instructions

* Specify test data dir in config

* Fix tutorial docs

* Trainers should default to num_workers=0, download=False

* Correct location for root_dir

* Try different GDAL name

* Try again

* Various fixes to release tests

* Update pip installs in tutorials

* Fix some bugs

* Config file not being picked up

* Get back to 100% test coverage

* Added correct weight string to UCMerced

* yolo fix

* yolo fix pt 2

* yolo fix 2 pt. 1

* Simplify tests a bit

* Make the trainer notebook look stupid

* UCMerced should download by default in the trainers

* Revert

* Fix logo/author, include LICENSE in upload

Co-authored-by: Caleb Robinson <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0.1.0 release and publication
3 participants