Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUILD: Simplifying contributor dependencies #23522

Merged
merged 10 commits into from
Nov 11, 2018
Merged

BUILD: Simplifying contributor dependencies #23522

merged 10 commits into from
Nov 11, 2018

Conversation

datapythonista
Copy link
Member

Unifying dev and optional dependencies, moving them to their standard locations, and create script to generate the pip file, and make sure in the CI they are always sync

… locations, and create script to generate the pip file, and make sure in the CI they are always sync
@datapythonista datapythonista added Build Library building on various platforms CI Continuous Integration Dependencies Required and optional dependencies labels Nov 5, 2018
@pep8speaks
Copy link

Hello @datapythonista! Thanks for submitting the PR.

Copy link
Contributor

@TomAugspurger TomAugspurger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM at a glance. Could maybe bike shed whether we want these at the top level or not (I don't have a strong preference).

True if the comparison fails, False otherwise
"""
with open(conda_fname) as conda_fd:
deps = yaml.load(conda_fd)['dependencies']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May as well do yaml.safe_load, even though this is "trusted" input :)

@TomAugspurger TomAugspurger added this to the 0.24.0 milestone Nov 5, 2018
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we resorting to a requirements.txt? is the environment.yaml not enough? you almost always want to simpy install via conda. having 2 ways to install is just confusing.

doc/source/contributing.rst Outdated Show resolved Hide resolved
doc/source/contributing.rst Show resolved Hide resolved
@jreback jreback removed this from the 0.24.0 milestone Nov 6, 2018
@datapythonista
Copy link
Member Author

@jreback having just a environment.yml was my first option too. But @TomAugspurger said he preferred to have also dependencies to install via pip. That's why I have the two files, environment.yml for conda users, and requirements.txt for pip users (and the script scripts/conda_to_pip.py to generate the pip file from the conda file, and validate in the CI that they have the same dependencies).

I'm happy to remove requirements.txt and the script if @TomAugspurger is ok with it.

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One drawback of installing the env in one go is that you end up with two pandas versions

doc/source/contributing.rst Outdated Show resolved Hide resolved
doc/source/contributing.rst Show resolved Hide resolved
@datapythonista
Copy link
Member Author

My reasoning for the names of the files, is that I don't think anyone will install pandas from source unless it's for development. And when installing from pip or conda, the dependencies are obtained from setup.cfg.

Personally, I think it makes sense to have environment_dev.yml and requirements_dev.yml if we have environment.yml (or environment_prod.yml) for production, and same for requirements. But as I doubt those files won't be used, I think the current names are the simplest option.

I'd really like to keep environment.yml as this is the de-facto standard by conda, and I think the future is that in any project you will simply do conda env create to get the environment (and conda env update to update it). And I think it's very convenient when you start to be used to it, and users will sooner than later know about this standard. Knowing that this will install from environment.yml is to me like knowing that you have your vim options in ~/.vimrc, something that you're expected to know once you start using the tool more.

But let me know if you're not convinced, and you still prefer to make the changes.

@codecov
Copy link

codecov bot commented Nov 6, 2018

Codecov Report

Merging #23522 into master will decrease coverage by <.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #23522      +/-   ##
==========================================
- Coverage   92.24%   92.23%   -0.01%     
==========================================
  Files         161      161              
  Lines       51278    51288      +10     
==========================================
+ Hits        47299    47304       +5     
- Misses       3979     3984       +5
Flag Coverage Δ
#multiple 90.62% <ø> (-0.01%) ⬇️
#single 42.26% <ø> (-0.02%) ⬇️
Impacted Files Coverage Δ
pandas/tseries/offsets.py 97.07% <0%> (-0.15%) ⬇️
pandas/core/dtypes/concat.py 96.26% <0%> (-0.09%) ⬇️
pandas/io/pytables.py 92.34% <0%> (-0.09%) ⬇️
pandas/io/parsers.py 95.55% <0%> (-0.07%) ⬇️
pandas/core/frame.py 97.03% <0%> (-0.01%) ⬇️
pandas/core/generic.py 96.81% <0%> (ø) ⬆️
pandas/core/window.py 96.4% <0%> (ø) ⬆️
pandas/core/series.py 93.7% <0%> (ø) ⬆️
pandas/core/indexes/base.py 96.45% <0%> (ø) ⬆️
pandas/core/arrays/datetimes.py 98.44% <0%> (+0.02%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 383d052...6f30a04. Read the comment docs.

@TomAugspurger
Copy link
Contributor

I'd like to keep the requirements.txt around, but am not going to block things if others decide to remove it :)

doc/source/contributing.rst Outdated Show resolved Hide resolved
doc/source/contributing.rst Show resolved Hide resolved
@datapythonista
Copy link
Member Author

Made explicit the requirements file for conda in the docs.

I think pip is not the primary anywhere (the script creates the pip file from the conda file already), and in the docs is already clear that conda should be installed, and pip instructions are provided as an extra.

So, I think this should be all right now.

Copy link
Contributor

@TomAugspurger TomAugspurger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Only outstanding question is I think whether to call it requirements.txt vs. requirements-dev.txt or whatever. I don't have a preference.

@jreback jreback added this to the 0.24.0 milestone Nov 11, 2018
@jreback
Copy link
Contributor

jreback commented Nov 11, 2018

I think we should name requirements-dev.txt I believe requirements.txt is what you need to install your project; ours includes lots of other things, so shouldn't confuse it.

@@ -1,3 +1,17 @@
NumPy
python-dateutil>=2.5.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comment above

scripts/conda_to_pip.py Outdated Show resolved Hide resolved
conda_fname : str
Path to the conda file with dependencies (e.g. `environment.yml`).
pip_fname : str
Path to the pip file with dependencies (e.g. `requirements.txt`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change the default here

scripts/conda_to_pip.py Outdated Show resolved Hide resolved
scripts/conda_to_pip.py Outdated Show resolved Hide resolved
@@ -0,0 +1,96 @@
#!/usr/bin/env python
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe change this name to

generate_requirements-dev_from_environment or some more meaningful name

#!/usr/bin/env python
"""
Convert the conda environment.yml to the pip requirements.txt,
or check that they have the same packages (for the CI)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this script called anywhere right now? should add docs on how to do this in contributing.rst?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@datapythonista if you want to add to contributing.rst at some point would be good (IOW how to run the script), but not necessary for now

@jreback
Copy link
Contributor

jreback commented Nov 11, 2018

lgtm. ping on green.

@datapythonista
Copy link
Member Author

The script is called in code_checks.sh to make sure that the pip dependencies are synchronized with the conda ones, and it fails with a message asking to run the script if it's not the case.

I think the script is named requirements_dev.txt and not requirements-dev.txt now, but made the change, doesn't make a big difference.

I think at some point would be nice to have all the scripts explained in the documentation. But I'd leave that for another PR, as I think some clean up is needed, and personally I think we should merge the content of scripts into ci, as for what I know, it's mostly arbitrary which scripts go to one, and which go to the other (I guess there was a reason originally, but not sure if that applies anymore).

@jreback jreback merged commit 592fd64 into pandas-dev:master Nov 11, 2018
@jreback
Copy link
Contributor

jreback commented Nov 11, 2018

thanks!

thoo added a commit to thoo/pandas that referenced this pull request Nov 11, 2018
* upstream/master:
  BUILD: Simplifying contributor dependencies (pandas-dev#23522)
  BUG/REF: TimedeltaIndex.__new__ (pandas-dev#23539)
thoo added a commit to thoo/pandas that referenced this pull request Nov 12, 2018
…fixed

* upstream/master:
  DOC: Enhancing pivot / reshape docs (pandas-dev#21038)
  TST: Fix xfailing DataFrame arithmetic tests by transposing (pandas-dev#23620)
  BUILD: Simplifying contributor dependencies (pandas-dev#23522)
  BUG/REF: TimedeltaIndex.__new__ (pandas-dev#23539)
  BUG: Casting tz-aware DatetimeIndex to object-dtype ndarray/Index (pandas-dev#23524)
  BUG: Delegate more of Excel parsing to CSV (pandas-dev#23544)
  API: DataFrame.__getitem__ returns Series for sparse column (pandas-dev#23561)
  CLN: use float64_t consistently instead of double, double_t (pandas-dev#23583)
  DOC: Fix Order of parameters in docstrings (pandas-dev#23611)
  TST: Unskip some Categorical Tests (pandas-dev#23613)
  TST: Fix integer ops comparison test (pandas-dev#23619)
JustinZhengBC pushed a commit to JustinZhengBC/pandas that referenced this pull request Nov 14, 2018
tm9k1 pushed a commit to tm9k1/pandas that referenced this pull request Nov 19, 2018
Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019
Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Build Library building on various platforms CI Continuous Integration Dependencies Required and optional dependencies
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants