Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Black pre-commit as automatic formatter #1768

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

negin513
Copy link
Contributor

Description of changes

Summary: This PR adds the capability to automatically format python codes using Black without changing git history using a pre-commit hook.


A while back, I have suggested using Black as a formatter for CTSM python codes to:

  1. Have a uniform python style/standard across different codes
  2. Accelerate our python developments and review process by automating the formatting.
  3. Improve Python code quality and adapt to modern SE practices.

We have discussed the best method of implementing this in #1471 and there were multiple suggestions. One of the concerns were messing with git history (#1577 (comment))

One way to automate the formatting of python codes without making any changes in the git history was using pre-commit hooks. This is similar to what modern python packages do for their developments (e.g. xarray ).

In this PR, I used this git feature and created a pre-commit hook to automate the formatting of any new python code that is being committed.

This is really easy to use and does not require any special changes. The only thing is that when the developer is committing a new change, it runs the formatter and gives some insight:

image

Please note that that pre-commit hook only modifies files that are changed and going to be committed.

However, using this we can make sure that our codes are always following a certain format.

Besides automatically running black on python codes, this pre-commit hook automatically checks and fixes:

1- trailing-whitespace : trims trailing whitespace.
2- check-yaml - checks yaml files for parseable syntax.
3- mixed-line-ending - replaces or checks mixed line ending.
4- end-of-file-fixer - ensures that a file is either empty, or ends with one newline.

Similar to xarray, I added a badge for black to readme.

Specific notes

Contributors other than yourself, if any: @ekluzek

CTSM Issues Fixed (include github issue #): #1471 and #1611

Are answers expected to change (and if so in what way)? No

Any User Interface Changes (namelist or namelist defaults changes)? No

Testing performed, if any:
tried submitting changes with this pre-commit.

@negin513
Copy link
Contributor Author

This does not change the interface for a user.

For our developers who want to run this pre-commit , please try this using the following:

pip install pre-commit (you only do this once.)

git clone https://github.com/ESCOMP/CTSM
cd CTSM
pre-commit install

So just one additional command after cloning the repository will activate this automatic checking and formatting.

@negin513 negin513 requested a review from ekluzek May 26, 2022 00:18
@ekluzek
Copy link
Collaborator

ekluzek commented Jun 2, 2022

Hmmm, this is interesting. The pip install didn't seem to work for me, the pip install seemed to work OK, but then I couldn't do pre-commit commands. But, I was able to get it to work by adding it to my conda environment. And I think that's what we want in the end anyway. So I think we might want to have a conda environment for pre-commit, or include it in the standard CTSM conda environment.

After that pre-commit failed in my ctsm conda environment, but I got it to work by creating a conda environment just for pre-commit. But, then I found that running pre-commit in the top directory wanted to change over 700 files (because of the trailing blank fixer and other things). I think the pre-commit normally would only be applied to files that were going to be committed, so normally it wouldn't do that. But, we probably want to run the pre-commit fixer on everything (which would be a good thing to do anyway), as a one-shot formatting step. I do like having a fixer for these type of things, but we should probably fix our code by the fixer standards first.

So let's talk more about this tomorrow, and let's meet more to discuss this. Thanks for working on this.

@ekluzek ekluzek added enhancement new capability or improved behavior of existing capability code health improving internal code structure to make easier to maintain (sustainability) labels Jun 2, 2022
@ekluzek
Copy link
Collaborator

ekluzek commented Jun 2, 2022

We talked about this at CTSM software this morning. To start with we think we should restrict the pre-commit hook to the python directory, until we get used to this workflow. I'm more nervous about FORTRAN code getting changed by formatters. The changes proposed here are good and ones I agree with, and shouldn't cause problems, but in principle for FORTRAN code I like the idea that the testing we've done when we commit is done with the code as committed. This would mean in practice if the pre-commit hook is triggered you should then run testing on the code before you do the second commit.

Another thing we thought we should do is to create a separate conda environment for the pre-commit hook. So part of this commit should be to add another conda environment for pre-commit. I checked and none of the default conda environments include pre-commit (which is as I would expect).

@ekluzek
Copy link
Collaborator

ekluzek commented Jun 3, 2022

Some hooks that I think we should look into:

check-added-large-files
check-executables-have-shebangs
check-shebang-scripts-are-executable
check-merge-conflict
check-xml

There's also a bunch of specific format and language linters and checkers. A perl checker could be interesting, but we are also trying to get away from perl and move to python.

Copy link
Collaborator

@ekluzek ekluzek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Add a conda environment file for using pre-commit
  • For now let's only apply the changes to the python subdirectory
  • Add some instructions for developers on how to use the pre-commit hook with the conda environment. A suggested place is under the doc directory in a README.developers file. It could go in the python directory for now until we have it work on everything.
  • We should run the pre-commit hooks we agree to on all our code before we bring this into main-dev. This probably means bringing Apply black to python directory #1577 to main-dev before or with this PR.

- id: mixed-line-ending
# black as python formatter
- repo: https://github.com/psf/black
rev: 22.3.0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way that the run of black could use the ctsm_py conda environment? As well as the run of black to use the python/pyproject.toml file? That would keep the version and options for black consistent across all uses of it. the line-length is especially the thing that we want to make sure is consistent across all uses of black as well as consistent with the line length for pylint.

@ekluzek
Copy link
Collaborator

ekluzek commented Oct 17, 2022

@negin513 I'm assuming since you are in CISL now, that we should plan on bringing this one in ourselves. Let us know if you have any thoughts on that. I think this is close to being done and we should be able to take it on and finish it out.

@ekluzek ekluzek self-assigned this Jan 9, 2023
@wwieder wwieder added this to the CESM3 Answer changing freeze milestone Jun 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
code health improving internal code structure to make easier to maintain (sustainability) enhancement new capability or improved behavior of existing capability
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants