Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Master/Main branch Python make tests fail at the end #1611

Closed
negin513 opened this issue Jan 22, 2022 · 6 comments
Closed

Master/Main branch Python make tests fail at the end #1611

negin513 opened this issue Jan 22, 2022 · 6 comments
Labels
closed: wontfix We won't fix this issue, because it would be too difficult and/or isn't important enough to fix enhancement new capability or improved behavior of existing capability

Comments

@negin513
Copy link
Contributor

negin513 commented Jan 22, 2022

Brief summary of bug

Hello everyone,
I've just noticed that on master branch without any changes or modifications by me, when I run make at the end it shows a fail/error message:
This is right after black reformatting suggestions.

33 files would be reformatted, 6 files would be left unchanged.
Makefile:45: recipe for target 'black' failed
make: *** [black] Error 1

When I am running the black line outside make (black --check --config pyproject.toml .) , it does not fail.

I don't think this is a major or critical issue at all, but just wanted to file this in case others have faced it and wanted to ensure I am not doing anything wrong.


On a separate note, the fsurdat_modifier tests fail when the user does not do git lfs pull, with the following error which was confusing. For me it takes a while to figure out what I was doing wrong: which was doing git lfs fetch --all instead of git lfs pull.

Traceback (most recent call last):
  File "/glade/scratch/negins/ctsm_fbsh/python/ctsm/test/test_sys_fsurdat_modifier.py", line 70, in test_allInfo
    fsurdat_modifier(self._cfg_file_path)
  File "/glade/scratch/negins/ctsm_fbsh/python/ctsm/modify_fsurdat/fsurdat_modifier.py", line 101, in fsurdat_modifier
    lnd_lon_1, lnd_lon_2, lnd_lat_1, lnd_lat_2, landmask_file)
  File "/glade/scratch/negins/ctsm_fbsh/python/ctsm/modify_fsurdat/modify_fsurdat.py", line 45, in init_from_file
    my_file = xr.open_dataset(fsurdat_in)
  File "/glade/u/apps/ch/opt/python/3.7.9/gnu/9.1.0/pkg-library/20201220/lib/python3.7/site-packages/xarray/backends/api.py", line 572, in open_dataset
    store = opener(filename_or_obj, **extra_kwargs, **backend_kwargs)
  File "/glade/u/apps/ch/opt/python/3.7.9/gnu/9.1.0/pkg-library/20201220/lib/python3.7/site-packages/xarray/backends/netCDF4_.py", line 364, in open
    return cls(manager, group=group, mode=mode, lock=lock, autoclose=autoclose)
  File "/glade/u/apps/ch/opt/python/3.7.9/gnu/9.1.0/pkg-library/20201220/lib/python3.7/site-packages/xarray/backends/netCDF4_.py", line 314, in __init__
    self.format = self.ds.data_model
  File "/glade/u/apps/ch/opt/python/3.7.9/gnu/9.1.0/pkg-library/20201220/lib/python3.7/site-packages/xarray/backends/netCDF4_.py", line 373, in ds
    return self._acquire()
  File "/glade/u/apps/ch/opt/python/3.7.9/gnu/9.1.0/pkg-library/20201220/lib/python3.7/site-packages/xarray/backends/netCDF4_.py", line 367, in _acquire
    with self._manager.acquire_context(needs_lock) as root:
  File "/glade/u/apps/ch/opt/python/3.7.9/gnu/9.1.0/lib/python3.7/contextlib.py", line 112, in __enter__
    return next(self.gen)
  File "/glade/u/apps/ch/opt/python/3.7.9/gnu/9.1.0/pkg-library/20201220/lib/python3.7/site-packages/xarray/backends/file_manager.py", line 187, in acquire_context
    file, cached = self._acquire_with_cache_info(needs_lock)
  File "/glade/u/apps/ch/opt/python/3.7.9/gnu/9.1.0/pkg-library/20201220/lib/python3.7/site-packages/xarray/backends/file_manager.py", line 205, in _acquire_with_cache_info
    file = self._opener(*self._args, **kwargs)
  File "netCDF4/_netCDF4.pyx", line 2357, in netCDF4._netCDF4.Dataset.__init__
  File "netCDF4/_netCDF4.pyx", line 1925, in netCDF4._netCDF4._ensure_nc_success
OSError: [Errno -51] NetCDF: Unknown file format: b'/glade/scratch/negins/ctsm_fbsh/python/ctsm/test/testinputs/surfdata_5x5_amazon_16pfts_Irrig_CMIP6_simyr2000_c171214.nc'

I thought possible solutions around it might be:

  1. Running git lfs under the hood in ./run_ctsm_py_tests or as a part of Makefile to ensure these files are downloaded. (But what if the user does not have git lfs?)
  2. or Checking if the files exists in ./run_ctsm_py_tests: But I am not sure if this works because by default the files exist as a pointer in the ctsm repository.

Not sure if either of these solutions are good, but just some thoughts here.


General bug information

CTSM version you are using: ctsm5.1.dev071

Does this bug cause significantly incorrect results in the model's science? No

Details of bug

[Fill in details here.]

Important details of your setup / configuration so we can reproduce the bug

My commands for replicating this:

git clone  https://github.com/ESCOMP/CTSM.git  
cd CTSM
./manage_externals/checkout_externals
cd python/
git lfs pull
make

Definition of Done:
Add a check that "git lfs" is installed and working.

@ekluzek
Copy link
Collaborator

ekluzek commented Jan 22, 2022

@negin513 thanks for reporting this. It's good to see what the LFS error looks like.

The failure in the black target is my fault. The black target is going to fail until we bring #1577 in. I probably should have shown everyone what that looks like. And maybe should have left the black check off of "all" until we do have it working.

@negin513
Copy link
Contributor Author

@ekluzek It is absolutely fine. I just wanted to make sure that I am not doing anything wrong to break the tests before creating the new tag. I think this is a great feature that you added to the tests. 👍

@billsacks
Copy link
Member

I'd like to better understand what led to the git lfs problem. In my testing, as long as I have run 'git lfs install' on the machine I'm using (as a one-time thing), if I do a fresh clone of CTSM, I do not need to do 'git lfs pull', because the CTSM repo is now set up to do this pull automatically for netcdf files. It's possible that things would get confused for an existing clone that predates this change in lfs configuration, and you need to do a single git lfs pull in that case. Is it possible that you encountered this either (1) for a pre-existing clone, or (2) because you hadn't run 'git lfs install' before doing your git clone? If neither of these is the case, then I would be interested in seeing steps to reproduce this problem, and I can try to figure out what might be causing it.

@ekluzek
Copy link
Collaborator

ekluzek commented Jan 22, 2022

@billsacks I just tried this myself to better understand it. I hadn't ever run "git lfs install" on cheyenne, nor "git lfs pull". So when I tried to run the test (in a preexisting clone as I normally do), I got the error that @negin513 shows above.

After doing

git lfs install
git lfs pull

then the test works correctly. So in your asking of "1" or "2", I was both 1 and 2. And it sounds like @negin513 hadn't run "git lfs install" either, and might have done this in a preexisting clone, so likely both as well. Now, I'm curious if I checkout a new clone what happens, so I'll try that.

Now, that we've both done that we are unlikely to run into it again. But, I do wonder if it would be possible and easy to add a check if the above two had been run? Maybe there could be another target in the Makefile to "check_lfs" and just check to see if LFS is loaded? Or it could be part of the test above?

I have it setup on cheyenne now, but I might try it somewhere else and will have forgotten by then what the solution was. ;-) (only partially kidding). Anyone new will run into the same problem as well. Getting good error messages is always hard work, but it does really help if the error message can explain how to solve the error. My hope is that we'll be able to do something straight forward to catch this for developers, and give them the instructions of what to do.

@billsacks
Copy link
Member

I like the idea of adding a check. It could be a direct check of lfs like you suggest (I'm not sure off-hand how to check if you have installed lfs in your gitconfig, but maybe there is a way), or it could be a check of a sample file used as an indicator of whether lfs is working. For the latter, I would imagine putting a tiny netcdf file in the repository like confirm_lfs.nc. Then you could have some code that tries to do some netcdf operation on the file, and give a meaningful error message if it fails. Maybe you could do this in the Makefile, but I think it would be easier to do it in run_ctsm_py_tests: One of the first things that script could do would be to try to open that file as a netcdf dataset. This open would be in a try/except block, which catches an OSError; if caught, it would print a helpful message about git lfs before aborting.

I don't foresee having time to implement this myself in the near future, but I support this idea if someone else wants to take a crack at it.

@ekluzek ekluzek added next this should get some attention in the next week or two. Normally each Thursday SE meeting. enhancement new capability or improved behavior of existing capability labels Jan 26, 2022
@billsacks billsacks removed the next this should get some attention in the next week or two. Normally each Thursday SE meeting. label Feb 24, 2022
@ekluzek ekluzek added the next this should get some attention in the next week or two. Normally each Thursday SE meeting. label Jan 9, 2023
@billsacks
Copy link
Member

Closing this old issue as a wontfix... can revisit later if need be.

@billsacks billsacks closed this as not planned Won't fix, can't repro, duplicate, stale Feb 23, 2023
@billsacks billsacks added closed: wontfix We won't fix this issue, because it would be too difficult and/or isn't important enough to fix and removed next this should get some attention in the next week or two. Normally each Thursday SE meeting. labels Feb 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
closed: wontfix We won't fix this issue, because it would be too difficult and/or isn't important enough to fix enhancement new capability or improved behavior of existing capability
Projects
None yet
Development

No branches or pull requests

3 participants