Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[2.2.2 regression] Codespell tries and fails to decode and fix .git\objects files #2539

Closed
EwoutH opened this issue Oct 18, 2022 · 6 comments · Fixed by #2541
Closed

[2.2.2 regression] Codespell tries and fails to decode and fix .git\objects files #2539

EwoutH opened this issue Oct 18, 2022 · 6 comments · Fixed by #2541

Comments

@EwoutH
Copy link

EwoutH commented Oct 18, 2022

Using Codespell 2.2.2 on Python 3.10 in a Git repository, Codespell errors with an permission error trying to modify Git objects:

PermissionError: [Errno 13] Permission denied: '.\\.git\\objects\\28\\719cdd2c5741cadfa2eede24b624249600f2aa'

This behaviour was not present in Codespell 2.2.1 (I tested that multiple times on multiple branches) so I think it's a 2.2.2 regression.

Full log:

(Py310) C:\Users\Ewout\Documents\GitHub\EMAworkbench>codespell . -w
.\CONTRIBUTING.md:24: wil ==> will, well
WARNING: Decoding file using encoding=utf-8 failed: .\.git\objects\00\a64bbac844bee64e9747d950e769870ac01fc8
WARNING: Trying next encoding iso-8859-1
WARNING: Decoding file using encoding=utf-8 failed: .\.git\objects\0c\d404eaacd02cd51a30438e14ecbadba2bc25d7
WARNING: Trying next encoding iso-8859-1
WARNING: Decoding file using encoding=utf-8 failed: .\.git\objects\0e\221584b2ca3efb1d0c227644f863571a195ce5
WARNING: Trying next encoding iso-8859-1
WARNING: Decoding file using encoding=utf-8 failed: .\.git\objects\0e\b5063aece7ad0f777839db61502047e92b982a
WARNING: Trying next encoding iso-8859-1
WARNING: Decoding file using encoding=utf-8 failed: .\.git\objects\0e\b88e5cd03acfd5ace58bf042313c7f22fbffda
WARNING: Trying next encoding iso-8859-1
WARNING: Decoding file using encoding=utf-8 failed: .\.git\objects\0e\dd7ff99fb009e57273889f9ec19ef612b4d271
WARNING: Trying next encoding iso-8859-1
WARNING: Decoding file using encoding=utf-8 failed: .\.git\objects\0f\eb1bea2475bb2b103d2499f28a4411154d336c
WARNING: Trying next encoding iso-8859-1
WARNING: Decoding file using encoding=utf-8 failed: .\.git\objects\17\ec18a5d54ae157814d94b5adb70a2545af6e37
WARNING: Trying next encoding iso-8859-1
WARNING: Decoding file using encoding=utf-8 failed: .\.git\objects\18\9acca87135099b8ec590204b8141b131b69796
WARNING: Trying next encoding iso-8859-1
WARNING: Decoding file using encoding=utf-8 failed: .\.git\objects\19\cba58fe9824793572e1e06a45ae98bc0fa8b06
WARNING: Trying next encoding iso-8859-1
WARNING: Decoding file using encoding=utf-8 failed: .\.git\objects\1f\8d3993763dd7f55d06edaaf7bef3750a8750c6
WARNING: Trying next encoding iso-8859-1
WARNING: Decoding file using encoding=utf-8 failed: .\.git\objects\25\5e3d6d9639dfe6fd4e797e1c63d59ba0522c2d
WARNING: Trying next encoding iso-8859-1
WARNING: Decoding file using encoding=utf-8 failed: .\.git\objects\27\471e6e6b17405d4bcac965e226e65bda20f70a
WARNING: Trying next encoding iso-8859-1
WARNING: Decoding file using encoding=utf-8 failed: .\.git\objects\27\f2cf2ad8af4f240709fe46980adf2718ef85d6
WARNING: Trying next encoding iso-8859-1
WARNING: Decoding file using encoding=utf-8 failed: .\.git\objects\28\719cdd2c5741cadfa2eede24b624249600f2aa
WARNING: Trying next encoding iso-8859-1
FIXED: .\.git\objects\28\719cdd2c5741cadfa2eede24b624249600f2aa
Traceback (most recent call last):
  File "C:\Users\Ewout\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\Ewout\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "C:\Users\Ewout\.virtualenvs\Py310\Scripts\codespell.exe\__main__.py", line 7, in <module>
  File "C:\Users\Ewout\.virtualenvs\Py310\lib\site-packages\codespell_lib\_codespell.py", line 767, in _script_main
    return main(*sys.argv[1:])
  File "C:\Users\Ewout\.virtualenvs\Py310\lib\site-packages\codespell_lib\_codespell.py", line 910, in main
    bad_count += parse_file(
  File "C:\Users\Ewout\.virtualenvs\Py310\lib\site-packages\codespell_lib\_codespell.py", line 760, in parse_file
    with codecs.open(filename, 'w', encoding=encoding) as f:
  File "C:\Users\Ewout\AppData\Local\Programs\Python\Python310\lib\codecs.py", line 905, in open
    file = builtins.open(filename, mode, buffering)
PermissionError: [Errno 13] Permission denied: '.\\.git\\objects\\28\\719cdd2c5741cadfa2eede24b624249600f2aa'
@DimitriPapadopoulos
Copy link
Collaborator

While I can reproduce this issue, I wonder whether it is really a regression. Indeed, I had similar errors with former versions of codespell, see #2189.

When just reading files, without-w:

$ mkdir -p foo/bar
$ touch foo/bar/file
$ chmod a= foo/bar/file
$ cd foo
$ codespell
Traceback (most recent call last):
  File "/home/username/.local/bin/codespell", line 8, in <module>
    sys.exit(_script_main())
  File "/my/path/codespell/codespell_lib/_codespell.py", line 767, in _script_main
    return main(*sys.argv[1:])
  File "/my/path/codespell/codespell_lib/_codespell.py", line 910, in main
    bad_count += parse_file(
  File "/my/path/codespell/codespell_lib/_codespell.py", line 651, in parse_file
    text = is_text_file(filename)
  File "/my/path/codespell/codespell_lib/_codespell.py", line 509, in is_text_file
    with open(filename, mode='rb') as f:
PermissionError: [Errno 13] Permission denied: './bar/file'
$ 

When attempting to fix files, with -w:

$ codespell -w
Traceback (most recent call last):
  File "/home/username/.local/bin/codespell", line 8, in <module>
    sys.exit(_script_main())
  File "/my/path/codespell/codespell_lib/_codespell.py", line 767, in _script_main
    return main(*sys.argv[1:])
  File "/my/path/codespell/codespell_lib/_codespell.py", line 910, in main
    bad_count += parse_file(
  File "/my/path/codespell/codespell_lib/_codespell.py", line 651, in parse_file
    text = is_text_file(filename)
  File "/my/path/codespell/codespell_lib/_codespell.py", line 509, in is_text_file
    with open(filename, mode='rb') as f:
PermissionError: [Errno 13] Permission denied: './bar/file'
$ 

Could it be that codespell used to skip .git, and now does not skip .git, for some reason?

@DimitriPapadopoulos
Copy link
Collaborator

DimitriPapadopoulos commented Oct 18, 2022

As for the following warnings, they are indeed new and the result of 900f186, yet they are expected. It's just that the encoding detection code used to have a bug, and this bug has been fixed by 900f186.

WARNING: Decoding file using encoding=utf-8 failed: .\.git\objects\00\a64bbac844bee64e9747d950e769870ac01fc8
WARNING: Trying next encoding iso-8859-1

@EwoutH
Copy link
Author

EwoutH commented Oct 18, 2022

So I should probably let codespell ignore my .git folder from this point onwards? Or is the error still undesired behaviour?

@peternewman
Copy link
Collaborator

Does just codespell work @EwoutH when in the right directory?

We should ignore hidden files already as per e.g.:
#2539 (comment)

Do the tests pass on Windows for you?

I suspect it's a bug somewhere, but I don't know if it's our hidden directory checking, or our Windows behaviour or something else...

@DimitriPapadopoulos
Copy link
Collaborator

DimitriPapadopoulos commented Oct 18, 2022

You should certainly add skip = ./.git to your configuration file. As far as I know, codespell has never been skipping .git by itself.
Edit: My wrong, codespell does seem to be skipping hidden files, I wonder why I had or thought I had to add skip = ./.git in some projects:

$ mkdir .git
$ touch .git/file
$ chmod a= .git/file
$ codespell -w
$ 

I still wonder why you cannot reproduce this issue with prior versions of codespell. I seem to be able to trigger this error with codespell 2.2.0:

$ mkdir -p foo/bar
$ touch foo/bar/file
$ chmod a= foo/bar/file
$ cd foo
$ codespell --version
2.2.0
$ codespell
Traceback (most recent call last):
  File "/home/username/.local/bin/codespell", line 8, in <module>
    sys.exit(_script_main())
  File "/my/path/codespell/codespell_lib/_codespell.py", line 767, in _script_main
    return main(*sys.argv[1:])
  File "/my/path/codespell/codespell_lib/_codespell.py", line 910, in main
    bad_count += parse_file(
  File "/my/path/codespell/codespell_lib/_codespell.py", line 651, in parse_file
    text = is_text_file(filename)
  File "/my/path/codespell/codespell_lib/_codespell.py", line 509, in is_text_file
    with open(filename, mode='rb') as f:
PermissionError: [Errno 13] Permission denied: './bar/file'
$ 

@DimitriPapadopoulos
Copy link
Collaborator

DimitriPapadopoulos commented Oct 19, 2022

OK, it appears codespell skips hidden files but does not skip hidden directories reliably. That has always been the case, hence the need to explicitly skip .git in the configuration file.

$ echo 'errror' > .hidden_file
$ 
$ mkdir .hidden_directory
$ echo 'errror' > .hidden_directory/file
$ echo 'errror' > .hidden_directory/a/file
$ echo 'errror' > .hidden_directory/a/b/file
$ echo 'errror' > .hidden_directory/a/b/c/file
$ 
$ codespell 
./.hidden_directory/a/file:1: errror ==> error
./.hidden_directory/a/b/file:1: errror ==> error
./.hidden_directory/a/b/c/file:1: errror ==> error
$ 

As you can see, it's all a question of depth. Hidden files and files directly under a hidden directory are skipped. Files in subdirectories of hidden directories are not skipped. This is not a 2.2.2 regression, it is just not a known bug. I guess you were lucky not to hit the bug before.

Will have to modify the call to os.walk and make sure we use the default topdown=True to be able to prune hidden directories in place.

When topdown is True, the caller can modify the dirnames list in-place (perhaps using del or slice assignment), and walk() will only recurse into the subdirectories whose names remain in dirnames; this can be used to prune the search, [...]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants