-
-
Notifications
You must be signed in to change notification settings - Fork 31.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
list duplicate test names with patchcheck #60283
Comments
See also bpo-16056 for the current list of duplicate test names in The attached patch improves patchcheck.py to list duplicate test An example of patchcheck output with the patch applied: ================== $ make patchcheck
./python ./Tools/scripts/patchcheck.py
Getting the list of files that have been added/changed ... 1 file
Fixing whitespace ... 0 files
Fixing C file whitespace ... 0 files
Fixing docs whitespace ... 0 files
Duplicate test names ... 1 test:
TestErrorHandling.test_get_only in file Lib/test/test_heapq.py
Docs modified ... NO
Misc/ACKS updated ... NO
Misc/NEWS updated ... NO
configure regenerated ... not needed
pyconfig.h.in regenerated ... not needed Did you run the test suite? ================== |
Nice feature to do without adding a dependency on a lint tool! |
I would like to see this written in a way that would let one run it globally or on a single file independent of a patch (e.g. an independent script from which patchcheck could import certain functions). Or is that what you explicitly didn't want Éric? :) This would let one do a report or global check as was done for bpo-16056. It would also make it a bit easier to check manually that the script is checking for duplicates correctly. Also, some suggestions: +def testmethod_names(code, name=[]): It might be clearer to use the name=None form. + test_files = [fn for fn in python_files if Are you getting the test files in test/ subdirectories of subpackages? I think checking that the file name starts with "test_" might be sufficient to get all test files. + if name[-1].startswith('test_'): I believe 'test' is the prefix that unittest uses. I'm pretty sure we have some tests that don't start with 'test_'. |
+1
IIRC those are just test helpers that are not executed directly. |
Here are a couple examples of test method names that don't begin with "test_": def testLoadTk(self):
def testLoadTkFailure(self): http://hg.python.org/cpython/file/f1094697d7dc/Lib/tkinter/test/test_tkinter/test_loadtk.py#l9 |
sqlite3 tests use CheckThing style (urgh). |
I thought you were talking about test files. I still don't see why looking for test_* methods, every class might contain duplicate method names, so they should all be checked. |
Oh, I see why you said that then. To find the test files themselves, this logic was used in the patch: + fn.startswith(os.path.join('Lib', 'test'))] Regarding your question for the general case, I'm not sure if there is ever a use case for duplicate method names. Is there? |
Note that using the module code object to find duplicates does not Duplicates are extensively used within the std lib: Running find_duplicate_test_names.py, the initial script from issue if not name[-1].startswith('<'):
yield '.'.join(name) prints 347 (on a total of 1368 std lib .py files) duplicate To eliminate module level functions (but not nested functions), the if len(name) > 2 and not name[-1].startswith('<'):
yield '.'.join(name) and lists 188 duplicate nested functions, methods or classes. In |
Using the python class browser (pyclbr.py) in conjunction with the |
It doesn't necessary have to be limited to methods, anything duplicate might turn out to be a bug. If the script doesn't mix scopes there shouldn't be too many false positives, and if they are it shouldn't be a big deal if they are reported on the changed file by
Nothing that can't be done in a more elegant way afaict. It might make sense for variables though, where you have e.g.: foo = do_something(x)
foo = do_something_more(foo) |
property getter, setter, and deleter methods do have the same name. |
Also Lib/test/test_smtplib.py test method names start with 'test' |
For informational purposes, here is where unittest defaults to the prefix "test" for finding test methods: http://hg.python.org/cpython/file/f11649b21603/Lib/unittest/loader.py#l48 sqlite3 is able to use "Check" because it manages its own test discovery. For example-- http://hg.python.org/cpython/file/f11649b21603/Lib/sqlite3/test/regression.py#l306 |
The attached script, named duplicate_code_names.py, takes a file The script output on the whole std lib (see the result in the $ time ./python Tools/scripts/duplicate_code_names.py $(find Lib -name "*py") > std_lib_duplicates.txt
Lib/test/badsyntax_future4.py: compile error: from __future__ imports must occur at the beginning of the file (badsyntax_future4.py, line 3)
Lib/test/badsyntax_future6.py: compile error: from __future__ imports must occur at the beginning of the file (badsyntax_future6.py, line 3)
Lib/test/badsyntax_future3.py: compile error: future feature rested_snopes is not defined (badsyntax_future3.py, line 3)
Lib/test/badsyntax_future9.py: compile error: not a chance (badsyntax_future9.py, line 3)
Lib/test/bad_coding.py: compile error: unknown encoding for 'Lib/test/bad_coding.py': uft-8
Lib/test/badsyntax_future8.py: compile error: future feature * is not defined (badsyntax_future8.py, line 3)
Lib/test/badsyntax_3131.py: compile error: invalid character in identifier (badsyntax_3131.py, line 2)
Lib/test/badsyntax_future7.py: compile error: from __future__ imports must occur at the beginning of the file (badsyntax_future7.py, line 3)
Lib/test/bad_coding2.py: compile error: encoding problem for 'Lib/test/bad_coding2.py': utf-8
Lib/test/badsyntax_pep3120.py: compile error: invalid or missing encoding declaration for 'Lib/test/badsyntax_pep3120.py'
Lib/test/badsyntax_future5.py: compile error: from __future__ imports must occur at the beginning of the file (badsyntax_future5.py, line 4)
Lib/lib2to3/tests/data/different_encoding.py: compile error: invalid syntax (different_encoding.py, line 3)
Lib/lib2to3/tests/data/py2_test_grammar.py: compile error: invalid token (py2_test_grammar.py, line 31)
Lib/lib2to3/tests/data/bom.py: compile error: invalid syntax (bom.py, line 2)
Lib/lib2to3/tests/data/crlf.py: compile error: invalid syntax (crlf.py, line 1)
Lib/__phello__.foo.py: __phello__.foo not a valid module name real 6m14.854s FWIW running the same command with python 3.2 takes about 2.5 |
duplicate_code_names_2.py uses tokenize to print duplicate code names $ ./duplicate_code_names_2.py --ignore ignored_duplicates .
Duplicate function or class names:
./Lib/test/_test_multiprocessing.py:3047 _TestProcess
./Lib/test/test_os.py:1290 Win32ErrorTests Duplicate method names: |
The following issues have been entered for all the above duplicate bpo-19112, bpo-19113, bpo-19114, bpo-19115, bpo-19116, except the following which should be added to ignored_duplicates:
|
FWIW testtools rejects test suites with duplicate test ids; I'm considering adding that feature into unittest itself. We'd need an option to make it warn rather than error I think, but if we did that we wouldn't need a separate script at all. |
Upgrading the script to account for Python changes. This is now duplicate_code_names_3.py $ ./python ./duplicate_code_names_3.py --ignore ignored_duplicates Lib/test
Duplicate method names:
Lib/test/test_dataclasses.py:1406 TestCase.test_helper_asdict_builtin_containers
Lib/test/test_dataclasses.py:1579 TestCase.test_helper_astuple_builtin_containers
Lib/test/test_dataclasses.py:700 TestCase.test_not_tuple
Lib/test/test_dataclasses.py:3245 TestReplace.test_recursive_repr_two_attrs
Lib/test/test_genericclass.py:161 TestClassGetitem.test_class_getitem
Lib/test/test_gzip.py:764 TestCommandLine.test_compress_infile_outfile
Lib/test/test_heapq.py:376 TestErrorHandling.test_get_only
Lib/test/test_importlib/test_util.py:755 PEP3147Tests.test_source_from_cache_path_like_arg
Lib/test/test_logging.py:328 BuiltinLevelsTest.test_regression_29220
Lib/test/test_sys_setprofile.py:363 ProfileSimulatorTestCase.test_unbound_method_invalid_args
Lib/test/test_sys_setprofile.py:354 ProfileSimulatorTestCase.test_unbound_method_no_args
Lib/test/test_utf8_mode.py:198 UTF8ModeTests.test_io_encoding False positives have been removed from the output of the above command and so, all the above methods are effectively duplicates that must be fixed. |
This script should be part of Python and run in the pre-commit CI like Travis CI! |
Agreed, making duplicate method definitions a CI failure is the desired end state once our test suite is cleaned up and it doesn't have false positives. FYI - pylint also implements this check quite reliably as function-redefined via its pylint.checkers.base.BasicErrorChecker._check_redefinition() method. https://github.com/PyCQA/pylint/blob/2.2/pylint/checkers/base.py#L843 |
Thanks for the link Gregory. I will write a script based on ast and check its output against pylint and against the current script based on tokenize. The travis() function of Tools/scripts/patchcheck.py may be modified to import this script and run it only on files modified by the PR. This may allow the pre-commit duplicate check to be installed without waiting for the python test suite to be cleaned if the existing duplicates are temporarily added to the ignored_duplicates file (assuming an issue has been entered for each one of those existing duplicates with a note saying to remove the entry in ignored_duplicates when the issue is fixed). Indeed, issues bpo-19113 and bpo-19119 are still open after they have been entered 5 years ago. |
PR 12886 adds a check on duplicate method definitions to the travis() function of patchcheck.py. False positives must be entered to Tools/scripts/duplicates_ignored.txt. The existing duplicates have been entered to this file (with the corresponding bpo issue number in comment) and no duplicates are found currently by duplicate_meth_defs.py when run with '--ignore duplicates_ignored.txt'. It is expected that these entries would be removed from this file when their issues are closed. |
Should the unittest module grow a feature to scan for duplicate methods? I imagine that duplicate methods are a common problem. Possibly, inheriting from unittest can be accompanied by a metaclass that has __prepare__ with special dictionary that detects and warns about duplicates. |
The open proposed PR for this issue has been languishing unreviewed for several months now. Since the proposal is really a request to change our development process, I'm nosying Brett and Łukasz (3.9 RM). In any case, if we would decide to add this to our CI, I thine we should only start with the master branch so I'm closing the 3.7 and 2.7 backport PRs. |
Superseded by python/core-workflow#505 |
Isn’t that repo for tooling and meta issues? IMO this should stay in the CPython tracker for visibility. |
Yes, and CI issues. Most relevant comments in this thread are about running these checks in CI. So it belongs in the core-workflow repo. |
Oh, in my opinion the tool / config should be available for devs to run locally first, then also run from CI! |
Ok, I reopen the ticket. |
Co-authored-by: Alex Waygood <[email protected]> Co-authored-by: Adam Turner <[email protected]>
(cherry picked from commit 3cb9a8e) Co-authored-by: Hugo van Kemenade <[email protected]> Co-authored-by: Alex Waygood <[email protected]> Co-authored-by: Adam Turner <[email protected]>
Co-authored-by: Alex Waygood <[email protected]> Co-authored-by: Adam Turner <[email protected]> (cherry picked from commit 3cb9a8e)
Co-authored-by: Alex Waygood <[email protected]> Co-authored-by: Adam Turner <[email protected]>
…09365) * gh-60283: Check for redefined test names in CI (GH-109161) (cherry picked from commit 3cb9a8e) Co-authored-by: Hugo van Kemenade <[email protected]> Co-authored-by: Alex Waygood <[email protected]> Co-authored-by: Adam Turner <[email protected]> * Update exclude list for 3.12 * Explicitly exclude files which failed to lint/parse * Sort to avoid future merge conflicts --------- Co-authored-by: Hugo van Kemenade <[email protected]> Co-authored-by: Alex Waygood <[email protected]> Co-authored-by: Adam Turner <[email protected]>
) Co-authored-by: Alex Waygood <[email protected]> Co-authored-by: Adam Turner <[email protected]> (cherry picked from commit 3cb9a8e)
duplicate_meth_defs.py
tool as a pre-commit check on Travis (GH-12886) #12950Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
Linked PRs
The text was updated successfully, but these errors were encountered: