Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

list duplicate test names with patchcheck #60283

Closed
xdegaye mannequin opened this issue Sep 28, 2012 · 36 comments · Fixed by #109161
Closed

list duplicate test names with patchcheck #60283

xdegaye mannequin opened this issue Sep 28, 2012 · 36 comments · Fixed by #109161
Labels
3.9 only security fixes easy stdlib Python modules in the Lib dir tests Tests in the Lib/test dir type-bug An unexpected behavior, bug, or error

Comments

@xdegaye
Copy link
Mannequin

xdegaye mannequin commented Sep 28, 2012

BPO 16079
Nosy @rhettinger, @gpshead, @vstinner, @rbtcollins, @ned-deily, @ezio-melotti, @merwok, @cjerdonek, @xdegaye, @ambv, @miss-islington
PRs
  • bpo-16079: fix duplicate test method name in test_gzip. #12827
  • [3.7] bpo-16079: fix duplicate test method name in test_gzip. (GH-12827) #12828
  • bpo-16079: Add the duplicate_meth_defs.py tool as a pre-commit check on Travis #12886
  • [2.7] bpo-16079: Add the duplicate_meth_defs.py tool (GH-12886) #12940
  • [3.7] bpo-16079: Add the duplicate_meth_defs.py tool as a pre-commit check on Travis (GH-12886) #12950
  • Files
  • duplicate_test_names.patch
  • duplicate_code_names.py
  • std_lib_duplicates.txt
  • duplicate_code_names_2.py
  • ignored_duplicates
  • duplicate_code_names_3.py
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2012-09-28.10:15:37.662>
    labels = ['easy', 'tests', 'type-bug', 'library', '3.9']
    title = 'list duplicate test names with patchcheck'
    updated_at = <Date 2019-11-27.18:45:45.910>
    user = 'https://github.com/xdegaye'

    bugs.python.org fields:

    activity = <Date 2019-11-27.18:45:45.910>
    actor = 'brett.cannon'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Library (Lib)', 'Tests']
    creation = <Date 2012-09-28.10:15:37.662>
    creator = 'xdegaye'
    dependencies = []
    files = ['27326', '27375', '27376', '31891', '31892', '48265']
    hgrepos = []
    issue_num = 16079
    keywords = ['patch', 'easy']
    message_count = 31.0
    messages = ['171428', '171499', '171505', '171509', '171511', '171512', '171513', '171519', '171536', '171544', '171548', '171558', '171559', '171576', '171734', '198525', '198586', '229625', '340168', '340219', '340221', '340252', '340286', '340295', '340323', '340584', '340585', '340687', '340785', '340786', '351668']
    nosy_count = 11.0
    nosy_names = ['rhettinger', 'gregory.p.smith', 'vstinner', 'rbcollins', 'ned.deily', 'ezio.melotti', 'eric.araujo', 'chris.jerdonek', 'xdegaye', 'lukasz.langa', 'miss-islington']
    pr_nums = ['12827', '12828', '12886', '12940', '12950']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue16079'
    versions = ['Python 3.9']

    Linked PRs

    @xdegaye
    Copy link
    Mannequin Author

    xdegaye mannequin commented Sep 28, 2012

    See also bpo-16056 for the current list of duplicate test names in
    the std lib.

    The attached patch improves patchcheck.py to list duplicate test
    names when running 'make patchcheck'. This patch to the default
    branch can also be applied asis to the 2.7 branch.

    An example of patchcheck output with the patch applied:

    ==================

    $ make patchcheck
    ./python ./Tools/scripts/patchcheck.py
    Getting the list of files that have been added/changed ... 1 file
    Fixing whitespace ... 0 files
    Fixing C file whitespace ... 0 files
    Fixing docs whitespace ... 0 files
    Duplicate test names ... 1 test:
      TestErrorHandling.test_get_only in file Lib/test/test_heapq.py
    Docs modified ... NO
    Misc/ACKS updated ... NO
    Misc/NEWS updated ... NO
    configure regenerated ... not needed
    pyconfig.h.in regenerated ... not needed

    Did you run the test suite?

    ==================

    @xdegaye xdegaye mannequin added type-feature A feature request or enhancement stdlib Python modules in the Lib dir labels Sep 28, 2012
    @merwok
    Copy link
    Member

    merwok commented Sep 28, 2012

    Nice feature to do without adding a dependency on a lint tool!

    @cjerdonek
    Copy link
    Member

    I would like to see this written in a way that would let one run it globally or on a single file independent of a patch (e.g. an independent script from which patchcheck could import certain functions). Or is that what you explicitly didn't want Éric? :)

    This would let one do a report or global check as was done for bpo-16056. It would also make it a bit easier to check manually that the script is checking for duplicates correctly.

    Also, some suggestions:

    +def testmethod_names(code, name=[]):

    It might be clearer to use the name=None form.

    + test_files = [fn for fn in python_files if
    + fn.startswith(os.path.join('Lib', 'test'))]

    Are you getting the test files in test/ subdirectories of subpackages? I think checking that the file name starts with "test_" might be sufficient to get all test files.

    + if name[-1].startswith('test_'):

    I believe 'test' is the prefix that unittest uses. I'm pretty sure we have some tests that don't start with 'test_'.

    @ezio-melotti
    Copy link
    Member

    I would like to see this written in a way that would let one
    run it globally or on a single file independent of a patch

    +1
    It can be added to Tools/scripts and imported by patchcheck.

    I'm pretty sure we have some tests that don't start with 'test_'.

    IIRC those are just test helpers that are not executed directly.
    OTOH I don't see why looking for test_*, every py file might contain duplicate names so they should all be checked.

    @cjerdonek
    Copy link
    Member

    Here are a couple examples of test method names that don't begin with "test_":

        def testLoadTk(self):
        def testLoadTkFailure(self):

    http://hg.python.org/cpython/file/f1094697d7dc/Lib/tkinter/test/test_tkinter/test_loadtk.py#l9

    @merwok
    Copy link
    Member

    merwok commented Sep 28, 2012

    sqlite3 tests use CheckThing style (urgh).

    @ezio-melotti
    Copy link
    Member

    Here are a couple examples of test method names that don't begin with "test_":

    I thought you were talking about test files. I still don't see why looking for test_* methods, every class might contain duplicate method names, so they should all be checked.

    @cjerdonek
    Copy link
    Member

    I thought you were talking about test files.

    Oh, I see why you said that then. To find the test files themselves, this logic was used in the patch:

    + fn.startswith(os.path.join('Lib', 'test'))]

    Regarding your question for the general case, I'm not sure if there is ever a use case for duplicate method names. Is there?

    @xdegaye
    Copy link
    Mannequin Author

    xdegaye mannequin commented Sep 28, 2012

    Note that using the module code object to find duplicates does not
    allow for selecting among the different code types: function, nested
    function, method or class.

    Duplicates are extensively used within the std lib:

    Running find_duplicate_test_names.py, the initial script from issue
    16056, on the whole std lib instead of just Lib/test, after updating
    the script to list all the duplicates (except <lambda>, <genexp>,
    ...) with:

        if not name[-1].startswith('<'):
            yield '.'.join(name)

    prints 347 (on a total of 1368 std lib .py files) duplicate
    functions, methods or classes.

    To eliminate module level functions (but not nested functions), the
    script is run now with the following change:

        if len(name) > 2 and not name[-1].startswith('<'):
            yield '.'.join(name)

    and lists 188 duplicate nested functions, methods or classes. In
    this list there are 131 duplicates in .py files located in the
    subdirectory of a "test" directory.

    @xdegaye
    Copy link
    Mannequin Author

    xdegaye mannequin commented Sep 28, 2012

    Using the python class browser (pyclbr.py) in conjunction with the
    search for duplicates in the module code object would allow to
    restrict the listing of duplicates to functions and methods or even
    just to methods (depending on the feature requirements), without
    listing the duplicate classes and duplicate nested functions. With
    an associated performance cost.

    @ezio-melotti
    Copy link
    Member

    It doesn't necessary have to be limited to methods, anything duplicate might turn out to be a bug. If the script doesn't mix scopes there shouldn't be too many false positives, and if they are it shouldn't be a big deal if they are reported on the changed file by make patchcheck.

    I'm not sure if there is ever a use case for duplicate
    method names. Is there?

    Nothing that can't be done in a more elegant way afaict.

    It might make sense for variables though, where you have e.g.:

    foo = do_something(x)
    foo = do_something_more(foo)

    @xdegaye
    Copy link
    Mannequin Author

    xdegaye mannequin commented Sep 29, 2012

    I'm not sure if there is ever a use case for duplicate method
    names. Is there?

    property getter, setter, and deleter methods do have the same name.

    @xdegaye
    Copy link
    Mannequin Author

    xdegaye mannequin commented Sep 29, 2012

    Here are a couple examples of test method names that don't begin
    with "test_":

    def testLoadTk(self):
    def testLoadTkFailure(self):
    

    Also Lib/test/test_smtplib.py test method names start with 'test'
    instead of 'test_' although the 'Regression tests package for Python'
    documentation states: "The test methods in the test module should
    start with test_".

    @cjerdonek
    Copy link
    Member

    For informational purposes, here is where unittest defaults to the prefix "test" for finding test methods:

    http://hg.python.org/cpython/file/f11649b21603/Lib/unittest/loader.py#l48

    sqlite3 is able to use "Check" because it manages its own test discovery. For example--

    http://hg.python.org/cpython/file/f11649b21603/Lib/sqlite3/test/regression.py#l306

    @xdegaye
    Copy link
    Mannequin Author

    xdegaye mannequin commented Oct 1, 2012

    The attached script, named duplicate_code_names.py, takes a file
    name list as argument and prints duplicate code names found in these
    files ordered by function, class, method and nested class or
    function.

    The script output on the whole std lib (see the result in the
    attached file std_lib_duplicates.txt):

    $ time ./python Tools/scripts/duplicate_code_names.py $(find Lib -name "*py") > std_lib_duplicates.txt
    Lib/test/badsyntax_future4.py: compile error: from __future__ imports must occur at the beginning of the file (badsyntax_future4.py, line 3)
    Lib/test/badsyntax_future6.py: compile error: from __future__ imports must occur at the beginning of the file (badsyntax_future6.py, line 3)
    Lib/test/badsyntax_future3.py: compile error: future feature rested_snopes is not defined (badsyntax_future3.py, line 3)
    Lib/test/badsyntax_future9.py: compile error: not a chance (badsyntax_future9.py, line 3)
    Lib/test/bad_coding.py: compile error: unknown encoding for 'Lib/test/bad_coding.py': uft-8
    Lib/test/badsyntax_future8.py: compile error: future feature * is not defined (badsyntax_future8.py, line 3)
    Lib/test/badsyntax_3131.py: compile error: invalid character in identifier (badsyntax_3131.py, line 2)
    Lib/test/badsyntax_future7.py: compile error: from __future__ imports must occur at the beginning of the file (badsyntax_future7.py, line 3)
    Lib/test/bad_coding2.py: compile error: encoding problem for 'Lib/test/bad_coding2.py': utf-8
    Lib/test/badsyntax_pep3120.py: compile error: invalid or missing encoding declaration for 'Lib/test/badsyntax_pep3120.py'
    Lib/test/badsyntax_future5.py: compile error: from __future__ imports must occur at the beginning of the file (badsyntax_future5.py, line 4)
    Lib/lib2to3/tests/data/different_encoding.py: compile error: invalid syntax (different_encoding.py, line 3)
    Lib/lib2to3/tests/data/py2_test_grammar.py: compile error: invalid token (py2_test_grammar.py, line 31)
    Lib/lib2to3/tests/data/bom.py: compile error: invalid syntax (bom.py, line 2)
    Lib/lib2to3/tests/data/crlf.py: compile error: invalid syntax (crlf.py, line 1)
    Lib/__phello__.foo.py: __phello__.foo not a valid module name

    real 6m14.854s
    user 6m14.455s
    sys 0m0.392s

    FWIW running the same command with python 3.2 takes about 2.5
    minutes instead of more than 6 minutes (importlib ?).

    @xdegaye
    Copy link
    Mannequin Author

    xdegaye mannequin commented Sep 28, 2013

    duplicate_code_names_2.py uses tokenize to print duplicate code names
    within the same scope, excluding property setter/getter/deleter
    duplicates, excluding duplicates of nested classes or functions, and
    ignoring duplicates listed in a file (run with --help for more
    details). With the attached ignored_duplicates file, it prints the
    following output on the root of the current default branch (in about 1
    mn on an old laptop):

    $ ./duplicate_code_names_2.py --ignore ignored_duplicates .
    Duplicate function or class names:
    ./Lib/test/_test_multiprocessing.py:3047 _TestProcess
    ./Lib/test/test_os.py:1290 Win32ErrorTests

    Duplicate method names:
    ./Lib/ctypes/test/test_functions.py:316 FunctionTestCase.test_errors
    ./Lib/distutils/tests/test_cmd.py:80 CommandTestCase.test_ensure_string_list
    ./Lib/lib2to3/tests/test_fixers.py:1467 Test_dict.test_14
    ./Lib/lib2to3/tests/test_fixers.py:1472 Test_dict.test_15
    ./Lib/lib2to3/tests/test_fixers.py:1477 Test_dict.test_17
    ./Lib/lib2to3/tests/test_fixers.py:1482 Test_dict.test_18
    ./Lib/lib2to3/tests/test_fixers.py:1487 Test_dict.test_19
    ./Lib/test/test_complex.py:104 ComplexTest.test_truediv
    ./Lib/test/test_dis.py:250 DisTests.test_big_linenos
    ./Lib/test/test_dis.py:294 DisTests.test_dis_object
    ./Lib/test/test_ftplib.py:537 TestFTPClass.test_mkd
    ./Lib/test/test_heapq.py:366 TestErrorHandling.test_get_only
    ./Lib/test/test_import.py:255 ImportTests.test_import_name_binding
    ./Lib/test/test_regrtest.py:210 ParseArgsTestCase.test_findleaks
    ./Lib/test/test_smtplib.py:249 DebuggingServerTests.testNotImplemented
    ./Lib/test/test_webbrowser.py:161 OperaCommandTest.test_open_new
    ./Lib/unittest/test/testmock/testmock.py:1381 MockTest.test_attribute_deletion
    ./Lib/xml/dom/minidom.py:379 Attr._get_name
    ./Mac/Tools/Doc/setup.py:123 DocBuild.makeHelpIndex

    @xdegaye
    Copy link
    Mannequin Author

    xdegaye mannequin commented Sep 29, 2013

    The following issues have been entered for all the above duplicate
    names found by duplicate_code_names_2.py:

    bpo-19112, bpo-19113, bpo-19114, bpo-19115, bpo-19116,
    bpo-19117, bpo-19118, bpo-19119, bpo-19122, bpo-19123,
    bpo-19125, bpo-19126, bpo-19127, bpo-19128

    except the following which should be added to ignored_duplicates:

    ./Lib/test/test_os.py:1290 Win32ErrorTests
    

    @rbtcollins
    Copy link
    Member

    FWIW testtools rejects test suites with duplicate test ids; I'm considering adding that feature into unittest itself. We'd need an option to make it warn rather than error I think, but if we did that we wouldn't need a separate script at all.

    @xdegaye
    Copy link
    Mannequin Author

    xdegaye mannequin commented Apr 13, 2019

    Upgrading the script to account for Python changes. This is now duplicate_code_names_3.py

    $ ./python ./duplicate_code_names_3.py --ignore ignored_duplicates Lib/test
    Duplicate method names:
    Lib/test/test_dataclasses.py:1406 TestCase.test_helper_asdict_builtin_containers
    Lib/test/test_dataclasses.py:1579 TestCase.test_helper_astuple_builtin_containers
    Lib/test/test_dataclasses.py:700 TestCase.test_not_tuple
    Lib/test/test_dataclasses.py:3245 TestReplace.test_recursive_repr_two_attrs
    Lib/test/test_genericclass.py:161 TestClassGetitem.test_class_getitem
    Lib/test/test_gzip.py:764 TestCommandLine.test_compress_infile_outfile
    Lib/test/test_heapq.py:376 TestErrorHandling.test_get_only
    Lib/test/test_importlib/test_util.py:755 PEP3147Tests.test_source_from_cache_path_like_arg
    Lib/test/test_logging.py:328 BuiltinLevelsTest.test_regression_29220
    Lib/test/test_sys_setprofile.py:363 ProfileSimulatorTestCase.test_unbound_method_invalid_args
    Lib/test/test_sys_setprofile.py:354 ProfileSimulatorTestCase.test_unbound_method_no_args
    Lib/test/test_utf8_mode.py:198 UTF8ModeTests.test_io_encoding

    False positives have been removed from the output of the above command and so, all the above methods are effectively duplicates that must be fixed.

    @gpshead gpshead added easy 3.7 (EOL) end of life 3.8 (EOL) end of life tests Tests in the Lib/test dir labels Apr 14, 2019
    @gpshead
    Copy link
    Member

    gpshead commented Apr 14, 2019

    New changeset cd46655 by Gregory P. Smith in branch 'master':
    bpo-16079: fix duplicate test method name in test_gzip. (GH-12827)
    cd46655

    @gpshead gpshead added type-bug An unexpected behavior, bug, or error and removed type-feature A feature request or enhancement labels Apr 14, 2019
    @miss-islington
    Copy link
    Contributor

    New changeset 9f9e029 by Miss Islington (bot) in branch '3.7':
    bpo-16079: fix duplicate test method name in test_gzip. (GH-12827)
    9f9e029

    @vstinner
    Copy link
    Member

    This script should be part of Python and run in the pre-commit CI like Travis CI!

    @gpshead
    Copy link
    Member

    gpshead commented Apr 15, 2019

    Agreed, making duplicate method definitions a CI failure is the desired end state once our test suite is cleaned up and it doesn't have false positives.

    FYI - pylint also implements this check quite reliably as function-redefined via its pylint.checkers.base.BasicErrorChecker._check_redefinition() method.

    https://github.com/PyCQA/pylint/blob/2.2/pylint/checkers/base.py#L843

    @xdegaye
    Copy link
    Mannequin Author

    xdegaye mannequin commented Apr 16, 2019

    Thanks for the link Gregory. I will write a script based on ast and check its output against pylint and against the current script based on tokenize.

    The travis() function of Tools/scripts/patchcheck.py may be modified to import this script and run it only on files modified by the PR. This may allow the pre-commit duplicate check to be installed without waiting for the python test suite to be cleaned if the existing duplicates are temporarily added to the ignored_duplicates file (assuming an issue has been entered for each one of those existing duplicates with a note saying to remove the entry in ignored_duplicates when the issue is fixed). Indeed, issues bpo-19113 and bpo-19119 are still open after they have been entered 5 years ago.

    @xdegaye
    Copy link
    Mannequin Author

    xdegaye mannequin commented Apr 20, 2019

    List of issues entered for all the current duplicate method definitions:

    bpo-19113 bpo-19119 bpo-36678 bpo-36679
    bpo-36680 bpo-36681 bpo-36682 bpo-36683

    @xdegaye
    Copy link
    Mannequin Author

    xdegaye mannequin commented Apr 20, 2019

    PR 12886 adds a check on duplicate method definitions to the travis() function of patchcheck.py.

    False positives must be entered to Tools/scripts/duplicates_ignored.txt.

    The existing duplicates have been entered to this file (with the corresponding bpo issue number in comment) and no duplicates are found currently by duplicate_meth_defs.py when run with '--ignore duplicates_ignored.txt'. It is expected that these entries would be removed from this file when their issues are closed.

    @rhettinger
    Copy link
    Contributor

    Should the unittest module grow a feature to scan for duplicate methods? I imagine that duplicate methods are a common problem.

    Possibly, inheriting from unittest can be accompanied by a metaclass that has __prepare__ with special dictionary that detects and warns about duplicates.

    @xdegaye
    Copy link
    Mannequin Author

    xdegaye mannequin commented Apr 24, 2019

    List of issues entered for all the current duplicate method definitions in 2.7:

    bpo-19113 bpo-36711 bpo-36712 bpo-36713

    @xdegaye
    Copy link
    Mannequin Author

    xdegaye mannequin commented Apr 24, 2019

    Not sure the unittest module is the right place to implement these checks.
    The following issues deal with duplicates that are not unittest methods:

    bpo-19127 bpo-19128 bpo-36711

    @ned-deily
    Copy link
    Member

    The open proposed PR for this issue has been languishing unreviewed for several months now. Since the proposal is really a request to change our development process, I'm nosying Brett and Łukasz (3.9 RM). In any case, if we would decide to add this to our CI, I thine we should only start with the master branch so I'm closing the 3.7 and 2.7 backport PRs.

    @ned-deily ned-deily added 3.9 only security fixes and removed 3.7 (EOL) end of life 3.8 (EOL) end of life labels Sep 10, 2019
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @erlend-aasland
    Copy link
    Contributor

    Superseded by python/core-workflow#505

    @erlend-aasland erlend-aasland closed this as not planned Won't fix, can't repro, duplicate, stale Jun 27, 2023
    @merwok
    Copy link
    Member

    merwok commented Jun 27, 2023

    Isn’t that repo for tooling and meta issues?

    IMO this should stay in the CPython tracker for visibility.

    @erlend-aasland
    Copy link
    Contributor

    Isn’t that repo for tooling and meta issues?

    Yes, and CI issues. Most relevant comments in this thread are about running these checks in CI. So it belongs in the core-workflow repo.

    @merwok
    Copy link
    Member

    merwok commented Jun 28, 2023

    Oh, in my opinion the tool / config should be available for devs to run locally first, then also run from CI!
    That’s why I see this as a normal CPython ticket.

    @erlend-aasland
    Copy link
    Contributor

    Ok, I reopen the ticket.

    hugovk added a commit that referenced this issue Sep 12, 2023
    miss-islington pushed a commit to miss-islington/cpython that referenced this issue Sep 13, 2023
    (cherry picked from commit 3cb9a8e)
    
    Co-authored-by: Hugo van Kemenade <[email protected]>
    Co-authored-by: Alex Waygood <[email protected]>
    Co-authored-by: Adam Turner <[email protected]>
    hugovk added a commit to hugovk/cpython that referenced this issue Sep 13, 2023
    Co-authored-by: Alex Waygood <[email protected]>
    Co-authored-by: Adam Turner <[email protected]>
    (cherry picked from commit 3cb9a8e)
    vstinner pushed a commit to vstinner/cpython that referenced this issue Sep 13, 2023
    Yhg1s pushed a commit that referenced this issue Sep 14, 2023
    …09365)
    
    * gh-60283: Check for redefined test names in CI (GH-109161)
    (cherry picked from commit 3cb9a8e)
    
    Co-authored-by: Hugo van Kemenade <[email protected]>
    Co-authored-by: Alex Waygood <[email protected]>
    Co-authored-by: Adam Turner <[email protected]>
    
    * Update exclude list for 3.12
    
    * Explicitly exclude files which failed to lint/parse
    
    * Sort to avoid future merge conflicts
    
    ---------
    
    Co-authored-by: Hugo van Kemenade <[email protected]>
    Co-authored-by: Alex Waygood <[email protected]>
    Co-authored-by: Adam Turner <[email protected]>
    hugovk added a commit that referenced this issue Sep 15, 2023
    )
    
    Co-authored-by: Alex Waygood <[email protected]>
    Co-authored-by: Adam Turner <[email protected]>
    (cherry picked from commit 3cb9a8e)
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.9 only security fixes easy stdlib Python modules in the Lib dir tests Tests in the Lib/test dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    Successfully merging a pull request may close this issue.

    10 participants