Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Case sets #262

Open
jlorieau opened this issue Feb 23, 2022 · 7 comments
Open

Case sets #262

jlorieau opened this issue Feb 23, 2022 · 7 comments
Labels
question Further information is requested

Comments

@jlorieau
Copy link

I'm relatively new to pytest cases, so this feature may already be implemented in some way that I have not found.

I have a series of data test cases in a root file for multiple datasets. These include reference datasets and datasets that have been modified in some specific way by an external program:

cases.py:

def data_reference(): ...
def data_procedure1a(): ...
def data_procedure1b(): ...
def data_procedure2():...

For individual cases, I can import these pretty easily:

test.py

@parametrize_with_cases('dataset', prefix='data_', cases='cases')

However, I have many tests that require modifying the reference by a procedure implemented in my program, and comparing to the data from a different data test case.

Option 1
So far, I add the following to cases.py:

cases.py

def case_procedure1a():
    return data_reference(), data_procedure1a()

def case_procedure1b():
    return data_reference(), data_procedure1b()

Then I call this test case for a test that requires the matching of 2 datasets:

test.py

@parametrize_with_cases('reference,modified', prefix='case_', cases='cases', glob=('*procedure1*')

Developing test cases for all possible pairings gets pretty unwieldy quickly.

Option 2
An alternative would be to import my test cases and parameterize them:

test.py

import pytest
from .cases import data_reference, data_procedure1a, data_procedure1b

@pytest.mark.parameterize('reference,modified', 
    ((data_reference(), data_procedure1a()),
     (data_reference(), data_procedure1b()))

However, this approach requires importing the test case functions, and it directly couples my test code to the case code.

I wonder whether there is a procedure to test data test case pairs or triplets in which I could reference these by id, tag or something else. I suspect that this may require some additional coding, which is why this might be a feature request.

@smarie
Copy link
Owner

smarie commented Mar 4, 2022

Hi @jlorieau , thanks for the feedback !

The usual way to solve this kind of problems is to do it "bottom-up". So

  • start with the test function and write the names of the arguments that are needed. For example test_foo(orig_data, mod_data)

  • then find out if the two arguments are independent or not.

    • If they are, each will be handled by a specific @parametrize[_with_cases], or by a specific parametrized fixture

    • If they are not, then does one depend on the other (built using the other) or are they actual "pairs" ?

      • if actual pairs: (1 single tuple parameter or tuple parametrized fixture)
      • if one depend on the other, for example mod_data depends on orig_data, then create fixture dependencies
from pytest_cases import fixture, parametrize, parametrize_with_cases

@fixture
@parametrize_with_cases("d", prefix="data_", cases="cases")
def orig_data(d)
    yield d

@fixture
def mod_data(d)
    mod_d = proc(d)
    yield mod_d


def test_foo(orig_data, mod_data):
    # ...

You can even now parametrize the procedures used in the mod_data fixture

@parametrize_with_cases("proc", prefix="proc_", cases="cases")
def mod_data(d, proc)

does that help ?

@jlorieau
Copy link
Author

jlorieau commented Mar 7, 2022

Thank you @smarie for the response. That suggestion is helpful, but I'm concerned that it may be unwieldy for a large sets of case comparisons. The alternative approach I took, which may be helpful to other other users but may not align with the scope of pytest cases, is as follows:

from itertools import product, chain

import pytest
from pytest_cases import get_all_cases

def parametrize_casesets(*globs, cases=None, prefix='data_') -> tuple:
    """Convert a series of case globs into a set of cases for parametrization.
    """
    # Convert globs to functions
    funcs = []
    dummy = lambda: None
    for glob in globs:
        glob_funcs = map(lambda glob: get_all_cases(dummy, cases=cases,
                                                    prefix=prefix, glob=glob),
                         glob if not isinstance(glob, str) else (glob,))
        glob_funcs = chain.from_iterable(glob_funcs)
        funcs.append(glob_funcs)

    # Create a generator for the product of these
    return tuple(tuple(f() for f in prod) if len(prod) > 1 else prod[0]()
                 for prod in product(*funcs))

Then these are used with parametrize as follows:

@pytest.mark.parametrize('reference, modified',
                         parametrize_casesets('*reference_1d',
                                              '*procA_1d',
                                              cases='...cases',
                                              prefix='data_') +
                         parametrize_casesets('*reference_2d',
                                              '*procA_2d',
                                              cases='...cases',
                                              prefix='data_') +
                         parametrize_casesets('*reference_3d',
                                              '*procA_3d',
                                              cases='...cases',
                                              prefix='data_'))
def test_procA(reference, modified):
    """Test processing with procA"""
    # Modify reference with procA
   new_modified = procA(reference)

    # Test that new_modified and modified match

I might have a large number of modified datasets that need to be compared to a few references:

reference_1d vs set1_procA_1d
                set2_procA_1d
                set3_procA_1d
reference_2d vs set1_procA_2d
                set2_procA_2d
reference_2d vs set1_procA_3d
                set2_procA_3d
                set3_procA_3d

For this example, I have 8 comparisons for procA, but I could add more data cases without modifying my test code.

I don't think my current implementation of parametrize_casesets is great, particularly with how the product is implemented and the return value, but it's closer to the implementation I'm lookin for.

Feel free to close this as 'wontfix' if you think this functionality might be outside the scope of pytest-cases. Thanks again.

@smarie
Copy link
Owner

smarie commented Mar 7, 2022

Hi @jlorieau , thanks for the feedback ! I think I now understand better what you were looking for.

This should be the equivalent for your custom code:

(EDITED, see other posts below)

@fixture
@parametrize_with_cases("d", prefix="data_", cases="cases", glob="*reference_1d")
def data_1d(d):
    yield d

@fixture
@parametrize_with_cases("d", prefix="data_", cases="cases", glob="*procA_1d")
def mod_1d(d):
    yield d


@fixture
@parametrize_with_cases("d", prefix="data_", cases="cases", glob="*reference_2d")
def data_2d(d):
    yield d

@fixture
@parametrize_with_cases("d", prefix="data_", cases="cases", glob="*procA_2d")
def mod_2d(d):
    yield d


@parametrize("reference,modified", fixture_pairs)
def test_procA(reference, modified):
    assert reference == modified

this can be made generic for n dimensions :

from pytest_cases import fixture, parametrize, parametrize_with_cases

fixture_pairs = []

for dim in range(1, 3):
    data_fix_name = f"data_{dim}d"
    ref_fix_name = f"ref_{dim}d"

    @fixture(name=data_fix_name)
    @parametrize_with_cases("d", prefix="data_", cases="cases", glob=f"*reference_{dim}d")
    def data_nd(d):
        yield d

    @fixture(name=ref_fix_name)
    @parametrize_with_cases("d", prefix="data_", cases="cases", glob=f"*procA_{dim}d")
    def ref_nd(d):
        yield d

    # If we do not do this, the fixture symbols are overridden in the module with the next loop's
    # See https://github.com/pytest-dev/pytest/issues/2424
    globals()[data_fix_name] = data_nd
    globals()[ref_fix_name] = ref_nd

    fixture_pairs.append((data_nd, ref_nd))


@parametrize("reference,modified", fixture_pairs)
def test_procA(reference, modified):
    assert reference == modified

Let me know if that works for you !

Note: this feature is actually a feature in @parametrize: its argvalues can contain fixture references (implicitly or explicitly using fixture_ref), and if you use several fixture references in a tuple then the cross-product is made.

@jlorieau
Copy link
Author

jlorieau commented Mar 8, 2022

Great thank you!

@smarie
Copy link
Owner

smarie commented Mar 9, 2022

You're welcome @jlorieau . So do you confirm that the first option works ? And/or maybe the second ?

@smarie
Copy link
Owner

smarie commented Mar 21, 2022

I tried today and confirm that the first option works, but not the second. I updated the examples, there were a few typos.

@smarie
Copy link
Owner

smarie commented Mar 21, 2022

I made the second one work too, just in case you're still interested by this union of cross-products of parametrized fixtures. I edited the example above.
Let me know @jlorieau !

@smarie smarie added the question Further information is requested label Mar 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants