Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use django_db mark at session fixture? #514

Open
ludbek opened this issue Sep 14, 2017 · 26 comments · May be fixed by #972
Open

How to use django_db mark at session fixture? #514

ludbek opened this issue Sep 14, 2017 · 26 comments · May be fixed by #972

Comments

@ludbek
Copy link

ludbek commented Sep 14, 2017

I have a situation where I need to load huge db fixture which is created by a function. The fixture is needed in all api tests. So I made a session fixture at conftest.py which would do it. But the problem is pytest throws following exception even though I have marked django_db:

E Failed: Database access not allowed, use the "django_db" mark to enable it.
Below is my code snippet.

from permission.helpers import update_permissions

pytestmark = [
        pytest.mark.django_db(transaction = True),]

@pytest.fixture(scope="session", autouse = True)
def permission(request):
        load_time_consuming_db_fixture()
@cybergrind
Copy link

@ludbek we've also missed such feature and created plugin for this feature:
https://github.com/tipsi/pytest-tipsi-django (usage: https://github.com/tipsi/pytest-tipsi-django/blob/master/test_django_plugin/app/tests/test_transactions.py)
In conjunction with:
https://github.com/tipsi/pytest-tipsi-testing
It gives you the ability to nest transactions and correct execution/shutdown order.

@ludbek
Copy link
Author

ludbek commented Nov 6, 2017

@cybergrind Thanks for replying. I will definitely check it out and let you know how it went.

@paultiplady
Copy link

This has been the biggest pain point for me coming from Django's UnitTest class-based tests, to pytest-django -- in Django we use setUpTestData to run expensive DB operations once (equivalent to session-scoped pytest fixtures), and then there's a cunning trick to run obj.refresh_from_db() in the setUp to refresh the class references.

Even if I'm just creating one DB model instance, and reloading it every TC, this is almost always faster than creating in each test case.

It would be great if we could get the session-scoped fixtures from pytest-tipsi-django merged upstream, if that's a possibility; it took a bit of digging to find this issue and solution.

@cybergrind
Copy link

hey @paultiplady

I'm not sure that approach from pytest-tipsi-djanjo fits the usual testing model for pytest. The most noticeable difference is the finishing on fixtures: currently pytest doesn't explicitly finish unnecessary fixtures with a wider scope. So you need explicitly finish transactions in particular order and in general, this may cause very different effects (currently pytest may keep fixtures active even if active test and its fixtures don't require it at all).

We had to change tests in our project to this approach because we need to test some big scenarios sometimes and we've replaced existing manual transaction management in huge tests with slightly better fixtures, but it still requires attention on order tests.

Right now I can see only one solution for that: putting some kind of FAQ into documentation.

@paultiplady
Copy link

paultiplady commented Jan 5, 2018

Thanks for the additional detail @cybergrind. I've dug into this a little bit more, but have run out of time for today -- here's where I've got to, I'd appreciate a sanity-check on whether this approach is useful or not, since I'm not that familiar with the pytest internals.

Also I don't understand what you mean by "pytest doesn't explicitly finish unnecessary fixtures with a wider scope", could you expand on that a bit more please? Is that referring to finalizers? That might affects what I've written below.

The pytest-django plugin uses the django_db mark, which gets handled in _django_db_marker in plugin.py (https://github.com/pytest-dev/pytest-django/blob/master/pytest_django/plugin.py#L375), calling in to the function-scoped db fixture (https://github.com/pytest-dev/pytest-django/blob/master/pytest_django/fixtures.py#L142). This fixture instantiates a Django TestCase, and calls its _pre_setup function (and enqueues the _post_teardown).

I can see a couple of options:

I'm wondering if we could extend _django_db_marker to optionally do class- or session-
scoped setup as well, which would do essentially the same thing as db, but calling the equivalent of a cls.setUpTestData or a function passed in the mark kwargs instead.

For class-scoped, and I'm assuming for session-scoped as well, the expected behaviour would be to roll back the DB afterwards, so we basically need scoped fixtures that trigger in the right order, and that each set up their own atomic transaction. I believe that means this needs to be a modification to the db fixture, rather than a separate fixture that runs beside it.

That way we'd correctly trigger any class-/session-level setup that is specified, and that setup would get called once per class/session. I believe a modification to the mark itself is required because if you just set up a session-scoped function that manually triggers django_db_blocker.unblock(), that seems to happen after the django_db mark has set up the first transaction.

This might look something like this (in plugin.py _django_db_marker()):

    if marker:
        if marker.session_db:
            getfixturevalue(request, 'db_session')
            
        if marker.class_db:
            getfixturevalue(request, 'db_class')
            
        validate_django_db(marker)
        if marker.transaction:
            getfixturevalue(request, 'transactional_db')
        else:
            getfixturevalue(request, 'db')

Is this crazy talk, or is this thread worth exploring further?

@cybergrind
Copy link

Regarding finalization: https://github.com/tipsi/pytest-tipsi-testing/blob/master/tests/test_finalization.py

This test doesn't work without explicit finalization, same as non-function level database fixtures. And this is about pytest implementation, so there is nothing to do in pytest-django to fix it.

@MRigal
Copy link

MRigal commented Apr 11, 2019

Duplicate of #388 and #33

@blueyed
Copy link
Contributor

blueyed commented Apr 11, 2019

Thanks, closing.

@blueyed blueyed closed this as completed Apr 11, 2019
@paultiplady
Copy link

#33 is closed, and #388 contains no meaningful discussion (unlike this one). Seems odd to close this one @blueyed , if anything I'd suggest closing #388 and make this one the canonical ticket for this problem.

@paultiplady
Copy link

👍 thanks!

@mkokotovich
Copy link

I would also very much appreciate this functionality.

@blueyed
Copy link
Contributor

blueyed commented May 31, 2019

@mkokotovich
How much? Enough to make it happen yourself? ;)

Anyway, the main / fundamental problem here (from the original comment) is already that the DB is reset during tests, so there is no trivial way to have a session scoped fixture like that.

What might work though is something along:

@pytest.fixture(scope="session")
def django_db_setup(django_db_setup, django_db_blocker):
    with django_db_blocker.unblock():
        with transaction.atomic():  # XXX: could/should use `TestCase_enter_atomics` for multiple dbs
            load_time_consuming_db_fixture()
            yield

The idea being to wrap everything into an additional atomic block. This is untested however, and you might need to use TransactionTestCase actually for this.

@blueyed
Copy link
Contributor

blueyed commented May 31, 2019

@paultiplady
Your comment sounds good - i.e. it is worth further exploration AFAICS (see also my previous comment).

@cybergrind
Copy link

@blueyed we're using such approach for more than a year (wrap everything into an additional atomic block) it works pretty well.
But you cannot just drop-in something like that because pytest doesn't have the determenistic where session-level (and other higher than test) will be closed so it will require tracking of dependencies of db transactions regardless they require each other or not directly.
So you need explicitly track transactions stack and close nested transactions before next test. In can be done in such a way: https://github.com/tipsi/pytest-tipsi-django/blob/master/pytest_tipsi_django/django_fixtures.py#L46

@sinjihn-4th-valley
Copy link

sorry I want to clarify since I believe I'm running into the same issue:

If I want to create a fixture which relies on creating a new db object, I thought I could do

@pytest.mark.django_db(transaction=True)
@pytest.fixture(scope="session")
def object():
    object = Object.create_object(params)
    yield object
    // or alternatively
    object = mixer.blend('myObject')
    yield object

However, I receive the following error on test case run: Database access not allowed, use the "django_db" mark, or the "db" or "transactional_db" fixtures to enable it.

@paultiplady
Copy link

Just checking back in here -- still an issue, and I haven't had time to dig in further. However there has been a bit of movement upstream - there's now a wrap_testdata decorator for setUpTestData that handles auto-reloading the model instances. This was added in Django 3.2, and there's a library to backport it:

https://github.com/charettes/django-testdata

Any solution to this problem should probably crib the wrap_testdata implementation to memoize the model instances. However I think the hard bit is hooking the fixtures in to the right bit of the pytest-django DB machinery, and wrap_testdata probably doesn't help with that.

@paultiplady
Copy link

I took a quick look at this and actually I was able to find a partial solve to the issue above. @blueyed please take a look and let me know if you think the approach in #972 is worth progressing. (I currently haven't spent time polishing it, but am happy to do so if it looks acceptable).

I did try to solve the more general problem of wiring up session fixtures, but I think it's substantially harder. I'll leave some notes in the PR explaining my findings after playing around with possible fixture APIs to solve the generic case.

@bluetech
Copy link
Member

Hi @paultiplady,

Replying to your PR #972 here, to keep the discussions in one place.

The way pytest-django works is that it (conceptually) wraps each pytest-django test in its own django TestCase case and executes that. This nullifies the purpose of setUpTestData, which is to share setup overhead of DB objects across tests -- pytest-django executes it for each test.

I think we should pursue the more generalized feature, and I think it can be a killer feature of pytest-django: the setUpTestData functionality with proper rollbacks, but with arbitrary scopes and using the much nicer pytest fixture infrastructure.

A previous attempt at this functionality is #258. Just yesterday I began playing with it again. First thing I needed to do is completely disable Django's setUpTestData stuff, since it conflicts with anything we might do (namely, it closes all DB connections which causes troubles when wrapped in a pytest-django transaction). Commit here: bluetech@80cacd9 I might just bring commit it to master to make any future work easier.

Next, there's the question of the API. As others said, the most intuitive API would be

@pytest.fixture(scope="module")
def items():
    item1 = Item.objects.create()
    item2 = Item.objects.create()
    return item1

where everything created is available in the scope and is automatically rolled back at the end of the scope. I think however that it is not reasonable to start wrapping all fixtures in transactions, so I think the fixture needs to opt-in to it:

@pytest.fixture(scope="module")
def items(django_test_data):
    item1 = Item.objects.create()
    item2 = Item.objects.create()
    return item1

Turns it is not hard to implement - here it is: bluetech@fcf5ef4 I tested it with just some simple scenarios, and it works as you would expect. It still needs various usage checks like it can't be used with transactional tests etc. but that can be done I think.

One problem I hit right off the bat is multi-db (for which pytest-django recently added experimental support). Currently, pytest-django implements multi-db (=> which databases to use) at the test level, i.e. every test can specify which databases it wants. But the django_test_data transactions need to know the databases as well. For Django's setUpTestData this all works out because the transaction is at the class level, and databases is also defined on the class level. This is something I need to think about.

Another thing is the new Django 3.2 TestData stuff, which does some voodoo magic so that the underlying DB data persists across the scope, but each test gets fresh ORM objects, which don't leak across tests. I think this is not an essential feature, we can think about how/if it fits with pytest-django later.

Anyway, I am very interested in getting something for this in pytest-django, it's the biggest pain point I have.

@paultiplady
Copy link

Sounds good @bluetech -- I'm happy to discard my WIP if someone with more know-how of the internals is pushing this thread forwards.

A couple thoughts while this is fresh in my mind:

The way pytest-django works is that it (conceptually) wraps each pytest-django test in its own django TestCase case and executes that. This nullifies the purpose of setUpTestData, which is to share setup overhead of DB objects across tests -- pytest-django executes it for each test.

I had experimented with adding a class-scoped fixture that gets requested by _django_db_helper, and is used to stash setUpTestData state between test cases - perhaps this would resolve the per-test instantiation problem. Essentially adding back a cls that is class-scoped for the test case classmethods to use.

The advantage of the approach in my PR is that it does not incur any extra DB queries over the Django TestCase.setUpTestData approach, whereas I believe the more generic solution incurs an extra rollback query per test. It's probably not the end of the world to add a single query per test of overhead, but just want to note that hooking in to the existing lifecycle classmethods does let us avoid doing extra DB work since setUpTestData gets called inside the existing per-test transaction. I think that both solutions could actually coexist if the extra query was found to be problematic for some cases (depending on how much of the setUpClass you need to remove...)

I think we should pursue the more generalized feature, and I think it can be a killer feature of pytest-django: the setUpTestData functionality with proper rollbacks, but with arbitrary scopes and using the much nicer pytest fixture infrastructure.

Sounds good to me. The more flexible session-scoped fixture (even if it does incur an extra query per test) would be good enough for me to remove django.testcase.TestCase from my tests, I think.

One concern - with the approach here:

@pytest.fixture(scope="module")
def items(django_test_data):
    item1 = Item.objects.create()
    item2 = Item.objects.create()
    return item1

How does pytest handle the references to item1? I believe you're going to initialize it once (when loading the module), and pass that same object ref into each test function. So if test1 modifies item1, then how do we clear those in-memory changes for test2? (The DB state will get rolled back, but the in-memory object instance also needs to be reset).

This is the problem that TestData solves by memoizing the model instance and replacing cls.item1 with a descriptor that returns the memoized copy instead of the actual underlying model instance. Is there a pytest-ish way to intercept what is being yielded from the fixture and wrap it? I think that maybe the old PR's idea of creating a fixture-decorator could be a possible solution here.

Basically i think you might need to find a nicer sugar for logic like:

@pytest.fixture(scope="module")
def items(django_test_data):
    item1 = Item.objects.create()
    item2 = Item.objects.create()
    return TestData(item1)

Anyway, this is just based on a code read, I could easily be wrong about this. Something like this test case should catch it though if it is a problem: https://github.com/pytest-dev/pytest-django/pull/972/files#diff-82fbc96aa3f082d4667c09a2a35aec050c73120de9174e1f37bef26ef9cd3115R351-R363

@bluetech
Copy link
Member

bluetech commented Dec 5, 2021

@paultiplady your proposal would definitely be more "bulletproof" because it follows what Django does which is certain to go smoother than trying to extend it.

I think the more generic approach would be more natural in pytest, and less constraining than having to work on a class level only. However, it's possible it won't pan out, in which case, we should go with your proposal, so if you'd like, I urge you to try it and see if you can make it work well.

whereas I believe the more generic solution incurs an extra rollback query per test

The more generic solution would incur an extra rollback per scope it's applied to, e.g. the items example above would add an additional rollback per module in which it is effective. On the other hand, it would remove the current per-class overhead (it is already removed in the latest release, actually).

How does pytest handle the references to item1?

In my current POC it will be shared in the scope. I believe a TestData-equivalent feature could be added with some further pytest magic, though I haven't thought of it yet.

BTW, I've ran into another problem with my approach, posted a question about it to the Django internals forum: https://forum.djangoproject.com/t/why-does-django-close-db-connections-between-test-classes/10782

@MichaelSnowden
Copy link

This worked for me: https://pytest-django.readthedocs.io/en/latest/database.html#populate-the-database-with-initial-test-data

@henribru
Copy link

henribru commented May 1, 2022

The solution given at https://pytest-django.readthedocs.io/en/latest/database.html#populate-the-database-with-initial-test-data seems to interact weirdly with multi-database support. I end up with this warning if any of my tests use multiple databases:

    @contextmanager
    def _nodb_cursor(self):
        try:
            with super()._nodb_cursor() as cursor:
                yield cursor
        except (Database.DatabaseError, WrappedDatabaseError):
>           warnings.warn(
                "Normally Django will use a connection to the 'postgres' database "
                "to avoid running initialization queries against the production "
                "database when it's not needed (for example, when running tests). "
                "Django was unable to create a connection to the 'postgres' database "
                "and will use the first PostgreSQL database instead.",
                RuntimeWarning
            )
E           RuntimeWarning: Normally Django will use a connection to the 'postgres' database to avoid running initialization queries against the production database when it's not needed (for example, when running tests). Django was unable to create a connection to the 'postgres' database and will use the first PostgreSQL database instead.

../../../.cache/pypoetry/virtualenvs/.../lib/python3.10/site-packages/django/db/backends/postgresql/base.py:304: RuntimeWarning

During handling of the above exception, another exception occurred:

    def teardown_database() -> None:
        with django_db_blocker.unblock():
            try:
                teardown_databases(db_cfg, verbosity=request.config.option.verbose)
            except Exception as exc:
>               request.node.warn(
                    pytest.PytestWarning(
                        "Error when trying to teardown test databases: %r" % exc
                    )
                )
E               pytest.PytestWarning: Error when trying to teardown test databases: RuntimeWarning("Normally Django will use a connection to the 'postgres' database to avoid running initialization queries against the production database when it's not needed (for example, when running tests). Django was unable to create a connection to the 'postgres' database and will use the first PostgreSQL database instead.")

../../../.cache/pypoetry/virtualenvs/.../lib/python3.10/site-packages/pytest_django/fixtures.py:133: PytestWarning

Though it's worth noting that my "two" databases are really just two slightly different ways of connecting to a single database. All the settings, including the name, are the same, one just has a different isolation level. Maybe this use case isn't 100% supported by the current multi-database support? It does seem to work fine as long as I don't override django_db_setup to create initial data though.

@danialmmllc
Copy link

danialmmllc commented Oct 19, 2022

sorry I want to clarify since I believe I'm running into the same issue:

If I want to create a fixture which relies on creating a new db object, I thought I could do

@pytest.mark.django_db(transaction=True)
@pytest.fixture(scope="session")
def object():
    object = Object.create_object(params)
    yield object
    // or alternatively
    object = mixer.blend('myObject')
    yield object

However, I receive the following error on test case run: Database access not allowed, use the "django_db" mark, or the "db" or "transactional_db" fixtures to enable it.

@sinjihn-4th-valley Adding db fixture should solve this issue:

@pytest.fixture(scope="session")
def object(db):
    ....

@sterliakov
Copy link

I'm currently solving this with pytest-subtests, grouping similar tests into one function with several subtests.test groups. This works well, but leads to long and hardly readable test methods. The suggestion from "Populate db with initial data" works too in general, but is inapplicable in my case (most time is spent in postprocessing to generate new object from relevant S3 data, and this generation is an important part of a test suite).

@HansBambel
Copy link

HansBambel commented Jun 26, 2023

This worked for me: https://pytest-django.readthedocs.io/en/latest/database.html#populate-the-database-with-initial-test-data

To pile onto this: When creating, for example, some standard users and then writing a test where a user is deleted, the users are still available in the next tests as the database is rolled back after the test.

Also note that every test has the standard users irrespective of the fixture being called or not. (See that no test calls create_users).

@pytest.fixture(scope="session", autouse=True)
def create_users(django_db_setup, django_db_blocker, django_user_model):
    """Create some users. They are available in all tests."""
    with django_db_blocker.unblock():
        for i in range(10):
            django_user_model.objects.create_user(username=f"user{i}", password="password")

def test_count_users(django_user_model):
    assert django_user_model.objects.count() == 10

def test_delete_user(django_user_model):
    django_user_model.objects.first().delete()
    assert django_user_model.objects.count() == 9

def test_count_users_after_delete(django_user_model):
    assert django_user_model.objects.count() == 10

This made it easier for me to understand. Basically, every test has the full pre-populated database.
Note that using autouse=True is not needed, but I find it easier for other members to see that every test has this fixture.

@Stephane-Ag
Copy link

Stephane-Ag commented Apr 11, 2024

I can confirm that using django_db_blocker.unblock() is a good working solution. As mentioned here and in the answer right above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.