Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

List resource not updated between requests #2602

Merged
merged 1 commit into from
Feb 26, 2015
Merged

Conversation

dbrgn
Copy link
Contributor

@dbrgn dbrgn commented Feb 25, 2015

The Problem

This is the test I used:

import pytest
from django.contrib.auth import get_user_model
from rest_framework.test import APIClient
from reports import models

@pytest.mark.django_db
def test_get_200():
    user = get_user_model().objects.create(is_superuser=True, is_staff=True, is_active=True)
    client = APIClient()
    client.force_authenticate(user)

    models.Report.objects.create()
    response = client.get('/api/v1/reports')
    print(response.content)

    print('*' * 20)

    for _ in range(3):
        models.Report.objects.create()

    client2 = APIClient()
    client2.force_authenticate(user)
    response = client2.get('/api/v1/reports')
    print(response.content)

    assert 0

So two independent APIClients are created. They both request /reports. In between the two requests, more Report instances are created. But they don't show up in the second response.

Output:

danilo@adalbert:fr-frontend$ py.test test_foo.py
================================================= test session starts =================================================
platform linux2 -- Python 2.7.9 -- py-1.4.26 -- pytest-2.6.4
plugins: cache, django
collected 1 items 

test_foo.py F

====================================================== FAILURES =======================================================
____________________________________________________ test_get_200 _____________________________________________________
test_foo.py:29: in test_get_200
    assert 0
E   assert 0
------------------------------------------------ Captured stdout call -------------------------------------------------
[{"id":1,"uri":"http://testserver/api/v1/reports/1","name":"","created":"2015-02-25T13:59:51.990526Z","clientName":"","managerName":"","managerEmail":"","logo":null,"users":[],"design":null,"views":"{}"}]
********************
[{"id":1,"uri":"http://testserver/api/v1/reports/1","name":"","created":"2015-02-25T13:59:51.990526Z","clientName":"","managerName":"","managerEmail":"","logo":null,"users":[],"design":null,"views":"{}"}]
============================================== 1 failed in 5.86 seconds ===============================================

The Bug

The issue does not occur in DRF 3.0.0, but all versions 3.0.1-3.0.5 are affected.

With git bisect I could track down the commit that introduced this bug: 8d6b0b1 The corresponding pull request is #2241.

Resolution Attempts

First I simply tried to revert the commit onto 3.0.5... But that caused a conflict:

<<<<<<< HEAD
        # Dealing with nested relationships, data can be a Manager,
        # so, first get a queryset from the Manager if needed
        iterable = data.all() if isinstance(data, models.Manager) else data
=======
        iterable = data.all() if (hasattr(data, 'all')) else data
>>>>>>> parent of 8d6b0b1... Update serializers.py
        return [
            self.child.to_representation(item) for item in iterable
        ]

I'm not sure what the correct solution is, as I don't understand this code yet. I just know that the bug is gone if I replace the conflicting line with iterable = data.all() if (hasattr(data, 'all')) else data.

CC @tomchristie @IvanAlegre

dbrgn added a commit to dbrgn/django-rest-framework that referenced this pull request Feb 25, 2015
@dbrgn
Copy link
Contributor Author

dbrgn commented Feb 25, 2015

After a comment by @tomchristie in IRC, I added back the all() call.

@@ -562,7 +562,7 @@ def to_representation(self, data):
"""
# Dealing with nested relationships, data can be a Manager,
# so, first get a queryset from the Manager if needed
iterable = data.all() if isinstance(data, models.Manager) else data
iterable = data.all() if hasattr(data, 'all') else data
return [
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is the proper fix.
I wonder if we shouldn't simply replace models.Manager by models.QuerySet

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I think isinstance(data, (models.Manager, models.QuerySet)) may be what we want.
We do need it for manager (as it's not an iterable by itself) and we also need it for queryset (as they get cached).

@xordoquy
Copy link
Collaborator

Mind pasting the related view part ? This is the most important part that would explain why this possibly fails.

@tomchristie
Copy link
Member

Nice work on debugging this.
It would be really good to have the test case in here too, of course.

@dbrgn
Copy link
Contributor Author

dbrgn commented Feb 25, 2015

Sure. This is the view:

class ReportViewSet(viewsets.ModelViewSet):
    queryset = models.Report.objects.all()
    serializer_class = serializers.ReportSerializer
    permission_classes = [permissions.AllowAny]

    def _get_filtered_queryset(self):
        """
        A queryset that filters by access permissions.
        """
        if not self.request.user.is_superuser:
            return self.queryset.filter(users=self.request.user)
        return self.queryset

    def list(self, request):
        """
        Use the filtered queryset for this endpoint.
        """
        with monkeypatch_attribute(self, 'get_queryset', self._get_filtered_queryset):
            response = super(ReportViewSet, self).list(request)
        return response

And the serializer:

class ReportSerializer(serializers.HyperlinkedModelSerializer):
    name = serializers.ModelField(models.Report()._meta.get_field('name'), required=True)
    users = serializers.HyperlinkedRelatedField(
            many=True, read_only=True,
            source='reportuser_set', view_name='reportuser-detail')

    class Meta:
        model = models.Report
        fields = (
            'id', 'uri', 'name', 'created',
            'client_name', 'manager_name', 'manager_email',
            'users', 'design',
            'views',
        )
        read_only_fields = ('id', 'uri', 'users', 'created', 'design')

@xordoquy
Copy link
Collaborator

        return self.queryset.filter(users=self.request.user)

At this point you have a Queryset instead of a Manager.

@dbrgn
Copy link
Contributor Author

dbrgn commented Feb 25, 2015

@xordoquy there we go, thanks. So would this work?

iterable = data.all() if isinstance(data, (models.Manager, models.QuerySet)) else data

@dbrgn
Copy link
Contributor Author

dbrgn commented Feb 25, 2015

Doesn't really make sense though, does it? The goal of manager.all() is to get a QuerySet.

@tomchristie
Copy link
Member

Doesn't really make sense though, does it? The goal of manager.all() is to get a QuerySet.

I'm not sure what you mean - that bit of code gets triggered for relationships where the attribute is a manager, eg a field name groups on a User instance. You'd normally do user.groups.all() to get the iterable of each which is what this bit of code is for.

@tomchristie
Copy link
Member

So would this work?

Looks right to me, yes.

dbrgn added a commit to dbrgn/django-rest-framework that referenced this pull request Feb 25, 2015
@dbrgn
Copy link
Contributor Author

dbrgn commented Feb 25, 2015

OK, commit was amended.

Unfortunately I don't know where to put the test and how to implement it. And I really don't have time now and in the coming weeks. So I hope someone else could write a quick test for this :)

@xordoquy
Copy link
Collaborator

QuerySet should be imported from django.db.models.query

@dbrgn
Copy link
Contributor Author

dbrgn commented Feb 25, 2015

Done.

@tomchristie tomchristie added this to the 3.1.0 Release milestone Feb 26, 2015
tomchristie added a commit that referenced this pull request Feb 26, 2015
List resource not updated between requests
@tomchristie tomchristie merged commit d219fc0 into encode:master Feb 26, 2015
@dbrgn dbrgn deleted the fix-1602 branch February 26, 2015 15:55
@tomchristie tomchristie mentioned this pull request Jun 25, 2015
@tomchristie
Copy link
Member

The bug here is in fact in the user code (because it's easy to get tripped up by Django's queryset caching when combined with setting a queryset class attribute)

Compare this...

def _get_filtered_queryset(self):
    """
    A queryset that filters by access permissions.
    """
    if not self.request.user.is_superuser:
        return self.queryset.filter(users=self.request.user)
    return self.queryset

With REST framework's implementation...

def get_queryset(self):
    queryset = self.queryset
    if isinstance(queryset, QuerySet):
        # Ensure queryset is re-evaluated on each request.
        queryset = queryset.all()
    return queryset

Note that we always re-evaluate the queryset using .all() with every call to get_queryset.
If the user code was instead this...

def _get_filtered_queryset(self):
    queryset = self.get_queryset()
    if not self.request.user.is_superuser:
        return queryset.filter(users=self.request.user)
    return queryset

Then the problem would not manifest itself.

Very easy to get tripped up by, and not obvious how we can protect against it.

@dbrgn
Copy link
Contributor Author

dbrgn commented Jul 10, 2015

@tomchristie oh, thanks for that note.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants