ListSerializer.to_representation does not respect prefetches #2704

aleontiev · 2015-03-17T05:09:37Z

Update: Related to #2727 - @tomchristie

I noticed some undesirable behavior around ListSerializer's implementation of to_representation in the context of relations prefetched with Django 1.7's Prefetch object.

I am talking about this block of code:

        # Dealing with nested relationships, data can be a Manager,
        # so, first get a queryset from the Manager if needed
        iterable = data.all() if isinstance(data, (models.Manager, query.QuerySet)) else data
        return [
            self.child.to_representation(item) for item in iterable
        ]

To demonstrate the problem, consider a request made to fetch all auth.Users and their related auth.Groups matching a certain filter. This can be done with Prefetch like so:

>>> users_with_test_groups = User.objects.all().prefetch_related(Prefetch('groups', queryset=Group.objects.filter(name__icontains='test')

Lets look at the first user and his prefetched groups:

>>> user = users_with_test_groups[0]
>>> user.groups.all()
[<Group: test>]

... so far, so good. Now let's add DRF into the mix; I have a UserSerializer and a related GroupSerializer:

UserSerializer(ModelSerializer):
   groups = GroupSerializer(many=True)
   ...

GroupSerializer(ModelSerializer):
   ...

If I call UserSerializer(users_with_test_groups, many=True).data, I expect to see my first user returned with only those related groups that contain test. But I actually see ALL groups related to that user!

This is because calling .all() on a filtered managed queryset will re-evaluate that queryset as if it has no filters:

>>> user.groups.all()
[<Group: test>]
>>> user.groups.all().all()
[<Group: student>, <Group: test>]

This took a while to figure out and seems like bizarre behavior. Why is calling .all() on a queryset necessary when you can just iterate over it?

As a workaround, I have started using a custom ListSerializer that never calls .all(). Interested to hear if anybody else has run into this or tried using DRF Serializers together with the Prefetch object.

The text was updated successfully, but these errors were encountered:

xordoquy · 2015-03-17T08:55:09Z

Closing this as duplicate of #2442.

aleontiev · 2015-03-17T12:36:14Z

@xordoquy this is actually a distinct issue that only comes up when you use the Prefetch object along with a queryset to filter the relation.

My use case is providing a flexible GET API that provides filtering on secondary related resources.

xordoquy · 2015-03-17T12:43:53Z

Will take time to investigate this further then

xordoquy · 2015-03-17T12:47:08Z

This took a while to figure out and seems like bizarre behavior. Why is calling .all() on a queryset necessary when you can just iterate over it?

This is to force the QS reevaluation when someone writes the view class such as:

class View(...):
    queryset = User.objects.all()

Which means that the queryset will be evaluated at the class declaration rather than on per request basis if we didn't force the reevaluation.

xordoquy · 2015-03-17T12:49:44Z

@tomchristie do we really need the queryset reevaluation ?
It definitively helps for new comers that don't fully understand QS but I'm a bit afraid we'll get a couple of similar issues - such as this one - at some point

aleontiev · 2015-03-17T13:08:45Z

This isn't a major blocker right now (e.g. we've been able to work around it with a custom ListSerializer and/or custom relation field that proxies a serializer)

However, perhaps there should be an setting that allows you to toggle this behavior, which can be on by default for newcomers per your example?

tomchristie · 2015-03-17T15:29:19Z

@tomchristie do we really need the queryset reevaluation ?

@xordoquy Sorry, which one specifically?

xordoquy · 2015-03-17T15:59:39Z

@tomchristie the one we have in serializers:

        # Dealing with nested relationships, data can be a Manager,
        # so, first get a queryset from the Manager if needed
        iterable = data.all() if isinstance(data, (models.Manager, query.QuerySet)) else data
        return [
            self.child.to_representation(item) for item in iterable
        ]

The more I think about it the more I believe it is encouraging bad patterns (i.e. queryset evaluated where they shouldn't be)

tomchristie · 2015-03-17T16:13:21Z

We certainly need it for the Manager case - I expect we do need it for the queryset case too, but it'd be worth look at the history for that line.

kevin-brown · 2015-03-17T16:58:59Z

This is because calling .all() on a filtered managed queryset will re-evaluate that queryset as if it has no filters:

Is this something that's documented? It seems strange that calling all() on a queryset multiple times will perform different queries.

xordoquy · 2015-03-17T17:03:18Z

@kevin-brown adding .all() creates a new queryset which by default isn't evaluated hence the second query. In most cases the .all().all() won't add additional DB request since the first queryset won't get a chance to be evaluated.

kevin-brown · 2015-03-17T17:05:00Z

My question is specifically about the different query. I understand that it should be re-evaluated, which would be fine if it triggered the same query, but I don't understand why it's being evaluated using a different query.

I'm specifically referencing this

>>> user.groups.all()
[<Group: test>]
>>> user.groups.all().all()
[<Group: student>, <Group: test>]

xordoquy · 2015-03-17T17:14:46Z

@kevin-brown you probably have a point there. It should have been the very same request played another time. Probably an issue on the Django part.

aleontiev · 2015-03-17T18:03:24Z

@kevin-brown @xordoquy this does seem like a Django bug / undesired behavior related to the way querysets are cloned, which is happening during calls to .all. (https://github.com/django/django/blob/stable/1.7.x/django/db/models/query.py#L953)

The prefetch context is lost during cloning; also note that calling .all and then re-evaluating an evaluated queryset will issue another query:

>>> from django.db.models import Prefetch
>>> from django.contrib.auth.models import User, Group
>>> user = User.objects.create(username='[email protected]', email='[email protected]')
>>> user.groups = [Group.objects.create(name='test1'), Group.objects.create(name='test2')]
>>> connection.queries = []
>>> users = User.objects.filter(pk=user.pk).only('id').prefetch_related(Prefetch('groups', queryset=Group.objects.filter(name='test1').only('name')))
>>> user = users[0]
>>> connection.queries
[{u'sql': u'SELECT "auth_user"."id" FROM "auth_user" WHERE "auth_user"."id" = 1435 LIMIT 1', u'time': u'0.001'}, {u'sql': u'SELECT ("auth_user_groups"."user_id") AS "_prefetch_related_val_user_id", "auth_group"."id", "auth_group"."name" FROM "auth_group" INNER JOIN "auth_user_groups" ON ( "auth_group"."id" = "auth_user_groups"."group_id" ) WHERE ("auth_group"."name" = \'test1\' AND "auth_user_groups"."user_id" IN (1435))', u'time': u'0.001'}]
>>> len(connection.queries)
2
>>> user_groups = user.groups.all()
>>> user_groups
[<Group: test1>]
>>> len(connection.queries)
2
>>> user_groups.all()
[<Group: test1>, <Group: test2>]
>>> len(connection.queries)
3
>>> user_groups.all().all()
[<Group: test1>, <Group: test2>]
>>> len(connection.queries)
4

tomchristie · 2015-06-24T09:26:27Z

Have closed #2727 as a duplicate of this, although possible that we can more broadly state the issue, as it's probably not just prefetch_related that's at issue, but anything else that can be lost when .all() is called on the queryset.

tomchristie · 2015-06-24T09:28:28Z

TODO to progress this issue:

Issue a pull request with iterable = data.all() if isinstance(data, models.Manager) else data (Note that QuerySet is removed from the isinstance check.
Do any tests fail?
Check the blame/history on that line - when was it last modified and when was QuerySet added? What was the rationale at the time?

Note that #2727 includes a very trival example for demonstrating the behavior that doesn't rely on prefetch_related... #2727 (comment)

jpadilla · 2015-06-25T02:04:50Z

Making that change, keeps all tests passing.
This originally changed in List resource not updated between requests #2602

Working on #3076 to hopefully progress this issue.

Progressing #2704

tomchristie · 2015-06-25T13:51:59Z

Closed by #3076.

aleontiev · 2015-06-25T15:19:55Z

Thanks for the fix guys :)

tomchristie · 2015-06-25T15:37:30Z

😄

xordoquy closed this as completed Mar 17, 2015

xordoquy reopened this Mar 17, 2015

xordoquy mentioned this issue Mar 19, 2015

ListSerializer.to_representation cuts data and KeyError is raised if the queryset used has been modified #2727

Closed

tomchristie added the Needs design decision label Mar 23, 2015

MattBlack85 mentioned this issue Apr 20, 2015

Unable to modify objects in Serializer(instance=objects, many=True) if objects is a queryset #2841

Closed

tomchristie mentioned this issue Jun 23, 2015

Serializers many to many field not reflecting edits made in PUT/PATCH if prefetch_related used #2442

Closed

tomchristie mentioned this issue Jun 25, 2015

Progressing #2704 #3076

Merged

tomchristie added a commit that referenced this issue Jun 25, 2015

Merge pull request #3076 from jpadilla/issues/2704

df7c114

Progressing #2704

tomchristie closed this as completed Jun 25, 2015

tomchristie added this to the 3.1.4 Release milestone Jun 25, 2015

tomchristie added Bug and removed Needs design decision labels Jun 25, 2015

tomchristie modified the milestones: 3.1.4 Release, 3.2.0 Release Jul 30, 2015

pyup-bot mentioned this issue Oct 22, 2016

Pin djangorestframework to latest version 3.5.1 wooyek/django-website-pro#16

Merged

pyup-bot mentioned this issue Oct 30, 2016

Pin djangorestframework to latest version 3.5.1 Cyberbyte-Software/Sensor-Portal#49

Merged

pyup-bot mentioned this issue Feb 26, 2017

Pin djangorestframework to latest version 3.5.4 maidstone-hackspace/maidstone-hackspace-website#39

Merged

This was referenced Mar 9, 2017

Pin djangorestframework to latest version 3.6.0 getpatchwork/patchwork#87

Closed

Pin djangorestframework to latest version 3.6.1 getpatchwork/patchwork#88

Closed

Pin djangorestframework to latest version 3.6.2 getpatchwork/patchwork#89

Closed

pyup-bot mentioned this issue Jun 7, 2017

Pin djangorestframework to latest version 3.6.3 srtab/alexandriadocs#39

Closed

pyup-bot mentioned this issue Sep 19, 2017

Pin djangorestframework to latest version 3.6.4 founders4schools/django-donations#21

Merged

This was referenced Oct 6, 2017

Pin djangorestframework to latest version 3.7.0 founders4schools/django-donations#29

Closed

Pin djangorestframework to latest version 3.7.0 getpatchwork/patchwork#125

Closed

Pin djangorestframework to latest version 3.7.0 javipalanca/ojoalplato#65

Closed

This was referenced Nov 14, 2017

Pin djangorestframework to latest version 3.7.3 adfinis/timed-backend#152

Closed

Pin djangorestframework to latest version 3.7.3 adfinis/timed-backend#168

Closed

This was referenced Dec 10, 2017

Pin djangorestframework to latest version 3.7.3 javipalanca/ojoalplato#141

Closed

Pin djangorestframework to latest version 3.7.3 javipalanca/ojoalplato#195

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ListSerializer.to_representation does not respect prefetches #2704

ListSerializer.to_representation does not respect prefetches #2704

aleontiev commented Mar 17, 2015

xordoquy commented Mar 17, 2015

aleontiev commented Mar 17, 2015

xordoquy commented Mar 17, 2015

xordoquy commented Mar 17, 2015

xordoquy commented Mar 17, 2015

aleontiev commented Mar 17, 2015

tomchristie commented Mar 17, 2015

xordoquy commented Mar 17, 2015

tomchristie commented Mar 17, 2015

kevin-brown commented Mar 17, 2015

xordoquy commented Mar 17, 2015

kevin-brown commented Mar 17, 2015

xordoquy commented Mar 17, 2015

aleontiev commented Mar 17, 2015

tomchristie commented Jun 24, 2015

tomchristie commented Jun 24, 2015

jpadilla commented Jun 25, 2015

tomchristie commented Jun 25, 2015

aleontiev commented Jun 25, 2015

tomchristie commented Jun 25, 2015

ListSerializer.to_representation does not respect prefetches #2704

ListSerializer.to_representation does not respect prefetches #2704

Comments

aleontiev commented Mar 17, 2015

xordoquy commented Mar 17, 2015

aleontiev commented Mar 17, 2015

xordoquy commented Mar 17, 2015

xordoquy commented Mar 17, 2015

xordoquy commented Mar 17, 2015

aleontiev commented Mar 17, 2015

tomchristie commented Mar 17, 2015

xordoquy commented Mar 17, 2015

tomchristie commented Mar 17, 2015

kevin-brown commented Mar 17, 2015

xordoquy commented Mar 17, 2015

kevin-brown commented Mar 17, 2015

xordoquy commented Mar 17, 2015

aleontiev commented Mar 17, 2015

tomchristie commented Jun 24, 2015

tomchristie commented Jun 24, 2015

jpadilla commented Jun 25, 2015

tomchristie commented Jun 25, 2015

aleontiev commented Jun 25, 2015

tomchristie commented Jun 25, 2015