Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix DISTINCT for Oracle databases #2935

Closed
wants to merge 6 commits into from
Closed

fix DISTINCT for Oracle databases #2935

wants to merge 6 commits into from

Conversation

trollknurr
Copy link
Contributor

fix error "DatabaseError: ORA-00932: inconsistent datatypes: expected - got NCLOB" for oracle db

@tomchristie
Copy link
Member

Presumably this fix means we'll just have silently broken behavior, by not including the required distinct?

@jpadilla
Copy link
Member

For more background on this. From docs:

LOB columns may not be used in a SELECT DISTINCT list. This means that attempting to use the QuerySet.distinct method on a model that includes TextField columns will result in an error when run against Oracle. As a workaround, use the QuerySet.defer method in conjunction with distinct() to prevent TextField columns from being included in the SELECT DISTINCT list.

@trollknurr
Copy link
Contributor Author

@tomchristie DRF 2.4 works fine without distinct.
https://github.com/tomchristie/django-rest-framework/blob/version-2.4.x/rest_framework/filters.py#L106
this fix was figured out during migration from DRF 2.x to 3.x

@trollknurr
Copy link
Contributor Author

@jpadilla QuerySet defer(field_name) method will not suit in case we want to search on this field_name

@tomchristie
Copy link
Member

DRF 2.4 works fine without distinct.

I believe that the distinct was added due to a bug being found since the 3.0 release. I've no reason to believe that it wasn't also present in 2.x, but it only manifests with certain lookups.

(Someone would need to track down the issue, I don't remember exactly, but git blame would probably point in the right direction)

@kevin-brown
Copy link
Member

I believe that the distinct was added due to a bug being found since the 3.0 release.

Confirmed.

@tomchristie
Copy link
Member

Right - so I'd assume merging this fix would also mean that we'd be silently breaking #2535 for users of Oracle? I assume there's no good solution here?

@trollknurr
Copy link
Contributor Author

That's right, i broked #2535. Here updated fix.
Using set give one more query overheat, but Oracle users want to use SearchFilter

queryset = queryset.distinct()
else:
pk_list = queryset.values_list('pk', flat=True)
queryset = view.model.objects.filter(pk__in=set(pk_list))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using view.model.objects here feels a little bit brittle. I think we can avoid needing that. Does something like the following make sense?...

if settings.DATABASES[queryset.db]["ENGINE"] != "django.db.backends.oracle":
    return queryset.filter(reduce(operator.or_, or_queries)).distinct()
else:
    pk_list = queryset.filter(reduce(operator.or_, or_queries)).values_list('pk', flat=True)
    return queryset.filter(pk_in=set(pk_list))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side note here: Since when is the model attribute back on views? I thought that was removed.

@trollknurr
Copy link
Contributor Author

Avoided using view.model.objects & little enchancements

@@ -100,13 +101,17 @@ def filter_queryset(self, request, queryset, view):

orm_lookups = [self.construct_search(six.text_type(search_field))
for search_field in search_fields]

and_queries = []
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understand the motivation here - feels more obscure after this change rather than more clear.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't sure, if calling queryset.filter(pk_in=set(pk_list)) will make duplications go away, as they pk still will be in pk_list. I want to filter original queryset by pk_list.
When we call filter with multiple kwargs or several times on one queryset it is equal to logical and.

@trollknurr
Copy link
Contributor Author

@tomchristie updated

and_queries.append(reduce(operator.or_, or_queries))

if and_queries:
if settings.DATABASES[queryset.db]["ENGINE"] == "django.db.backends.oracle":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll want a brief comment here regarding needing this behavior for oracle and distinct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tomchristie According to Oracle DB limits there is no capability to make a DISTINT on *LOB. There is a need to use SearchFilter with TextField (for example). That's why i pull this workaround for Oracle.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarification: I was meaning that we should have a code comment for this.

@jpadilla
Copy link
Member

jpadilla commented Jun 2, 2015

@xordoquy thinking we can also milestone this for next release as it seems to be 99% completed.

@xordoquy
Copy link
Collaborator

xordoquy commented Jun 2, 2015

I won't have the time to review it.
I'll try to get the release out by tomorrow evening which doesn't leave much time.

@trollknurr
Copy link
Contributor Author

@tomchristie updated

for search_term in self.get_search_terms(request):
or_queries = [models.Q(**{orm_lookup: search_term})
for orm_lookup in orm_lookups]
queryset = queryset.filter(reduce(operator.or_, or_queries)).distinct()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer it if we didn't introduce the and_queries. We should keep each pull request as absolutely minimal as possible. Seems like there's two different changes here - the Oracle support as one, and the change in the filtering style as another. Let's just adopt the first of those two.
Apologies for the slow progress on this, but thanks for getting it nearly there! :)

@trollknurr
Copy link
Contributor Author

@tomchristie i'm not sure about failed test - is it my fault?

@@ -152,7 +163,7 @@ def remove_invalid_fields(self, queryset, fields, view):
field.source or field_name
for field_name, field in serializer_class().fields.items()
if not getattr(field, 'write_only', False)
]
]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't need to be indented another level.

This is what is causing Flake8 to fail by the way.

@tomchristie
Copy link
Member

Superseeded by #3103.

@tomchristie tomchristie closed this Jul 2, 2015
@xordoquy xordoquy added this to the 3.2.0 Release milestone Aug 1, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants