Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed Up Dataset List Request #6759

Merged
merged 18 commits into from
Jan 24, 2023
Merged

Speed Up Dataset List Request #6759

merged 18 commits into from
Jan 24, 2023

Conversation

fm3
Copy link
Member

@fm3 fm3 commented Jan 18, 2023

  • Slightly speeds up the dataset list request by moving the filter logic to composed SQL.
  • An optional boolean parameter compact can be used to get a more compact response with less information. This speeds up the route significantly more, as the complete response is constructed with just one SQL query.
  • the optional isEditable GET parameter in the dataset list request has been removed (it was, to our best knowledge, unused)
  • The front-end does not yet use this new compact version, but this is planned, cc @philippotto
  • Note that the work by @normanrz in the now superseded [Reference] Dataset list query #6766 paved the way for this PR :)
  • Also slipped in here: bugfix for folder access query (make visible folders that include public datasets)
  • Also slipped in here: bugfix for annotation list total count header

URL of deployed dev instance (used for testing):

Steps to test:

  • query http://localhost:9000/api/datasets?compact=true
  • should be fast (compared to no compact=true)
  • should still work with filters like isActive, searchQuery, folderId, … (except isEditable, that one has been removed)
  • isEditable boolean field should be correct (that one needs most scrutiny I’d say)

Issues:


@fm3 fm3 self-assigned this Jan 18, 2023
@fm3 fm3 marked this pull request as ready for review January 23, 2023 12:22
@fm3 fm3 changed the title [WIP] speed up dataset list request Speed Up Dataset List Request Jan 23, 2023
@fm3 fm3 requested a review from normanrz January 23, 2023 12:25
Copy link
Member

@normanrz normanrz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good stuff. Looking forward to seeing the perf improvements!

I am wondering if a complex query like this (or parts of it) might be better placed in a view for reusability and debuggability.

app/models/binary/DataSet.scala Outdated Show resolved Hide resolved
(
(u.isAdmin AND u._organization = d._organization) OR
u.isDatasetManager OR
d._uploader = u._id OR
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may need to revise that in the future. There needs to be a way to revoke access to a dataset even for the person that happens to have uploaded the data.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you agree we can just remove this from the relevant places now. In fact, to be allowed to upload (to a folder), you need to be admin/dataset manager or folder team manager already. So this change will not affect many (and there it may actually be intended)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created #6780 – I decided not to do it right away because I need to double-check that this does not break the uploading process itself

d.isUsable,
d.displayName,
d.created,
COALESCE(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw this could also be implemented as a view: a view of all dataset+user combinations with an attribute whether the user can read/write the dataset. The benefit of a view would be that we can easily examine it with pgadmin et al and it could potentially be used by multiple queries.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree about the views. However, we need to figure out a way to handle them during migrations. The existing userInfos view is already frequently a source of trouble (needs to be dropped + recreated in evolutions). I would not like to have to do this for two more views all the time 😬 Maybe there is a way to get around this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the problem with dropping/recreating?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a conceptual problem, but time and again we wrote evolutions with only their semantics in mind, then saw them fail because the view handling was not yet included. It of course makes the evolutions a lot bigger as well. None of this totally blocks this, but I would only go for it if we see concrete benefit from creating more views

@fm3
Copy link
Member Author

fm3 commented Jan 24, 2023

Another question: I spoke to @jstriebel and they said the current webknossos-libs don’t use any of the fields that "compact" removes. So in theory, we could just breaking-change the list query once the front-end uses only the compact version. However, this would break the versioning in https://docs.webknossos.org/webknossos/rest_api.html (even more). What are your feelings on this? Do you know if there are still users querying the Rest API directly, without our libs?
Related #5840

@normanrz
Copy link
Member

What are your feelings on this? Do you know if there are still users querying the Rest API directly, without our libs?

I would mention it in migration guide and break it. I'd say that the REST API is only officially supported through our official libraries.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants