Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add import taxonomy endpoint #112

Merged

Conversation

rpenido
Copy link
Contributor

@rpenido rpenido commented Nov 1, 2023

Description

Add a rest api endpoint to import taxonomies

Supporting Information

Testing instructions

  • Please ensure that the tests cover the expected behavior of the view

Private-ref: FAL-3532

@openedx-webhooks openedx-webhooks added the open-source-contribution PR author is not from Axim or 2U label Nov 1, 2023
@openedx-webhooks
Copy link

openedx-webhooks commented Nov 1, 2023

Thanks for the pull request, @rpenido! Please note that it may take us up to several weeks or months to complete a review and merge your PR.

Feel free to add as much of the following information to the ticket as you can:

  • supporting documentation
  • Open edX discussion forum threads
  • timeline information ("this must be merged by XX date", and why that is)
  • partner information ("this is a course on edx.org")
  • any other information that can help Product understand the context for the PR

All technical communication about the code itself will be done via the GitHub pull request interface. As a reminder, our process documentation is here.

Please let us know once your PR is ready for our review and all tests are green.

@rpenido rpenido force-pushed the rpenido/fal-3532-import-taxonomy-bare-bones branch from d552b92 to cb30030 Compare November 1, 2023 21:06
@rpenido rpenido force-pushed the rpenido/fal-3532-import-taxonomy-bare-bones branch from cb30030 to c1d5944 Compare November 1, 2023 21:22
openedx_tagging/core/tagging/rest_api/v1/serializers.py Outdated Show resolved Hide resolved
openedx_tagging/core/tagging/rest_api/v1/serializers.py Outdated Show resolved Hide resolved
taxonomy_description=taxonomy_description,
file=file,
parser_format=parser_format,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm.. the import/export tags API was designed to use async celery tasks so it could run in the background (in case there's a lot of tags to import).

@bradenmacdonald do you want us to use these async tasks here, or are we doing all imports synchronously?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect synchronous will be totally fine for the short term, but we might as well do it async since we've done most of the work already. How much work will that add to make the REST API layer async too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do we mean to make the call async here?

  1. The client makes the API call, gets the response "Job running.." immediately, and later is notified (websocket or pooling) that the job is done and the results
  2. The client makes the API call, and the REST API calls the Python API async but waits for the result before sending the response to the client (the HTTP request will be "sync", but we will use the celery task)

The 2 is simple (but I don't know if we have some gain that way).
1 size depends on what we want to accomplish. Will we just show a toast, or we want a page listing the job history?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant (1). But with simple polling for now, no need for websockets or anything fancy. For (2) it actually ties up much more server resources than just directly doing the import synchronously.

If that's going to take a while to implement though we can just to sync for now.

Copy link
Contributor Author

@rpenido rpenido Nov 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it!
There is only one problem in our flow:
We need to create the Taxonomy before calling the import.

If the import fails, it is easy to remove the created taxonomy in the sync call.

Doing async, we can create the taxonomy, return the id to the client, and make calls waiting for the results (the result is associated with the taxonomy). But I need to figure out how to delete the freshly created taxonomy after a failed attempt. Would you happen to have any ideas on this?

This is not urgent for this task, but we will need to handle it if we want to go async sometime in the future. One approach is to change the import flow in the front-end: we first create the taxonomy and always call import inside it (then we don't need to handle the delete). This will impact our designs and this task a bit.

Another approach is to handle the create+import sync and use async for importing on an already created taxonomy. But it will also impact the UX, having different flows.

I think the appropriate solution will be to refactor the import to let it create the taxonomy and change the results to be tied to some kind of "job-id", and not the taxonomy itself.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the right approach is to create an additional async task which creates the taxonomy, then synchronously calls the existing import task (which is a celery task so can be called either sync or async, but since we're already in an async task, we call it sync). If that succeeds, it returns the taxonomy ID, and if it fails, it deletes the taxonomy and returns an error message.

The main thing is that this "wrapper" task which does create+import returns an import ID not a taxonomy ID. And then the frontend polls the import ID until it gets either a success or failure message. Only then does it get the taxonomy ID.


But that's all getting too complicated. Just do it synchronously for now an we'll make it async if it's too slow in practice. Explicitly put a comment in the REST API for the import that "this is an unstable API and may change if we make it async."

@rpenido rpenido force-pushed the rpenido/fal-3532-import-taxonomy-bare-bones branch from 326cde6 to f030767 Compare November 3, 2023 15:04
@rpenido rpenido marked this pull request as ready for review November 6, 2023 13:48
@rpenido
Copy link
Contributor Author

rpenido commented Nov 6, 2023

I think this is ready for a final review @pomegranited !

But I prefer to wait for the development of the edx-platform part before merging.

@rpenido rpenido force-pushed the rpenido/fal-3532-import-taxonomy-bare-bones branch from f6b6854 to 3f1e08e Compare November 7, 2023 13:18
@rpenido rpenido force-pushed the rpenido/fal-3532-import-taxonomy-bare-bones branch from 3f1e08e to a50e08f Compare November 7, 2023 13:30
@rpenido rpenido force-pushed the rpenido/fal-3532-import-taxonomy-bare-bones branch from 77290f6 to e975107 Compare November 7, 2023 13:44
@rpenido rpenido force-pushed the rpenido/fal-3532-import-taxonomy-bare-bones branch from e975107 to e1b1789 Compare November 7, 2023 13:44
Copy link
Contributor

@pomegranited pomegranited left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good, and the tests look great @rpenido! Comments inline.

openedx_tagging/core/tagging/rest_api/v1/views.py Outdated Show resolved Hide resolved
openedx_tagging/core/tagging/rest_api/v1/views.py Outdated Show resolved Hide resolved
openedx_tagging/core/tagging/rest_api/v1/views.py Outdated Show resolved Hide resolved
import_success = import_tags(taxonomy, file, parser_format)

if import_success:
return HttpResponse(status=200)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto here, can we return the Taxonomy on success?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but I think we will need to revisit this in the future.

I think we should return the Taxonomy for now, but we will probably want to return the tag list for this taxonomy (at least the first page of it) after the import.

We will have a cleaner view when implementing the Import inside the Taxonomy Details.

And just to make sure: the import inside the Taxonomy Details is not part of this task, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rpenido

We will have a cleaner view when implementing the Import inside the Taxonomy Details.

Agreed -- ok to just return Taxonomy for now.

And just to make sure: the import inside the Taxonomy Details is not part of this task, right?

Correct, importing tags to an existing taxonomy is covered by openedx/modular-learning#140

if import_success:
return HttpResponse(status=200)
else:
import_error = get_last_import_log(taxonomy)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm.. This works fine in a serial environment, but if we've got multiple imports happening at the same time, this won't necessarily be the import log we're interested in here. Really, the only way to get that is if we have the task ID after import.

But I don't know if it's worth refactoring import_tags to support this, so: could you add a comment here to warn us? Then we'll know where to look if users report problems.

Copy link
Contributor Author

@rpenido rpenido Nov 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not as alarming as I made it sound:
1- It only happens on update. If we do a mass import/create we will have new ids, so no problem.
2- I did another look at the code and realized that we prevent multiple imports of the same taxonomy simultaneously. So, the concurrent problem may not occur at all, right?

openedx_tagging/core/tagging/rules.py Outdated Show resolved Hide resolved
@rpenido rpenido requested a review from pomegranited November 8, 2023 15:21
@rpenido
Copy link
Contributor Author

rpenido commented Nov 8, 2023

ToDo: Bump version before merge

Copy link
Contributor

@pomegranited pomegranited left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 This looks great @rpenido ! Ready for your review @bradenmacdonald .

Note: needs a version bump before merging.

  • I tested this
  • I read through the code
  • I checked for accessibility issues N/A
  • Includes documentation

def setUp(self):
ImportTaxonomyMixin.setUp(self)

self.taxonomy = Taxonomy.objects.create(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: it's not a big deal, but I generally would prefer that we use the API methods api.create_taxonomy(...) and api.add_tag_to_taxonomy(...) instead of using the models directly. In this case for example, add_tag_to_taxonomy does a lot of additional validation of the created objects, which is good to have, but isn't happening here if we just create the models directly.

The only major exception is in test_models.py where we're actually testing the models; in that case obviously we should use the models directly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even in the setUp @bradenmacdonald?

A breaking change will prevent the tests from running and be trickier to diagnose, don't you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well that's the thing - the API is not supposed to have breaking changes in general. But we may change the internal models. So if this test isn't specifically testing the internal models, it should use the API which should be more stable. But again, don't worry about it for now. What you have is fine.

@bradenmacdonald
Copy link
Contributor

@pomegranited From a very quick look, this seems good to me. You're a CC for this repo so I'm happy for you to approve and merge it, unless you want me to do a second review?

@pomegranited
Copy link
Contributor

@bradenmacdonald Could you do a 2nd review? I'm happy to merge this when it's ready.

openedx_tagging/core/tagging/rest_api/v1/serializers.py Outdated Show resolved Hide resolved
openedx_tagging/core/tagging/rest_api/v1/serializers.py Outdated Show resolved Hide resolved
@@ -215,9 +240,6 @@ def export(self, request, **_kwargs) -> HttpResponse:
Export a taxonomy.
"""
taxonomy = self.get_object()
perm = "oel_tagging.export_taxonomy"
if not request.user.has_perm(perm, taxonomy):
raise PermissionDenied("You do not have permission to export this taxonomy.")
Copy link
Contributor

@bradenmacdonald bradenmacdonald Nov 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's true that "if you can view the taxonomy, you can export it", but are we still checking that the user has "view" permission here? I don't see it. (Though it looks like it's covered by the tests. I'm just wondering where that check is happening).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tests/openedx_tagging/core/tagging/test_views.py Outdated Show resolved Hide resolved
@bradenmacdonald
Copy link
Contributor

@pomegranited It looks good to me! Just saw some minor things to clean up.

Co-authored-by: Braden MacDonald <[email protected]>
@pomegranited
Copy link
Contributor

Thanks for fixing those docstrings @rpenido ! Could you push a version bump when you're done addressing @bradenmacdonald 's review, and I'll get this merged.

@rpenido
Copy link
Contributor Author

rpenido commented Nov 9, 2023

Thanks for fixing those docstrings @rpenido ! Could you push a version bump when you're done addressing @bradenmacdonald 's review, and I'll get this merged.

Done c4fe95d!

@pomegranited pomegranited merged commit 579790d into openedx:main Nov 10, 2023
7 checks passed
@openedx-webhooks
Copy link

@rpenido 🎉 Your pull request was merged! Please take a moment to answer a two question survey so we can improve your experience in the future.

@pomegranited pomegranited deleted the rpenido/fal-3532-import-taxonomy-bare-bones branch November 10, 2023 02:09
@pomegranited
Copy link
Contributor

Version 0.3.3 released.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
open-source-contribution PR author is not from Axim or 2U
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

4 participants