import/export Taxonomy API functions #58

ChrisChV · 2023-06-19T19:39:31Z

Description

Implements import_tags and export_tags functions

Supporting information

Merge after #57.

Closes openedx/modular-learning#64

Testing instructions

Ensure that the tests cover the expected behaviour

openedx-webhooks · 2023-06-19T19:39:36Z

Thanks for the pull request, @ChrisChV! Please note that it may take us up to several weeks or months to complete a review and merge your PR.

Feel free to add as much of the following information to the ticket as you can:

supporting documentation
Open edX discussion forum threads
timeline information ("this must be merged by XX date", and why that is)
partner information ("this is a course on edx.org")
any other information that can help Product understand the context for the PR

All technical communication about the code itself will be done via the GitHub pull request interface. As a reminder, our process documentation is here.

Please let us know once your PR is ready for our review and all tests are green.

pomegranited

This is looking really good @ChrisChV ! I added one comment about test coverage -- do you have time to address that as part of this task? Would be great if we could keep the coverage for this module at 100%.

But what's here LGTM: 👍

I tested this by running the tests
I read through the code and tests with care
~~I checked for accessibility issues~~ N/A
~~Includes documentation~~ N/A

pomegranited · 2023-06-20T07:43:20Z

tests/openedx_tagging/core/tagging/test_api.py

@@ -205,3 +211,124 @@ def test_tag_object_invalid_tag(self):
                "course",
            )
        assert "Invalid object tag for taxonomy" in str(exc.exception)
+
+    def test_import_tags_csv(self):


There's some lines missing from the test coverage for import and export -- could you add tests for them too?

openedx_tagging/core/tagging/api.py 145 11 64 9 90% 185, 191, 206, 212->224, 215, 221-222, 224->exit, 287-289, 304, 310, 340->exit

Sure, thanks!

bradenmacdonald · 2023-06-20T22:49:09Z

@ChrisChV Until @pomegranited 's PR is merged, can you please change the "merge target" of this PR from main to her branch? Then it will only show the changes from your new commit. After her PR merges you can change it back to main.

bradenmacdonald · 2023-06-20T22:55:41Z

openedx_tagging/core/tagging/api.py

+    if taxonomy.allow_free_text:
+        raise ValueError(
+            _(
+                f"Invalid taxonomy ({taxonomy.id}): You can't import free-from tags taxonomies"


This wording is a bit confusing. I'd suggest: "You cannot import into a free-form taxonomy."

bradenmacdonald · 2023-06-20T22:58:34Z

openedx_tagging/core/tagging/api.py

+                    )
+                )
+            tags_data = list(csv_reader)
+        else:


Instead of using if format not in TaxonomyDataFormat.__members__.values(): above and assuming here that there are only two valid options, it's better to use

elif format == TaxonomyDataFormat.JSON: ... else: raise ValueError(_(f"Invalid format: {format}"))

@bradenmacdonald I think this is valid on the import. In the export I have left the verification at the beginning before loading the tags

bradenmacdonald · 2023-06-20T23:02:57Z

openedx_tagging/core/tagging/api.py

+        if tag_id not in new_tags and tag_id not in updated_tags:
+            try:
+                # Update tag
+                tag_instance = Tag.objects.get(external_id=tag_id)


You also need to use taxonomy=taxonomy here, or you can have cross-taxonomy bugs.

Oh wow.. I should have seen that!

bradenmacdonald · 2023-06-20T23:03:41Z

openedx_tagging/core/tagging/api.py

+                    # if there is no parent in the data import
+                    tag_instance.parent = None
+                updated_tags.append(tag_id)
+            except ObjectDoesNotExist:


Better to use Tag.DoesNotExist instead of the generic base class.

bradenmacdonald · 2023-06-20T23:05:03Z

openedx_tagging/core/tagging/api.py

+            return tag_instance
+        else:
+            # Returns the created/updated tag from history
+            return Tag.objects.get(external_id=tag_id)


Same thing here, needs taxonomy=taxonomy. The same external_id can appear in multiple taxonomies.

BTW if you add something like this to the Taxonomy class:

@property def tags(self): return Tag.objects.filter(taxonomy=self)

then this code can become return taxonomy.tags.get(external_id=tag_id) 😎

There's already taxonomy.tag_set, so I don't think we need another property?

Oh right, duh... not sure why I forgot that. But yes, use that :)

ChrisChV · 2023-06-21T17:09:09Z

@ChrisChV Until @pomegranited 's PR is merged, can you please change the "merge target" of this PR from main to her branch? Then it will only show the changes from your new commit. After her PR merges you can change it back to main.

@bradenmacdonald I have not found the way to change openedx/openedx-learning to open-craft/openedx-learning. I created this open-craft#1 on open-craft/openedx-learning. Do yo know a way to change the "merge target" between repos?
Personally, in these cases I do not like to have two PRs since you can lose the thread of the reviews

bradenmacdonald · 2023-06-21T17:36:03Z

@ChrisChV Oh, I see. In that case, what you'd have to do is push Jill's branch to the openedx fork if you can; if not, just leave it how you have it and link to the commit. That's fine.

ChrisChV · 2023-06-22T18:30:00Z

@ormsbee This is ready for your review

ChrisChV · 2023-06-27T21:56:17Z

@ormsbee I merged the changes of #57. Now the review is easier and it's ready

pomegranited

@ChrisChV I found a couple of nits, but otherwise good 👍

I tested this by running the tests locally and ensuring 100% coverage.
I read through the code and tests, and ensured that the tests cover our use cases.
~~I checked for accessibility issues~~ N/A
Includes documentation
Commit structure follows OEP-0051

pomegranited · 2023-06-28T06:13:29Z

openedx_tagging/core/tagging/api.py

 from django.utils.translation import gettext_lazy as _

 from .models import ObjectTag, Tag, Taxonomy

+csv_fields = ['id', 'name', 'parent_id', 'parent_name']


nit: would like to make it clear that this is a constant, and it's not part of the externally-exportable python api:

Suggested change

csv_fields = ['id', 'name', 'parent_id', 'parent_name']

_CSV_FIELDS = ['id', 'name', 'parent_id', 'parent_name']

pomegranited · 2023-06-28T06:20:32Z

tests/openedx_tagging/core/tagging/test_api.py

+        tagging_api.resync_object_tags([object_tag])
+        object_tag = ObjectTag.objects.get(object_id=object_id)
+        self.assertEqual(object_tag.tag.value, 'Bacteria')
+        self.assertEqual(object_tag._value, 'Bacteria')


nit: pylint doesn't like this, but I agree it's necessary for the test. So could you please add this here and in the two other mentions below?

Suggested change

self.assertEqual(object_tag._value, 'Bacteria')

self.assertEqual(object_tag._value, 'Bacteria') # pylint: disable=protected-access

pomegranited · 2023-06-28T06:27:10Z

openedx_tagging/core/tagging/api.py

+        if replace:
+            taxonomy.tag_set.exclude(external_id__in=updated_tags).delete()
+
+        resync_object_tags(ObjectTag.objects.filter(taxonomy=taxonomy))


This might be a large operation.. so I don't think it should sit under the same atomic operation. Can you bump it out a level?

Suggested change

resync_object_tags(ObjectTag.objects.filter(taxonomy=taxonomy))

resync_object_tags(ObjectTag.objects.filter(taxonomy=taxonomy))

pomegranited · 2023-06-28T06:28:11Z

openedx_tagging/core/tagging/api.py

+        json_result = {
+            'name': taxonomy.name,
+            'description': taxonomy.description,
+            'tags': result


nit:

Suggested change

'tags': result

'tags': result,

ormsbee

I started writing per-line reviews, but then I thought it best to step back a bit.

Import is going to be one of those places where the complexity is going to grow a lot as time goes on. We already know that we're going to have to support a variety of modifications in the future–consolidating tags, renaming tags, etc.

We're clearly not going to implement that functionality yet. But the fact that this is going to grow a lot means that we should be really disciplined and explicit about the stages of data transformation that we have here. We also have to keep the code as painfully simple and easy to test as possible.

What I expect to see as a top level import function would be something fairly short that that calls out to other classes/functions to implement something like the following steps, where the output of one step is the input into the next:

Parse the raw input into a DSL of actions (e.g. CreateOrUpdate, Delete, Rename, etc.).
Validate that the parsed set of actions is internally consistent (e.g. we're not adding a Tag as a parent of itself).
Validate that the actions make sense against the Taxonomy data being operated on (requires reading data form our Taxonomy).
Create a plan for the operations we'll need to do (because preview functionality is really nice to have when there's no revert button).
Execute that plan.

Each of those steps should be separately testable. We should be able to have a CSVParser and a JSONParser and have the inputs and outputs of their tests only be the raw data and the actions they generate.

We should be able to have a set of actions, and test the function that validates them.

When we chain things together, we should be able to set up a database state and a list of DSL statements and test the plan that we generate.

Right now, all of these concerns are mixed together in the import_tags. This will make it harder to debug as time goes on, and encourage the use of side-effects to effectively "message" later parts of the pipeline with dicts built earlier on.

These may not seem like immediate problems, but if we make it so that the easiest way to add new features is to add a couple dozen lines to import_tags and make a new 50 line inner function there, things are going to get very confusing very quickly.

ormsbee · 2023-06-29T06:26:22Z

openedx_tagging/core/tagging/api.py

+
+    updated_tags = []
+
+    def create_update_tag(tag):


Please try to avoid inner functions this large and complex. The create_update_tag function has interesting logic but it's harder to test separately because it's nested in here.

ChrisChV · 2023-07-14T19:26:46Z

Closed in favor of #64

openedx-webhooks · 2023-07-14T19:26:51Z

@ChrisChV Even though your pull request wasn’t merged, please take a moment to answer a two question survey so we can improve your experience in the future.

openedx-webhooks added the open-source-contribution PR author is not from Axim or 2U label Jun 19, 2023

pomegranited approved these changes Jun 20, 2023

View reviewed changes

ChrisChV force-pushed the chris/taxonomy-import-export branch 2 times, most recently from 442cb64 to c2e5099 Compare June 20, 2023 17:04

mphilbrick211 added the needs test run Author's first PR to this repository, awaiting test authorization from Axim label Jun 20, 2023

bradenmacdonald reviewed Jun 20, 2023

View reviewed changes

ChrisChV force-pushed the chris/taxonomy-import-export branch from a10c6f6 to c0e5a48 Compare June 21, 2023 16:57

ChrisChV force-pushed the chris/taxonomy-import-export branch from c0e5a48 to ebda011 Compare June 21, 2023 18:29

ChrisChV force-pushed the chris/taxonomy-import-export branch from ebda011 to 455e783 Compare June 22, 2023 20:55

pomegranited mentioned this pull request Jun 25, 2023

import/export Taxonomy API functions open-craft/openedx-learning#1

Closed

e0d removed the needs test run Author's first PR to this repository, awaiting test authorization from Axim label Jun 26, 2023

ChrisChV added 3 commits June 27, 2023 16:47

feat: import/export Taxonomy API functions

d7538c7

fix: Nits on import/export functions and on tests

577f44b

chore: Improving 'reaplce' functionality of 'import_tags'

e5b1698

ChrisChV force-pushed the chris/taxonomy-import-export branch from a043b41 to e5b1698 Compare June 27, 2023 21:50

pomegranited approved these changes Jun 28, 2023

View reviewed changes

ormsbee requested changes Jun 29, 2023

View reviewed changes

pomegranited mentioned this pull request Jul 10, 2023

Taxonomy view/management REST APIs #63

Merged

1 task

ChrisChV closed this Jul 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

import/export Taxonomy API functions #58

import/export Taxonomy API functions #58

ChrisChV commented Jun 19, 2023 •

edited

Loading

openedx-webhooks commented Jun 19, 2023

pomegranited left a comment

pomegranited Jun 20, 2023

ChrisChV Jun 20, 2023

bradenmacdonald commented Jun 20, 2023

bradenmacdonald Jun 20, 2023

bradenmacdonald Jun 20, 2023

ChrisChV Jun 21, 2023

bradenmacdonald Jun 20, 2023

pomegranited Jun 20, 2023

bradenmacdonald Jun 20, 2023

bradenmacdonald Jun 20, 2023 •

edited

Loading

pomegranited Jun 20, 2023

bradenmacdonald Jun 21, 2023

ChrisChV commented Jun 21, 2023 •

edited

Loading

bradenmacdonald commented Jun 21, 2023

ChrisChV commented Jun 22, 2023

ChrisChV commented Jun 27, 2023

pomegranited left a comment •

edited

Loading

pomegranited Jun 28, 2023

pomegranited Jun 28, 2023

pomegranited Jun 28, 2023

pomegranited Jun 28, 2023

ormsbee left a comment

ormsbee Jun 29, 2023

ChrisChV commented Jul 14, 2023

openedx-webhooks commented Jul 14, 2023

	csv_fields = ['id', 'name', 'parent_id', 'parent_name']
	_CSV_FIELDS = ['id', 'name', 'parent_id', 'parent_name']

	self.assertEqual(object_tag._value, 'Bacteria')
	self.assertEqual(object_tag._value, 'Bacteria') # pylint: disable=protected-access

	resync_object_tags(ObjectTag.objects.filter(taxonomy=taxonomy))
	resync_object_tags(ObjectTag.objects.filter(taxonomy=taxonomy))

import/export Taxonomy API functions #58

import/export Taxonomy API functions #58

Conversation

ChrisChV commented Jun 19, 2023 • edited Loading

Description

Supporting information

Testing instructions

openedx-webhooks commented Jun 19, 2023

pomegranited left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bradenmacdonald commented Jun 20, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bradenmacdonald Jun 20, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ChrisChV commented Jun 21, 2023 • edited Loading

bradenmacdonald commented Jun 21, 2023

ChrisChV commented Jun 22, 2023

ChrisChV commented Jun 27, 2023

pomegranited left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ormsbee left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ChrisChV commented Jul 14, 2023

openedx-webhooks commented Jul 14, 2023

ChrisChV commented Jun 19, 2023 •

edited

Loading

bradenmacdonald Jun 20, 2023 •

edited

Loading

ChrisChV commented Jun 21, 2023 •

edited

Loading

pomegranited left a comment •

edited

Loading