Bulk organization user creation/addition feature #3651

teovin · 2024-11-07T15:55:40Z

This feature adds a new button to the organization user management view (/manage/organization-users):

Upon clicking, the bulk upload form comes up:

User makes the organization, affiliation expiration, and CSV file selections. Form cannot be submitted without an organization selection or with a non .csv file extension.

It is also invalid if it doesn't include all of the column headers, if it doesn't have any users at all, if it doesn't include the email data for a given user, if it has duplicate users, and if an email address is invalid.

The affiliation field can be toggled just like in the single user flow.

Once submitted, users that don't exist will be created; those that do will have their organization affiliation updated. New users will receive new user email with account activation email. Existing users will receive the user added to organization email.

A generic success message will be displayed at the end if there are no validation errors:

If some of the users get processed, but some can't due to those being an admin or registrar, below will be displayed:

If all users in the CSV are admins or registrars, the below error message will be displayed:

This PR also fixes a bug where the existing user email wasn't displaying the organization name.

Disclaimer: Any behavior and the wording here are open to feedback. This was my initial take on it :)

EDIT:

As of Nov 20th, I put this feature behind a feature flag so we can do some more testing on it in stage before a possible production deployment.

codecov · 2024-11-07T16:42:13Z

Codecov Report

Attention: Patch coverage is 91.11111% with 12 lines in your changes missing coverage. Please review.

Project coverage is 69.32%. Comparing base (1c1fd71) to head (cf5882d).
Report is 68 commits behind head on develop.

Files with missing lines	Patch %	Lines
perma_web/perma/views/user_management.py	83.72%	7 Missing ⚠️
perma_web/perma/forms.py	94.50%	5 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #3651      +/-   ##
===========================================
+ Coverage    69.01%   69.32%   +0.31%     
===========================================
  Files           54       54              
  Lines         7478     7623     +145     
===========================================
+ Hits          5161     5285     +124     
- Misses        2317     2338      +21

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

christiansmith · 2024-11-12T17:18:25Z

Reviewing this with @bensteinberg (running locally), we noticed a few minor details with the CSV handling form.

The CSV upload fails without a header row or with incorrect column names.
1. We could provide a template for users to download and edit
2. We could also handle the case of a missing header row and validate the structure of the provided data rows.
When there is a validation error, instructions should remain on the screen along with the error message.
If a malformed CSV is uploaded, it would be good to have better feedback for the user than the current stack trace.

bensteinberg

As discussed, @christiansmith and I just looked at this together. I see this is added to the test for permissions, but otherwise has no tests.

Also, we were just discussing this with @rebeccacremona, who pointed out that we might prefer to assemble all the work and run it in a single database query, rather than going line-by-line. It might also make sense to fail if there are any validation errors, rather than create some users and not others.

teovin · 2024-11-15T19:02:38Z

@christiansmith @bensteinberg Pushed a couple of commits for the desired behavior and also added a test. I am checking with Clare about below to see what her preference would be. My initial approach was that it is good to process the good data and ignore the rest, as in something is better than nothing if it makes sense. Will check with Becky on the point of running a single db query.

It might also make sense to fail if there are any validation errors, rather than create some users and not others.

teovin · 2024-11-18T15:36:49Z

@christiansmith @bensteinberg I made some changes to the PR since you last looked, I'd appreciate it if you could look again. Clare preferred to process the valid users and notify the requestor for those that were invalid, and I updated the PR description and UI messaging to reflect that.

rebeccacremona

This is looking great! This feature has been a lot of hard work, and it shows!

I noticed one tiny bug: if the CSV includes email addresses of existing users and uses capital letters, we will not find them and will try to create a new user with the capitalized email address (which will fail).

I think that bug is worth fixing before merging. But that is the only one I saw!

I also left some thoughts on readability, a few tweaks a person might consider making, a few things we might want to think about in the future, etc.

Please take or leave that feedback 🙂 I don't think any of it, even if it's something we want to do, should stand in the way of this being merged 🙂.

perma_web/perma/forms.py

rebeccacremona · 2024-12-02T17:23:11Z

perma_web/perma/forms.py

+
+        # validate the rows
+        seen = set()
+        row_count = 0


I noticed something fun. If you want, you actually don't need the seen set or the row_count integer! if email in self.user_data is the same as if email in seen, and row_count is the same as len(self.user_data) 🙂 But if you prefer the aesthetics of seen and row_count, no objections here!

perma_web/perma/forms.py

rebeccacremona · 2024-12-02T17:43:27Z

perma_web/perma/forms.py

+                        expires_at=expires_at
+                    )
+                )
+


I regret am having a little trouble reading this section. I think it is partly because of variable names.

created_user_affiliations is a list of affiliation objects (which makes sense to me) but, the similarly named updated_user_affiliations is a list of user objects.

similarly with preexisting_affiliations_set and all_user_affiliations; both are sets, and both are sets of user objects.

I am confused whether updated_user_affiliations and all_user_affiliations are in fact needed. Is this true? updated_user_affiliations is only used to make all_user_affiliations, and all_user_affiliations is the same as set(self.updated_users.values())? If that's true, I think this would be easier to read if you used that directly.

rebeccacremona · 2024-12-02T17:49:39Z

perma_web/perma/forms.py

+
+            # create or update the affiliations of existing users
+            # affiliations that already exist
+            preexisting_affiliations = (UserOrganizationAffiliation.objects.filter(user__in=updated_user_affiliations,


This is just a thought about readability. Do you think it would be easier to read if, right here after you get the existing affiliation objects, you did the update? Then, you could do the calculations to figure out if any new affiliation objects need to be created, and then create them, in its own section with its own comment. I think that might be clearer than doing them together, but you may disagree 🙂

rebeccacremona · 2024-12-02T17:56:36Z

perma_web/perma/forms.py

+            preexisting_affiliations = (UserOrganizationAffiliation.objects.filter(user__in=updated_user_affiliations,
+                                                                                   organization=organization))
+
+            preexisting_affiliations_set = set(affiliation.user for affiliation in preexisting_affiliations)


Ah, here's a good Django + database thing to know about! Since you are iterating through the UserOrganizationAffiliation queryset and fetching each object's related user object, this will make one database query per object. Django includes a utility for situations like this, select_related: https://docs.djangoproject.com/en/5.1/ref/models/querysets/#select-related. If you call UserOrganizationAffiliation.objects.filter(...).select_related('user'), then Django will get all those user objects in one single query!

rebeccacremona · 2024-12-02T18:01:52Z

perma_web/perma/views/user_management.py

+                        email_new_user(*args, obj, email_template, extra_context)
+                    else:
+                        send_user_email(obj.raw_email, email_template, extra_context)
+                except Exception as e:


Thanks for adding this error handling!

As mentioned, I am a little worried that, when we are actually calling out to the mailgun API and actually sending emails, that it will be too slow to do this inline, resulting in users who upload large CSVs seeing a 502 despite the request succeeding (like happens with large link batches). I think we might need to make this async.

Possibly to be discussed!! Even if we do, that definitely doesn't have to be part of this PR!!

I will test in stage to see at what point we start seeing failures. Depending on that, I can add a limit to the rows.

perma_web/perma/forms.py

teovin and others added 2 commits November 6, 2024 15:16

add first draft

4b45cd9

Merge branch 'develop' into bulk-org-user-creation

2741dd6

lint fix

125d965

teovin marked this pull request as ready for review November 7, 2024 17:06

teovin requested a review from a team as a code owner November 7, 2024 17:06

teovin requested review from bensteinberg and removed request for a team November 7, 2024 17:06

update language, make names optional

741405b

bensteinberg reviewed Nov 12, 2024

View reviewed changes

teovin added 2 commits November 14, 2024 13:49

show help text along with errors, validate the header rows

df20fef

add test

83f4498

teovin added 6 commits November 15, 2024 17:23

clean up the queries, update success message per clare

6c88171

add messaging scenario where all users in csv are invalid

0e60202

move messages into helper functions

0f09756

reorganize emailing pieces

d41aa7e

DRY the messaging helpers

420ae6a

DRY the test a bit

808e525

minor text update

cd4a571

teovin requested a review from bensteinberg November 19, 2024 14:27

teovin added 7 commits November 20, 2024 08:52

rename vars

0f4251c

put feature behind flag

2431e07

Merge remote-tracking branch 'origin' into bulk-org-user-creation

d9ef26e

rebase and readd the migration

6488bfe

pass user raw_email for when sending emails to updated users

18e1fa2

validate form for no data and duplicate emails cases as well

4e76c75

DRY the validation

eace4cb

Merge branch 'develop' into bulk-org-user-creation

41d7dcc

bensteinberg requested review from rebeccacremona and removed request for bensteinberg November 22, 2024 15:41

teovin added 3 commits November 22, 2024 10:57

tweak to language

afb6a4d

update test lang as well

eb97ff2

further dry the validation

2d59d57

rebeccacremona added the no-nudge label Nov 22, 2024

teovin added 3 commits November 25, 2024 17:18

review changes

3bed51f

lint

1cbee33

test fix

c5483e2

rebeccacremona requested changes Dec 2, 2024

View reviewed changes

teovin added 4 commits December 2, 2024 14:38

lowercase email to prevent creating new user with same email

2b4e104

use django EmailValidator class to validate emails

4ce15de

show duplicate user in error message

acd0457

rename var

92e06de

rebeccacremona reviewed Dec 2, 2024

View reviewed changes

perma_web/perma/forms.py Outdated Show resolved Hide resolved

prevent downcasing raw email

cf5882d

rebeccacremona approved these changes Dec 4, 2024

View reviewed changes

teovin merged commit 44802ea into harvard-lil:develop Dec 4, 2024
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bulk organization user creation/addition feature #3651

Bulk organization user creation/addition feature #3651

teovin commented Nov 7, 2024 •

edited

Loading

codecov bot commented Nov 7, 2024 •

edited

Loading

christiansmith commented Nov 12, 2024

bensteinberg left a comment

teovin commented Nov 15, 2024

teovin commented Nov 18, 2024

rebeccacremona left a comment

rebeccacremona Dec 2, 2024

rebeccacremona Dec 2, 2024

rebeccacremona Dec 2, 2024

rebeccacremona Dec 2, 2024

rebeccacremona Dec 2, 2024

teovin Dec 2, 2024

Bulk organization user creation/addition feature #3651

Bulk organization user creation/addition feature #3651

Conversation

teovin commented Nov 7, 2024 • edited Loading

This feature adds a new button to the organization user management view (/manage/organization-users):

Upon clicking, the bulk upload form comes up:

User makes the organization, affiliation expiration, and CSV file selections. Form cannot be submitted without an organization selection or with a non .csv file extension.

It is also invalid if it doesn't include all of the column headers, if it doesn't have any users at all, if it doesn't include the email data for a given user, if it has duplicate users, and if an email address is invalid.

The affiliation field can be toggled just like in the single user flow.

Once submitted, users that don't exist will be created; those that do will have their organization affiliation updated. New users will receive new user email with account activation email. Existing users will receive the user added to organization email.

A generic success message will be displayed at the end if there are no validation errors:

If some of the users get processed, but some can't due to those being an admin or registrar, below will be displayed:

If all users in the CSV are admins or registrars, the below error message will be displayed:

This PR also fixes a bug where the existing user email wasn't displaying the organization name.

codecov bot commented Nov 7, 2024 • edited Loading

Codecov Report

christiansmith commented Nov 12, 2024

bensteinberg left a comment

Choose a reason for hiding this comment

teovin commented Nov 15, 2024

teovin commented Nov 18, 2024

rebeccacremona left a comment

Choose a reason for hiding this comment

rebeccacremona Dec 2, 2024

Choose a reason for hiding this comment

rebeccacremona Dec 2, 2024

Choose a reason for hiding this comment

rebeccacremona Dec 2, 2024

Choose a reason for hiding this comment

rebeccacremona Dec 2, 2024

Choose a reason for hiding this comment

rebeccacremona Dec 2, 2024

Choose a reason for hiding this comment

teovin Dec 2, 2024

Choose a reason for hiding this comment

teovin commented Nov 7, 2024 •

edited

Loading

codecov bot commented Nov 7, 2024 •

edited

Loading