-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Aggregate Wrapped user data #874
base: master
Are you sure you want to change the base?
Conversation
…sion to be included in wrapped
[diff-counting] Significant lines: 524. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left a couple comments
officeHourCounts[answererId] = new Map<string, number>(); | ||
taCounts[answererId] = new Map<string, number>(); | ||
// Checking if ta already showed up as student and now as an answerer | ||
} else if (userStats[answererId] && userStats[answererId]?.timeHelpingStudents === undefined) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this else if
redundant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking of the case where while iterating through questions, a user who is actually a ta comes up first as the asker, so they get an instance created without the timeHelpingStudents (as undefined). Then if they come up later as an answerer, indicating that they are a ta, the if statement only checks and gives all the fields if the ta doesn't already have an instance, so I think they wouldn't actually get the timeHelpingStudents field without this extra check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh i see
src/scripts/wrapped-fa24.ts
Outdated
stats.favTaId = Array.from(taCounts[userId].entries()).reduce((a, b) => a[1] < b[1] ? b : a)[0]; | ||
} | ||
stats.numVisits = officeHourCounts[userId].size; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea for this I think it's ok if not every session has TAs assigned
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to answer your question, let's use most common answererId
for a student as a proxy for their favorite TA since there's many courses that don't assign TAs to office hours in QMI and work backwards from there to determine favorite class and day of the week since class should correspond with favorite TA. We can get class pretty easily from sessionId, but maybe we run a separate script to label each session with a day of the week – we'll leave day of the week for later.
src/scripts/wrapped-fa24.ts
Outdated
@@ -13,6 +13,9 @@ console.log('Firebase admin initialized!'); | |||
const db = admin.firestore(); | |||
|
|||
// Firestore Timestamps for the query range | |||
// EDIT: possibly make these dates as new constants in constants.ts, | |||
// would make it easier to edit for other years | |||
|
|||
const startDate = admin.firestore.Timestamp.fromDate(new Date('2023-08-20')); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remember to change these dates to sp24 and fa24
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point, now that the functionality is more or less there I think it'll be helpful to start thinking about code style and readability – which should also make verification a bit easier as we get into the last few weeks of development.
- Separating concerns into helper functions, for example async functions handling processSessions(answererId, sessionId) and updateWrappedDocuments().
await
can sometimes be inefficient when repeatedly called, and we should start thinking a bit more about performance for this script. It might be better to do something like the following:const sessionPromises = questionsSnapshot.docs.map(doc => sessionsRef.doc(doc.sessionId).get());
const sessionDocs = await Promise.all(sessionPromises);
Of course, this is made difficult when we're (more intuitively) iterating through questions, which each have their own sessionId, so one possibility would be to do some pre-processing to group questions with the samesessionId
together beforehand so we can do this type of batch processing.- Using a
for...of
loop is generally better for async operations. So for the loop at the end where we useuserStats
to update theWrapped
collection, you might do something likefor (const [userId, stats] of Object.entries(userStats)) {
. - For all error scenarios, especially since we'll be rolling this out to production soon, let's collect all
userIds
in a separate list and log those out at the end so we can investigate any issues more easily.
Try to refactor the code and make these changes incrementally, testing as you go along.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple small things to improve code maintainability
src/scripts/wrapped-fa24.ts
Outdated
If a user is only a student, they need to have at least one OH visit. | ||
If a user is a TA, they need to have at least one TA session AND at least one OH visit as a student. | ||
*/ | ||
if ((stats.numVisits > 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a confusing conditional. Let's make it something like:
const hasVisits = stats.numVisits > 0;
const isTaActive = stats.timeHelpingStudents !== undefined || (TAsessions[userId]?.length > 0);
const hasFavoriteTa = stats.favTaId !== "";
if (hasVisits && isTaActive && hasFavoriteTa) {
src/scripts/wrapped-fa24.ts
Outdated
} | ||
} | ||
|
||
if (!officeHourSessions[askerId]?.includes(sessionId)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this might lead to a weird edge case error if officeHourSessions[askerId]
is undefined since we're using optional chaining (?.
). Let's make this:
officeHourSessions[askerId] = officeHourSessions[askerId] || []; if (!officeHourSessions[askerId].includes(sessionId)) { officeHourSessions[askerId].push(sessionId); }
stats.favTaId = Array.from(taCounts[userId].entries()).reduce((a, b) => a[1] < b[1] ? b : a)[0]; | ||
} | ||
|
||
if (stats.favTaId && stats.favTaId !== "") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this block of code getting the user's favorite class and title seems correct and precise to me, but is it a bit inefficient? For instance, if we've already identified the favorite TA, there should only be one class/title and we don't need to do all this ordering. Idk, there might just be some edge cases I haven't thought of.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mainly included this extra if statement because I wasn't sure if it was possible for a user to encounter the same TA in two classes. But if we can confirm that a student always has one class in common with the favorite TA and is a guaranteed precondition, I definitely agree that this isn't needed.
Summary
Test Plan
Notes
Breaking Changes
None