-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add cache in getGoogleDriveObject, improved fixParents #10588
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good 👍 left 2 coms
connectorId?: number, | ||
memoizationKey?: string | number |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: would put that in a cache object => clearer readability, if cache / if not cache, + obvious that memoization&connectorId are here for the sole purpose of caching
connectorId?: number, | |
memoizationKey?: string | number | |
cache?: { connectorId: number, | |
memoizationKey: string | number } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
definitely 💯
@@ -33,7 +32,9 @@ async function getFileParents( | |||
while (currentObject.parent) { | |||
const parent = await getGoogleDriveObject( | |||
authCredentials, | |||
currentObject.parent | |||
currentObject.parent, | |||
connectorId, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes i think with a cache object (and a global object argument) it would be more understandable
getGdriveObject({ auth, driveFileId: toto, cache: { blah}})
logger: Logger, | ||
execute: boolean = true | ||
) { | ||
export async function fixParents({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this function supposed to be removed after?
I assume yes; let's add a TODO(nodes-core) clean up after project
(if not, we need to explain in detail what this does in a comment and why we had to resort to such extremities)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or maybe after we just remove the call in activities/garbage collector (but in that case would still need to explain what we do & why in a com)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would keep it, at least for cli. For GC if everything is ok we can at some point remove it too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I switched it to dry run when running from the garbage collect, this will just log if we have inconsistent parent ids (not checking from google to avoid load)
Still the cli may be necessary until we're sure everything is fixed in the code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we keep it then we need a good comment here because the function doesn't speak for itself
await upsertDataSourceFolder({ | ||
dataSourceConfig, | ||
folderId: file.dustFileId, | ||
parents: parents.map((p) => getInternalId(p)), | ||
parentId: dustParentId, | ||
title: file.name ?? "", | ||
mimeType: MIME_TYPES.GOOGLE_DRIVE.FOLDER, | ||
sourceUrl: getSourceUrlForGoogleDriveFiles(file), | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also need to update the table under the file, if it's a spreadsheet file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks 👍 Pre-approving, left a nit, need to comment fix-parents too imo if we leave it in the cli
logger: Logger, | ||
execute: boolean = true | ||
) { | ||
export async function fixParents({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we keep it then we need a good comment here because the function doesn't speak for itself
60 * 10 * 1000 | ||
); | ||
|
||
export async function getGoogleDriveObject( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: here what I had suggested was to move to an object argument, so we see in the calls that we are providing cache keys
Description
Tests
Tested locally with small ds. Need to do some dryrun on small connectors
Risk
Messing up things in ds, would require a full sync
Deploy Plan
deploy connectors