-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(ingest/gc): reduce logging, remove unnecessary sleeps #12238
Conversation
@@ -265,13 +267,17 @@ def keep_last_n_dpi( | |||
self.report.report_failure( | |||
f"Exception while deleting DPI: {e}", exc=e | |||
) | |||
if deleted_count_last_n % self.config.batch_size == 0: | |||
if ( | |||
deleted_count_last_n % self.config.batch_size == 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch
if deleted_count_last_n > 0: | ||
logger.info(f"Deleted {deleted_count_last_n} DPIs from {job.urn}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems we are repeating code here, how about refactoring both ifs to
if deleted_count_last_n > 0: | |
logger.info(f"Deleted {deleted_count_last_n} DPIs from {job.urn}") | |
if deleted_count_last_n > 0: | |
logger.info(f"Deleted {deleted_count_last_n} DPIs from {job.urn}") | |
if deleted_count_last_n % self.config.batch_size == 0 and self.config.delay: | |
logger.info(f"Sleeping for {self.config.delay} seconds") | |
time.sleep(self.config.delay) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job, left some minor suggestions.
for flow in self.get_data_flows(): | ||
dataFlows[flow.urn] = flow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might consider using dictionary comprehension to reduce indentations:
dataFlows = {flow.urn: flow for flow in self.get_data_flows()}
Deleted 0 ....
Checklist