-
Notifications
You must be signed in to change notification settings - Fork 14.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Properly check the existence of missing mapped TIs #25788
Conversation
db598ad
to
51af409
Compare
904bcb9
to
bb06806
Compare
c536b31
to
8516d24
Compare
8516d24
to
07249ad
Compare
total_length = ( | ||
task.parse_time_mapped_ti_count | ||
or task.run_time_mapped_ti_count(self.run_id, session=session) | ||
or 0 | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This induces an unnecessary query of parse_time_mapped_ti_count
is 0. Reading how run_time_mapped_ti_count
is used, some refactoring might be a good idea. I’ll do it after this PR is done.
219ccfb
to
c2dc5e6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few very small corrections; this should be good otherwise.
cc38cb2
to
da6f608
Compare
The previous implementation of missing indexes was not right. Missing indexes were being checked everytime that `task_instance_scheduling_decision` was called. The missing tasks should only be revised after expanding of last resort for mapped tasks have been done. If we find that a task is in schedulable state and has already been expanded, we revise their indexes and ensure they are complete. Missing indexes are marked as removed. This implementation allows the revision to be done in one place
Co-authored-by: Tzu-ping Chung <[email protected]>
Co-authored-by: Tzu-ping Chung <[email protected]>
da6f608
to
23a546e
Compare
Also a chapter was added to recommend taking a backup before the migration. Based on discussions and user input from apache#25866, apache#24526 Closes: apache#24526 Improve cleanup of temporary files in CI (apache#25957) After recent change in Paralell execution, we start to have infrequent "no space left on device" message - likely caused by the /tmp/ generated files clogging the filesystem from multiple runs. We could fix it by simply running cleanup after parallel job always, but this is not good due to diagnostics needed when debugging parallel runs locally so we need to have a way to skip /tmp files deletion. This PR fixes the problem twofold: * cleanup breeze instructions which is run at the beginning of every job cleans also /tmp file * the parallel jobs cleans after themselvs unless skipped. Properly check the existence of missing mapped TIs (apache#25788) The previous implementation of missing indexes was not correct. Missing indexes were being checked every time that `task_instance_scheduling_decision` was called. The missing tasks should only be revised after expanding of last resort for mapped tasks have been done. If we find that a task is in schedulable state and has already been expanded, we revise its indexes and ensure they are complete. Missing indexes are marked as removed. This implementation allows the revision to be done in one place Co-authored-by: Tzu-ping Chung <[email protected]> Fix dataset_event_manager resolution (apache#25943) Appears `__init__` is not invoked as part of `_run_raw_task` due to the way TI is refreshed from db. Centralize dataset manager instantiation instead. Fix unhashable issue with secrets.backend_kwargs and caching (apache#25970) Resolves apache#25968 Fix response schema for list-mapped-task-instance (apache#25965) update areActiveRuns, fix states (apache#25962)
The previous implementation of missing indexes was not right. Missing indexes
were being checked everytime that
task_instance_scheduling_decision
was called.The missing tasks should only be revised after expanding of last resort for mapped tasks have been done. If we find that a task is in schedulable state and has already been expanded, we revise their indexes and ensure they are complete. Missing indexes are marked as removed.
This implementation allows the revision to be done in one place
Closes: #25681