Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for serial ngen jobs #278

Closed

Conversation

robertbartel
Copy link
Contributor

  • Update NGENRequest to support request of either serial or parallel jobs
  • Add support for performing serial NextGen execution to ngen Docker image entrypoint
  • Update Launcher scheduler class to support running serial NextGen jobs
  • Fix bug in Launcher related to handling of NgenCalibrationRequest
  • Update both dataservice and partitionerservice to handle serial NextGen jobs properly if/when encountered, since these don't need to ever await partitioning

The branch for this PR is based on the one for #277; as such, that PR blocks the review of this one.

Fixing logic where pip installs the updated packages (to avoid the
entire image rebuild), so make sure deps are ignored (as this was the
slow part).
Adding type GET_SERIALIZED_FORM to get the entire serialized state of
a dataset.
Refactor to ensure it takes advantage of connection handling via its
async context manager logic.
Add new function to get serialized dataset details to
DatasetExternalClient.
Fix issue with not acquiring a session, and wrapping some things in try
block to catch and log exceptions.
Fixing bug in response to LIST_FILES query, where reason text was not
being assembled entirely correctly.
Adding optional offset and length params to abstract interface
definition of get_data, and adding get_file_stat abstract method.
Adding support for querying about dataset items.
Updating to support indicating start and size of data for partial
transfers.
Optimizing the reloading of datasets on object store manager startup.
Accounting for changes to get_data parameters and implementing
get_file_stat.
Updating service to respond to GET_DATASET_ITEMS queries and requests
for data with an offset start (i.e. partials).
Adding initial second view display and manage existing datasets, along
with navigation functionality to toggle between "manage" and "create"
views.
Sending serialized datasets to HTML template as list/array rather than
dict/map.
Implementing layout and initial details-viewing behavior for dataset
management GUI view.
Updating dmod.client to depend on dmod.communication>=0.11.0.
Updating dmod.requestservice to depend on dmod.communication>=0.11.0.
Update to new dmod.communication (0.11.0) and dmod.scheduler (0.10.0).
Updating to depend on new dmod.communication 0.11.0.
Updating to depend on dmod.scheduler 0.10.0.
Updating Dockerfiles for data-service and py-sources to explicitly use
"latest" tag in FROM statements building off another DMOD internal
Docker image.
- Add properties for whether serial or parallel exec was requested
- Have data requirements and lazy init of partition config itself
  consider whether partition config is necessary, which depends on if
  parallel execution was requested
- Updating Launcher class to account for serial ngen execution by not
  passing any partitioning config dataset when there is only 1 cpu
- Fixing bug in Launcher regarding how NgenCalibrationRequest jobs were
  processed, which was not handled correctly
Updating manager logic to have single-cpu (i.e., serial exec) jobs be
moved to AWAITING_ALLOCATION without a partition config being generated
if ever in the AWAITING_PARTITIONING step.
Updating to new dmod.communication (0.12.0) and dmod.scheduler (0.11.0).
Updating service to move to AWAITING_ALLOCATION step instead of
AWAITING_PARTITIONING step after data checks are successful, if serial
execution is indicated by a job cpu count of 1.
Updating to new dmod.communication (0.12.0) and dmod.scheduler (0.11.0).
Updating to new dmod.communication (0.12.0).
@robertbartel
Copy link
Contributor Author

Closing; replaced by #301, which isolates the relevant changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maas MaaS Workstream
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants