Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Update the number of allocations per nlp process #86277

Merged
merged 7 commits into from
May 5, 2022

Conversation

davidkyle
Copy link
Member

This is the Java side of elastic/ml-cpp#2258 which causes internal breakages while the PRs are out of sync due to naming changes.

Adds a method to DeploymentManager to update the number of allocations per process as implemented in elastic/ml-cpp#2258.

Also PyTorchResults now has an error type rather than the error being a special case of the inference result and reverts the test mutes in #86263

@davidkyle davidkyle added :ml Machine learning v8.3.0 labels Apr 28, 2022
@elasticmachine elasticmachine added the Team:ML Meta label for the ML team label Apr 28, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

Copy link
Contributor

@dimitris-athanasiou dimitris-athanasiou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Just a minor suggestion for a potential method extraction.

TimeValue timeout,
ActionListener<ThreadSettings> listener
) {
var processContext = getProcessContext(task, listener::onFailure);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems to be another method which we could extract here that is something like executePyTorchAction and takes in an AbstractPyTorchAction. Then we could reuse it when we fire either the inference action or the control message action and it does the getting of the process context and the try-catch of running the action.

I might be missing something that makes it impossible to do this. In any case, just a thought.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++ yeah there was an opportunity to refactor here

@davidkyle davidkyle force-pushed the update-num-allocations branch from dee9ed1 to f950f4c Compare May 5, 2022 09:17
Copy link
Contributor

@dimitris-athanasiou dimitris-athanasiou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@davidkyle davidkyle merged commit 6318be5 into elastic:master May 5, 2022
@davidkyle davidkyle deleted the update-num-allocations branch May 5, 2022 11:03
dimitris-athanasiou added a commit to dimitris-athanasiou/elasticsearch that referenced this pull request Jun 8, 2022
During elastic#86277 an error was introduced in parsing of pytorch
thread settings. This commit fixes the issue.
dimitris-athanasiou added a commit that referenced this pull request Jun 9, 2022
During #86277 an error was introduced in parsing of pytorch
thread settings. This commit fixes the issue.
dimitris-athanasiou added a commit to dimitris-athanasiou/elasticsearch that referenced this pull request Jun 9, 2022
During elastic#86277 an error was introduced in parsing of pytorch
thread settings. This commit fixes the issue.
elasticsearchmachine pushed a commit that referenced this pull request Jun 9, 2022
During #86277 an error was introduced in parsing of pytorch
thread settings. This commit fixes the issue.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:ml Machine learning >non-issue Team:ML Meta label for the ML team v8.3.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants