Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid rate limiter resource count update from iterating over all instances #208

Merged
merged 7 commits into from
Jun 15, 2023

Conversation

kthui
Copy link
Contributor

@kthui kthui commented May 24, 2023

This PR optimize the logic on rate limiter instance max resource update, after an instance is added or removed. The previous logic will iterate over all instances once for each instance added or removed. For adding an instance, the new logic will only compare the resource limit between the new instance and the max resource, and update the max resource if the new instance has a higher limit. For removing an instance, the resource on the instance being removed is compared with the max resource. If the max resource is greater than the resource on the instance being removed, then no update will be performed. If the removed instance has higher or equal limit to the max resource, then the max resource is discarded and re-computed, which iterates over the instances once.

Server PR: triton-inference-server/server#5885

@kthui kthui changed the title Avoid rate limiter resource count from iterating over all instances Avoid rate limiter resource count update from iterating over all instances May 24, 2023
@kthui kthui marked this pull request as ready for review May 24, 2023 23:01
src/rate_limiter.cc Outdated Show resolved Hide resolved
src/rate_limiter.cc Outdated Show resolved Hide resolved
src/rate_limiter.cc Outdated Show resolved Hide resolved
resource_manager_->AddModelInstance(pair_it.first->second.get());
const auto& status = resource_manager_->UpdateResourceLimits();
const auto& status =
resource_manager_->AddModelInstance(pair_it.first->second.get());
if (!status.IsOk()) {
resource_manager_->RemoveModelInstance(pair_it.first->second.get());
Copy link
Contributor

@rmccorm4 rmccorm4 May 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kthui it looks like we throw away the status returned by RemoveModelInstance, which in turn calls several other methods that can return detailed status errors.

Can we propagate the error statuses back up via LOG_ERROR in the places where we call this method? It looks like we call it in a few places.

Copy link
Contributor Author

@kthui kthui May 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kthui kthui requested a review from rmccorm4 May 31, 2023 01:36
@kthui kthui force-pushed the jacky-rate-limiter-resource branch 2 times, most recently from 64f80d6 to 6460553 Compare June 1, 2023 02:30
src/rate_limiter.cc Show resolved Hide resolved
src/rate_limiter.cc Outdated Show resolved Hide resolved
@kthui kthui force-pushed the jacky-rate-limiter-resource branch 2 times, most recently from 88f7622 to 8fa3e9a Compare June 2, 2023 18:20
@kthui kthui requested a review from tanmayv25 June 2, 2023 18:27
@kthui kthui force-pushed the jacky-rate-limiter-resource branch 2 times, most recently from afb3c1b to b12ea16 Compare June 10, 2023 00:14
@kthui kthui force-pushed the jacky-rate-limiter-resource branch from b12ea16 to e664d04 Compare June 10, 2023 00:24
@kthui kthui merged commit be341bd into main Jun 15, 2023
@kthui kthui deleted the jacky-rate-limiter-resource branch February 13, 2024 17:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants