Skip to content

Commit

Permalink
[API] Updates ml.start_trained_model_deployment, adds body
Browse files Browse the repository at this point in the history
  • Loading branch information
picandocodigo committed Mar 3, 2025
1 parent d361400 commit 60c5f83
Showing 1 changed file with 29 additions and 13 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -15,33 +15,49 @@
# specific language governing permissions and limitations
# under the License.
#
# Auto generated from build hash f284cc16f4d4b4289bc679aa1529bb504190fe80
# @see https://github.com/elastic/elasticsearch/tree/main/rest-api-spec
# Auto generated from commit f284cc16f4d4b4289bc679aa1529bb504190fe80
# @see https://github.com/elastic/elasticsearch-specification
#
module Elasticsearch
module API
module MachineLearning
module Actions
# Start a trained model deployment.
# It allocates the model to every machine learning node.
#
# @option arguments [String] :model_id The unique identifier of the trained model. (*Required*)
# @option arguments [String] :cache_size A byte-size value for configuring the inference cache size. For example, 20mb.
# @option arguments [String] :deployment_id The Id of the new deployment. Defaults to the model_id if not set.
# @option arguments [Integer] :number_of_allocations The total number of allocations this model is assigned across machine learning nodes.
# @option arguments [Integer] :threads_per_allocation The number of threads used by each model allocation during inference.
# @option arguments [String] :model_id The unique identifier of the trained model. Currently, only PyTorch models are supported. (*Required*)
# @option arguments [Integer, String] :cache_size The inference cache size (in memory outside the JVM heap) per node for the model.
# The default value is the same size as the +model_size_bytes+. To disable the cache,
# +0b+ can be provided.
# @option arguments [String] :deployment_id A unique identifier for the deployment of the model.
# @option arguments [Integer] :number_of_allocations The number of model allocations on each node where the model is deployed.
# All allocations on a node share the same copy of the model in memory but use
# a separate set of threads to evaluate the model.
# Increasing this value generally increases the throughput.
# If this setting is greater than the number of hardware threads
# it will automatically be changed to a value less than the number of hardware threads.
# If adaptive_allocations is enabled, do not set this value, because it’s automatically set. Server default: 1.
# @option arguments [String] :priority The deployment priority.
# @option arguments [Integer] :queue_capacity Controls how many inference requests are allowed in the queue at a time.
# @option arguments [Time] :timeout Controls the amount of time to wait for the model to deploy.
# @option arguments [String] :wait_for The allocation status for which to wait (options: starting, started, fully_allocated)
# @option arguments [Integer] :queue_capacity Specifies the number of inference requests that are allowed in the queue. After the number of requests exceeds
# this value, new requests are rejected with a 429 error. Server default: 1024.
# @option arguments [Integer] :threads_per_allocation Sets the number of threads used by each model allocation during inference. This generally increases
# the inference speed. The inference process is a compute-bound process; any number
# greater than the number of available hardware threads on the machine does not increase the
# inference speed. If this setting is greater than the number of hardware threads
# it will automatically be changed to a value less than the number of hardware threads. Server default: 1.
# @option arguments [Time] :timeout Specifies the amount of time to wait for the model to deploy. Server default: 20s.
# @option arguments [String] :wait_for Specifies the allocation status to wait for before returning. Server default: started.
# @option arguments [Hash] :headers Custom HTTP headers
# @option arguments [Hash] :body request body
#
# @see https://www.elastic.co/guide/en/elasticsearch/reference/current/start-trained-model-deployment.html
# @see https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-ml-start-trained-model-deployment
#
def start_trained_model_deployment(arguments = {})
request_opts = { endpoint: arguments[:endpoint] || 'ml.start_trained_model_deployment' }

defined_params = [:model_id].each_with_object({}) do |variable, set_variables|
defined_params = [:model_id].inject({}) do |set_variables, variable|
set_variables[variable] = arguments[variable] if arguments.key?(variable)
set_variables
end
request_opts[:defined_params] = defined_params unless defined_params.empty?

Expand All @@ -50,7 +66,7 @@ def start_trained_model_deployment(arguments = {})
arguments = arguments.clone
headers = arguments.delete(:headers) || {}

body = nil
body = arguments.delete(:body)

_model_id = arguments.delete(:model_id)

Expand Down

0 comments on commit 60c5f83

Please sign in to comment.