diff --git a/docs/source/features/packaging_output_models.md b/docs/source/features/packaging_output_models.md index 396ec6d5e..da257ad5e 100644 --- a/docs/source/features/packaging_output_models.md +++ b/docs/source/features/packaging_output_models.md @@ -1,7 +1,7 @@ # Packaging Olive artifacts ## What is Olive Packaging -Olive will output multiple candidate models based on metrics priorities. It also can package output artifacts when the user requires. Olive packaging can be used in different scenarios. There are 3 packaging types: `Zipfile`, `AzureMLModels` and `AzureMLData`. +Olive will output multiple candidate models based on metrics priorities. It also can package output artifacts when the user requires. Olive packaging can be used in different scenarios. There are 4 packaging types: `Zipfile`, `AzureMLModels`, `AzureMLData` and `AzureMLDeployment`. ### Zipfile Zipfile packaging will generate a ZIP file which includes 3 folders: `CandidateModels`, `SampleCode` and `ONNXRuntimePackages`, and a `models_rank.json` file in the `output_dir` folder (from Engine Configuration): @@ -107,6 +107,9 @@ and for CPU, the best execution provider is CPUExecutionProvider, so the first r Olive will also upload model configuration file, inference config file, metrics file and model info file to the Azure ML. +### AzureMLDeployment +AzureMLDeployment packaging will [package](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-package-models?view=azureml-api-2&tabs=sdk) ranked No. 1 model across all output models to Azure ML workspace, and create an endpoint for it if the endpoint doesn't exist, then deploy the output model to this endpoint. + ## How to package Olive artifacts Olive packaging configuration is configured in `PackagingConfig` in Engine configuration. `PackagingConfig` can be a single packging configuration. Alternatively, if you want to apply multiple packaging types, you can also define a list of packaging configurations. @@ -138,12 +141,52 @@ If not specified, Olive will not package artifacts. The version for this data asset. This is `1` by default. * `description [str]` The description for this data asset. This is `None` by default. + * `AzureMLDeployment` + * `model_name [str]`: + The model name when registering your output model to your Azure ML workspace. `olive-deployment-model` by default + * `model_version [int | str]`: + The model version when registering your output model to your Azure ML workspace. Please note if there is already a model with the same name and the same version in your workspace, this will override your existing registered model. `1` by default. + * `description [str]` + The description for this model registration. This is `None` by default. + * `model_package [ModelPackageConfig]`: + The configurations for model packaging. + * `target_environment [str]`: + The environment name for the environment created by Olive. `olive-target-environment` by default. + * `target_environment_version [str]` + The environment version for the environment created by Olive. Please note if there is already an environment with the same name and the same version in your workspace, your existing environment version will plus 1. This `target_environment_version` will not be applied for your environment. `None` by default. + * `inferencing_server [InferenceServerConfig]` + * `type [str]` + The targeted inferencing server type. `AzureMLOnline` or `AzureMLBatch`. + * `code_folder [str]` + The folder path to your scoring script. + * `scoring_script [str]` + The scoring script name. + * `base_environment_id [str]` + The base environment id that will be used for Azure ML packaging. The format is `azureml::`. + * `environment_variables [dict]` + Env vars that are required for the package to run, but not necessarily known at Environment creation time. `None` by default. + * `deployment_config [DeploymentConfig]` + The deployment configuration. + * `endpoint_name [str]` + The endpoint name for the deployment. If the endpoint doesn't exist, Olive will create one endpoint with this name. `olive-default-endpoint` by default. + * `deployment_name [str]` + The name of the deployment. `olive-default-deployment` by default. + * `instance_type [str]` + Azure compute sku. ManagedOnlineDeployment only. `None` by default. + * `compute [str]` + Compute target for batch inference operation. BatchDeployment only. `None` by default. + * `instance_count [str]` + Number of instances the interfering will run on. `1` by default. + * `mini_batch_size [str]` + Size of the mini-batch passed to each batch invocation. `10` by default. + * `extra_config [dict]` + Extra configurations for deployment. `None` by default. * `include_sample_code [bool]`: Whether or not to include sample code in zip file. Defaults to True * `include_runtime_packages [bool]`: Whether or not to include runtime packages (like onnxruntime) in zip file. Defaults to True -You can add `PackagingConfig` to Engine configurations. e.g.: +You can add different types `PackagingConfig` as a list to Engine configurations. e.g.: ``` "engine": { @@ -170,9 +213,21 @@ You can add `PackagingConfig` to Engine configurations. e.g.: { "type": "AzureMLData", "name": "OutputModels" + }, + { + "type": "AzureMLDeployment", + "config": { + "model_package": { + "inferencing_server": { + "type": "AzureMLOnline", + "code_folder": "code", + "scoring_script": "score.py" + }, + "base_environment_id": "azureml:olive-aml-packaging:1" + } + } } ] - "clean_cache": true, "cache_dir": "cache" } ``` diff --git a/olive/engine/footprint.py b/olive/engine/footprint.py index ee8497482..1a162a250 100644 --- a/olive/engine/footprint.py +++ b/olive/engine/footprint.py @@ -101,7 +101,7 @@ def create_pareto_frontier(self, output_model_num: int = None) -> Optional["Foot logger.info("Output all %d models", len(self.nodes)) return self._create_pareto_frontier_from_nodes(self.nodes) else: - topk_nodes = self._get_top_ranked_nodes(output_model_num) + topk_nodes = self.get_top_ranked_nodes(output_model_num) logger.info("Output top ranked %d models based on metric priorities", len(topk_nodes)) return self._create_pareto_frontier_from_nodes(topk_nodes) @@ -298,7 +298,7 @@ def _mark_pareto_frontier(self): self.nodes[k].is_pareto_frontier = cmp_flag self.is_marked_pareto_frontier = True - def _get_top_ranked_nodes(self, k: int) -> List[FootprintNode]: + def get_top_ranked_nodes(self, k: int) -> List[FootprintNode]: footprint_node_list = self.nodes.values() sorted_footprint_node_list = sorted( footprint_node_list, diff --git a/olive/engine/packaging/packaging_config.py b/olive/engine/packaging/packaging_config.py index c67169b79..21f640c10 100644 --- a/olive/engine/packaging/packaging_config.py +++ b/olive/engine/packaging/packaging_config.py @@ -15,6 +15,7 @@ class PackagingType(str, Enum): Zipfile = "Zipfile" AzureMLModels = "AzureMLModels" AzureMLData = "AzureMLData" + AzureMLDeployment = "AzureMLDeployment" class CommonPackagingConfig(ConfigBase): @@ -35,10 +36,59 @@ class AzureMLModelsPackagingConfig(CommonPackagingConfig): description: Optional[str] = None +class InferencingServerType(str, Enum): + AzureMLOnline = "AzureMLOnline" + AzureMLBatch = "AzureMLBatch" + + +class InferenceServerConfig(ConfigBase): + type: InferencingServerType + code_folder: str + scoring_script: str + + +class AzureMLModelModeType(str, Enum): + download = "download" + copy = "copy" + + +class ModelConfigurationConfig(ConfigBase): + mode: AzureMLModelModeType = AzureMLModelModeType.download + mount_path: Optional[str] = None # Relative mount path + + +class ModelPackageConfig(ConfigBase): + target_environment: str = "olive-target-environment" + target_environment_version: Optional[str] = None + inferencing_server: InferenceServerConfig + base_environment_id: str + model_configurations: ModelConfigurationConfig = None + environment_variables: Optional[dict] = None + + +class DeploymentConfig(ConfigBase): + endpoint_name: str = "olive-default-endpoint" + deployment_name: str = "olive-default-deployment" + instance_type: Optional[str] = None + compute: Optional[str] = None + instance_count: int = 1 + mini_batch_size: int = 10 # AzureMLBatch only + extra_config: Optional[dict] = None + + +class AzureMLDeploymentPackagingConfig(CommonPackagingConfig): + model_name: str = "olive-deployment-model" + model_version: Union[int, str] = "1" + model_description: Optional[str] = None + model_package: ModelPackageConfig + deployment_config: DeploymentConfig = DeploymentConfig() + + _type_to_config = { PackagingType.Zipfile: ZipfilePackagingConfig, PackagingType.AzureMLModels: AzureMLModelsPackagingConfig, PackagingType.AzureMLData: AzureMLDataPackagingConfig, + PackagingType.AzureMLDeployment: AzureMLDeploymentPackagingConfig, } diff --git a/olive/engine/packaging/packaging_generator.py b/olive/engine/packaging/packaging_generator.py index 8abfb3932..fe586df13 100644 --- a/olive/engine/packaging/packaging_generator.py +++ b/olive/engine/packaging/packaging_generator.py @@ -18,8 +18,8 @@ from olive.common.utils import copy_dir, retry_func, run_subprocess from olive.engine.packaging.packaging_config import ( - AzureMLDataPackagingConfig, - AzureMLModelsPackagingConfig, + AzureMLDeploymentPackagingConfig, + InferencingServerType, PackagingConfig, PackagingType, ) @@ -49,7 +49,233 @@ def generate_output_artifacts( return packaging_config_list = packaging_configs if isinstance(packaging_configs, list) else [packaging_configs] for packaging_config in packaging_config_list: - _package_candidate_models(packaging_config, output_dir, footprints, pf_footprints, azureml_client_config) + if packaging_config.type == PackagingType.AzureMLDeployment: + _package_azureml_deployment(packaging_config, footprints, pf_footprints, azureml_client_config) + else: + _package_candidate_models(packaging_config, output_dir, footprints, pf_footprints, azureml_client_config) + + +def _package_azureml_deployment( + packaging_config: AzureMLDeploymentPackagingConfig, + footprints: Dict["AcceleratorSpec", "Footprint"], + pf_footprints: Dict["AcceleratorSpec", "Footprint"], + azureml_client_config: "AzureMLClientConfig" = None, +): + from azure.ai.ml.entities import ( + AzureMLBatchInferencingServer, + AzureMLOnlineInferencingServer, + BaseEnvironment, + BatchDeployment, + BatchEndpoint, + CodeConfiguration, + ManagedOnlineDeployment, + ManagedOnlineEndpoint, + ModelConfiguration, + ModelPackage, + ) + from azure.core.exceptions import ResourceExistsError, ResourceNotFoundError, ServiceResponseError + + config: AzureMLDeploymentPackagingConfig = packaging_config.config + if config.export_in_mlflow_format: + logger.warning("Exporting model in MLflow format is not supported for AzureML endpoint packaging.") + + try: + # Get best model from footprint + best_node = _get_best_candidate_node(pf_footprints, footprints) + + with tempfile.TemporaryDirectory() as temp_dir: + tempdir = Path(temp_dir) + + model_config = best_node.model_config + _save_model( + model_config["config"].get("model_path", None), + model_config["type"], + model_config, + tempdir, + model_config["config"].get("inference_settings", None), + False, + ) + + # Register model to AzureML + _upload_to_azureml_models( + azureml_client_config, + tempdir, + config.model_name, + config.model_version, + config.model_description, + False, + ) + + ml_client = azureml_client_config.create_client() + + # AzureML package config + model_package_config = config.model_package + + code_folder = Path(model_package_config.inferencing_server.code_folder) + assert code_folder.exists(), f"Code folder {code_folder} does not exist." + + scoring_script = code_folder / model_package_config.inferencing_server.scoring_script + assert scoring_script.exists(), f"Scoring script {scoring_script} does not exist." + + code_configuration = CodeConfiguration( + code=model_package_config.inferencing_server.code_folder, + scoring_script=model_package_config.inferencing_server.scoring_script, + ) + + inferencing_server = None + if model_package_config.inferencing_server.type == InferencingServerType.AzureMLOnline: + inferencing_server = AzureMLOnlineInferencingServer(code_configuration=code_configuration) + elif model_package_config.inferencing_server.type == InferencingServerType.AzureMLBatch: + inferencing_server = AzureMLBatchInferencingServer(code_configuration=code_configuration) + + model_configuration = None + if model_package_config.model_configurations: + model_configuration = ModelConfiguration( + mode=model_package_config.model_configurations.mode, + mount_path=model_package_config.model_configurations.mount_path, + ) + + base_environment_source = BaseEnvironment( + type="EnvironmentAsset", resource_id=model_package_config.base_environment_id + ) + + package_request = ModelPackage( + target_environment=model_package_config.target_environment, + inferencing_server=inferencing_server, + base_environment_source=base_environment_source, + target_environment_version=model_package_config.target_environment_version, + model_configuration=model_configuration, + environment_variables=model_package_config.environment_variables, + ) + + # invoke model package operation + model_package = retry_func( + func=ml_client.models.package, + kwargs={"name": config.model_name, "version": config.model_version, "package_request": package_request}, + max_tries=azureml_client_config.max_operation_retries, + delay=azureml_client_config.operation_retry_interval, + exceptions=ServiceResponseError, + ) + + logger.info( + "Target environment created successfully: name: %s, version: %s", + model_package_config.target_environment, + model_package_config.target_environment_version, + ) + + # Deploy model package + deployment_config = config.deployment_config + + # Get endpoint + try: + endpoint = retry_func( + ml_client.online_endpoints.get, + [deployment_config.endpoint_name], + max_tries=azureml_client_config.max_operation_retries, + delay=azureml_client_config.operation_retry_interval, + exceptions=ServiceResponseError, + ) + logger.info( + "Endpoint %s already exists. The scoring_uri is: %s", + deployment_config.endpoint_name, + endpoint.scoring_uri, + ) + except ResourceNotFoundError: + logger.info("Endpoint %s does not exist. Creating a new endpoint...", deployment_config.endpoint_name) + if model_package_config.inferencing_server.type == InferencingServerType.AzureMLOnline: + endpoint = ManagedOnlineEndpoint( + name=deployment_config.endpoint_name, + description="this is an endpoint created by Olive automatically", + ) + elif model_package_config.inferencing_server.type == InferencingServerType.AzureMLBatch: + endpoint = BatchEndpoint( + name=deployment_config.endpoint_name, + description="this is an endpoint created by Olive automatically", + ) + + endpoint = retry_func( + ml_client.online_endpoints.begin_create_or_update, + [endpoint], + max_tries=azureml_client_config.max_operation_retries, + delay=azureml_client_config.operation_retry_interval, + exceptions=ServiceResponseError, + ).result() + logger.info( + "Endpoint %s created successfully. The scoring_uri is: %s", + deployment_config.endpoint_name, + endpoint.scoring_uri, + ) + + deployment = None + extra_config = deployment_config.extra_config or {} + if model_package_config.inferencing_server.type == InferencingServerType.AzureMLOnline: + deployment = ManagedOnlineDeployment( + name=deployment_config.deployment_name, + endpoint_name=deployment_config.endpoint_name, + environment=model_package, + instance_type=deployment_config.instance_type, + instance_count=deployment_config.instance_count, + **extra_config, + ) + + elif model_package_config.inferencing_server.type == InferencingServerType.AzureMLBatch: + deployment = BatchDeployment( + name=deployment_config.deployment_name, + endpoint_name=deployment_config.endpoint_name, + environment=model_package, + compute=deployment_config.compute, + mini_batch_size=deployment_config.mini_batch_size, + **extra_config, + ) + deployment = retry_func( + ml_client.online_deployments.begin_create_or_update, + [deployment], + max_tries=azureml_client_config.max_operation_retries, + delay=azureml_client_config.operation_retry_interval, + exceptions=ServiceResponseError, + ).result() + logger.info("Deployment %s created successfully", deployment.name) + + except ResourceNotFoundError: + logger.exception( + "Failed to package AzureML deployment. The resource is not found. Please check the exception details." + ) + raise + except ResourceExistsError: + logger.exception( + "Failed to package AzureML deployment. The resource already exists. Please check the exception details." + ) + raise + except Exception: + logger.exception("Failed to package AzureML deployment. Please check the exception details.") + raise + + +def _get_best_candidate_node( + pf_footprints: Dict["AcceleratorSpec", "Footprint"], footprints: Dict["AcceleratorSpec", "Footprint"] +): + objective_dict = next(iter(pf_footprints.values())).objective_dict + top_nodes = [] + for accelerator_spec, pf_footprint in pf_footprints.items(): + footprint = footprints[accelerator_spec] + if pf_footprint.nodes and footprint.nodes: + top_nodes.append(next(iter(pf_footprint.get_top_ranked_nodes(1)))) + return next( + iter( + sorted( + top_nodes, + key=lambda x: tuple( + ( + x.metrics.value[metric].value + if x.metrics.cmp_direction[metric] == 1 + else -x.metrics.value[metric].value + ) + for metric in objective_dict + ), + reverse=True, + ) + ) + ) def _package_candidate_models( @@ -100,7 +326,12 @@ def _package_candidate_models( _copy_configurations(model_dir, footprint, model_id) _copy_metrics(model_dir, input_node, node) model_path = _save_model( - pf_footprint, model_id, model_dir, inference_config, export_in_mlflow_format + pf_footprint.get_model_path(model_id), + pf_footprint.get_model_type(model_id), + pf_footprint.get_model_config(model_id), + model_dir, + inference_config, + export_in_mlflow_format, ) model_info_list = [] @@ -110,9 +341,18 @@ def _package_candidate_models( _copy_model_info(model_dir, model_info) if packaging_type == PackagingType.AzureMLModels: - _upload_to_azureml_models(azureml_client_config, model_dir, model_name, config) + _upload_to_azureml_models( + azureml_client_config, + model_dir, + model_name, + config.version, + config.description, + export_in_mlflow_format, + ) elif packaging_type == PackagingType.AzureMLData: - _upload_to_azureml_data(azureml_client_config, model_dir, model_name, config) + _upload_to_azureml_data( + azureml_client_config, model_dir, model_name, config.version, config.description + ) model_rank += 1 @@ -125,7 +365,9 @@ def _upload_to_azureml_models( azureml_client_config: "AzureMLClientConfig", model_path: Path, model_name: str, - config: AzureMLModelsPackagingConfig, + version: Union[int, str], + description: str, + export_in_mlflow_format: bool, ): """Upload model to AzureML workspace Models.""" from azure.ai.ml.constants import AssetTypes @@ -135,10 +377,10 @@ def _upload_to_azureml_models( ml_client = azureml_client_config.create_client() model = Model( path=model_path, - type=AssetTypes.MLFLOW_MODEL if config.export_in_mlflow_format else AssetTypes.CUSTOM_MODEL, + type=AssetTypes.MLFLOW_MODEL if export_in_mlflow_format else AssetTypes.CUSTOM_MODEL, name=model_name, - version=str(config.version), - description=config.description, + version=str(version), + description=description, ) retry_func( ml_client.models.create_or_update, @@ -150,7 +392,11 @@ def _upload_to_azureml_models( def _upload_to_azureml_data( - azureml_client_config: "AzureMLClientConfig", model_path: Path, model_name: str, config: AzureMLDataPackagingConfig + azureml_client_config: "AzureMLClientConfig", + model_path: Path, + model_name: str, + version: Union[int, str], + description: str, ): """Upload model as Data to AzureML workspace Data.""" from azure.ai.ml.constants import AssetTypes @@ -161,9 +407,9 @@ def _upload_to_azureml_data( data = Data( path=str(model_path), type=AssetTypes.URI_FILE if model_path.is_file() else AssetTypes.URI_FOLDER, - description=config.description, + description=description, name=model_name, - version=str(config.version), + version=str(version), ) retry_func( ml_client.data.create_or_update, @@ -230,15 +476,14 @@ def _copy_metrics(model_dir: Path, input_node: "FootprintNode", node: "Footprint def _save_model( - pf_footprint: "Footprint", - model_id: str, + model_path: str, + model_type: str, + model_config: Dict, saved_model_path: Path, inference_config: Dict, export_in_mlflow_format: bool, ): - model_path = pf_footprint.get_model_path(model_id) model_resource_path = create_resource_path(model_path) if model_path else None - model_type = pf_footprint.get_model_type(model_id) if model_type.lower() == "onnxmodel": with tempfile.TemporaryDirectory(dir=saved_model_path, prefix="olive_tmp") as model_tempdir: @@ -251,7 +496,6 @@ def _save_model( elif temp_resource_path.type == ResourceType.LocalFolder: # if model_path is a folder, save all files in the folder to model_dir / file_name # file_name for .onnx file is model.onnx, otherwise keep the original file name - model_config = pf_footprint.get_model_config(model_id) onnx_file_name = model_config.get("onnx_file_name") onnx_model = ONNXModelHandler(temp_resource_path, onnx_file_name) model_name = Path(onnx_model.model_path).name diff --git a/test/unit_test/engine/packaging/code/score.py b/test/unit_test/engine/packaging/code/score.py new file mode 100644 index 000000000..862c45ce3 --- /dev/null +++ b/test/unit_test/engine/packaging/code/score.py @@ -0,0 +1,4 @@ +# ------------------------------------------------------------------------- +# Copyright (c) Microsoft Corporation. All rights reserved. +# Licensed under the MIT License. +# -------------------------------------------------------------------------- diff --git a/test/unit_test/engine/packaging/test_packaging_generator.py b/test/unit_test/engine/packaging/test_packaging_generator.py index 86b27fb46..f9d69c3ae 100644 --- a/test/unit_test/engine/packaging/test_packaging_generator.py +++ b/test/unit_test/engine/packaging/test_packaging_generator.py @@ -17,7 +17,12 @@ from olive.engine.footprint import Footprint, FootprintNode from olive.engine.packaging.packaging_config import ( AzureMLDataPackagingConfig, + AzureMLDeploymentPackagingConfig, AzureMLModelsPackagingConfig, + DeploymentConfig, + InferenceServerConfig, + InferencingServerType, + ModelPackageConfig, PackagingConfig, PackagingType, ) @@ -328,6 +333,158 @@ def test_generate_azureml_data(mock_create_resource_path, mock_retry_func): ) +@patch("olive.engine.packaging.packaging_generator.retry_func") +@pytest.mark.parametrize( + ("inferencing_server_type"), + [(InferencingServerType.AzureMLOnline), (InferencingServerType.AzureMLBatch)], +) +def test_azureml_deployment(mock_retry_func, inferencing_server_type): + from azure.ai.ml.constants import AssetTypes + from azure.ai.ml.entities import ( + AzureMLBatchInferencingServer, + AzureMLOnlineInferencingServer, + BaseEnvironment, + BatchDeployment, + CodeConfiguration, + ManagedOnlineDeployment, + Model, + ModelPackage, + ) + from azure.core.exceptions import ServiceResponseError + + # setup + model_id = "model_id" + model_name = "olive_model" + model_version = "1" + model_path = "fake_model_file" + endpoint_name = "package-test-endpoint" + deployment_name = "package-test-deployment" + code_folder = str(Path(__file__).parent / "code") + scoring_script = "score.py" + base_environment_id = "base_environment_id" + target_environment = "target_environment" + target_environment_version = "1" + + footprints = get_footprints(model_id, model_path) + azureml_client_config = Mock(max_operation_retries=3, operation_retry_interval=5) + ml_client_mock = Mock() + azureml_client_config.create_client.return_value = ml_client_mock + + model = Model( + path=model_path, + name=model_name, + version=model_version, + type=AssetTypes.CUSTOM_MODEL, + ) + + code_configuration = CodeConfiguration( + code=code_folder, + scoring_script=scoring_script, + ) + model_package_mock = Mock() + + inferencing_server = None + if inferencing_server_type == InferencingServerType.AzureMLOnline: + inferencing_server = AzureMLOnlineInferencingServer(code_configuration=code_configuration) + deployment = ManagedOnlineDeployment( + name=deployment_name, + endpoint_name=endpoint_name, + environment=model_package_mock, + ) + elif inferencing_server_type == InferencingServerType.AzureMLBatch: + inferencing_server = AzureMLBatchInferencingServer(code_configuration=code_configuration) + deployment = BatchDeployment( + name=deployment_name, + endpoint_name=endpoint_name, + environment=model_package_mock, + ) + + base_environment_source = BaseEnvironment(type="EnvironmentAsset", resource_id=base_environment_id) + + package_request = ModelPackage( + target_environment=target_environment, + inferencing_server=inferencing_server, + base_environment_source=base_environment_source, + target_environment_version=target_environment_version, + model_configuration=None, + environment_variables=None, + ) + + mock_retry_func( + ml_client_mock.models.package, + kwargs={"name": model_name, "version": model_version, "package_request": package_request}, + max_tries=azureml_client_config.max_operation_retries, + delay=azureml_client_config.operation_retry_interval, + exceptions=ServiceResponseError, + ).return_value = model_package_mock + + inference_server_config = InferenceServerConfig( + type=inferencing_server_type, + code_folder=code_folder, + scoring_script=scoring_script, + ) + + model_package_config = ModelPackageConfig( + target_environment=target_environment, + target_environment_version=target_environment_version, + inferencing_server=inference_server_config, + base_environment_id=base_environment_id, + ) + + deployment_config = DeploymentConfig( + endpoint_name=endpoint_name, + deployment_name=deployment_name, + ) + + config = AzureMLDeploymentPackagingConfig( + model_name=model_name, + model_version=model_version, + model_package=model_package_config, + deployment_config=deployment_config, + ) + packaging_config = PackagingConfig() + packaging_config.type = PackagingType.AzureMLDeployment + packaging_config.config = config + + # execute + generate_output_artifacts( + packaging_config, footprints, footprints, output_dir=Path("output"), azureml_client_config=azureml_client_config + ) + + # assert + assert mock_retry_func.call_once_with( + ml_client_mock.models.create_client, + [model], + max_tries=azureml_client_config.max_operation_retries, + delay=azureml_client_config.operation_retry_interval, + exceptions=ServiceResponseError, + ) + + assert mock_retry_func.call_once_with( + ml_client_mock.models.package, + kwargs={"name": model_name, "version": model_version, "package_request": package_request}, + max_tries=azureml_client_config.max_operation_retries, + delay=azureml_client_config.operation_retry_interval, + exceptions=ServiceResponseError, + ) + + assert mock_retry_func.call_once_with( + ml_client_mock.online_endpoints.get, + [endpoint_name], + max_tries=azureml_client_config.max_operation_retries, + delay=azureml_client_config.operation_retry_interval, + exceptions=ServiceResponseError, + ) + + assert mock_retry_func.call_once_with( + ml_client_mock.online_deployments.begin_create_or_update, + [deployment], + max_tries=azureml_client_config.max_operation_retries, + delay=azureml_client_config.operation_retry_interval, + exceptions=ServiceResponseError, + ) + + def get_footprints(model_id, model_path): acc_spec = AcceleratorSpec(accelerator_type="cpu", execution_provider="CPUExecutionProvider") model_config = {"config": {"model_path": model_path}, "type": "ONNXModel"}