Deploy Pipelines via the Wallaroo SDK
Deploy a Pipeline
When a pipeline step is added or removed, the pipeline must be deployed through the pipeline wallaroo.pipeline.Pipeline.deploy(deployment_config, wait_for_status). This allocates resources to the pipeline from the Kubernetes environment and make it available to submit information to perform inferences. For full details on pipeline deployment configurations, see Wallaroo SDK Essentials Guide: Pipeline Deployment Configuration.
Deploy a Pipeline Parameters
The method wallaroo.pipeline.Pipeline.deploy takes the following parameters.
| Parameter | Type | Description |
|---|---|---|
| deployment_config | wallaroo.deployment_config.DeploymentConfig (Optional) | The number of resources to allocate to the cluster such as the number of cpus, amount of ram, etc, along with how many replicas, autoscaling, and other settings. For complete details, see Wallaroo SDK Essentials Guide: Pipeline Deployment Configuration. |
| wait_for_status | Boolean (Optional Default: True) | If True, the Python script will wait until the pipeline is finished successfully deploying before continuing or reaches the timeout. If False, continue on with the Python script. |
Deploy Pipelines Asynchronously
By default, wallaroo.pipeline.Pipeline.deploy sets wait_for_status=True. In this situation, the Python code pauses and display the deployment process until either:
- The pipeline successfully deploys, at which point the Python code continues.
- The pipeline deployment reaches its timeout.
- The pipeline errors out for one or more reasons. In these situations:
- The Python code is stopped if the exception is not caught.
- An error is displayed.
If wait_for_status=False, then Python code will not wait for the success or failure of the pipeline to deploy, but proceed to the next instruction without pause.
In these situations, before performing an inference request, use pipeline status to verify that the pipeline status is Running.
For example, the following example:
- Deploys three pipelines with
wait_for_status=False. - Before performing an inference request, verify the pipeline
sample_pipeline_1status isstatus='Running'.
for pipe, deploy_config in [(sample_pipeline_1, deployment_config_1), (sample_pipeline_2, deployment_config_2), (sample_pipeline_2, deployment_config_3)]:
pipe.deploy(deployment_config=deploy_config, wait_for_status=False)
# perform other code
# check the pipeline status before performing an inference
if sample_pipeline_1.status()['status'] == 'Running':
sample_pipeline_1.infer(sample_data)
Deploy a Pipeline Deployment Defaults
Deployment configurations default to the following*.
| Runtime | CPUs | Memory | GPUs |
|---|---|---|---|
| Wallaroo Native Runtime** | 4 | 3 Gi | 0 |
| Wallaroo Containerized Runtime*** | 2 | 1 Gi | 0 |
*: For Kubernetes limits and requests.
**: Resources are always allocated for the Wallaroo Native Runtime engine even if there are no Wallaroo Native Runtimes included in the deployment, so it is recommended to decrease these resources when pipelines use Containerized Runtimes.
***: Resources for Wallaroo Containerized Runtimes only apply with a Wallaroo Containerized Runtime is part of the deployment.
Deploy a Pipeline with a New Deployment Configuration
Pipelines do not need to be undeployed to deploy new pipeline versions or pipeline deployment configurations. For example, the following pipeline is deployed, new pipeline steps are set, and the pipeline deploy command is issues again. This creates a new version of the pipeline and updates the deployed pipeline with the new configuration.
# clear all steps
pipeline.clear()
# set modelA as the step
pipeline.add_model_step(modelA)
# deploy the pipeline - the version is saved and the resources allocated to the pipeline
pipeline.deploy()
# clear the steps - this configuration is only stored in the local SDK session until the deploy or create_version command is given
pipeline.clear()
# set modelB as the step
pipeline.add_model_step(modelB)
# deploy the pipeline - the pipeline configuration is saved and the pipeline deployment updated without significant downtime
pipeline.deploy()
Model Deployment Architecture Inheritance
Deployment configurations inherit the model’s architecture setting. This is set during model upload by specifying the arch parameter. By default, models uploaded to Wallaroo default to the x86 architecture.
The following model operations inherit the model’s architecture setting.
- Model Deployment: Model deployment and Model Deployment Deployment Configuration inherit the the model’s architecture. No specification of the architecture is required for model deployment.
- Pipeline Publishing: The Wallaroo engine set when a pipeline is containerized and published to an Open Container Initiative (OCI) Registry inherits the model’s architecture setting.
The following example shows uploading a model set with the architecture set to ARM, and how the deployment inherits that architecture without additional deployment configuration changes. For this example, an ONNX model is uploaded.
import wallaroo
housing_model_control_arm = (wl.upload_model(model_name_arm,
model_file_name,
framework=Framework.ONNX,
arch=wallaroo.engine_config.Architecture.ARM)
.configure(tensor_fields=["tensor"])
)
display(housing_model_control_arm)
| Name | house-price-estimator-arm |
| Version | 163ff0a9-0f1a-4229-bbf2-a19e4385f10f |
| File Name | rf_model.onnx |
| SHA | e22a0831aafd9917f3cc87a15ed267797f80e2afa12ad7d8810ca58f173b8cc6 |
| Status | ready |
| Image Path | None |
| Architecture | arm |
| Acceleration | None |
| Updated At | 2024-04-Mar 20:34:00 |
Note that the deployment configuration settings, no architecture is specified. When pipeline_arm is displayed, we see the arch setting inherited the model’s arch setting.
pipeline_arm = wl.build_pipeline(arm_pipeline_name)
# set the model step with the ARM targeted model
pipeline_arm.add_model_step(housing_model_control_arm)
#minimum deployment config for this model
deploy_config = wallaroo.DeploymentConfigBuilder().replica_count(1).cpus(1).memory("1Gi").build()
pipeline_arm.deploy(deployment_config = deploy_config)
Waiting for deployment - this will take up to 45s .......... ok
display(pipeline_arm)
| name | architecture-demonstration-arm |
|---|---|
| created | 2024-03-04 20:34:08.895396+00:00 |
| last_updated | 2024-03-04 21:52:01.894671+00:00 |
| deployed | True |
| arch | arm |
| accel | None |
| tags | |
| versions | 55d834b4-92c8-4a93-b78b-6a224f17f9c1, 98821b85-401a-4ab5-af8e-1b3126727069, 74571863-9eb0-47aa-8b5a-3bdaa7aa9f03, b72fb0db-e4b4-4936-a7cb-3d0fb7827a6f, 3ae70818-10f3-4f61-a998-dee5e2f00daf |
| steps | house-price-estimator-arm |
| published | True |
Deploy Current Pipeline Version
By default, deploying a Wallaroo pipeline will deploy the most current version. For example:
sample_pipeline = wl.build_pipeline("test-pipeline")
sample_pipeline.add_model_step(model)
sample_pipeline.deploy()
sample_pipeline.status()
{'status': 'Running',
'details': None,
'engines': [{'ip': '10.12.1.65',
'name': 'engine-778b65459-f9mt5',
'status': 'Running',
'reason': None,
'pipeline_statuses': {'pipelines': [{'id': 'imdb-pipeline',
'status': 'Running'}]},
'model_statuses': {'models': [{'name': 'embedder-o',
'version': '1c16d21d-fe4c-4081-98bc-65fefa465f7d',
'sha': 'd083fd87fa84451904f71ab8b9adfa88580beb92ca77c046800f79780a20b7e4',
'status': 'Running'},
{'name': 'smodel-o',
'version': '8d311ba3-c336-48d3-99cd-85d95baa6f19',
'sha': '3473ea8700fbf1a1a8bfb112554a0dde8aab36758030dcde94a9357a83fd5650',
'status': 'Running'}]}}],
'engine_lbs': [{'ip': '10.12.1.66',
'name': 'engine-lb-85846c64f8-ggg2t',
'status': 'Running',
'reason': None}]}
Deploy Previous Pipeline Version
Pipeline versions are deployed with the method wallaroo.pipeline_variant.deploy(deployment_name, model_configs, config: Optional[wallaroo.deployment_config.DeploymentConfig]). Note that the deployment_name and model_configs are required. The model_configs are retrieved with the wallaroo.pipeline_variant.model_configs() method.
The following demonstrates retrieving a previous version of a pipeline, deploying it, and retrieving the deployment status.
pipeline_version = pipeline.versions()[7]
display(pipeline_version.name())
display(pipeline_version)
pipeline_version.deploy("houseprice-estimator", pipeline_version.model_configs())
display(pipeline.status())
| name | houseprice-estimator |
| version | 92f2b4f3-494b-4d69-895f-9e767ac1869d |
| creation_time | 2023-11-Sep 20:49:17 |
| last_updated_time | 2023-11-Sep 20:49:17 |
| deployed | False |
| tags | |
| steps | house-price-rf-model |
{'status': 'Running',
'details': [],
'engines': [{'ip': '10.244.3.211',
'name': 'engine-578dc7cdcf-qkx5n',
'status': 'Running',
'reason': None,
'details': [],
'pipeline_statuses': {'pipelines': [{'id': 'houseprice-estimator',
'status': 'Running'}]},
'model_statuses': {'models': [{'name': 'house-price-rf-model',
'version': '616c2306-bf93-417b-9656-37bee6f14379',
'sha': 'e22a0831aafd9917f3cc87a15ed267797f80e2afa12ad7d8810ca58f173b8cc6',
'status': 'Running'}]}}],
'engine_lbs': [{'ip': '10.244.4.243',
'name': 'engine-lb-584f54c899-2rtvg',
'status': 'Running',
'reason': None,
'details': []}],
'sidekicks': []}
Pipeline Status
Once complete, the pipeline status() command will show 'status':'Running'.
Pipeline deployments can be modified to enable auto-scaling to allow pipelines to allocate more or fewer resources based on need by setting the pipeline’s This will then be applied to the deployment of the pipelineccfraudPipelineby specifying it'sdeployment_config` optional parameter. If this optional parameter is not passed, then the deployment will defer to default values. For more information, see Manage Pipeline Deployment Configuration.
In the following example, the pipeline imdb-pipeline that contains two steps will be deployed with default deployment configuration:
imdb_pipeline.status
<bound method Pipeline.status of {'name': 'imdb-pipeline', 'create_time': datetime.datetime(2022, 3, 30, 21, 21, 31, 127756, tzinfo=tzutc()), 'definition': "[{'ModelInference': {'models': [{'name': 'embedder-o', 'version': '1c16d21d-fe4c-4081-98bc-65fefa465f7d', 'sha': 'd083fd87fa84451904f71ab8b9adfa88580beb92ca77c046800f79780a20b7e4'}]}}, {'ModelInference': {'models': [{'name': 'smodel-o', 'version': '8d311ba3-c336-48d3-99cd-85d95baa6f19', 'sha': '3473ea8700fbf1a1a8bfb112554a0dde8aab36758030dcde94a9357a83fd5650'}]}}]"}>
imdb_pipeline.deploy()
Waiting for deployment - this will take up to 45s ...... ok
imdb_pipeline.status()
{'status': 'Running',
'details': None,
'engines': [{'ip': '10.12.1.65',
'name': 'engine-778b65459-f9mt5',
'status': 'Running',
'reason': None,
'pipeline_statuses': {'pipelines': [{'id': 'imdb-pipeline',
'status': 'Running'}]},
'model_statuses': {'models': [{'name': 'embedder-o',
'version': '1c16d21d-fe4c-4081-98bc-65fefa465f7d',
'sha': 'd083fd87fa84451904f71ab8b9adfa88580beb92ca77c046800f79780a20b7e4',
'status': 'Running'},
{'name': 'smodel-o',
'version': '8d311ba3-c336-48d3-99cd-85d95baa6f19',
'sha': '3473ea8700fbf1a1a8bfb112554a0dde8aab36758030dcde94a9357a83fd5650',
'status': 'Running'}]}}],
'engine_lbs': [{'ip': '10.12.1.66',
'name': 'engine-lb-85846c64f8-ggg2t',
'status': 'Running',
'reason': None}]}
Manage Pipeline Deployment Configuration
For full details on pipeline deployment configurations, see Wallaroo SDK Essentials Guide: Pipeline Deployment Configuration.
Troubleshooting Pipeline Deployment
If you deploy more pipelines than your environment can handle, or if you deploy more pipelines than your license allows, you may see an error like the following:
LimitError: You have reached a license limit in your Wallaroo instance. In order to add additional resources, you can remove some of your existing resources. If you have any questions contact us at community@wallaroo.ai: MAX_PIPELINES_LIMIT_EXCEEDED
Undeploy any unnecessary pipelines either through the SDK or through the Wallaroo Pipeline Dashboard, then attempt to redeploy the pipeline in question again.
Undeploy a Pipeline
When a pipeline is not currently needed, it can be undeployed and its resources turned back to the Kubernetes environment. To undeploy a pipeline, use the pipeline undeploy() command.
In this example, the aloha_pipeline will be undeployed:
aloha_pipeline.undeploy()
{'name': 'aloha-test-demo', 'create_time': datetime.datetime(2022, 3, 29, 20, 34, 3, 960957, tzinfo=tzutc()), 'definition': "[{'ModelInference': {'models': [{'name': 'aloha-2', 'version': 'a8e8abdc-c22f-416c-a13c-5fe162357430', 'sha': 'fd998cd5e4964bbbb4f8d29d245a8ac67df81b62be767afbceb96a03d1a01520'}]}}]"}
Get Pipeline URL Endpoint
The Pipeline URL Endpoint or the Pipeline Deploy URL is used to submit data to a pipeline to use for an inference. This is done through the pipeline _deployment._url() method.
In this example, the pipeline URL endpoint for the pipeline ccfraud_pipeline will be displayed:
ccfraud_pipeline._deployment._url()
'http://engine-lb.ccfraud-pipeline-1:29502/pipelines/ccfraud-pipeline'